Skip to content

Operator Architecture

This document describes the architecture of the Nexus Kubernetes Operator.

High-Level Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        EKS Cluster                               │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │                nexus-system namespace                   │    │
│  │  ┌─────────────────────────────────────────────────┐    │    │
│  │  │              Nexus Operator                     │    │    │
│  │  │  - Watches NexusAICapability CRs                 │    │    │
│  │  │  - Provisions AWS resources                     │    │    │
│  │  │  - Deploys K8s workloads                        │    │    │
│  │  │  - Manages lifecycle (create/update/delete)     │    │    │
│  │  └─────────────────────────────────────────────────┘    │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │           {capability}-{env} namespace                   │    │
│  │  ┌──────────────┐    ┌──────────────┐                   │    │
│  │  │   Frontend   │    │   Backend    │                   │    │
│  │  │  Deployment  │    │  Deployment  │                   │    │
│  │  └──────────────┘    └──────────────┘                   │    │
│  │  ┌──────────────┐    ┌──────────────┐                   │    │
│  │  │   Service    │    │   Service    │                   │    │
│  │  │(LoadBalancer)│    │(LoadBalancer)│                   │    │
│  │  └──────────────┘    └──────────────┘                   │    │
│  └─────────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│                        AWS Services                              │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐        │
│  │ DynamoDB │  │    S3    │  │   Glue   │  │   IAM    │        │
│  │  Tables  │  │ Buckets  │  │ Database │  │  Roles   │        │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘        │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐                      │
│  │   SSM    │  │ Secrets  │  │ Cognito  │                      │
│  │ Params   │  │ Manager  │  │ (opt)    │                      │
│  └──────────┘  └──────────┘  └──────────┘                      │
└─────────────────────────────────────────────────────────────────┘

Operator Components

Kubernetes Resources

The operator creates these Kubernetes resources in the cluster:

Resource Name Purpose
Namespace nexus-system Operator's home namespace
ServiceAccount nexus-operator Identity for operator with IRSA
ClusterRole nexus-operator Cluster-wide permissions
ClusterRoleBinding nexus-operator Binds role to service account
Deployment nexus-operator Operator pod deployment
Service nexus-operator:8080 Health check endpoint
CRD nexuscapabilities.nexus.ai Custom Resource Definition

AWS Resources

The operator creates these AWS resources:

Resource Name Purpose
IAM Role NexusAIOperatorRole Operator's AWS permissions (IRSA)
ECR Repository nexus-operator Operator container image

Operator Flow

Phase 1: Operator Installation

sequenceDiagram
    participant User
    participant Script as Deploy Script
    participant K8s as Kubernetes API
    participant AWS as AWS Services

    User->>Script: ./build-and-deploy.sh
    Script->>AWS: Create ECR repo
    Script->>AWS: Push operator image
    Script->>K8s: Create namespace
    Script->>K8s: Apply CRD
    Script->>K8s: Create RBAC
    Script->>AWS: Create IAM role (IRSA)
    Script->>K8s: Deploy operator
    Script->>K8s: Verify health
    K8s-->>User: Operator ready

Phase 2: Capability Deployment

sequenceDiagram
    participant User
    participant K8s as Kubernetes API
    participant Op as Nexus Operator
    participant AWS as AWS Services

    User->>K8s: Apply NexusAICapability CR
    K8s->>Op: New CR detected
    Op->>AWS: Create DynamoDB tables
    Op->>AWS: Create S3 buckets
    Op->>AWS: Create Glue database
    Op->>AWS: Create SSM parameters
    Op->>AWS: Create Secrets
    Op->>AWS: Create application IAM role
    Op->>K8s: Create namespace
    Op->>K8s: Deploy backend
    Op->>K8s: Deploy frontend
    Op->>K8s: Update CR status
    K8s-->>User: Capability ready

Phase 3: Capability Update

sequenceDiagram
    participant User
    participant K8s as Kubernetes API
    participant Op as Nexus Operator

    User->>K8s: Update NexusAICapability CR
    K8s->>Op: CR change detected
    Op->>Op: Calculate diff
    Op->>K8s: Rolling update deployments
    Op->>K8s: Update CR status
    K8s-->>User: Update complete

Phase 4: Capability Deletion

sequenceDiagram
    participant User
    participant K8s as Kubernetes API
    participant Op as Nexus Operator
    participant AWS as AWS Services

    User->>K8s: Delete NexusAICapability CR
    K8s->>Op: Deletion detected
    Op->>K8s: Delete K8s resources

    alt deletionPolicy: Delete
        Op->>AWS: Delete IAM roles
        Op->>AWS: Delete Secrets
        Op->>AWS: Delete SSM parameters
        Op->>AWS: Delete Glue resources
        Op->>AWS: Delete S3 buckets
        Op->>AWS: Delete DynamoDB tables
    else deletionPolicy: Retain
        Op->>Op: Skip AWS deletion
    end

    K8s-->>User: Deletion complete

Resource Provisioners

The operator uses modular provisioners for each resource type:

DynamoDB Provisioner

  • Creates transformation-system table
  • Creates license table
  • Creates wxcc-task-tracking table
  • Configures on-demand capacity

S3 Provisioner

  • Creates call-data bucket
  • Creates wxcc-simulator bucket
  • Creates journey-logs bucket
  • Creates journey-reports bucket
  • Configures encryption and versioning

Glue Provisioner

  • Creates analytics database
  • Creates wxcc_calls table
  • Configures crawler settings

SSM Provisioner

  • Creates configuration parameters
  • Follows naming convention: /{capability}/{env}/{category}/{key}
  • Stores non-sensitive configuration

Secrets Provisioner

  • Creates API credentials
  • Creates webhook secrets
  • Stores sensitive data encrypted

IAM Provisioner

  • Creates application role with IRSA
  • Configures least-privilege policies
  • Sets up trust policy for EKS service account

Cognito Provisioner (Optional)

  • Creates User Pool
  • Creates App Client with OAuth config
  • Creates Cognito domain
  • Sets up RBAC groups

Kubernetes Provisioner

  • Creates application namespace
  • Creates ServiceAccount with IRSA annotation
  • Deploys frontend and backend
  • Creates LoadBalancer services

Security Architecture

Container Security

The operator runs with hardened security context:

securityContext:
  runAsNonRoot: true
  runAsUser: 1000        # nexus user
  fsGroup: 1000
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true
  capabilities:
    drop: [ALL]

IRSA (IAM Roles for Service Accounts)

The operator uses IRSA for AWS authentication:

  • No credentials in containers - No AWS access keys
  • Automatic token rotation - Short-lived credentials
  • Least-privilege policies - Minimal required permissions
  • Service Account annotations - Links K8s SA to IAM role

RBAC

The operator ClusterRole has permissions for:

  • NexusAICapability CRD management
  • Namespace, Service, Deployment creation
  • ConfigMap, Secret management
  • CRD access

Directory Structure

kube-operator/
├── src/
│   └── nexus_operator/
│       ├── __init__.py
│       ├── main.py                 # Entry point, Kopf handlers
│       ├── resources/
│       │   ├── __init__.py
│       │   ├── cognito.py          # Cognito provisioner
│       │   ├── dynamodb.py         # DynamoDB provisioner
│       │   ├── glue.py             # Glue provisioner
│       │   ├── iam.py              # IAM provisioner
│       │   ├── kubernetes.py       # K8s resource creator
│       │   ├── s3.py               # S3 provisioner
│       │   ├── secrets.py          # Secrets Manager provisioner
│       │   └── ssm.py              # SSM provisioner
│       └── utils/
│           ├── __init__.py
│           ├── eks_utils.py        # EKS/OIDC utilities
│           └── naming.py           # Resource naming utility
├── manifests/
│   ├── crd.yaml                    # Custom Resource Definition
│   ├── deployment.yaml             # Operator deployment
│   └── rbac.yaml                   # RBAC configuration
├── examples/
│   ├── test-capability.yaml        # Test capability example
│   ├── capability-with-cognito.yaml
│   └── dual-service-backend.yaml
├── Dockerfile                      # Operator container image
├── pyproject.toml                  # Python project config
├── requirements.txt                # Python dependencies
└── operator-nexus-dev.sh        # Management script

Technology Stack

Component Technology Purpose
Framework Kopf Kubernetes operator framework
Language Python 3.12 Operator implementation
AWS SDK Boto3 AWS resource provisioning
K8s Client kubernetes-python Kubernetes API access
Web Server aiohttp Health check endpoints
Container Docker Operator packaging
Registry Amazon ECR Container image storage

← Back to Kubernetes Operator | Next: Custom Resource Reference →