Deployment Guide¶
Step-by-step guide for deploying the Nexus Kubernetes Operator.
Prerequisites¶
Required¶
| Requirement | Version | Purpose |
|---|---|---|
| EKS Cluster | 1.27+ | Kubernetes platform |
| AWS CLI | 2.x | AWS access |
| Python | 3.9+ | Scripts and operator |
| OIDC Provider | - | IRSA authentication |
Python Dependencies¶
AWS Permissions¶
The deploying user needs these permissions:
- EKS: DescribeCluster, ListClusters, AccessKubernetesApi
- IAM: CreateRole, AttachRolePolicy, CreateOpenIDConnectProvider
- ECR: CreateRepository, PutImage, GetAuthorizationToken
- STS: GetCallerIdentity, AssumeRole
Deployment Scripts¶
All scripts are located in /opt/mycode/nexus/nexus-deployer/kube-operator/.
Script Overview¶
| Script | Purpose |
|---|---|
build-and-deploy.sh |
Build image and deploy (all-in-one) |
build-and-push.sh |
Build and push image to ECR |
operator-nexus-dev.sh |
Manage operator lifecycle |
Deployment Methods¶
Method 1: One-Command Deployment (Recommended)¶
This script: 1. Builds the operator Docker image 2. Creates ECR repository if needed 3. Pushes image to ECR 4. Deploys operator to cluster 5. Verifies health
Method 2: Step-by-Step Deployment¶
Step 1: Scan Cluster (Optional)¶
Verifies: - Cluster exists and is active - EKS version compatibility - OIDC provider configured - Your access level - Node groups health
Step 2: Build and Push Image¶
This:
- Builds Docker image from Dockerfile
- Creates ECR repository nexus-operator
- Pushes image to ECR
Step 3: Deploy Operator¶
This:
- Creates nexus-system namespace
- Applies CRD
- Creates RBAC resources
- Creates IAM role with IRSA
- Deploys operator
Step 4: Verify Deployment¶
Expected output:
✓ Operator is installed
Status: installed
Version: latest
Namespace: nexus-system
Ready replicas: 1/1
✓ Operator is healthy and running
Using Python Scripts Directly¶
For more control, use the Python scripts directly:
deploy_operator_to_cluster.py¶
cd /opt/mycode/nexus/nexus-deployer/backend
python3 scripts/deploy_operator_to_cluster.py \
--cluster nexus-dev \
--region ap-southeast-1 \
--profile external-access
Options:
| Option | Description |
|---|---|
--cluster |
EKS cluster name (required) |
--region |
AWS region (default: ap-southeast-1) |
--profile |
AWS profile name |
--session-id |
AWS session ID (for UI integration) |
--version |
Operator version (default: latest) |
--force |
Force redeploy |
--skip-verify |
Skip cluster verification |
--skip-access-check |
Skip access check |
--debug |
Enable debug logging |
cleanup_operator.py¶
# Safe cleanup (keeps IAM, ECR, access)
python3 scripts/cleanup_operator.py \
--cluster nexus-dev \
--profile external-access
# Keep IAM and ECR
python3 scripts/cleanup_operator.py \
--cluster nexus-dev \
--profile external-access \
--keep-iam --keep-ecr
# Complete cleanup (removes access)
python3 scripts/cleanup_operator.py \
--cluster nexus-dev \
--profile external-access \
--remove-aws-auth
Deploying Capabilities¶
Once the operator is running, deploy capabilities using CRs.
Deploy Test Capability¶
Watch Deployment Progress¶
# Watch capability status
kubectl get nexuscapabilities -A -w
# View operator logs
kubectl -n nexus-system logs -f deployment/nexus-operator
Check Created Resources¶
# Kubernetes resources
kubectl get all -n nexus-ai-dev
# AWS resources
aws dynamodb list-tables | grep nexus-ai
aws s3 ls | grep nexus-ai
aws iam get-role --role-name nexus-ai-dev-app-role
Production Deployment Checklist¶
Pre-Deployment¶
- [ ] EKS cluster is healthy and accessible
- [ ] OIDC provider is configured
- [ ] IAM permissions are in place
- [ ] Container images are built and pushed
- [ ] SSL certificates are ready (if using HTTPS)
- [ ] DNS is configured (if using custom domain)
Deployment¶
- [ ] Operator deployed successfully
- [ ] Operator pod is running (1/1 Ready)
- [ ] CRD is installed
- [ ] RBAC is configured
Post-Deployment¶
- [ ] Health check endpoints responding
- [ ] Operator logs show no errors
- [ ] Test capability deploys successfully
- [ ] AWS resources are created correctly
Updating the Operator¶
Update with New Image¶
Update Without Rebuild¶
Force Redeployment¶
Deleting the Operator¶
Safe Delete (Recommended)¶
Preserves IAM roles, ECR repository, and access:
Complete Cleanup¶
Removes everything including access:
Access Loss
Using --full may remove your cluster access. Ensure you have another way to access the cluster.
Multi-Cluster Deployment¶
To deploy to multiple clusters:
# Cluster 1
python3 scripts/deploy_operator_to_cluster.py \
--cluster cluster-1 \
--region us-east-1
# Cluster 2
python3 scripts/deploy_operator_to_cluster.py \
--cluster cluster-2 \
--region ap-southeast-1
Configuration¶
Operator Environment Variables¶
| Variable | Default | Description |
|---|---|---|
LOG_LEVEL |
INFO | Logging level |
OPERATOR_NAMESPACE |
nexus-system | Operator namespace |
AWS_REGION |
ap-southeast-1 | AWS region |
Customizing Deployment¶
Edit manifests/deployment.yaml to customize:
- Resource limits
- Environment variables
- Node affinity
- Tolerations