--- title: "Senior Devops" description: "Senior Devops - Claude Code skill from the Engineering - Core domain." --- # Senior Devops
:material-code-braces: Engineering - Core :material-identifier: `senior-devops` :material-github: Source
Install: claude /plugin install engineering-skills
Complete toolkit for senior devops with modern tools and best practices. ## Quick Start ### Main Capabilities This skill provides three core capabilities through automated scripts: ```bash # Script 1: Pipeline Generator — scaffolds CI/CD pipelines for GitHub Actions or CircleCI python scripts/pipeline_generator.py ./app --platform=github --stages=build,test,deploy # Script 2: Terraform Scaffolder — generates and validates IaC modules for AWS/GCP/Azure python scripts/terraform_scaffolder.py ./infra --provider=aws --module=ecs-service --verbose # Script 3: Deployment Manager — orchestrates container deployments with rollback support python scripts/deployment_manager.py deploy --env=production --image=app:1.2.3 --strategy=blue-green ``` ## Core Capabilities ### 1. Pipeline Generator Scaffolds CI/CD pipeline configurations for GitHub Actions or CircleCI, with stages for build, test, security scan, and deploy. **Example — GitHub Actions workflow:** ```yaml # .github/workflows/ci.yml name: CI/CD Pipeline on: push: branches: [main, develop] pull_request: branches: [main] jobs: build-and-test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Set up Node.js uses: actions/setup-node@v4 with: node-version: '20' cache: 'npm' - run: npm ci - run: npm run lint - run: npm test -- --coverage - name: Upload coverage uses: codecov/codecov-action@v4 build-docker: needs: build-and-test runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Build and push image uses: docker/build-push-action@v5 with: push: ${{ github.ref == 'refs/heads/main' }} tags: ghcr.io/${{ github.repository }}:${{ github.sha }} deploy: needs: build-docker if: github.ref == 'refs/heads/main' runs-on: ubuntu-latest steps: - name: Deploy to ECS run: | aws ecs update-service \ --cluster production \ --service app-service \ --force-new-deployment ``` **Usage:** ```bash python scripts/pipeline_generator.py --platform=github|circleci --stages=build,test,deploy ``` ### 2. Terraform Scaffolder Generates, validates, and plans Terraform modules. Enforces consistent module structure and runs `terraform validate` + `terraform plan` before any apply. **Example — AWS ECS service module:** ```hcl # modules/ecs-service/main.tf resource "aws_ecs_task_definition" "app" { family = var.service_name requires_compatibilities = ["FARGATE"] network_mode = "awsvpc" cpu = var.cpu memory = var.memory container_definitions = jsonencode([{ name = var.service_name image = var.container_image essential = true portMappings = [{ containerPort = var.container_port protocol = "tcp" }] environment = [for k, v in var.env_vars : { name = k, value = v }] logConfiguration = { logDriver = "awslogs" options = { awslogs-group = "/ecs/${var.service_name}" awslogs-region = var.aws_region awslogs-stream-prefix = "ecs" } } }]) } resource "aws_ecs_service" "app" { name = var.service_name cluster = var.cluster_id task_definition = aws_ecs_task_definition.app.arn desired_count = var.desired_count launch_type = "FARGATE" network_configuration { subnets = var.private_subnet_ids security_groups = [aws_security_group.app.id] assign_public_ip = false } load_balancer { target_group_arn = aws_lb_target_group.app.arn container_name = var.service_name container_port = var.container_port } } ``` **Usage:** ```bash python scripts/terraform_scaffolder.py --provider=aws|gcp|azure --module=ecs-service|gke-deployment|aks-service [--verbose] ``` ### 3. Deployment Manager Orchestrates deployments with blue/green or rolling strategies, health-check gates, and automatic rollback on failure. **Example — Kubernetes blue/green deployment (blue-slot specific elements):** ```yaml # k8s/deployment-blue.yaml apiVersion: apps/v1 kind: Deployment metadata: name: app-blue labels: app: myapp slot: blue # slot label distinguishes blue from green spec: replicas: 3 selector: matchLabels: app: myapp slot: blue template: metadata: labels: app: myapp slot: blue spec: containers: - name: app image: ghcr.io/org/app:1.2.3 readinessProbe: # gate: pod must pass before traffic switches httpGet: path: /healthz port: 8080 initialDelaySeconds: 10 periodSeconds: 5 resources: requests: cpu: "250m" memory: "256Mi" limits: cpu: "500m" memory: "512Mi" ``` **Usage:** ```bash python scripts/deployment_manager.py deploy \ --env=staging|production \ --image=app:1.2.3 \ --strategy=blue-green|rolling \ --health-check-url=https://app.example.com/healthz python scripts/deployment_manager.py rollback --env=production --to-version=1.2.2 python scripts/deployment_manager.py --analyze --env=production # audit current state ``` ## Resources - Pattern Reference: `references/cicd_pipeline_guide.md` — detailed CI/CD patterns, best practices, anti-patterns - Workflow Guide: `references/infrastructure_as_code.md` — IaC step-by-step processes, optimization, troubleshooting - Technical Guide: `references/deployment_strategies.md` — deployment strategy configs, security considerations, scalability - Tool Scripts: `scripts/` directory ## Development Workflow ### 1. Infrastructure Changes (Terraform) ```bash # Scaffold or update module python scripts/terraform_scaffolder.py ./infra --provider=aws --module=ecs-service --verbose # Validate and plan — review diff before applying terraform -chdir=infra init terraform -chdir=infra validate terraform -chdir=infra plan -out=tfplan # Apply only after plan review terraform -chdir=infra apply tfplan # Verify resources are healthy aws ecs describe-services --cluster production --services app-service \ --query 'services[0].{Status:status,Running:runningCount,Desired:desiredCount}' ``` ### 2. Application Deployment ```bash # Generate or update pipeline config python scripts/pipeline_generator.py . --platform=github --stages=build,test,security,deploy # Build and tag image docker build -t ghcr.io/org/app:$(git rev-parse --short HEAD) . docker push ghcr.io/org/app:$(git rev-parse --short HEAD) # Deploy with health-check gate python scripts/deployment_manager.py deploy \ --env=production \ --image=app:$(git rev-parse --short HEAD) \ --strategy=blue-green \ --health-check-url=https://app.example.com/healthz # Verify pods are running kubectl get pods -n production -l app=myapp kubectl rollout status deployment/app-blue -n production # Switch traffic after verification kubectl patch service app-svc -n production \ -p '{"spec":{"selector":{"slot":"blue"}}}' ``` ### 3. Rollback Procedure ```bash # Immediate rollback via deployment manager python scripts/deployment_manager.py rollback --env=production --to-version=1.2.2 # Or via kubectl kubectl rollout undo deployment/app -n production kubectl rollout status deployment/app -n production # Verify rollback succeeded kubectl get pods -n production -l app=myapp curl -sf https://app.example.com/healthz || echo "ROLLBACK FAILED — escalate" ``` ## Troubleshooting Check the comprehensive troubleshooting section in `references/deployment_strategies.md`.