diff --git a/engineering-team/senior-computer-vision/SKILL.md b/engineering-team/senior-computer-vision/SKILL.md
index f75d4d2..5028bef 100644
--- a/engineering-team/senior-computer-vision/SKILL.md
+++ b/engineering-team/senior-computer-vision/SKILL.md
@@ -1,226 +1,531 @@
 ---
 name: senior-computer-vision
-description: World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.
+description: Computer vision engineering skill for object detection, image segmentation, and visual AI systems. Covers CNN and Vision Transformer architectures, YOLO/Faster R-CNN/DETR detection, Mask R-CNN/SAM segmentation, and production deployment with ONNX/TensorRT. Includes PyTorch, torchvision, Ultralytics, Detectron2, and MMDetection frameworks. Use when building detection pipelines, training custom models, optimizing inference, or deploying vision systems.
 ---
 
 # Senior Computer Vision Engineer
 
-World-class senior computer vision engineer skill for production-grade AI/ML/Data systems.
+Production computer vision engineering skill for object detection, image segmentation, and visual AI system deployment.
+
+## Table of Contents
+
+- [Quick Start](#quick-start)
+- [Core Expertise](#core-expertise)
+- [Tech Stack](#tech-stack)
+- [Workflow 1: Object Detection Pipeline](#workflow-1-object-detection-pipeline)
+- [Workflow 2: Model Optimization and Deployment](#workflow-2-model-optimization-and-deployment)
+- [Workflow 3: Custom Dataset Preparation](#workflow-3-custom-dataset-preparation)
+- [Architecture Selection Guide](#architecture-selection-guide)
+- [Reference Documentation](#reference-documentation)
+- [Common Commands](#common-commands)
 
 ## Quick Start
 
-### Main Capabilities
-
 ```bash
-# Core Tool 1
-python scripts/vision_model_trainer.py --input data/ --output results/
+# Generate training configuration for YOLO or Faster R-CNN
+python scripts/vision_model_trainer.py models/ --task detection --arch yolov8
 
-# Core Tool 2  
-python scripts/inference_optimizer.py --target project/ --analyze
+# Analyze model for optimization opportunities (quantization, pruning)
+python scripts/inference_optimizer.py model.pt --target onnx --benchmark
 
-# Core Tool 3
-python scripts/dataset_pipeline_builder.py --config config.yaml --deploy
+# Build dataset pipeline with augmentations
+python scripts/dataset_pipeline_builder.py images/ --format coco --augment
 ```
 
 ## Core Expertise
 
-This skill covers world-class capabilities in:
+This skill provides guidance on:
 
-- Advanced production patterns and architectures
-- Scalable system design and implementation
-- Performance optimization at scale
-- MLOps and DataOps best practices
-- Real-time processing and inference
-- Distributed computing frameworks
-- Model deployment and monitoring
-- Security and compliance
-- Cost optimization
-- Team leadership and mentoring
+- **Object Detection**: YOLO family (v5-v11), Faster R-CNN, DETR, RT-DETR
+- **Instance Segmentation**: Mask R-CNN, YOLACT, SOLOv2
+- **Semantic Segmentation**: DeepLabV3+, SegFormer, SAM (Segment Anything)
+- **Image Classification**: ResNet, EfficientNet, Vision Transformers (ViT, DeiT)
+- **Video Analysis**: Object tracking (ByteTrack, SORT), action recognition
+- **3D Vision**: Depth estimation, point cloud processing, NeRF
+- **Production Deployment**: ONNX, TensorRT, OpenVINO, CoreML
 
 ## Tech Stack
 
-**Languages:** Python, SQL, R, Scala, Go
-**ML Frameworks:** PyTorch, TensorFlow, Scikit-learn, XGBoost
-**Data Tools:** Spark, Airflow, dbt, Kafka, Databricks
-**LLM Frameworks:** LangChain, LlamaIndex, DSPy
-**Deployment:** Docker, Kubernetes, AWS/GCP/Azure
-**Monitoring:** MLflow, Weights & Biases, Prometheus
-**Databases:** PostgreSQL, BigQuery, Snowflake, Pinecone
+| Category | Technologies |
+|----------|--------------|
+| Frameworks | PyTorch, torchvision, timm |
+| Detection | Ultralytics (YOLO), Detectron2, MMDetection |
+| Segmentation | segment-anything, mmsegmentation |
+| Optimization | ONNX, TensorRT, OpenVINO, torch.compile |
+| Image Processing | OpenCV, Pillow, albumentations |
+| Annotation | CVAT, Label Studio, Roboflow |
+| Experiment Tracking | MLflow, Weights & Biases |
+| Serving | Triton Inference Server, TorchServe |
+
+## Workflow 1: Object Detection Pipeline
+
+Use this workflow when building an object detection system from scratch.
+
+### Step 1: Define Detection Requirements
+
+Analyze the detection task requirements:
+
+```
+Detection Requirements Analysis:
+- Target objects: [list specific classes to detect]
+- Real-time requirement: [yes/no, target FPS]
+- Accuracy priority: [speed vs accuracy trade-off]
+- Deployment target: [cloud GPU, edge device, mobile]
+- Dataset size: [number of images, annotations per class]
+```
+
+### Step 2: Select Detection Architecture
+
+Choose architecture based on requirements:
+
+| Requirement | Recommended Architecture | Why |
+|-------------|-------------------------|-----|
+| Real-time (>30 FPS) | YOLOv8/v11, RT-DETR | Single-stage, optimized for speed |
+| High accuracy | Faster R-CNN, DINO | Two-stage, better localization |
+| Small objects | YOLO + SAHI, Faster R-CNN + FPN | Multi-scale detection |
+| Edge deployment | YOLOv8n, MobileNetV3-SSD | Lightweight architectures |
+| Transformer-based | DETR, DINO, RT-DETR | End-to-end, no NMS required |
+
+### Step 3: Prepare Dataset
+
+Convert annotations to required format:
+
+```bash
+# COCO format (recommended)
+python scripts/dataset_pipeline_builder.py data/images/ \
+    --annotations data/labels/ \
+    --format coco \
+    --split 0.8 0.1 0.1 \
+    --output data/coco/
+
+# Verify dataset
+python -c "from pycocotools.coco import COCO; coco = COCO('data/coco/train.json'); print(f'Images: {len(coco.imgs)}, Categories: {len(coco.cats)}')"
+```
+
+### Step 4: Configure Training
+
+Generate training configuration:
+
+```bash
+# For Ultralytics YOLO
+python scripts/vision_model_trainer.py data/coco/ \
+    --task detection \
+    --arch yolov8m \
+    --epochs 100 \
+    --batch 16 \
+    --imgsz 640 \
+    --output configs/
+
+# For Detectron2
+python scripts/vision_model_trainer.py data/coco/ \
+    --task detection \
+    --arch faster_rcnn_R_50_FPN \
+    --framework detectron2 \
+    --output configs/
+```
+
+### Step 5: Train and Validate
+
+```bash
+# Ultralytics training
+yolo detect train data=data.yaml model=yolov8m.pt epochs=100 imgsz=640
+
+# Detectron2 training
+python train_net.py --config-file configs/faster_rcnn.yaml --num-gpus 1
+
+# Validate on test set
+yolo detect val model=runs/detect/train/weights/best.pt data=data.yaml
+```
+
+### Step 6: Evaluate Results
+
+Key metrics to analyze:
+
+| Metric | Target | Description |
+|--------|--------|-------------|
+| mAP@50 | >0.7 | Mean Average Precision at IoU 0.5 |
+| mAP@50:95 | >0.5 | COCO primary metric |
+| Precision | >0.8 | Low false positives |
+| Recall | >0.8 | Low missed detections |
+| Inference time | <33ms | For 30 FPS real-time |
+
+## Workflow 2: Model Optimization and Deployment
+
+Use this workflow when preparing a trained model for production deployment.
+
+### Step 1: Benchmark Baseline Performance
+
+```bash
+# Measure current model performance
+python scripts/inference_optimizer.py model.pt \
+    --benchmark \
+    --input-size 640 640 \
+    --batch-sizes 1 4 8 16 \
+    --warmup 10 \
+    --iterations 100
+```
+
+Expected output:
+
+```
+Baseline Performance (PyTorch FP32):
+- Batch 1: 45.2ms (22.1 FPS)
+- Batch 4: 89.4ms (44.7 FPS)
+- Batch 8: 165.3ms (48.4 FPS)
+- Memory: 2.1 GB
+- Parameters: 25.9M
+```
+
+### Step 2: Select Optimization Strategy
+
+| Deployment Target | Optimization Path |
+|-------------------|-------------------|
+| NVIDIA GPU (cloud) | PyTorch → ONNX → TensorRT FP16 |
+| NVIDIA GPU (edge) | PyTorch → TensorRT INT8 |
+| Intel CPU | PyTorch → ONNX → OpenVINO |
+| Apple Silicon | PyTorch → CoreML |
+| Generic CPU | PyTorch → ONNX Runtime |
+| Mobile | PyTorch → TFLite or ONNX Mobile |
+
+### Step 3: Export to ONNX
+
+```bash
+# Export with dynamic batch size
+python scripts/inference_optimizer.py model.pt \
+    --export onnx \
+    --input-size 640 640 \
+    --dynamic-batch \
+    --simplify \
+    --output model.onnx
+
+# Verify ONNX model
+python -c "import onnx; model = onnx.load('model.onnx'); onnx.checker.check_model(model); print('ONNX model valid')"
+```
+
+### Step 4: Apply Quantization (Optional)
+
+For INT8 quantization with calibration:
+
+```bash
+# Generate calibration dataset
+python scripts/inference_optimizer.py model.onnx \
+    --quantize int8 \
+    --calibration-data data/calibration/ \
+    --calibration-samples 500 \
+    --output model_int8.onnx
+```
+
+Quantization impact analysis:
+
+| Precision | Size | Speed | Accuracy Drop |
+|-----------|------|-------|---------------|
+| FP32 | 100% | 1x | 0% |
+| FP16 | 50% | 1.5-2x | <0.5% |
+| INT8 | 25% | 2-4x | 1-3% |
+
+### Step 5: Convert to Target Runtime
+
+```bash
+# TensorRT (NVIDIA GPU)
+trtexec --onnx=model.onnx --saveEngine=model.engine --fp16
+
+# OpenVINO (Intel)
+mo --input_model model.onnx --output_dir openvino/
+
+# CoreML (Apple)
+python -c "import coremltools as ct; model = ct.convert('model.onnx'); model.save('model.mlpackage')"
+```
+
+### Step 6: Benchmark Optimized Model
+
+```bash
+python scripts/inference_optimizer.py model.engine \
+    --benchmark \
+    --runtime tensorrt \
+    --compare model.pt
+```
+
+Expected speedup:
+
+```
+Optimization Results:
+- Original (PyTorch FP32): 45.2ms
+- Optimized (TensorRT FP16): 12.8ms
+- Speedup: 3.5x
+- Accuracy change: -0.3% mAP
+```
+
+## Workflow 3: Custom Dataset Preparation
+
+Use this workflow when preparing a computer vision dataset for training.
+
+### Step 1: Audit Raw Data
+
+```bash
+# Analyze image dataset
+python scripts/dataset_pipeline_builder.py data/raw/ \
+    --analyze \
+    --output analysis/
+```
+
+Analysis report includes:
+
+```
+Dataset Analysis:
+- Total images: 5,234
+- Image sizes: 640x480 to 4096x3072 (variable)
+- Formats: JPEG (4,891), PNG (343)
+- Corrupted: 12 files
+- Duplicates: 45 pairs
+
+Annotation Analysis:
+- Format detected: Pascal VOC XML
+- Total annotations: 28,456
+- Classes: 5 (car, person, bicycle, dog, cat)
+- Distribution: car (12,340), person (8,234), bicycle (3,456), dog (2,890), cat (1,536)
+- Empty images: 234
+```
+
+### Step 2: Clean and Validate
+
+```bash
+# Remove corrupted and duplicate images
+python scripts/dataset_pipeline_builder.py data/raw/ \
+    --clean \
+    --remove-corrupted \
+    --remove-duplicates \
+    --output data/cleaned/
+```
+
+### Step 3: Convert Annotation Format
+
+```bash
+# Convert VOC to COCO format
+python scripts/dataset_pipeline_builder.py data/cleaned/ \
+    --annotations data/annotations/ \
+    --input-format voc \
+    --output-format coco \
+    --output data/coco/
+```
+
+Supported format conversions:
+
+| From | To |
+|------|-----|
+| Pascal VOC XML | COCO JSON |
+| YOLO TXT | COCO JSON |
+| COCO JSON | YOLO TXT |
+| LabelMe JSON | COCO JSON |
+| CVAT XML | COCO JSON |
+
+### Step 4: Apply Augmentations
+
+```bash
+# Generate augmentation config
+python scripts/dataset_pipeline_builder.py data/coco/ \
+    --augment \
+    --aug-config configs/augmentation.yaml \
+    --output data/augmented/
+```
+
+Recommended augmentations for detection:
+
+```yaml
+# configs/augmentation.yaml
+augmentations:
+  geometric:
+    - horizontal_flip: { p: 0.5 }
+    - vertical_flip: { p: 0.1 }  # Only if orientation invariant
+    - rotate: { limit: 15, p: 0.3 }
+    - scale: { scale_limit: 0.2, p: 0.5 }
+
+  color:
+    - brightness_contrast: { brightness_limit: 0.2, contrast_limit: 0.2, p: 0.5 }
+    - hue_saturation: { hue_shift_limit: 20, sat_shift_limit: 30, p: 0.3 }
+    - blur: { blur_limit: 3, p: 0.1 }
+
+  advanced:
+    - mosaic: { p: 0.5 }  # YOLO-style mosaic
+    - mixup: { p: 0.1 }   # Image mixing
+    - cutout: { num_holes: 8, max_h_size: 32, max_w_size: 32, p: 0.3 }
+```
+
+### Step 5: Create Train/Val/Test Splits
+
+```bash
+python scripts/dataset_pipeline_builder.py data/augmented/ \
+    --split 0.8 0.1 0.1 \
+    --stratify \
+    --seed 42 \
+    --output data/final/
+```
+
+Split strategy guidelines:
+
+| Dataset Size | Train | Val | Test |
+|--------------|-------|-----|------|
+| <1,000 images | 70% | 15% | 15% |
+| 1,000-10,000 | 80% | 10% | 10% |
+| >10,000 | 90% | 5% | 5% |
+
+### Step 6: Generate Dataset Configuration
+
+```bash
+# For Ultralytics YOLO
+python scripts/dataset_pipeline_builder.py data/final/ \
+    --generate-config yolo \
+    --output data.yaml
+
+# For Detectron2
+python scripts/dataset_pipeline_builder.py data/final/ \
+    --generate-config detectron2 \
+    --output detectron2_config.py
+```
+
+## Architecture Selection Guide
+
+### Object Detection Architectures
+
+| Architecture | Speed | Accuracy | Best For |
+|--------------|-------|----------|----------|
+| YOLOv8n | 1.2ms | 37.3 mAP | Edge, mobile, real-time |
+| YOLOv8s | 2.1ms | 44.9 mAP | Balanced speed/accuracy |
+| YOLOv8m | 4.2ms | 50.2 mAP | General purpose |
+| YOLOv8l | 6.8ms | 52.9 mAP | High accuracy |
+| YOLOv8x | 10.1ms | 53.9 mAP | Maximum accuracy |
+| RT-DETR-L | 5.3ms | 53.0 mAP | Transformer, no NMS |
+| Faster R-CNN R50 | 46ms | 40.2 mAP | Two-stage, high quality |
+| DINO-4scale | 85ms | 49.0 mAP | SOTA transformer |
+
+### Segmentation Architectures
+
+| Architecture | Type | Speed | Best For |
+|--------------|------|-------|----------|
+| YOLOv8-seg | Instance | 4.5ms | Real-time instance seg |
+| Mask R-CNN | Instance | 67ms | High-quality masks |
+| SAM | Promptable | 50ms | Zero-shot segmentation |
+| DeepLabV3+ | Semantic | 25ms | Scene parsing |
+| SegFormer | Semantic | 15ms | Efficient semantic seg |
+
+### CNN vs Vision Transformer Trade-offs
+
+| Aspect | CNN (YOLO, R-CNN) | ViT (DETR, DINO) |
+|--------|-------------------|------------------|
+| Training data needed | 1K-10K images | 10K-100K+ images |
+| Training time | Fast | Slow (needs more epochs) |
+| Inference speed | Faster | Slower |
+| Small objects | Good with FPN | Needs multi-scale |
+| Global context | Limited | Excellent |
+| Positional encoding | Implicit | Explicit |
 
 ## Reference Documentation
 
 ### 1. Computer Vision Architectures
 
-Comprehensive guide available in `references/computer_vision_architectures.md` covering:
+See `references/computer_vision_architectures.md` for:
 
-- Advanced patterns and best practices
-- Production implementation strategies
-- Performance optimization techniques
-- Scalability considerations
-- Security and compliance
-- Real-world case studies
+- CNN backbone architectures (ResNet, EfficientNet, ConvNeXt)
+- Vision Transformer variants (ViT, DeiT, Swin)
+- Detection heads (anchor-based vs anchor-free)
+- Feature Pyramid Networks (FPN, BiFPN, PANet)
+- Neck architectures for multi-scale detection
 
 ### 2. Object Detection Optimization
 
-Complete workflow documentation in `references/object_detection_optimization.md` including:
+See `references/object_detection_optimization.md` for:
 
-- Step-by-step processes
-- Architecture design patterns
-- Tool integration guides
-- Performance tuning strategies
-- Troubleshooting procedures
+- Non-Maximum Suppression variants (NMS, Soft-NMS, DIoU-NMS)
+- Anchor optimization and anchor-free alternatives
+- Loss function design (focal loss, GIoU, CIoU, DIoU)
+- Training strategies (warmup, cosine annealing, EMA)
+- Data augmentation for detection (mosaic, mixup, copy-paste)
 
 ### 3. Production Vision Systems
 
-Technical reference guide in `references/production_vision_systems.md` with:
+See `references/production_vision_systems.md` for:
 
-- System design principles
-- Implementation examples
-- Configuration best practices
-- Deployment strategies
-- Monitoring and observability
-
-## Production Patterns
-
-### Pattern 1: Scalable Data Processing
-
-Enterprise-scale data processing with distributed computing:
-
-- Horizontal scaling architecture
-- Fault-tolerant design
-- Real-time and batch processing
-- Data quality validation
-- Performance monitoring
-
-### Pattern 2: ML Model Deployment
-
-Production ML system with high availability:
-
-- Model serving with low latency
-- A/B testing infrastructure
-- Feature store integration
-- Model monitoring and drift detection
-- Automated retraining pipelines
-
-### Pattern 3: Real-Time Inference
-
-High-throughput inference system:
-
-- Batching and caching strategies
-- Load balancing
-- Auto-scaling
-- Latency optimization
-- Cost optimization
-
-## Best Practices
-
-### Development
-
-- Test-driven development
-- Code reviews and pair programming
-- Documentation as code
-- Version control everything
-- Continuous integration
-
-### Production
-
-- Monitor everything critical
-- Automate deployments
-- Feature flags for releases
-- Canary deployments
-- Comprehensive logging
-
-### Team Leadership
-
-- Mentor junior engineers
-- Drive technical decisions
-- Establish coding standards
-- Foster learning culture
-- Cross-functional collaboration
-
-## Performance Targets
-
-**Latency:**
-- P50: < 50ms
-- P95: < 100ms
-- P99: < 200ms
-
-**Throughput:**
-- Requests/second: > 1000
-- Concurrent users: > 10,000
-
-**Availability:**
-- Uptime: 99.9%
-- Error rate: < 0.1%
-
-## Security & Compliance
-
-- Authentication & authorization
-- Data encryption (at rest & in transit)
-- PII handling and anonymization
-- GDPR/CCPA compliance
-- Regular security audits
-- Vulnerability management
+- ONNX export and optimization
+- TensorRT deployment pipeline
+- Batch inference optimization
+- Edge device deployment (Jetson, Intel NCS)
+- Model serving with Triton
+- Video processing pipelines
 
 ## Common Commands
 
+### Ultralytics YOLO
+
 ```bash
-# Development
-python -m pytest tests/ -v --cov
-python -m black src/
-python -m pylint src/
-
 # Training
-python scripts/train.py --config prod.yaml
-python scripts/evaluate.py --model best.pth
+yolo detect train data=coco.yaml model=yolov8m.pt epochs=100 imgsz=640
 
-# Deployment
-docker build -t service:v1 .
-kubectl apply -f k8s/
-helm upgrade service ./charts/
+# Validation
+yolo detect val model=best.pt data=coco.yaml
 
-# Monitoring
-kubectl logs -f deployment/service
-python scripts/health_check.py
+# Inference
+yolo detect predict model=best.pt source=images/ save=True
+
+# Export
+yolo export model=best.pt format=onnx simplify=True dynamic=True
 ```
 
+### Detectron2
+
+```bash
+# Training
+python train_net.py --config-file configs/COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml \
+    --num-gpus 1 OUTPUT_DIR ./output
+
+# Evaluation
+python train_net.py --config-file configs/faster_rcnn.yaml --eval-only \
+    MODEL.WEIGHTS output/model_final.pth
+
+# Inference
+python demo.py --config-file configs/faster_rcnn.yaml \
+    --input images/*.jpg --output results/ \
+    --opts MODEL.WEIGHTS output/model_final.pth
+```
+
+### MMDetection
+
+```bash
+# Training
+python tools/train.py configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py
+
+# Testing
+python tools/test.py configs/faster_rcnn.py checkpoints/latest.pth --eval bbox
+
+# Inference
+python demo/image_demo.py demo.jpg configs/faster_rcnn.py checkpoints/latest.pth
+```
+
+### Model Optimization
+
+```bash
+# ONNX export and simplify
+python -c "import torch; model = torch.load('model.pt'); torch.onnx.export(model, torch.randn(1,3,640,640), 'model.onnx', opset_version=17)"
+python -m onnxsim model.onnx model_sim.onnx
+
+# TensorRT conversion
+trtexec --onnx=model.onnx --saveEngine=model.engine --fp16 --workspace=4096
+
+# Benchmark
+trtexec --loadEngine=model.engine --batch=1 --iterations=1000 --avgRuns=100
+```
+
+## Performance Targets
+
+| Metric | Real-time | High Accuracy | Edge |
+|--------|-----------|---------------|------|
+| FPS | >30 | >10 | >15 |
+| mAP@50 | >0.6 | >0.8 | >0.5 |
+| Latency P99 | <50ms | <150ms | <100ms |
+| GPU Memory | <4GB | <8GB | <2GB |
+| Model Size | <50MB | <200MB | <20MB |
+
 ## Resources
 
-- Advanced Patterns: `references/computer_vision_architectures.md`
-- Implementation Guide: `references/object_detection_optimization.md`
-- Technical Reference: `references/production_vision_systems.md`
-- Automation Scripts: `scripts/` directory
-
-## Senior-Level Responsibilities
-
-As a world-class senior professional:
-
-1. **Technical Leadership**
-   - Drive architectural decisions
-   - Mentor team members
-   - Establish best practices
-   - Ensure code quality
-
-2. **Strategic Thinking**
-   - Align with business goals
-   - Evaluate trade-offs
-   - Plan for scale
-   - Manage technical debt
-
-3. **Collaboration**
-   - Work across teams
-   - Communicate effectively
-   - Build consensus
-   - Share knowledge
-
-4. **Innovation**
-   - Stay current with research
-   - Experiment with new approaches
-   - Contribute to community
-   - Drive continuous improvement
-
-5. **Production Excellence**
-   - Ensure high availability
-   - Monitor proactively
-   - Optimize performance
-   - Respond to incidents
+- **Architecture Guide**: `references/computer_vision_architectures.md`
+- **Optimization Guide**: `references/object_detection_optimization.md`
+- **Deployment Guide**: `references/production_vision_systems.md`
+- **Scripts**: `scripts/` directory for automation tools
diff --git a/engineering-team/senior-computer-vision/references/computer_vision_architectures.md b/engineering-team/senior-computer-vision/references/computer_vision_architectures.md
index ea5f5df..3e6a22a 100644
--- a/engineering-team/senior-computer-vision/references/computer_vision_architectures.md
+++ b/engineering-team/senior-computer-vision/references/computer_vision_architectures.md
@@ -1,80 +1,683 @@
 # Computer Vision Architectures
 
-## Overview
+Comprehensive guide to CNN and Vision Transformer architectures for object detection, segmentation, and image classification.
 
-World-class computer vision architectures for senior computer vision engineer.
+## Table of Contents
 
-## Core Principles
+- [Backbone Architectures](#backbone-architectures)
+- [Detection Architectures](#detection-architectures)
+- [Segmentation Architectures](#segmentation-architectures)
+- [Vision Transformers](#vision-transformers)
+- [Feature Pyramid Networks](#feature-pyramid-networks)
+- [Architecture Selection](#architecture-selection)
 
-### Production-First Design
+---
 
-Always design with production in mind:
-- Scalability: Handle 10x current load
-- Reliability: 99.9% uptime target
-- Maintainability: Clear, documented code
-- Observability: Monitor everything
+## Backbone Architectures
 
-### Performance by Design
+Backbone networks extract feature representations from images. The choice of backbone affects both accuracy and inference speed.
 
-Optimize from the start:
-- Efficient algorithms
-- Resource awareness
-- Strategic caching
-- Batch processing
+### ResNet Family
 
-### Security & Privacy
+ResNet introduced residual connections that enable training of very deep networks.
 
-Build security in:
-- Input validation
-- Data encryption
-- Access control
-- Audit logging
+| Variant | Params | GFLOPs | Top-1 Acc | Use Case |
+|---------|--------|--------|-----------|----------|
+| ResNet-18 | 11.7M | 1.8 | 69.8% | Edge, mobile |
+| ResNet-34 | 21.8M | 3.7 | 73.3% | Balanced |
+| ResNet-50 | 25.6M | 4.1 | 76.1% | Standard backbone |
+| ResNet-101 | 44.5M | 7.8 | 77.4% | High accuracy |
+| ResNet-152 | 60.2M | 11.6 | 78.3% | Maximum accuracy |
 
-## Advanced Patterns
+**Residual Block Architecture:**
 
-### Pattern 1: Distributed Processing
+```
+Input
+  |
+  +---> Conv 1x1 (reduce channels)
+  |         |
+  |     Conv 3x3
+  |         |
+  |     Conv 1x1 (expand channels)
+  |         |
+  +-----> Add <----+
+            |
+         ReLU
+            |
+         Output
+```
 
-Enterprise-scale data processing with fault tolerance.
+**When to use ResNet:**
+- Standard detection/segmentation tasks
+- When pretrained weights are important
+- Moderate compute budget
+- Well-understood, stable architecture
 
-### Pattern 2: Real-Time Systems
+### EfficientNet Family
 
-Low-latency, high-throughput systems.
+EfficientNet uses compound scaling to balance depth, width, and resolution.
 
-### Pattern 3: ML at Scale
+| Variant | Params | GFLOPs | Top-1 Acc | Relative Speed |
+|---------|--------|--------|-----------|----------------|
+| EfficientNet-B0 | 5.3M | 0.4 | 77.1% | 1x |
+| EfficientNet-B1 | 7.8M | 0.7 | 79.1% | 0.7x |
+| EfficientNet-B2 | 9.2M | 1.0 | 80.1% | 0.6x |
+| EfficientNet-B3 | 12M | 1.8 | 81.6% | 0.4x |
+| EfficientNet-B4 | 19M | 4.2 | 82.9% | 0.25x |
+| EfficientNet-B5 | 30M | 9.9 | 83.6% | 0.15x |
+| EfficientNet-B6 | 43M | 19 | 84.0% | 0.1x |
+| EfficientNet-B7 | 66M | 37 | 84.3% | 0.05x |
 
-Production ML with monitoring and automation.
+**Key innovations:**
+- Mobile Inverted Bottleneck (MBConv) blocks
+- Squeeze-and-Excitation attention
+- Compound scaling coefficients
+- Swish activation function
 
-## Best Practices
+**When to use EfficientNet:**
+- Mobile and edge deployment
+- When parameter efficiency matters
+- Classification tasks
+- Limited compute resources
 
-### Code Quality
-- Comprehensive testing
-- Clear documentation
-- Code reviews
-- Type hints
+### ConvNeXt
 
-### Performance
-- Profile before optimizing
-- Monitor continuously
-- Cache strategically
-- Batch operations
+ConvNeXt modernizes ResNet with techniques from Vision Transformers.
 
-### Reliability
-- Design for failure
-- Implement retries
-- Use circuit breakers
-- Monitor health
+| Variant | Params | GFLOPs | Top-1 Acc |
+|---------|--------|--------|-----------|
+| ConvNeXt-T | 29M | 4.5 | 82.1% |
+| ConvNeXt-S | 50M | 8.7 | 83.1% |
+| ConvNeXt-B | 89M | 15.4 | 83.8% |
+| ConvNeXt-L | 198M | 34.4 | 84.3% |
+| ConvNeXt-XL | 350M | 60.9 | 84.7% |
 
-## Tools & Technologies
+**Key design choices:**
+- 7x7 depthwise convolutions (like ViT patch size)
+- Layer normalization instead of batch norm
+- GELU activation
+- Fewer but wider stages
+- Inverted bottleneck design
 
-Essential tools for this domain:
-- Development frameworks
-- Testing libraries
-- Deployment platforms
-- Monitoring solutions
+**ConvNeXt Block:**
 
-## Further Reading
+```
+Input
+  |
+  +---> DWConv 7x7
+  |         |
+  |     LayerNorm
+  |         |
+  |     Linear (4x channels)
+  |         |
+  |     GELU
+  |         |
+  |     Linear (1x channels)
+  |         |
+  +-----> Add <----+
+            |
+         Output
+```
 
-- Research papers
-- Industry blogs
-- Conference talks
-- Open source projects
+### CSPNet (Cross Stage Partial)
+
+CSPNet is the backbone design used in YOLO v4-v8.
+
+**Key features:**
+- Gradient flow optimization
+- Reduced computation while maintaining accuracy
+- Cross-stage partial connections
+- Optimized for real-time detection
+
+**CSP Block:**
+
+```
+Input
+  |
+  +----> Split ----+
+  |                |
+  |            Conv Block
+  |                |
+  |            Conv Block
+  |                |
+  +----> Concat <--+
+            |
+         Output
+```
+
+---
+
+## Detection Architectures
+
+### Two-Stage Detectors
+
+Two-stage detectors first propose regions, then classify and refine them.
+
+#### Faster R-CNN
+
+Architecture:
+1. **Backbone**: Feature extraction (ResNet, etc.)
+2. **RPN (Region Proposal Network)**: Generate object proposals
+3. **RoI Pooling/Align**: Extract fixed-size features
+4. **Classification Head**: Classify and refine boxes
+
+```
+Image → Backbone → Feature Map
+                      |
+                      +→ RPN → Proposals
+                      |           |
+                      +→ RoI Align ← +
+                            |
+                      FC Layers
+                            |
+                    Class + BBox
+```
+
+**RPN Details:**
+- Sliding window over feature map
+- Anchor boxes at each position (3 scales × 3 ratios = 9)
+- Predicts objectness score and box refinement
+- NMS to reduce proposals (typically 300-2000)
+
+**Performance characteristics:**
+- mAP@50:95: ~40-42 (COCO, R50-FPN)
+- Inference: ~50-100ms per image
+- Better localization than single-stage
+- Slower but more accurate
+
+#### Cascade R-CNN
+
+Multi-stage refinement with increasing IoU thresholds.
+
+```
+Stage 1 (IoU 0.5) → Stage 2 (IoU 0.6) → Stage 3 (IoU 0.7)
+```
+
+**Benefits:**
+- Progressive refinement
+- Better high-IoU predictions
+- +3-4 mAP over Faster R-CNN
+- Minimal additional cost per stage
+
+### Single-Stage Detectors
+
+Single-stage detectors predict boxes and classes in one pass.
+
+#### YOLO Family
+
+**YOLOv8 Architecture:**
+
+```
+Input Image
+     |
+  Backbone (CSPDarknet)
+     |
+  +--+--+--+
+  |  |  |  |
+ P3 P4 P5 (multi-scale features)
+  |  |  |
+  Neck (PANet + C2f)
+  |  |  |
+  Head (Decoupled)
+     |
+ Boxes + Classes
+```
+
+**Key YOLOv8 innovations:**
+- C2f module (faster CSP variant)
+- Anchor-free detection head
+- Decoupled classification/regression heads
+- Task-aligned assigner (TAL)
+- Distribution focal loss (DFL)
+
+**YOLO variant comparison:**
+
+| Model | Size (px) | Params | mAP@50:95 | Speed (ms) |
+|-------|-----------|--------|-----------|------------|
+| YOLOv5n | 640 | 1.9M | 28.0 | 1.2 |
+| YOLOv5s | 640 | 7.2M | 37.4 | 1.8 |
+| YOLOv5m | 640 | 21.2M | 45.4 | 3.5 |
+| YOLOv8n | 640 | 3.2M | 37.3 | 1.2 |
+| YOLOv8s | 640 | 11.2M | 44.9 | 2.1 |
+| YOLOv8m | 640 | 25.9M | 50.2 | 4.2 |
+| YOLOv8l | 640 | 43.7M | 52.9 | 6.8 |
+| YOLOv8x | 640 | 68.2M | 53.9 | 10.1 |
+
+#### SSD (Single Shot Detector)
+
+Multi-scale detection with default boxes.
+
+**Architecture:**
+- VGG16 or MobileNet backbone
+- Additional convolution layers for multi-scale
+- Default boxes at each scale
+- Direct classification and regression
+
+**When to use SSD:**
+- Edge deployment (SSD-MobileNet)
+- When YOLO alternatives needed
+- Simple architecture requirements
+
+#### RetinaNet
+
+Focal loss to handle class imbalance.
+
+**Key innovation:**
+```python
+FL(p_t) = -α_t * (1 - p_t)^γ * log(p_t)
+```
+
+Where:
+- γ (focusing parameter) = 2 typically
+- α (class weight) = 0.25 for background
+
+**Benefits:**
+- Handles extreme foreground-background imbalance
+- Matches two-stage accuracy
+- Single-stage speed
+
+---
+
+## Segmentation Architectures
+
+### Instance Segmentation
+
+#### Mask R-CNN
+
+Extends Faster R-CNN with mask prediction branch.
+
+```
+RoI Features → FC Layers → Class + BBox
+      |
+      +→ Conv Layers → Mask (28×28 per class)
+```
+
+**Key details:**
+- RoI Align (bilinear interpolation, no quantization)
+- Per-class binary mask prediction
+- Decoupled mask and classification
+- 14×14 or 28×28 mask resolution
+
+**Performance:**
+- mAP (box): ~39 on COCO
+- mAP (mask): ~35 on COCO
+- Inference: ~100-200ms
+
+#### YOLACT / YOLACT++
+
+Real-time instance segmentation.
+
+**Approach:**
+1. Generate prototype masks (global)
+2. Predict mask coefficients per instance
+3. Linear combination: mask = Σ(coefficients × prototypes)
+
+**Benefits:**
+- Real-time (~30 FPS)
+- Simpler than Mask R-CNN
+- Global prototypes capture spatial info
+
+#### YOLOv8-Seg
+
+Adds segmentation head to YOLOv8.
+
+**Performance:**
+- mAP (box): 44.6
+- mAP (mask): 36.8
+- Speed: 4.5ms
+
+### Semantic Segmentation
+
+#### DeepLabV3+
+
+Atrous convolutions for multi-scale context.
+
+**Key components:**
+1. **ASPP (Atrous Spatial Pyramid Pooling)**
+   - Parallel atrous convolutions at different rates
+   - Captures multi-scale context
+   - Rates: 6, 12, 18 typically
+
+2. **Encoder-Decoder**
+   - Encoder: Backbone + ASPP
+   - Decoder: Upsample with skip connections
+
+```
+Image → Backbone → ASPP → Decoder → Segmentation
+              ↘    ↗
+          Low-level features
+```
+
+**Performance:**
+- mIoU: 89.0 on Cityscapes
+- Inference: ~25ms (ResNet-50)
+
+#### SegFormer
+
+Transformer-based semantic segmentation.
+
+**Architecture:**
+1. **Hierarchical Transformer Encoder**
+   - Multi-scale feature maps
+   - Efficient self-attention
+   - Overlapping patch embedding
+
+2. **MLP Decoder**
+   - Simple MLP aggregation
+   - No complex decoders needed
+
+**Benefits:**
+- No positional encoding needed
+- Efficient attention mechanism
+- Strong multi-scale features
+
+### Promptable Segmentation
+
+#### SAM (Segment Anything Model)
+
+Zero-shot segmentation with prompts.
+
+**Architecture:**
+1. **Image Encoder**: ViT-H (632M params)
+2. **Prompt Encoder**: Points, boxes, masks, text
+3. **Mask Decoder**: Lightweight transformer
+
+**Prompts supported:**
+- Points (foreground/background)
+- Bounding boxes
+- Rough masks
+- Text (via CLIP integration)
+
+**Usage patterns:**
+```python
+# Point prompt
+masks = sam.predict(image, point_coords=[[500, 375]], point_labels=[1])
+
+# Box prompt
+masks = sam.predict(image, box=[100, 100, 400, 400])
+
+# Multiple points
+masks = sam.predict(image, point_coords=[[500, 375], [200, 300]],
+                   point_labels=[1, 0])  # 1=foreground, 0=background
+```
+
+---
+
+## Vision Transformers
+
+### ViT (Vision Transformer)
+
+Original vision transformer architecture.
+
+**Architecture:**
+
+```
+Image → Patch Embedding → [CLS] + Position Embedding
+                              ↓
+                    Transformer Encoder ×L
+                              ↓
+                         [CLS] token
+                              ↓
+                     Classification Head
+```
+
+**Key details:**
+- Patch size: 16×16 or 14×14 typically
+- Position embeddings: Learned 1D
+- [CLS] token for classification
+- Standard transformer encoder blocks
+
+**Variants:**
+
+| Model | Patch | Layers | Hidden | Heads | Params |
+|-------|-------|--------|--------|-------|--------|
+| ViT-Ti | 16 | 12 | 192 | 3 | 5.7M |
+| ViT-S | 16 | 12 | 384 | 6 | 22M |
+| ViT-B | 16 | 12 | 768 | 12 | 86M |
+| ViT-L | 16 | 24 | 1024 | 16 | 304M |
+| ViT-H | 14 | 32 | 1280 | 16 | 632M |
+
+### DeiT (Data-efficient Image Transformers)
+
+Training ViT without massive datasets.
+
+**Key innovations:**
+- Knowledge distillation from CNN teachers
+- Strong data augmentation
+- Regularization (stochastic depth, label smoothing)
+- Distillation token (learns from teacher)
+
+**Training recipe:**
+- RandAugment
+- Mixup (α=0.8)
+- CutMix (α=1.0)
+- Random erasing (p=0.25)
+- Stochastic depth (p=0.1)
+
+### Swin Transformer
+
+Hierarchical transformer with shifted windows.
+
+**Key innovations:**
+1. **Shifted Window Attention**
+   - Local attention within windows
+   - Cross-window connection via shifting
+   - O(n) complexity vs O(n²) for global attention
+
+2. **Hierarchical Feature Maps**
+   - Patch merging between stages
+   - Similar to CNN feature pyramids
+   - Direct use in detection/segmentation
+
+**Architecture:**
+
+```
+Stage 1: 56×56, 96-dim   → Patch Merge
+Stage 2: 28×28, 192-dim  → Patch Merge
+Stage 3: 14×14, 384-dim  → Patch Merge
+Stage 4: 7×7, 768-dim
+```
+
+**Variants:**
+
+| Model | Params | GFLOPs | Top-1 |
+|-------|--------|--------|-------|
+| Swin-T | 29M | 4.5 | 81.3% |
+| Swin-S | 50M | 8.7 | 83.0% |
+| Swin-B | 88M | 15.4 | 83.5% |
+| Swin-L | 197M | 34.5 | 84.5% |
+
+---
+
+## Feature Pyramid Networks
+
+FPN variants for multi-scale detection.
+
+### Original FPN
+
+Top-down pathway with lateral connections.
+
+```
+P5 ← C5 (1/32)
+ ↓
+P4 ← C4 + Upsample(P5) (1/16)
+ ↓
+P3 ← C3 + Upsample(P4) (1/8)
+ ↓
+P2 ← C2 + Upsample(P3) (1/4)
+```
+
+### PANet (Path Aggregation Network)
+
+Bottom-up augmentation after FPN.
+
+```
+FPN top-down → Bottom-up augmentation
+P2 → N2 ↘
+P3 → N3 → N3 ↘
+P4 → N4 → N4 → N4 ↘
+P5 → N5 → N5 → N5 → N5
+```
+
+**Benefits:**
+- Shorter path from low-level to high-level
+- Better localization signals
+- +1-2 mAP improvement
+
+### BiFPN (Bidirectional FPN)
+
+Weighted bidirectional feature fusion.
+
+**Key innovations:**
+- Learnable fusion weights
+- Bidirectional cross-scale connections
+- Repeated blocks for iterative refinement
+
+**Fusion formula:**
+```
+O = Σ(w_i × I_i) / (ε + Σ w_i)
+```
+
+Where weights are learned via fast normalized fusion.
+
+### NAS-FPN
+
+Neural architecture search for FPN design.
+
+**Searched on COCO:**
+- 7 fusion cells
+- Optimized connection patterns
+- 3-4 mAP improvement over FPN
+
+---
+
+## Architecture Selection
+
+### Decision Matrix
+
+| Requirement | Recommended | Alternative |
+|-------------|-------------|-------------|
+| Real-time (>30 FPS) | YOLOv8s | RT-DETR-S |
+| Edge (<4GB RAM) | YOLOv8n | MobileNetV3-SSD |
+| High accuracy | DINO, Cascade R-CNN | YOLOv8x |
+| Instance segmentation | Mask R-CNN | YOLOv8-seg |
+| Semantic segmentation | SegFormer | DeepLabV3+ |
+| Zero-shot | SAM | CLIP+segmentation |
+| Small objects | YOLO+SAHI | Cascade R-CNN |
+| Video real-time | YOLOv8 + ByteTrack | YOLOX + SORT |
+
+### Training Data Requirements
+
+| Architecture | Minimum Images | Recommended |
+|--------------|----------------|-------------|
+| YOLO (fine-tune) | 100-500 | 1,000-5,000 |
+| YOLO (from scratch) | 5,000+ | 10,000+ |
+| Faster R-CNN | 1,000+ | 5,000+ |
+| DETR/DINO | 10,000+ | 50,000+ |
+| ViT backbone | 10,000+ | 100,000+ |
+| SAM (fine-tune) | 100-1,000 | 5,000+ |
+
+### Compute Requirements
+
+| Architecture | Training GPU | Inference GPU |
+|--------------|--------------|---------------|
+| YOLOv8n | 4GB VRAM | 2GB VRAM |
+| YOLOv8m | 8GB VRAM | 4GB VRAM |
+| YOLOv8x | 16GB VRAM | 8GB VRAM |
+| Faster R-CNN R50 | 8GB VRAM | 4GB VRAM |
+| Mask R-CNN R101 | 16GB VRAM | 8GB VRAM |
+| DINO-4scale | 32GB VRAM | 16GB VRAM |
+| SAM ViT-H | 32GB VRAM | 8GB VRAM |
+
+---
+
+## Code Examples
+
+### Load Pretrained Backbone (timm)
+
+```python
+import timm
+
+# List available models
+print(timm.list_models('*resnet*'))
+
+# Load pretrained
+backbone = timm.create_model('resnet50', pretrained=True, features_only=True)
+
+# Get feature maps
+features = backbone(torch.randn(1, 3, 224, 224))
+for f in features:
+    print(f.shape)
+# torch.Size([1, 64, 56, 56])
+# torch.Size([1, 256, 56, 56])
+# torch.Size([1, 512, 28, 28])
+# torch.Size([1, 1024, 14, 14])
+# torch.Size([1, 2048, 7, 7])
+```
+
+### Custom Detection Backbone
+
+```python
+import torch.nn as nn
+from torchvision.models import resnet50
+from torchvision.ops import FeaturePyramidNetwork
+
+class DetectionBackbone(nn.Module):
+    def __init__(self):
+        super().__init__()
+        backbone = resnet50(pretrained=True)
+
+        self.layer1 = nn.Sequential(backbone.conv1, backbone.bn1,
+                                     backbone.relu, backbone.maxpool,
+                                     backbone.layer1)
+        self.layer2 = backbone.layer2
+        self.layer3 = backbone.layer3
+        self.layer4 = backbone.layer4
+
+        self.fpn = FeaturePyramidNetwork(
+            in_channels_list=[256, 512, 1024, 2048],
+            out_channels=256
+        )
+
+    def forward(self, x):
+        c1 = self.layer1(x)
+        c2 = self.layer2(c1)
+        c3 = self.layer3(c2)
+        c4 = self.layer4(c3)
+
+        features = {'feat0': c1, 'feat1': c2, 'feat2': c3, 'feat3': c4}
+        pyramid = self.fpn(features)
+        return pyramid
+```
+
+### Vision Transformer with Detection Head
+
+```python
+import timm
+
+# Swin Transformer for detection
+swin = timm.create_model('swin_base_patch4_window7_224',
+                          pretrained=True,
+                          features_only=True,
+                          out_indices=[0, 1, 2, 3])
+
+# Get multi-scale features
+x = torch.randn(1, 3, 224, 224)
+features = swin(x)
+for i, f in enumerate(features):
+    print(f"Stage {i}: {f.shape}")
+# Stage 0: torch.Size([1, 128, 56, 56])
+# Stage 1: torch.Size([1, 256, 28, 28])
+# Stage 2: torch.Size([1, 512, 14, 14])
+# Stage 3: torch.Size([1, 1024, 7, 7])
+```
+
+---
+
+## Resources
+
+- [torchvision models](https://pytorch.org/vision/stable/models.html)
+- [timm library](https://github.com/huggingface/pytorch-image-models)
+- [Detectron2 Model Zoo](https://github.com/facebookresearch/detectron2/blob/main/MODEL_ZOO.md)
+- [MMDetection Model Zoo](https://github.com/open-mmlab/mmdetection/blob/main/docs/en/model_zoo.md)
+- [Ultralytics YOLOv8](https://docs.ultralytics.com/)
diff --git a/engineering-team/senior-computer-vision/references/object_detection_optimization.md b/engineering-team/senior-computer-vision/references/object_detection_optimization.md
index 81a7c2d..cc7bca5 100644
--- a/engineering-team/senior-computer-vision/references/object_detection_optimization.md
+++ b/engineering-team/senior-computer-vision/references/object_detection_optimization.md
@@ -1,80 +1,885 @@
 # Object Detection Optimization
 
-## Overview
+Comprehensive guide to optimizing object detection models for accuracy and inference speed.
 
-World-class object detection optimization for senior computer vision engineer.
+## Table of Contents
 
-## Core Principles
+- [Non-Maximum Suppression](#non-maximum-suppression)
+- [Anchor Design and Optimization](#anchor-design-and-optimization)
+- [Loss Functions](#loss-functions)
+- [Training Strategies](#training-strategies)
+- [Data Augmentation](#data-augmentation)
+- [Model Optimization Techniques](#model-optimization-techniques)
+- [Hyperparameter Tuning](#hyperparameter-tuning)
 
-### Production-First Design
+---
 
-Always design with production in mind:
-- Scalability: Handle 10x current load
-- Reliability: 99.9% uptime target
-- Maintainability: Clear, documented code
-- Observability: Monitor everything
+## Non-Maximum Suppression
 
-### Performance by Design
+NMS removes redundant overlapping detections to produce final predictions.
 
-Optimize from the start:
-- Efficient algorithms
-- Resource awareness
-- Strategic caching
-- Batch processing
+### Standard NMS
 
-### Security & Privacy
+Basic algorithm:
+1. Sort boxes by confidence score
+2. Select highest confidence box
+3. Remove boxes with IoU > threshold
+4. Repeat until no boxes remain
 
-Build security in:
-- Input validation
-- Data encryption
-- Access control
-- Audit logging
+```python
+def nms(boxes, scores, iou_threshold=0.5):
+    """
+    boxes: (N, 4) in format [x1, y1, x2, y2]
+    scores: (N,)
+    """
+    order = scores.argsort()[::-1]
+    keep = []
 
-## Advanced Patterns
+    while len(order) > 0:
+        i = order[0]
+        keep.append(i)
 
-### Pattern 1: Distributed Processing
+        if len(order) == 1:
+            break
 
-Enterprise-scale data processing with fault tolerance.
+        # Calculate IoU with remaining boxes
+        ious = compute_iou(boxes[i], boxes[order[1:]])
 
-### Pattern 2: Real-Time Systems
+        # Keep boxes with IoU <= threshold
+        mask = ious <= iou_threshold
+        order = order[1:][mask]
 
-Low-latency, high-throughput systems.
+    return keep
+```
 
-### Pattern 3: ML at Scale
+**Parameters:**
+- `iou_threshold`: 0.5-0.7 typical (lower = more suppression)
+- `score_threshold`: 0.25-0.5 (filter low-confidence first)
 
-Production ML with monitoring and automation.
+### Soft-NMS
 
-## Best Practices
+Reduces scores instead of removing boxes entirely.
 
-### Code Quality
-- Comprehensive testing
-- Clear documentation
-- Code reviews
-- Type hints
+**Formula:**
+```
+score = score * exp(-IoU^2 / sigma)
+```
 
-### Performance
-- Profile before optimizing
-- Monitor continuously
-- Cache strategically
-- Batch operations
+**Benefits:**
+- Better for overlapping objects
+- +1-2% mAP improvement
+- Slightly slower than hard NMS
 
-### Reliability
-- Design for failure
-- Implement retries
-- Use circuit breakers
-- Monitor health
+```python
+def soft_nms(boxes, scores, sigma=0.5, score_threshold=0.001):
+    """Gaussian penalty soft-NMS"""
+    order = scores.argsort()[::-1]
+    keep = []
 
-## Tools & Technologies
+    while len(order) > 0:
+        i = order[0]
+        keep.append(i)
 
-Essential tools for this domain:
-- Development frameworks
-- Testing libraries
-- Deployment platforms
-- Monitoring solutions
+        if len(order) == 1:
+            break
 
-## Further Reading
+        ious = compute_iou(boxes[i], boxes[order[1:]])
 
-- Research papers
-- Industry blogs
-- Conference talks
-- Open source projects
+        # Gaussian penalty
+        weights = np.exp(-ious**2 / sigma)
+        scores[order[1:]] *= weights
+
+        # Re-sort by updated scores
+        mask = scores[order[1:]] > score_threshold
+        order = order[1:][mask]
+        order = order[scores[order].argsort()[::-1]]
+
+    return keep
+```
+
+### DIoU-NMS
+
+Uses Distance-IoU instead of standard IoU.
+
+**Formula:**
+```
+DIoU = IoU - (d^2 / c^2)
+```
+
+Where:
+- d = center distance between boxes
+- c = diagonal of smallest enclosing box
+
+**Benefits:**
+- Better for occluded objects
+- Penalizes distant boxes less
+- Works well with DIoU loss
+
+### Batched NMS
+
+NMS per class (prevents cross-class suppression).
+
+```python
+def batched_nms(boxes, scores, classes, iou_threshold):
+    """Per-class NMS"""
+    # Offset boxes by class ID to prevent cross-class suppression
+    max_coordinate = boxes.max()
+    offsets = classes * (max_coordinate + 1)
+    boxes_for_nms = boxes + offsets[:, None]
+
+    keep = torchvision.ops.nms(boxes_for_nms, scores, iou_threshold)
+    return keep
+```
+
+### NMS-Free Detection (DETR-style)
+
+Transformer-based detectors eliminate NMS.
+
+**How DETR avoids NMS:**
+- Object queries are learned embeddings
+- Bipartite matching in training
+- Each query outputs exactly one detection
+- Set-based loss enforces uniqueness
+
+**Benefits:**
+- End-to-end differentiable
+- No hand-crafted post-processing
+- Better for complex scenes
+
+---
+
+## Anchor Design and Optimization
+
+### Anchor-Based Detection
+
+Traditional detectors use predefined anchor boxes.
+
+**Anchor parameters:**
+- Scales: [32, 64, 128, 256, 512] pixels
+- Ratios: [0.5, 1.0, 2.0] (height/width)
+- Stride: Feature map stride (8, 16, 32)
+
+**Anchor assignment:**
+- Positive: IoU > 0.7 with ground truth
+- Negative: IoU < 0.3 with all ground truths
+- Ignored: 0.3 < IoU < 0.7
+
+### K-Means Anchor Clustering
+
+Optimize anchors for your dataset.
+
+```python
+import numpy as np
+from sklearn.cluster import KMeans
+
+def optimize_anchors(annotations, num_anchors=9, image_size=640):
+    """
+    annotations: list of (width, height) for each bounding box
+    """
+    # Normalize to input size
+    boxes = np.array(annotations)
+    boxes = boxes / boxes.max() * image_size
+
+    # K-means clustering
+    kmeans = KMeans(n_clusters=num_anchors, random_state=42)
+    kmeans.fit(boxes)
+
+    # Get anchor sizes
+    anchors = kmeans.cluster_centers_
+
+    # Sort by area
+    areas = anchors[:, 0] * anchors[:, 1]
+    anchors = anchors[np.argsort(areas)]
+
+    # Calculate mean IoU with ground truth
+    mean_iou = calculate_anchor_fit(boxes, anchors)
+    print(f"Optimized anchors (mean IoU: {mean_iou:.3f}):")
+    print(anchors.astype(int))
+
+    return anchors
+
+def calculate_anchor_fit(boxes, anchors):
+    """Calculate how well anchors fit the boxes"""
+    ious = []
+    for box in boxes:
+        box_area = box[0] * box[1]
+        anchor_areas = anchors[:, 0] * anchors[:, 1]
+        intersections = np.minimum(box[0], anchors[:, 0]) * \
+                       np.minimum(box[1], anchors[:, 1])
+        unions = box_area + anchor_areas - intersections
+        max_iou = (intersections / unions).max()
+        ious.append(max_iou)
+    return np.mean(ious)
+```
+
+### Anchor-Free Detection
+
+Modern detectors predict boxes without anchors.
+
+**FCOS-style (center-based):**
+- Predict (l, t, r, b) distances from center
+- Centerness score for quality
+- Multi-scale assignment
+
+**YOLO v8 style:**
+- Predict (x, y, w, h) directly
+- Task-aligned assigner
+- Distribution focal loss for regression
+
+**Benefits of anchor-free:**
+- No hyperparameter tuning for anchors
+- Simpler architecture
+- Better generalization
+
+### Anchor Assignment Strategies
+
+**ATSS (Adaptive Training Sample Selection):**
+1. For each GT, select k closest anchors per level
+2. Calculate IoU for selected anchors
+3. IoU threshold = mean + std of IoUs
+4. Assign positives where IoU > threshold
+
+**TAL (Task-Aligned Assigner - YOLO v8):**
+```
+score = cls_score^alpha * IoU^beta
+```
+
+Where alpha=0.5, beta=6.0 (weights classification and localization)
+
+---
+
+## Loss Functions
+
+### Classification Losses
+
+#### Cross-Entropy Loss
+
+Standard multi-class classification:
+```python
+loss = -log(p_correct_class)
+```
+
+#### Focal Loss
+
+Handles class imbalance by down-weighting easy examples.
+
+```python
+def focal_loss(pred, target, gamma=2.0, alpha=0.25):
+    """
+    pred: (N, num_classes) predicted probabilities
+    target: (N,) ground truth class indices
+    """
+    ce_loss = F.cross_entropy(pred, target, reduction='none')
+    pt = torch.exp(-ce_loss)  # probability of correct class
+
+    # Focal term: (1 - pt)^gamma
+    focal_term = (1 - pt) ** gamma
+
+    # Alpha weighting
+    alpha_t = alpha * target + (1 - alpha) * (1 - target)
+
+    loss = alpha_t * focal_term * ce_loss
+    return loss.mean()
+```
+
+**Hyperparameters:**
+- gamma: 2.0 typical, higher = more focus on hard examples
+- alpha: 0.25 for foreground class weight
+
+#### Quality Focal Loss (QFL)
+
+Combines classification with IoU quality.
+
+```python
+def quality_focal_loss(pred, target, beta=2.0):
+    """
+    target: IoU values (0-1) instead of binary
+    """
+    ce = F.binary_cross_entropy(pred, target, reduction='none')
+    focal_weight = torch.abs(pred - target) ** beta
+    loss = focal_weight * ce
+    return loss.mean()
+```
+
+### Regression Losses
+
+#### Smooth L1 Loss
+
+```python
+def smooth_l1_loss(pred, target, beta=1.0):
+    diff = torch.abs(pred - target)
+    loss = torch.where(
+        diff < beta,
+        0.5 * diff ** 2 / beta,
+        diff - 0.5 * beta
+    )
+    return loss.mean()
+```
+
+#### IoU-Based Losses
+
+**IoU Loss:**
+```
+L_IoU = 1 - IoU
+```
+
+**GIoU (Generalized IoU):**
+```
+GIoU = IoU - (C - U) / C
+L_GIoU = 1 - GIoU
+```
+
+Where C = area of smallest enclosing box, U = union area.
+
+**DIoU (Distance IoU):**
+```
+DIoU = IoU - d^2 / c^2
+L_DIoU = 1 - DIoU
+```
+
+Where d = center distance, c = diagonal of enclosing box.
+
+**CIoU (Complete IoU):**
+```
+CIoU = IoU - d^2 / c^2 - alpha*v
+v = (4/pi^2) * (arctan(w_gt/h_gt) - arctan(w/h))^2
+alpha = v / (1 - IoU + v)
+L_CIoU = 1 - CIoU
+```
+
+**Comparison:**
+
+| Loss | Handles | Best For |
+|------|---------|----------|
+| L1/L2 | Basic regression | Simple tasks |
+| IoU | Overlap | Standard detection |
+| GIoU | Non-overlapping | Distant boxes |
+| DIoU | Center distance | Faster convergence |
+| CIoU | Aspect ratio | Best accuracy |
+
+```python
+def ciou_loss(pred_boxes, target_boxes):
+    """
+    pred_boxes, target_boxes: (N, 4) as [x1, y1, x2, y2]
+    """
+    # Standard IoU
+    inter = compute_intersection(pred_boxes, target_boxes)
+    union = compute_union(pred_boxes, target_boxes)
+    iou = inter / (union + 1e-7)
+
+    # Enclosing box diagonal
+    enclose_x1 = torch.min(pred_boxes[:, 0], target_boxes[:, 0])
+    enclose_y1 = torch.min(pred_boxes[:, 1], target_boxes[:, 1])
+    enclose_x2 = torch.max(pred_boxes[:, 2], target_boxes[:, 2])
+    enclose_y2 = torch.max(pred_boxes[:, 3], target_boxes[:, 3])
+    c_sq = (enclose_x2 - enclose_x1)**2 + (enclose_y2 - enclose_y1)**2
+
+    # Center distance
+    pred_cx = (pred_boxes[:, 0] + pred_boxes[:, 2]) / 2
+    pred_cy = (pred_boxes[:, 1] + pred_boxes[:, 3]) / 2
+    target_cx = (target_boxes[:, 0] + target_boxes[:, 2]) / 2
+    target_cy = (target_boxes[:, 1] + target_boxes[:, 3]) / 2
+    d_sq = (pred_cx - target_cx)**2 + (pred_cy - target_cy)**2
+
+    # Aspect ratio term
+    pred_w = pred_boxes[:, 2] - pred_boxes[:, 0]
+    pred_h = pred_boxes[:, 3] - pred_boxes[:, 1]
+    target_w = target_boxes[:, 2] - target_boxes[:, 0]
+    target_h = target_boxes[:, 3] - target_boxes[:, 1]
+
+    v = (4 / math.pi**2) * (
+        torch.atan(target_w / target_h) - torch.atan(pred_w / pred_h)
+    )**2
+    alpha_term = v / (1 - iou + v + 1e-7)
+
+    ciou = iou - d_sq / (c_sq + 1e-7) - alpha_term * v
+    return 1 - ciou
+```
+
+### Distribution Focal Loss (DFL)
+
+Used in YOLO v8 for regression.
+
+**Concept:**
+- Predict distribution over discrete positions
+- Each regression target is a soft label
+- Allows uncertainty estimation
+
+```python
+def dfl_loss(pred_dist, target, reg_max=16):
+    """
+    pred_dist: (N, reg_max) predicted distribution
+    target: (N,) continuous target values (0 to reg_max)
+    """
+    # Convert continuous target to soft label
+    target_left = target.floor().long()
+    target_right = target_left + 1
+    weight_right = target - target_left.float()
+    weight_left = 1 - weight_right
+
+    # Cross-entropy with soft targets
+    loss_left = F.cross_entropy(pred_dist, target_left, reduction='none')
+    loss_right = F.cross_entropy(pred_dist, target_right.clamp(max=reg_max-1),
+                                  reduction='none')
+
+    loss = weight_left * loss_left + weight_right * loss_right
+    return loss.mean()
+```
+
+---
+
+## Training Strategies
+
+### Learning Rate Schedules
+
+**Warmup:**
+```python
+# Linear warmup for first N epochs
+if epoch < warmup_epochs:
+    lr = base_lr * (epoch + 1) / warmup_epochs
+```
+
+**Cosine Annealing:**
+```python
+lr = lr_min + 0.5 * (lr_max - lr_min) * (1 + cos(pi * epoch / total_epochs))
+```
+
+**Step Decay:**
+```python
+# Reduce by factor at milestones
+lr = base_lr * (0.1 ** (milestones_passed))
+```
+
+**Recommended schedule for detection:**
+```python
+optimizer = SGD(model.parameters(), lr=0.01, momentum=0.937, weight_decay=0.0005)
+
+scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(
+    optimizer,
+    T_max=total_epochs,
+    eta_min=0.0001
+)
+
+# With warmup
+warmup_scheduler = torch.optim.lr_scheduler.LinearLR(
+    optimizer,
+    start_factor=0.1,
+    total_iters=warmup_epochs
+)
+
+scheduler = torch.optim.lr_scheduler.SequentialLR(
+    optimizer,
+    schedulers=[warmup_scheduler, scheduler],
+    milestones=[warmup_epochs]
+)
+```
+
+### Exponential Moving Average (EMA)
+
+Smooths model weights for better stability.
+
+```python
+class EMA:
+    def __init__(self, model, decay=0.9999):
+        self.model = model
+        self.decay = decay
+        self.shadow = {}
+        for name, param in model.named_parameters():
+            if param.requires_grad:
+                self.shadow[name] = param.data.clone()
+
+    def update(self):
+        for name, param in self.model.named_parameters():
+            if param.requires_grad:
+                self.shadow[name] = (
+                    self.decay * self.shadow[name] +
+                    (1 - self.decay) * param.data
+                )
+
+    def apply_shadow(self):
+        for name, param in self.model.named_parameters():
+            if param.requires_grad:
+                param.data.copy_(self.shadow[name])
+```
+
+**Usage:**
+- Update EMA after each training step
+- Use EMA weights for validation/inference
+- Decay: 0.9999 typical (higher = slower update)
+
+### Multi-Scale Training
+
+Train with varying input sizes.
+
+```python
+# Random size each batch
+sizes = [480, 512, 544, 576, 608, 640, 672, 704, 736, 768]
+input_size = random.choice(sizes)
+
+# Resize batch to selected size
+images = F.interpolate(images, size=input_size, mode='bilinear')
+```
+
+**Benefits:**
+- Better scale invariance
+- +1-2% mAP improvement
+- Slower training (variable batch size)
+
+### Gradient Accumulation
+
+Simulate larger batch sizes.
+
+```python
+accumulation_steps = 4
+optimizer.zero_grad()
+
+for i, (images, targets) in enumerate(dataloader):
+    loss = model(images, targets) / accumulation_steps
+    loss.backward()
+
+    if (i + 1) % accumulation_steps == 0:
+        optimizer.step()
+        optimizer.zero_grad()
+```
+
+### Mixed Precision Training
+
+Use FP16 for speed and memory.
+
+```python
+from torch.cuda.amp import autocast, GradScaler
+
+scaler = GradScaler()
+
+for images, targets in dataloader:
+    optimizer.zero_grad()
+
+    with autocast():
+        loss = model(images, targets)
+
+    scaler.scale(loss).backward()
+    scaler.step(optimizer)
+    scaler.update()
+```
+
+**Benefits:**
+- 2-3x faster training
+- 50% memory reduction
+- Minimal accuracy loss
+
+---
+
+## Data Augmentation
+
+### Geometric Augmentations
+
+```python
+import albumentations as A
+
+geometric = A.Compose([
+    A.HorizontalFlip(p=0.5),
+    A.Rotate(limit=15, p=0.3),
+    A.RandomScale(scale_limit=0.2, p=0.5),
+    A.Affine(translate_percent={'x': (-0.1, 0.1), 'y': (-0.1, 0.1)}, p=0.3),
+], bbox_params=A.BboxParams(format='coco', label_fields=['class_labels']))
+```
+
+### Color Augmentations
+
+```python
+color = A.Compose([
+    A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.5),
+    A.HueSaturationValue(hue_shift_limit=20, sat_shift_limit=30, val_shift_limit=20, p=0.5),
+    A.CLAHE(clip_limit=2.0, p=0.1),
+    A.GaussianBlur(blur_limit=3, p=0.1),
+    A.GaussNoise(var_limit=(10, 50), p=0.1),
+])
+```
+
+### Mosaic Augmentation
+
+Combines 4 images into one (YOLO-style).
+
+```python
+def mosaic_augmentation(images, labels, input_size=640):
+    """
+    images: list of 4 images
+    labels: list of 4 label arrays
+    """
+    result_image = np.zeros((input_size, input_size, 3), dtype=np.uint8)
+    result_labels = []
+
+    # Random center point
+    cx = int(random.uniform(input_size * 0.25, input_size * 0.75))
+    cy = int(random.uniform(input_size * 0.25, input_size * 0.75))
+
+    positions = [
+        (0, 0, cx, cy),           # top-left
+        (cx, 0, input_size, cy),  # top-right
+        (0, cy, cx, input_size),  # bottom-left
+        (cx, cy, input_size, input_size),  # bottom-right
+    ]
+
+    for i, (x1, y1, x2, y2) in enumerate(positions):
+        img = images[i]
+        h, w = y2 - y1, x2 - x1
+
+        # Resize and place
+        img_resized = cv2.resize(img, (w, h))
+        result_image[y1:y2, x1:x2] = img_resized
+
+        # Transform labels
+        for label in labels[i]:
+            # Scale and shift bounding boxes
+            new_label = transform_bbox(label, img.shape, (h, w), (x1, y1))
+            result_labels.append(new_label)
+
+    return result_image, result_labels
+```
+
+### MixUp
+
+Blends two images and labels.
+
+```python
+def mixup(image1, labels1, image2, labels2, alpha=0.5):
+    """
+    alpha: mixing ratio (0.5 = equal blend)
+    """
+    # Blend images
+    mixed_image = (alpha * image1 + (1 - alpha) * image2).astype(np.uint8)
+
+    # Blend labels with soft weights
+    labels1_weighted = [(box, cls, alpha) for box, cls in labels1]
+    labels2_weighted = [(box, cls, 1-alpha) for box, cls in labels2]
+
+    mixed_labels = labels1_weighted + labels2_weighted
+    return mixed_image, mixed_labels
+```
+
+### Copy-Paste Augmentation
+
+Paste objects from one image to another.
+
+```python
+def copy_paste(background, bg_labels, source, src_labels, src_masks):
+    """
+    Paste segmented objects onto background
+    """
+    result = background.copy()
+
+    for mask, label in zip(src_masks, src_labels):
+        # Random position
+        x_offset = random.randint(0, background.shape[1] - mask.shape[1])
+        y_offset = random.randint(0, background.shape[0] - mask.shape[0])
+
+        # Paste with mask
+        region = result[y_offset:y_offset+mask.shape[0],
+                       x_offset:x_offset+mask.shape[1]]
+        region[mask > 0] = source[mask > 0]
+
+        # Add new label
+        new_box = transform_bbox(label, x_offset, y_offset)
+        bg_labels.append(new_box)
+
+    return result, bg_labels
+```
+
+### Cutout / Random Erasing
+
+Randomly erase patches.
+
+```python
+def cutout(image, num_holes=8, max_h_size=32, max_w_size=32):
+    h, w = image.shape[:2]
+    result = image.copy()
+
+    for _ in range(num_holes):
+        y = random.randint(0, h)
+        x = random.randint(0, w)
+        h_size = random.randint(1, max_h_size)
+        w_size = random.randint(1, max_w_size)
+
+        y1, y2 = max(0, y - h_size // 2), min(h, y + h_size // 2)
+        x1, x2 = max(0, x - w_size // 2), min(w, x + w_size // 2)
+
+        result[y1:y2, x1:x2] = 0  # or random color
+
+    return result
+```
+
+---
+
+## Model Optimization Techniques
+
+### Pruning
+
+Remove unimportant weights.
+
+**Magnitude Pruning:**
+```python
+import torch.nn.utils.prune as prune
+
+# Prune 30% of weights with smallest magnitude
+for name, module in model.named_modules():
+    if isinstance(module, nn.Conv2d):
+        prune.l1_unstructured(module, name='weight', amount=0.3)
+```
+
+**Structured Pruning (channels):**
+```python
+# Prune entire channels
+prune.ln_structured(module, name='weight', amount=0.3, n=2, dim=0)
+```
+
+### Knowledge Distillation
+
+Train smaller model with larger teacher.
+
+```python
+def distillation_loss(student_logits, teacher_logits, labels,
+                      temperature=4.0, alpha=0.7):
+    """
+    Combine soft targets from teacher with hard labels
+    """
+    # Soft targets
+    soft_student = F.log_softmax(student_logits / temperature, dim=1)
+    soft_teacher = F.softmax(teacher_logits / temperature, dim=1)
+    soft_loss = F.kl_div(soft_student, soft_teacher, reduction='batchmean')
+    soft_loss *= temperature ** 2  # Scale by T^2
+
+    # Hard targets
+    hard_loss = F.cross_entropy(student_logits, labels)
+
+    # Combined loss
+    return alpha * soft_loss + (1 - alpha) * hard_loss
+```
+
+### Quantization
+
+Reduce precision for faster inference.
+
+**Post-Training Quantization:**
+```python
+import torch.quantization
+
+# Prepare model
+model.set_mode('inference')
+model.qconfig = torch.quantization.get_default_qconfig('fbgemm')
+torch.quantization.prepare(model, inplace=True)
+
+# Calibrate with representative data
+with torch.no_grad():
+    for images in calibration_loader:
+        model(images)
+
+# Convert to quantized model
+torch.quantization.convert(model, inplace=True)
+```
+
+**Quantization-Aware Training:**
+```python
+# Insert fake quantization during training
+model.train()
+model.qconfig = torch.quantization.get_default_qat_qconfig('fbgemm')
+model_prepared = torch.quantization.prepare_qat(model)
+
+# Train with fake quantization
+for epoch in range(num_epochs):
+    train(model_prepared)
+
+# Convert to quantized
+model_quantized = torch.quantization.convert(model_prepared)
+```
+
+---
+
+## Hyperparameter Tuning
+
+### Key Hyperparameters
+
+| Parameter | Range | Default | Impact |
+|-----------|-------|---------|--------|
+| Learning rate | 1e-4 to 1e-1 | 0.01 | Critical |
+| Batch size | 4 to 64 | 16 | Memory/speed |
+| Weight decay | 1e-5 to 1e-3 | 5e-4 | Regularization |
+| Momentum | 0.9 to 0.99 | 0.937 | Optimization |
+| Warmup epochs | 1 to 10 | 3 | Stability |
+| IoU threshold (NMS) | 0.4 to 0.7 | 0.5 | Recall/precision |
+| Confidence threshold | 0.1 to 0.5 | 0.25 | Detection count |
+| Image size | 320 to 1280 | 640 | Accuracy/speed |
+
+### Tuning Strategy
+
+1. **Baseline**: Use default hyperparameters
+2. **Learning rate**: Grid search [1e-3, 5e-3, 1e-2, 5e-2]
+3. **Batch size**: Maximum that fits in memory
+4. **Augmentation**: Start minimal, add progressively
+5. **Epochs**: Train until validation loss plateaus
+6. **NMS threshold**: Tune on validation set
+
+### Automated Hyperparameter Optimization
+
+```python
+import optuna
+
+def objective(trial):
+    lr = trial.suggest_loguniform('lr', 1e-4, 1e-1)
+    weight_decay = trial.suggest_loguniform('weight_decay', 1e-5, 1e-3)
+    mosaic_prob = trial.suggest_uniform('mosaic_prob', 0.0, 1.0)
+
+    model = create_model()
+    train_model(model, lr=lr, weight_decay=weight_decay, mosaic_prob=mosaic_prob)
+    mAP = test_model(model)
+
+    return mAP
+
+study = optuna.create_study(direction='maximize')
+study.optimize(objective, n_trials=100)
+
+print(f"Best params: {study.best_params}")
+print(f"Best mAP: {study.best_value}")
+```
+
+---
+
+## Detection-Specific Tips
+
+### Small Object Detection
+
+1. **Higher resolution**: 1280px instead of 640px
+2. **SAHI (Slicing)**: Inference on overlapping tiles
+3. **More FPN levels**: P2 level (1/4 scale)
+4. **Anchor adjustment**: Smaller anchors for small objects
+5. **Copy-paste augmentation**: Increase small object frequency
+
+### Handling Class Imbalance
+
+1. **Focal loss**: gamma=2.0, alpha=0.25
+2. **Over-sampling**: Repeat rare class images
+3. **Class weights**: Inverse frequency weighting
+4. **Copy-paste**: Augment rare classes
+
+### Improving Localization
+
+1. **CIoU loss**: Includes aspect ratio term
+2. **Cascade detection**: Progressive refinement
+3. **Higher IoU threshold**: 0.6-0.7 for positive samples
+4. **Deformable convolutions**: Learn spatial offsets
+
+### Reducing False Positives
+
+1. **Higher confidence threshold**: 0.4-0.5
+2. **More negative samples**: Hard negative mining
+3. **Background class weight**: Increase penalty
+4. **Ensemble**: Multiple model voting
+
+---
+
+## Resources
+
+- [MMDetection training configs](https://github.com/open-mmlab/mmdetection/tree/main/configs)
+- [Ultralytics training tips](https://docs.ultralytics.com/guides/hyperparameter-tuning/)
+- [Albumentations detection](https://albumentations.ai/docs/getting_started/bounding_boxes_augmentation/)
+- [Focal Loss paper](https://arxiv.org/abs/1708.02002)
+- [CIoU paper](https://arxiv.org/abs/2005.03572)
diff --git a/engineering-team/senior-computer-vision/references/production_vision_systems.md b/engineering-team/senior-computer-vision/references/production_vision_systems.md
index e1c2e4b..7242ebf 100644
--- a/engineering-team/senior-computer-vision/references/production_vision_systems.md
+++ b/engineering-team/senior-computer-vision/references/production_vision_systems.md
@@ -1,80 +1,1226 @@
 # Production Vision Systems
 
-## Overview
+Comprehensive guide to deploying computer vision models in production environments.
 
-World-class production vision systems for senior computer vision engineer.
+## Table of Contents
 
-## Core Principles
+- [Model Export and Optimization](#model-export-and-optimization)
+- [TensorRT Deployment](#tensorrt-deployment)
+- [ONNX Runtime Deployment](#onnx-runtime-deployment)
+- [Edge Device Deployment](#edge-device-deployment)
+- [Model Serving](#model-serving)
+- [Video Processing Pipelines](#video-processing-pipelines)
+- [Monitoring and Observability](#monitoring-and-observability)
+- [Scaling and Performance](#scaling-and-performance)
 
-### Production-First Design
+---
 
-Always design with production in mind:
-- Scalability: Handle 10x current load
-- Reliability: 99.9% uptime target
-- Maintainability: Clear, documented code
-- Observability: Monitor everything
+## Model Export and Optimization
 
-### Performance by Design
+### PyTorch to ONNX Export
 
-Optimize from the start:
-- Efficient algorithms
-- Resource awareness
-- Strategic caching
-- Batch processing
+Basic export:
+```python
+import torch
+import torch.onnx
 
-### Security & Privacy
+def export_to_onnx(model, input_shape, output_path, dynamic_batch=True):
+    """
+    Export PyTorch model to ONNX format.
 
-Build security in:
-- Input validation
-- Data encryption
-- Access control
-- Audit logging
+    Args:
+        model: PyTorch model
+        input_shape: (C, H, W) input dimensions
+        output_path: Path to save .onnx file
+        dynamic_batch: Allow variable batch sizes
+    """
+    model.set_mode('inference')
 
-## Advanced Patterns
+    # Create dummy input
+    dummy_input = torch.randn(1, *input_shape)
 
-### Pattern 1: Distributed Processing
+    # Dynamic axes for variable batch size
+    dynamic_axes = None
+    if dynamic_batch:
+        dynamic_axes = {
+            'input': {0: 'batch_size'},
+            'output': {0: 'batch_size'}
+        }
 
-Enterprise-scale data processing with fault tolerance.
+    # Export
+    torch.onnx.export(
+        model,
+        dummy_input,
+        output_path,
+        export_params=True,
+        opset_version=17,
+        do_constant_folding=True,
+        input_names=['input'],
+        output_names=['output'],
+        dynamic_axes=dynamic_axes
+    )
 
-### Pattern 2: Real-Time Systems
+    print(f"Exported to {output_path}")
+    return output_path
+```
 
-Low-latency, high-throughput systems.
+### ONNX Model Optimization
 
-### Pattern 3: ML at Scale
+Simplify and optimize ONNX graph:
+```python
+import onnx
+from onnxsim import simplify
 
-Production ML with monitoring and automation.
+def optimize_onnx(input_path, output_path):
+    """
+    Simplify ONNX model for faster inference.
+    """
+    # Load model
+    model = onnx.load(input_path)
 
-## Best Practices
+    # Check validity
+    onnx.checker.check_model(model)
 
-### Code Quality
-- Comprehensive testing
-- Clear documentation
-- Code reviews
-- Type hints
+    # Simplify
+    model_simplified, check = simplify(model)
 
-### Performance
-- Profile before optimizing
-- Monitor continuously
-- Cache strategically
-- Batch operations
+    if check:
+        onnx.save(model_simplified, output_path)
+        print(f"Simplified model saved to {output_path}")
 
-### Reliability
-- Design for failure
-- Implement retries
-- Use circuit breakers
-- Monitor health
+        # Print size reduction
+        import os
+        original_size = os.path.getsize(input_path) / 1024 / 1024
+        simplified_size = os.path.getsize(output_path) / 1024 / 1024
+        print(f"Size: {original_size:.2f}MB -> {simplified_size:.2f}MB")
+    else:
+        print("Simplification failed, saving original")
+        onnx.save(model, output_path)
 
-## Tools & Technologies
+    return output_path
+```
 
-Essential tools for this domain:
-- Development frameworks
-- Testing libraries
-- Deployment platforms
-- Monitoring solutions
+### Model Size Analysis
 
-## Further Reading
+```python
+def analyze_model(model_path):
+    """
+    Analyze ONNX model structure and size.
+    """
+    model = onnx.load(model_path)
 
-- Research papers
-- Industry blogs
-- Conference talks
-- Open source projects
+    # Count parameters
+    total_params = 0
+    param_sizes = {}
+
+    for initializer in model.graph.initializer:
+        param_count = 1
+        for dim in initializer.dims:
+            param_count *= dim
+        total_params += param_count
+        param_sizes[initializer.name] = param_count
+
+    # Print summary
+    print(f"Total parameters: {total_params:,}")
+    print(f"Model size: {total_params * 4 / 1024 / 1024:.2f} MB (FP32)")
+    print(f"Model size: {total_params * 2 / 1024 / 1024:.2f} MB (FP16)")
+    print(f"Model size: {total_params / 1024 / 1024:.2f} MB (INT8)")
+
+    # Top 10 largest layers
+    print("\nLargest layers:")
+    sorted_params = sorted(param_sizes.items(), key=lambda x: x[1], reverse=True)
+    for name, size in sorted_params[:10]:
+        print(f"  {name}: {size:,} params")
+
+    return total_params
+```
+
+---
+
+## TensorRT Deployment
+
+### TensorRT Engine Build
+
+```python
+import tensorrt as trt
+
+def build_tensorrt_engine(onnx_path, engine_path, precision='fp16',
+                          max_batch_size=8, workspace_gb=4):
+    """
+    Build TensorRT engine from ONNX model.
+
+    Args:
+        onnx_path: Path to ONNX model
+        engine_path: Path to save TensorRT engine
+        precision: 'fp32', 'fp16', or 'int8'
+        max_batch_size: Maximum batch size
+        workspace_gb: GPU memory workspace in GB
+    """
+    logger = trt.Logger(trt.Logger.WARNING)
+    builder = trt.Builder(logger)
+    network = builder.create_network(
+        1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
+    )
+    parser = trt.OnnxParser(network, logger)
+
+    # Parse ONNX
+    with open(onnx_path, 'rb') as f:
+        if not parser.parse(f.read()):
+            for error in range(parser.num_errors):
+                print(parser.get_error(error))
+            raise RuntimeError("ONNX parsing failed")
+
+    # Configure builder
+    config = builder.create_builder_config()
+    config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE,
+                                  workspace_gb * 1024 * 1024 * 1024)
+
+    # Set precision
+    if precision == 'fp16':
+        config.set_flag(trt.BuilderFlag.FP16)
+    elif precision == 'int8':
+        config.set_flag(trt.BuilderFlag.INT8)
+        # Requires calibrator for INT8
+
+    # Set optimization profile for dynamic shapes
+    profile = builder.create_optimization_profile()
+    input_name = network.get_input(0).name
+    input_shape = network.get_input(0).shape
+
+    # Min, optimal, max batch sizes
+    min_shape = (1,) + tuple(input_shape[1:])
+    opt_shape = (max_batch_size // 2,) + tuple(input_shape[1:])
+    max_shape = (max_batch_size,) + tuple(input_shape[1:])
+
+    profile.set_shape(input_name, min_shape, opt_shape, max_shape)
+    config.add_optimization_profile(profile)
+
+    # Build engine
+    serialized_engine = builder.build_serialized_network(network, config)
+
+    # Save engine
+    with open(engine_path, 'wb') as f:
+        f.write(serialized_engine)
+
+    print(f"TensorRT engine saved to {engine_path}")
+    return engine_path
+```
+
+### TensorRT Inference
+
+```python
+import numpy as np
+import pycuda.driver as cuda
+import pycuda.autoinit
+
+class TensorRTInference:
+    def __init__(self, engine_path):
+        """
+        Load TensorRT engine and prepare for inference.
+        """
+        self.logger = trt.Logger(trt.Logger.WARNING)
+
+        # Load engine
+        with open(engine_path, 'rb') as f:
+            engine_data = f.read()
+
+        runtime = trt.Runtime(self.logger)
+        self.engine = runtime.deserialize_cuda_engine(engine_data)
+        self.context = self.engine.create_execution_context()
+
+        # Allocate buffers
+        self.inputs = []
+        self.outputs = []
+        self.bindings = []
+        self.stream = cuda.Stream()
+
+        for i in range(self.engine.num_io_tensors):
+            name = self.engine.get_tensor_name(i)
+            dtype = trt.nptype(self.engine.get_tensor_dtype(name))
+            shape = self.engine.get_tensor_shape(name)
+            size = trt.volume(shape)
+
+            # Allocate host and device buffers
+            host_mem = cuda.pagelocked_empty(size, dtype)
+            device_mem = cuda.mem_alloc(host_mem.nbytes)
+
+            self.bindings.append(int(device_mem))
+
+            if self.engine.get_tensor_mode(name) == trt.TensorIOMode.INPUT:
+                self.inputs.append({'host': host_mem, 'device': device_mem,
+                                   'shape': shape, 'name': name})
+            else:
+                self.outputs.append({'host': host_mem, 'device': device_mem,
+                                    'shape': shape, 'name': name})
+
+    def infer(self, input_data):
+        """
+        Run inference on input data.
+
+        Args:
+            input_data: numpy array (batch, C, H, W)
+
+        Returns:
+            Output numpy array
+        """
+        # Copy input to host buffer
+        np.copyto(self.inputs[0]['host'], input_data.ravel())
+
+        # Transfer input to device
+        cuda.memcpy_htod_async(
+            self.inputs[0]['device'],
+            self.inputs[0]['host'],
+            self.stream
+        )
+
+        # Run inference
+        self.context.execute_async_v2(
+            bindings=self.bindings,
+            stream_handle=self.stream.handle
+        )
+
+        # Transfer output from device
+        cuda.memcpy_dtoh_async(
+            self.outputs[0]['host'],
+            self.outputs[0]['device'],
+            self.stream
+        )
+
+        # Synchronize
+        self.stream.synchronize()
+
+        # Reshape output
+        output = self.outputs[0]['host'].reshape(self.outputs[0]['shape'])
+        return output
+```
+
+### INT8 Calibration
+
+```python
+class Int8Calibrator(trt.IInt8EntropyCalibrator2):
+    def __init__(self, calibration_data, cache_file, batch_size=8):
+        """
+        INT8 calibrator for TensorRT.
+
+        Args:
+            calibration_data: List of numpy arrays
+            cache_file: Path to save calibration cache
+            batch_size: Calibration batch size
+        """
+        super().__init__()
+        self.calibration_data = calibration_data
+        self.cache_file = cache_file
+        self.batch_size = batch_size
+        self.current_index = 0
+
+        # Allocate device buffer
+        self.device_input = cuda.mem_alloc(
+            calibration_data[0].nbytes * batch_size
+        )
+
+    def get_batch_size(self):
+        return self.batch_size
+
+    def get_batch(self, names):
+        if self.current_index + self.batch_size > len(self.calibration_data):
+            return None
+
+        # Get batch
+        batch = self.calibration_data[
+            self.current_index:self.current_index + self.batch_size
+        ]
+        batch = np.stack(batch, axis=0)
+
+        # Copy to device
+        cuda.memcpy_htod(self.device_input, batch)
+        self.current_index += self.batch_size
+
+        return [int(self.device_input)]
+
+    def read_calibration_cache(self):
+        if os.path.exists(self.cache_file):
+            with open(self.cache_file, 'rb') as f:
+                return f.read()
+        return None
+
+    def write_calibration_cache(self, cache):
+        with open(self.cache_file, 'wb') as f:
+            f.write(cache)
+```
+
+---
+
+## ONNX Runtime Deployment
+
+### Basic ONNX Runtime Inference
+
+```python
+import onnxruntime as ort
+
+class ONNXInference:
+    def __init__(self, model_path, device='cuda'):
+        """
+        Initialize ONNX Runtime session.
+
+        Args:
+            model_path: Path to ONNX model
+            device: 'cuda' or 'cpu'
+        """
+        # Set execution providers
+        if device == 'cuda':
+            providers = [
+                ('CUDAExecutionProvider', {
+                    'device_id': 0,
+                    'arena_extend_strategy': 'kNextPowerOfTwo',
+                    'gpu_mem_limit': 4 * 1024 * 1024 * 1024,  # 4GB
+                    'cudnn_conv_algo_search': 'EXHAUSTIVE',
+                }),
+                'CPUExecutionProvider'
+            ]
+        else:
+            providers = ['CPUExecutionProvider']
+
+        # Session options
+        sess_options = ort.SessionOptions()
+        sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL
+        sess_options.intra_op_num_threads = 4
+
+        # Create session
+        self.session = ort.InferenceSession(
+            model_path,
+            sess_options=sess_options,
+            providers=providers
+        )
+
+        # Get input/output info
+        self.input_name = self.session.get_inputs()[0].name
+        self.input_shape = self.session.get_inputs()[0].shape
+        self.output_name = self.session.get_outputs()[0].name
+
+        print(f"Loaded model: {model_path}")
+        print(f"Input: {self.input_name} {self.input_shape}")
+        print(f"Provider: {self.session.get_providers()[0]}")
+
+    def infer(self, input_data):
+        """
+        Run inference.
+
+        Args:
+            input_data: numpy array (batch, C, H, W)
+
+        Returns:
+            Model output
+        """
+        outputs = self.session.run(
+            [self.output_name],
+            {self.input_name: input_data.astype(np.float32)}
+        )
+        return outputs[0]
+
+    def benchmark(self, input_shape, num_iterations=100, warmup=10):
+        """
+        Benchmark inference speed.
+        """
+        import time
+
+        dummy_input = np.random.randn(*input_shape).astype(np.float32)
+
+        # Warmup
+        for _ in range(warmup):
+            self.infer(dummy_input)
+
+        # Benchmark
+        start = time.perf_counter()
+        for _ in range(num_iterations):
+            self.infer(dummy_input)
+        end = time.perf_counter()
+
+        avg_time = (end - start) / num_iterations * 1000
+        fps = 1000 / avg_time * input_shape[0]
+
+        print(f"Average latency: {avg_time:.2f}ms")
+        print(f"Throughput: {fps:.1f} images/sec")
+
+        return avg_time, fps
+```
+
+---
+
+## Edge Device Deployment
+
+### NVIDIA Jetson Optimization
+
+```python
+def optimize_for_jetson(model_path, output_path, jetson_model='orin'):
+    """
+    Optimize model for NVIDIA Jetson deployment.
+
+    Args:
+        model_path: Path to ONNX model
+        output_path: Path to save optimized engine
+        jetson_model: 'nano', 'xavier', 'orin'
+    """
+    # Jetson-specific configurations
+    configs = {
+        'nano': {'precision': 'fp16', 'workspace': 1, 'dla': False},
+        'xavier': {'precision': 'fp16', 'workspace': 2, 'dla': True},
+        'orin': {'precision': 'int8', 'workspace': 4, 'dla': True},
+    }
+
+    config = configs[jetson_model]
+
+    # Build engine with Jetson-optimized settings
+    logger = trt.Logger(trt.Logger.WARNING)
+    builder = trt.Builder(logger)
+    network = builder.create_network(
+        1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
+    )
+    parser = trt.OnnxParser(network, logger)
+
+    with open(model_path, 'rb') as f:
+        parser.parse(f.read())
+
+    builder_config = builder.create_builder_config()
+    builder_config.set_memory_pool_limit(
+        trt.MemoryPoolType.WORKSPACE,
+        config['workspace'] * 1024 * 1024 * 1024
+    )
+
+    if config['precision'] == 'fp16':
+        builder_config.set_flag(trt.BuilderFlag.FP16)
+    elif config['precision'] == 'int8':
+        builder_config.set_flag(trt.BuilderFlag.INT8)
+
+    # Enable DLA if supported
+    if config['dla'] and builder.num_DLA_cores > 0:
+        builder_config.default_device_type = trt.DeviceType.DLA
+        builder_config.DLA_core = 0
+        builder_config.set_flag(trt.BuilderFlag.GPU_FALLBACK)
+
+    # Build and save
+    serialized = builder.build_serialized_network(network, builder_config)
+    with open(output_path, 'wb') as f:
+        f.write(serialized)
+
+    print(f"Jetson-optimized engine saved to {output_path}")
+```
+
+### OpenVINO for Intel Devices
+
+```python
+from openvino.runtime import Core
+
+class OpenVINOInference:
+    def __init__(self, model_path, device='CPU'):
+        """
+        Initialize OpenVINO inference.
+
+        Args:
+            model_path: Path to ONNX or OpenVINO IR model
+            device: 'CPU', 'GPU', 'MYRIAD' (Intel NCS)
+        """
+        self.core = Core()
+
+        # Load and compile model
+        self.model = self.core.read_model(model_path)
+        self.compiled = self.core.compile_model(self.model, device)
+
+        # Get input/output info
+        self.input_layer = self.compiled.input(0)
+        self.output_layer = self.compiled.output(0)
+
+        print(f"Loaded model on {device}")
+        print(f"Input shape: {self.input_layer.shape}")
+
+    def infer(self, input_data):
+        """
+        Run inference.
+        """
+        result = self.compiled([input_data])
+        return result[self.output_layer]
+
+    def benchmark(self, input_shape, num_iterations=100):
+        """
+        Benchmark inference speed.
+        """
+        import time
+
+        dummy = np.random.randn(*input_shape).astype(np.float32)
+
+        # Warmup
+        for _ in range(10):
+            self.infer(dummy)
+
+        # Benchmark
+        start = time.perf_counter()
+        for _ in range(num_iterations):
+            self.infer(dummy)
+        elapsed = time.perf_counter() - start
+
+        latency = elapsed / num_iterations * 1000
+        print(f"Latency: {latency:.2f}ms")
+        return latency
+
+
+def convert_to_openvino(onnx_path, output_dir, precision='FP16'):
+    """
+    Convert ONNX to OpenVINO IR format.
+    """
+    from openvino.tools import mo
+
+    mo.convert_model(
+        onnx_path,
+        output_model=f"{output_dir}/model.xml",
+        compress_to_fp16=(precision == 'FP16')
+    )
+    print(f"Converted to OpenVINO IR at {output_dir}")
+```
+
+### CoreML for Apple Silicon
+
+```python
+import coremltools as ct
+
+def convert_to_coreml(model_or_path, output_path, compute_units='ALL'):
+    """
+    Convert to CoreML for Apple devices.
+
+    Args:
+        model_or_path: PyTorch model or ONNX path
+        output_path: Path to save .mlpackage
+        compute_units: 'ALL', 'CPU_AND_GPU', 'CPU_AND_NE'
+    """
+    # Map compute units
+    units_map = {
+        'ALL': ct.ComputeUnit.ALL,
+        'CPU_AND_GPU': ct.ComputeUnit.CPU_AND_GPU,
+        'CPU_AND_NE': ct.ComputeUnit.CPU_AND_NE,  # Neural Engine
+    }
+
+    # Convert from ONNX
+    if isinstance(model_or_path, str) and model_or_path.endswith('.onnx'):
+        mlmodel = ct.convert(
+            model_or_path,
+            compute_units=units_map[compute_units],
+            minimum_deployment_target=ct.target.macOS13  # or iOS16
+        )
+    else:
+        # Convert from PyTorch
+        traced = torch.jit.trace(model_or_path, torch.randn(1, 3, 640, 640))
+        mlmodel = ct.convert(
+            traced,
+            inputs=[ct.TensorType(shape=(1, 3, 640, 640))],
+            compute_units=units_map[compute_units],
+        )
+
+    mlmodel.save(output_path)
+    print(f"CoreML model saved to {output_path}")
+```
+
+---
+
+## Model Serving
+
+### Triton Inference Server
+
+Configuration file (`config.pbtxt`):
+```protobuf
+name: "yolov8"
+platform: "onnxruntime_onnx"
+max_batch_size: 8
+
+input [
+  {
+    name: "images"
+    data_type: TYPE_FP32
+    dims: [ 3, 640, 640 ]
+  }
+]
+
+output [
+  {
+    name: "output0"
+    data_type: TYPE_FP32
+    dims: [ 84, 8400 ]
+  }
+]
+
+instance_group [
+  {
+    count: 2
+    kind: KIND_GPU
+  }
+]
+
+dynamic_batching {
+  preferred_batch_size: [ 4, 8 ]
+  max_queue_delay_microseconds: 100
+}
+```
+
+Triton client:
+```python
+import tritonclient.http as httpclient
+
+class TritonClient:
+    def __init__(self, url='localhost:8000', model_name='yolov8'):
+        self.client = httpclient.InferenceServerClient(url=url)
+        self.model_name = model_name
+
+        # Check model is ready
+        if not self.client.is_model_ready(model_name):
+            raise RuntimeError(f"Model {model_name} is not ready")
+
+    def infer(self, images):
+        """
+        Send inference request to Triton.
+
+        Args:
+            images: numpy array (batch, C, H, W)
+        """
+        # Create input
+        inputs = [
+            httpclient.InferInput("images", images.shape, "FP32")
+        ]
+        inputs[0].set_data_from_numpy(images)
+
+        # Create output request
+        outputs = [
+            httpclient.InferRequestedOutput("output0")
+        ]
+
+        # Send request
+        response = self.client.infer(
+            model_name=self.model_name,
+            inputs=inputs,
+            outputs=outputs
+        )
+
+        return response.as_numpy("output0")
+```
+
+### TorchServe Deployment
+
+Model handler (`handler.py`):
+```python
+from ts.torch_handler.base_handler import BaseHandler
+import torch
+import cv2
+import numpy as np
+
+class YOLOHandler(BaseHandler):
+    def __init__(self):
+        super().__init__()
+        self.input_size = 640
+        self.conf_threshold = 0.25
+        self.iou_threshold = 0.45
+
+    def preprocess(self, data):
+        """Preprocess input images."""
+        images = []
+        for row in data:
+            image = row.get("data") or row.get("body")
+
+            if isinstance(image, (bytes, bytearray)):
+                image = np.frombuffer(image, dtype=np.uint8)
+                image = cv2.imdecode(image, cv2.IMREAD_COLOR)
+
+            # Resize and normalize
+            image = cv2.resize(image, (self.input_size, self.input_size))
+            image = image.astype(np.float32) / 255.0
+            image = np.transpose(image, (2, 0, 1))
+            images.append(image)
+
+        return torch.tensor(np.stack(images))
+
+    def inference(self, data):
+        """Run model inference."""
+        with torch.no_grad():
+            outputs = self.model(data)
+        return outputs
+
+    def postprocess(self, outputs):
+        """Postprocess model outputs."""
+        results = []
+        for output in outputs:
+            # Apply NMS and format results
+            detections = self._nms(output, self.conf_threshold, self.iou_threshold)
+            results.append(detections.tolist())
+        return results
+```
+
+TorchServe configuration (`config.properties`):
+```properties
+inference_address=http://0.0.0.0:8080
+management_address=http://0.0.0.0:8081
+metrics_address=http://0.0.0.0:8082
+number_of_netty_threads=4
+job_queue_size=100
+model_store=/opt/ml/model
+load_models=yolov8.mar
+```
+
+### FastAPI Serving
+
+```python
+from fastapi import FastAPI, File, UploadFile
+from fastapi.responses import JSONResponse
+import uvicorn
+import numpy as np
+import cv2
+
+app = FastAPI(title="YOLO Detection API")
+
+# Global model
+model = None
+
+@app.on_event("startup")
+async def load_model():
+    global model
+    model = ONNXInference("models/yolov8m.onnx", device='cuda')
+
+@app.post("/detect")
+async def detect(file: UploadFile = File(...), conf: float = 0.25):
+    """
+    Detect objects in uploaded image.
+    """
+    # Read image
+    contents = await file.read()
+    nparr = np.frombuffer(contents, np.uint8)
+    image = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
+
+    # Preprocess
+    input_image = preprocess_image(image, 640)
+
+    # Inference
+    outputs = model.infer(input_image)
+
+    # Postprocess
+    detections = postprocess_detections(outputs, conf, 0.45)
+
+    return JSONResponse({
+        "detections": detections,
+        "image_size": list(image.shape[:2])
+    })
+
+@app.get("/health")
+async def health():
+    return {"status": "healthy", "model_loaded": model is not None}
+
+if __name__ == "__main__":
+    uvicorn.run(app, host="0.0.0.0", port=8000)
+```
+
+---
+
+## Video Processing Pipelines
+
+### Real-Time Video Detection
+
+```python
+import cv2
+import time
+from collections import deque
+
+class VideoDetector:
+    def __init__(self, model, conf_threshold=0.25, track=True):
+        self.model = model
+        self.conf_threshold = conf_threshold
+        self.track = track
+        self.tracker = ByteTrack() if track else None
+        self.fps_buffer = deque(maxlen=30)
+
+    def process_video(self, source, output_path=None, show=True):
+        """
+        Process video stream with detection.
+
+        Args:
+            source: Video file path, camera index, or RTSP URL
+            output_path: Path to save output video
+            show: Display results in window
+        """
+        cap = cv2.VideoCapture(source)
+
+        if output_path:
+            fourcc = cv2.VideoWriter_fourcc(*'mp4v')
+            fps = cap.get(cv2.CAP_PROP_FPS)
+            width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
+            height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
+            writer = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
+
+        frame_count = 0
+        start_time = time.time()
+
+        while cap.isOpened():
+            ret, frame = cap.read()
+            if not ret:
+                break
+
+            # Inference
+            t0 = time.perf_counter()
+            detections = self._detect(frame)
+
+            # Tracking
+            if self.track and len(detections) > 0:
+                detections = self.tracker.update(detections)
+
+            # Calculate FPS
+            inference_time = time.perf_counter() - t0
+            self.fps_buffer.append(1 / inference_time)
+            avg_fps = sum(self.fps_buffer) / len(self.fps_buffer)
+
+            # Draw results
+            frame = self._draw_detections(frame, detections, avg_fps)
+
+            # Output
+            if output_path:
+                writer.write(frame)
+
+            if show:
+                cv2.imshow('Detection', frame)
+                if cv2.waitKey(1) == ord('q'):
+                    break
+
+            frame_count += 1
+
+        # Cleanup
+        cap.release()
+        if output_path:
+            writer.release()
+        cv2.destroyAllWindows()
+
+        # Print statistics
+        total_time = time.time() - start_time
+        print(f"Processed {frame_count} frames in {total_time:.1f}s")
+        print(f"Average FPS: {frame_count / total_time:.1f}")
+
+    def _detect(self, frame):
+        """Run detection on single frame."""
+        # Preprocess
+        input_tensor = self._preprocess(frame)
+
+        # Inference
+        outputs = self.model.infer(input_tensor)
+
+        # Postprocess
+        detections = self._postprocess(outputs, frame.shape[:2])
+        return detections
+
+    def _preprocess(self, frame):
+        """Preprocess frame for model input."""
+        # Resize
+        input_size = 640
+        image = cv2.resize(frame, (input_size, input_size))
+
+        # Normalize and transpose
+        image = image.astype(np.float32) / 255.0
+        image = np.transpose(image, (2, 0, 1))
+        image = np.expand_dims(image, axis=0)
+
+        return image
+
+    def _draw_detections(self, frame, detections, fps):
+        """Draw detections on frame."""
+        for det in detections:
+            x1, y1, x2, y2 = det['bbox']
+            cls = det['class']
+            conf = det['confidence']
+            track_id = det.get('track_id', None)
+
+            # Draw box
+            color = self._get_color(cls)
+            cv2.rectangle(frame, (int(x1), int(y1)), (int(x2), int(y2)), color, 2)
+
+            # Draw label
+            label = f"{cls}: {conf:.2f}"
+            if track_id:
+                label = f"ID:{track_id} {label}"
+
+            cv2.putText(frame, label, (int(x1), int(y1) - 10),
+                       cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)
+
+        # Draw FPS
+        cv2.putText(frame, f"FPS: {fps:.1f}", (10, 30),
+                   cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
+
+        return frame
+```
+
+### Batch Video Processing
+
+```python
+import concurrent.futures
+from pathlib import Path
+
+def process_videos_batch(video_paths, model, output_dir, max_workers=4):
+    """
+    Process multiple videos in parallel.
+    """
+    output_dir = Path(output_dir)
+    output_dir.mkdir(parents=True, exist_ok=True)
+
+    def process_single(video_path):
+        detector = VideoDetector(model)
+        output_path = output_dir / f"{Path(video_path).stem}_detected.mp4"
+        detector.process_video(video_path, str(output_path), show=False)
+        return output_path
+
+    with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
+        futures = {executor.submit(process_single, vp): vp for vp in video_paths}
+
+        for future in concurrent.futures.as_completed(futures):
+            video_path = futures[future]
+            try:
+                output_path = future.result()
+                print(f"Completed: {video_path} -> {output_path}")
+            except Exception as e:
+                print(f"Failed: {video_path} - {e}")
+```
+
+---
+
+## Monitoring and Observability
+
+### Prometheus Metrics
+
+```python
+from prometheus_client import Counter, Histogram, Gauge, start_http_server
+
+# Define metrics
+INFERENCE_COUNT = Counter(
+    'model_inference_total',
+    'Total number of inferences',
+    ['model_name', 'status']
+)
+
+INFERENCE_LATENCY = Histogram(
+    'model_inference_latency_seconds',
+    'Inference latency in seconds',
+    ['model_name'],
+    buckets=[0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0]
+)
+
+GPU_MEMORY = Gauge(
+    'gpu_memory_used_bytes',
+    'GPU memory usage in bytes',
+    ['device']
+)
+
+DETECTIONS_COUNT = Counter(
+    'detections_total',
+    'Total detections by class',
+    ['model_name', 'class_name']
+)
+
+class MetricsWrapper:
+    def __init__(self, model, model_name='yolov8'):
+        self.model = model
+        self.model_name = model_name
+
+    def infer(self, input_data):
+        """Inference with metrics."""
+        start_time = time.perf_counter()
+
+        try:
+            result = self.model.infer(input_data)
+            INFERENCE_COUNT.labels(self.model_name, 'success').inc()
+
+            # Count detections by class
+            for det in result:
+                DETECTIONS_COUNT.labels(self.model_name, det['class']).inc()
+
+            return result
+
+        except Exception as e:
+            INFERENCE_COUNT.labels(self.model_name, 'error').inc()
+            raise
+
+        finally:
+            latency = time.perf_counter() - start_time
+            INFERENCE_LATENCY.labels(self.model_name).observe(latency)
+
+            # Update GPU memory
+            if torch.cuda.is_available():
+                memory = torch.cuda.memory_allocated()
+                GPU_MEMORY.labels('cuda:0').set(memory)
+
+# Start metrics server
+start_http_server(9090)
+```
+
+### Logging Configuration
+
+```python
+import logging
+import json
+from datetime import datetime
+
+class StructuredLogger:
+    def __init__(self, name, level=logging.INFO):
+        self.logger = logging.getLogger(name)
+        self.logger.setLevel(level)
+
+        # JSON formatter
+        handler = logging.StreamHandler()
+        handler.setFormatter(JsonFormatter())
+        self.logger.addHandler(handler)
+
+    def log_inference(self, model_name, latency, num_detections, input_shape):
+        self.logger.info(json.dumps({
+            'event': 'inference',
+            'timestamp': datetime.utcnow().isoformat(),
+            'model_name': model_name,
+            'latency_ms': latency * 1000,
+            'num_detections': num_detections,
+            'input_shape': list(input_shape)
+        }))
+
+    def log_error(self, model_name, error, input_shape):
+        self.logger.error(json.dumps({
+            'event': 'inference_error',
+            'timestamp': datetime.utcnow().isoformat(),
+            'model_name': model_name,
+            'error': str(error),
+            'error_type': type(error).__name__,
+            'input_shape': list(input_shape)
+        }))
+
+class JsonFormatter(logging.Formatter):
+    def format(self, record):
+        return record.getMessage()
+```
+
+---
+
+## Scaling and Performance
+
+### Batch Processing Optimization
+
+```python
+class BatchProcessor:
+    def __init__(self, model, max_batch_size=8, max_wait_ms=100):
+        self.model = model
+        self.max_batch_size = max_batch_size
+        self.max_wait_ms = max_wait_ms
+        self.queue = []
+        self.lock = threading.Lock()
+        self.results = {}
+
+    async def process(self, image, request_id):
+        """Add image to batch and wait for result."""
+        future = asyncio.Future()
+
+        with self.lock:
+            self.queue.append((request_id, image, future))
+
+            if len(self.queue) >= self.max_batch_size:
+                self._process_batch()
+
+        # Wait for result with timeout
+        result = await asyncio.wait_for(future, timeout=5.0)
+        return result
+
+    def _process_batch(self):
+        """Process accumulated batch."""
+        batch_items = self.queue[:self.max_batch_size]
+        self.queue = self.queue[self.max_batch_size:]
+
+        # Stack images
+        images = np.stack([item[1] for item in batch_items])
+
+        # Inference
+        outputs = self.model.infer(images)
+
+        # Return results
+        for i, (request_id, image, future) in enumerate(batch_items):
+            future.set_result(outputs[i])
+```
+
+### Multi-GPU Inference
+
+```python
+import torch.nn as nn
+from torch.nn.parallel import DataParallel
+
+class MultiGPUInference:
+    def __init__(self, model, device_ids=None):
+        """
+        Wrap model for multi-GPU inference.
+
+        Args:
+            model: PyTorch model
+            device_ids: List of GPU IDs, e.g., [0, 1, 2, 3]
+        """
+        if device_ids is None:
+            device_ids = list(range(torch.cuda.device_count()))
+
+        self.device = torch.device('cuda:0')
+        self.model = DataParallel(model, device_ids=device_ids)
+        self.model.to(self.device)
+        self.model.set_mode('inference')
+
+    def infer(self, images):
+        """
+        Run inference across GPUs.
+        """
+        with torch.no_grad():
+            images = torch.from_numpy(images).to(self.device)
+            outputs = self.model(images)
+        return outputs.cpu().numpy()
+```
+
+### Performance Benchmarking
+
+```python
+def comprehensive_benchmark(model, input_sizes, batch_sizes, num_iterations=100):
+    """
+    Benchmark model across different configurations.
+    """
+    results = []
+
+    for input_size in input_sizes:
+        for batch_size in batch_sizes:
+            # Create input
+            dummy = np.random.randn(batch_size, 3, input_size, input_size).astype(np.float32)
+
+            # Warmup
+            for _ in range(10):
+                model.infer(dummy)
+
+            # Benchmark
+            latencies = []
+            for _ in range(num_iterations):
+                start = time.perf_counter()
+                model.infer(dummy)
+                latencies.append(time.perf_counter() - start)
+
+            # Calculate statistics
+            latencies = np.array(latencies) * 1000  # Convert to ms
+            result = {
+                'input_size': input_size,
+                'batch_size': batch_size,
+                'mean_latency_ms': np.mean(latencies),
+                'std_latency_ms': np.std(latencies),
+                'p50_latency_ms': np.percentile(latencies, 50),
+                'p95_latency_ms': np.percentile(latencies, 95),
+                'p99_latency_ms': np.percentile(latencies, 99),
+                'throughput_fps': batch_size * 1000 / np.mean(latencies)
+            }
+            results.append(result)
+
+            print(f"Size: {input_size}, Batch: {batch_size}")
+            print(f"  Latency: {result['mean_latency_ms']:.2f}ms (p99: {result['p99_latency_ms']:.2f}ms)")
+            print(f"  Throughput: {result['throughput_fps']:.1f} FPS")
+
+    return results
+```
+
+---
+
+## Resources
+
+- [TensorRT Documentation](https://docs.nvidia.com/deeplearning/tensorrt/)
+- [ONNX Runtime Documentation](https://onnxruntime.ai/docs/)
+- [Triton Inference Server](https://github.com/triton-inference-server/server)
+- [OpenVINO Documentation](https://docs.openvino.ai/)
+- [CoreML Tools](https://coremltools.readme.io/)
diff --git a/engineering-team/senior-computer-vision/scripts/dataset_pipeline_builder.py b/engineering-team/senior-computer-vision/scripts/dataset_pipeline_builder.py
index 490cfe4..8ae18a6 100755
--- a/engineering-team/senior-computer-vision/scripts/dataset_pipeline_builder.py
+++ b/engineering-team/senior-computer-vision/scripts/dataset_pipeline_builder.py
@@ -1,17 +1,37 @@
 #!/usr/bin/env python3
 """
-Dataset Pipeline Builder
-Production-grade tool for senior computer vision engineer
+Dataset Pipeline Builder for Computer Vision
+
+Production-grade tool for building and managing CV dataset pipelines.
+Supports format conversion, splitting, augmentation config, and validation.
+
+Supported formats:
+- COCO (JSON annotations)
+- YOLO (txt per image)
+- Pascal VOC (XML annotations)
+- CVAT (XML export)
+
+Usage:
+    python dataset_pipeline_builder.py analyze --input /path/to/dataset
+    python dataset_pipeline_builder.py convert --input /path/to/coco --output /path/to/yolo --format yolo
+    python dataset_pipeline_builder.py split --input /path/to/dataset --train 0.8 --val 0.1 --test 0.1
+    python dataset_pipeline_builder.py augment-config --task detection --output augmentations.yaml
+    python dataset_pipeline_builder.py validate --input /path/to/dataset --format coco
 """
 
 import os
 import sys
 import json
+import random
+import shutil
 import logging
 import argparse
+import hashlib
 from pathlib import Path
-from typing import Dict, List, Optional
+from typing import Dict, List, Optional, Tuple, Set, Any
 from datetime import datetime
+from collections import defaultdict
+import xml.etree.ElementTree as ET
 
 logging.basicConfig(
     level=logging.INFO,
@@ -19,82 +39,1661 @@ logging.basicConfig(
 )
 logger = logging.getLogger(__name__)
 
-class DatasetPipelineBuilder:
-    """Production-grade dataset pipeline builder"""
-    
-    def __init__(self, config: Dict):
-        self.config = config
-        self.results = {
-            'status': 'initialized',
-            'start_time': datetime.now().isoformat(),
-            'processed_items': 0
+
+# ============================================================================
+# Dataset Format Definitions
+# ============================================================================
+
+SUPPORTED_IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.tiff', '.webp'}
+
+COCO_CATEGORIES_TEMPLATE = {
+    "info": {
+        "description": "Custom Dataset",
+        "version": "1.0",
+        "year": datetime.now().year,
+        "contributor": "Dataset Pipeline Builder",
+        "date_created": datetime.now().isoformat()
+    },
+    "licenses": [{"id": 1, "name": "Unknown", "url": ""}],
+    "images": [],
+    "annotations": [],
+    "categories": []
+}
+
+YOLO_DATA_YAML_TEMPLATE = """# YOLO Dataset Configuration
+# Generated by Dataset Pipeline Builder
+
+path: {dataset_path}
+train: {train_path}
+val: {val_path}
+test: {test_path}
+
+# Classes
+nc: {num_classes}
+names: {class_names}
+
+# Optional: Download script
+# download:
+"""
+
+AUGMENTATION_PRESETS = {
+    'detection': {
+        'light': {
+            'horizontal_flip': 0.5,
+            'vertical_flip': 0.0,
+            'rotate': {'limit': 10, 'p': 0.3},
+            'brightness_contrast': {'brightness_limit': 0.1, 'contrast_limit': 0.1, 'p': 0.3},
+            'blur': {'blur_limit': 3, 'p': 0.1}
+        },
+        'medium': {
+            'horizontal_flip': 0.5,
+            'vertical_flip': 0.1,
+            'rotate': {'limit': 15, 'p': 0.5},
+            'scale': {'scale_limit': 0.2, 'p': 0.5},
+            'brightness_contrast': {'brightness_limit': 0.2, 'contrast_limit': 0.2, 'p': 0.5},
+            'hue_saturation': {'hue_shift_limit': 10, 'sat_shift_limit': 20, 'p': 0.3},
+            'blur': {'blur_limit': 5, 'p': 0.2},
+            'noise': {'var_limit': (10, 50), 'p': 0.2}
+        },
+        'heavy': {
+            'horizontal_flip': 0.5,
+            'vertical_flip': 0.2,
+            'rotate': {'limit': 30, 'p': 0.7},
+            'scale': {'scale_limit': 0.3, 'p': 0.6},
+            'brightness_contrast': {'brightness_limit': 0.3, 'contrast_limit': 0.3, 'p': 0.6},
+            'hue_saturation': {'hue_shift_limit': 20, 'sat_shift_limit': 30, 'p': 0.5},
+            'blur': {'blur_limit': 7, 'p': 0.3},
+            'noise': {'var_limit': (10, 80), 'p': 0.3},
+            'mosaic': {'p': 0.5},
+            'mixup': {'p': 0.3},
+            'cutout': {'num_holes': 8, 'max_h_size': 32, 'max_w_size': 32, 'p': 0.3}
         }
-        logger.info(f"Initialized {self.__class__.__name__}")
-    
-    def validate_config(self) -> bool:
-        """Validate configuration"""
-        logger.info("Validating configuration...")
-        # Add validation logic
-        logger.info("Configuration validated")
-        return True
-    
-    def process(self) -> Dict:
-        """Main processing logic"""
-        logger.info("Starting processing...")
-        
+    },
+    'segmentation': {
+        'light': {
+            'horizontal_flip': 0.5,
+            'rotate': {'limit': 10, 'p': 0.3},
+            'elastic_transform': {'alpha': 50, 'sigma': 5, 'p': 0.1}
+        },
+        'medium': {
+            'horizontal_flip': 0.5,
+            'vertical_flip': 0.2,
+            'rotate': {'limit': 20, 'p': 0.5},
+            'scale': {'scale_limit': 0.2, 'p': 0.4},
+            'elastic_transform': {'alpha': 100, 'sigma': 10, 'p': 0.3},
+            'grid_distortion': {'num_steps': 5, 'distort_limit': 0.3, 'p': 0.3}
+        },
+        'heavy': {
+            'horizontal_flip': 0.5,
+            'vertical_flip': 0.3,
+            'rotate': {'limit': 45, 'p': 0.7},
+            'scale': {'scale_limit': 0.4, 'p': 0.6},
+            'elastic_transform': {'alpha': 200, 'sigma': 20, 'p': 0.5},
+            'grid_distortion': {'num_steps': 7, 'distort_limit': 0.5, 'p': 0.4},
+            'optical_distortion': {'distort_limit': 0.5, 'shift_limit': 0.5, 'p': 0.3}
+        }
+    },
+    'classification': {
+        'light': {
+            'horizontal_flip': 0.5,
+            'rotate': {'limit': 15, 'p': 0.3},
+            'brightness_contrast': {'p': 0.3}
+        },
+        'medium': {
+            'horizontal_flip': 0.5,
+            'rotate': {'limit': 30, 'p': 0.5},
+            'color_jitter': {'brightness': 0.2, 'contrast': 0.2, 'saturation': 0.2, 'hue': 0.1, 'p': 0.5},
+            'random_crop': {'height': 224, 'width': 224, 'p': 0.5},
+            'cutout': {'num_holes': 1, 'max_h_size': 40, 'max_w_size': 40, 'p': 0.3}
+        },
+        'heavy': {
+            'horizontal_flip': 0.5,
+            'vertical_flip': 0.2,
+            'rotate': {'limit': 45, 'p': 0.7},
+            'color_jitter': {'brightness': 0.4, 'contrast': 0.4, 'saturation': 0.4, 'hue': 0.2, 'p': 0.7},
+            'random_resized_crop': {'height': 224, 'width': 224, 'scale': (0.5, 1.0), 'p': 0.6},
+            'cutout': {'num_holes': 4, 'max_h_size': 60, 'max_w_size': 60, 'p': 0.5},
+            'auto_augment': {'policy': 'imagenet', 'p': 0.5},
+            'rand_augment': {'num_ops': 2, 'magnitude': 9, 'p': 0.5}
+        }
+    }
+}
+
+
+# ============================================================================
+# Dataset Analysis
+# ============================================================================
+
+class DatasetAnalyzer:
+    """Analyze dataset structure and statistics."""
+
+    def __init__(self, dataset_path: str):
+        self.dataset_path = Path(dataset_path)
+        self.stats = {}
+
+    def analyze(self) -> Dict[str, Any]:
+        """Run full dataset analysis."""
+        logger.info(f"Analyzing dataset at: {self.dataset_path}")
+
+        # Detect format
+        detected_format = self._detect_format()
+        self.stats['format'] = detected_format
+
+        # Count images
+        images = self._find_images()
+        self.stats['total_images'] = len(images)
+
+        # Analyze images
+        self.stats['image_stats'] = self._analyze_images(images)
+
+        # Analyze annotations based on format
+        if detected_format == 'coco':
+            self.stats['annotations'] = self._analyze_coco()
+        elif detected_format == 'yolo':
+            self.stats['annotations'] = self._analyze_yolo()
+        elif detected_format == 'voc':
+            self.stats['annotations'] = self._analyze_voc()
+        else:
+            self.stats['annotations'] = {'error': 'Unknown format'}
+
+        # Dataset quality checks
+        self.stats['quality'] = self._quality_checks()
+
+        return self.stats
+
+    def _detect_format(self) -> str:
+        """Auto-detect dataset format."""
+        # Check for COCO JSON
+        for json_file in self.dataset_path.rglob('*.json'):
+            try:
+                with open(json_file) as f:
+                    data = json.load(f)
+                if 'annotations' in data and 'images' in data:
+                    return 'coco'
+            except:
+                pass
+
+        # Check for YOLO txt files
+        txt_files = list(self.dataset_path.rglob('*.txt'))
+        if txt_files:
+            # Check if txt contains YOLO format (class x_center y_center width height)
+            for txt_file in txt_files[:5]:
+                if txt_file.name == 'classes.txt':
+                    continue
+                try:
+                    with open(txt_file) as f:
+                        line = f.readline().strip()
+                    if line:
+                        parts = line.split()
+                        if len(parts) == 5 and all(self._is_float(p) for p in parts):
+                            return 'yolo'
+                except:
+                    pass
+
+        # Check for VOC XML
+        xml_files = list(self.dataset_path.rglob('*.xml'))
+        for xml_file in xml_files[:5]:
+            try:
+                tree = ET.parse(xml_file)
+                root = tree.getroot()
+                if root.tag == 'annotation' and root.find('object') is not None:
+                    return 'voc'
+            except:
+                pass
+
+        return 'unknown'
+
+    def _is_float(self, s: str) -> bool:
+        """Check if string is a float."""
         try:
-            self.validate_config()
-            
-            # Main processing
-            result = self._execute()
-            
-            self.results['status'] = 'completed'
-            self.results['end_time'] = datetime.now().isoformat()
-            
-            logger.info("Processing completed successfully")
-            return self.results
-            
-        except Exception as e:
-            self.results['status'] = 'failed'
-            self.results['error'] = str(e)
-            logger.error(f"Processing failed: {e}")
-            raise
-    
-    def _execute(self) -> Dict:
-        """Execute main logic"""
-        # Implementation here
-        return {'success': True}
+            float(s)
+            return True
+        except ValueError:
+            return False
+
+    def _find_images(self) -> List[Path]:
+        """Find all images in dataset."""
+        images = []
+        for ext in SUPPORTED_IMAGE_EXTENSIONS:
+            images.extend(self.dataset_path.rglob(f'*{ext}'))
+            images.extend(self.dataset_path.rglob(f'*{ext.upper()}'))
+        return images
+
+    def _analyze_images(self, images: List[Path]) -> Dict:
+        """Analyze image files without loading them."""
+        stats = {
+            'count': len(images),
+            'extensions': defaultdict(int),
+            'sizes': [],
+            'locations': defaultdict(int)
+        }
+
+        for img in images:
+            stats['extensions'][img.suffix.lower()] += 1
+            stats['sizes'].append(img.stat().st_size)
+            # Track which subdirectory
+            rel_path = img.relative_to(self.dataset_path)
+            if len(rel_path.parts) > 1:
+                stats['locations'][rel_path.parts[0]] += 1
+            else:
+                stats['locations']['root'] += 1
+
+        if stats['sizes']:
+            stats['total_size_mb'] = sum(stats['sizes']) / (1024 * 1024)
+            stats['avg_size_kb'] = (sum(stats['sizes']) / len(stats['sizes'])) / 1024
+            stats['min_size_kb'] = min(stats['sizes']) / 1024
+            stats['max_size_kb'] = max(stats['sizes']) / 1024
+
+        stats['extensions'] = dict(stats['extensions'])
+        stats['locations'] = dict(stats['locations'])
+        del stats['sizes']  # Don't include raw sizes
+
+        return stats
+
+    def _analyze_coco(self) -> Dict:
+        """Analyze COCO format annotations."""
+        stats = {
+            'total_annotations': 0,
+            'classes': {},
+            'images_with_annotations': 0,
+            'annotations_per_image': {},
+            'bbox_stats': {}
+        }
+
+        # Find COCO JSON files
+        for json_file in self.dataset_path.rglob('*.json'):
+            try:
+                with open(json_file) as f:
+                    data = json.load(f)
+
+                if 'annotations' not in data:
+                    continue
+
+                # Build category mapping
+                cat_map = {}
+                if 'categories' in data:
+                    for cat in data['categories']:
+                        cat_map[cat['id']] = cat['name']
+
+                # Count annotations per class
+                img_annotations = defaultdict(int)
+                bbox_widths = []
+                bbox_heights = []
+                bbox_areas = []
+
+                for ann in data['annotations']:
+                    stats['total_annotations'] += 1
+                    cat_id = ann.get('category_id')
+                    cat_name = cat_map.get(cat_id, f'class_{cat_id}')
+                    stats['classes'][cat_name] = stats['classes'].get(cat_name, 0) + 1
+                    img_annotations[ann.get('image_id')] += 1
+
+                    # Bbox stats
+                    if 'bbox' in ann:
+                        bbox = ann['bbox']  # [x, y, width, height]
+                        if len(bbox) == 4:
+                            bbox_widths.append(bbox[2])
+                            bbox_heights.append(bbox[3])
+                            bbox_areas.append(bbox[2] * bbox[3])
+
+                stats['images_with_annotations'] = len(img_annotations)
+                if img_annotations:
+                    counts = list(img_annotations.values())
+                    stats['annotations_per_image'] = {
+                        'min': min(counts),
+                        'max': max(counts),
+                        'avg': sum(counts) / len(counts)
+                    }
+
+                if bbox_areas:
+                    stats['bbox_stats'] = {
+                        'avg_width': sum(bbox_widths) / len(bbox_widths),
+                        'avg_height': sum(bbox_heights) / len(bbox_heights),
+                        'avg_area': sum(bbox_areas) / len(bbox_areas),
+                        'min_area': min(bbox_areas),
+                        'max_area': max(bbox_areas)
+                    }
+
+            except Exception as e:
+                logger.warning(f"Error parsing {json_file}: {e}")
+
+        return stats
+
+    def _analyze_yolo(self) -> Dict:
+        """Analyze YOLO format annotations."""
+        stats = {
+            'total_annotations': 0,
+            'classes': defaultdict(int),
+            'images_with_annotations': 0,
+            'bbox_stats': {}
+        }
+
+        # Find classes.txt if exists
+        class_names = {}
+        classes_file = self.dataset_path / 'classes.txt'
+        if classes_file.exists():
+            with open(classes_file) as f:
+                for i, line in enumerate(f):
+                    class_names[i] = line.strip()
+
+        bbox_widths = []
+        bbox_heights = []
+
+        for txt_file in self.dataset_path.rglob('*.txt'):
+            if txt_file.name == 'classes.txt':
+                continue
+
+            try:
+                with open(txt_file) as f:
+                    lines = f.readlines()
+
+                if lines:
+                    stats['images_with_annotations'] += 1
+
+                for line in lines:
+                    parts = line.strip().split()
+                    if len(parts) >= 5:
+                        stats['total_annotations'] += 1
+                        class_id = int(parts[0])
+                        class_name = class_names.get(class_id, f'class_{class_id}')
+                        stats['classes'][class_name] += 1
+
+                        # Bbox stats (normalized coords)
+                        w = float(parts[3])
+                        h = float(parts[4])
+                        bbox_widths.append(w)
+                        bbox_heights.append(h)
+
+            except Exception as e:
+                logger.warning(f"Error parsing {txt_file}: {e}")
+
+        stats['classes'] = dict(stats['classes'])
+
+        if bbox_widths:
+            stats['bbox_stats'] = {
+                'avg_width_normalized': sum(bbox_widths) / len(bbox_widths),
+                'avg_height_normalized': sum(bbox_heights) / len(bbox_heights),
+                'min_width_normalized': min(bbox_widths),
+                'max_width_normalized': max(bbox_widths)
+            }
+
+        return stats
+
+    def _analyze_voc(self) -> Dict:
+        """Analyze Pascal VOC format annotations."""
+        stats = {
+            'total_annotations': 0,
+            'classes': defaultdict(int),
+            'images_with_annotations': 0,
+            'difficulties': {'easy': 0, 'difficult': 0}
+        }
+
+        for xml_file in self.dataset_path.rglob('*.xml'):
+            try:
+                tree = ET.parse(xml_file)
+                root = tree.getroot()
+
+                if root.tag != 'annotation':
+                    continue
+
+                objects = root.findall('object')
+                if objects:
+                    stats['images_with_annotations'] += 1
+
+                for obj in objects:
+                    stats['total_annotations'] += 1
+                    name = obj.find('name')
+                    if name is not None:
+                        stats['classes'][name.text] += 1
+
+                    difficult = obj.find('difficult')
+                    if difficult is not None and difficult.text == '1':
+                        stats['difficulties']['difficult'] += 1
+                    else:
+                        stats['difficulties']['easy'] += 1
+
+            except Exception as e:
+                logger.warning(f"Error parsing {xml_file}: {e}")
+
+        stats['classes'] = dict(stats['classes'])
+        return stats
+
+    def _quality_checks(self) -> Dict:
+        """Run quality checks on dataset."""
+        checks = {
+            'issues': [],
+            'warnings': [],
+            'recommendations': []
+        }
+
+        # Check class imbalance
+        if 'annotations' in self.stats and 'classes' in self.stats['annotations']:
+            classes = self.stats['annotations']['classes']
+            if classes:
+                counts = list(classes.values())
+                max_count = max(counts)
+                min_count = min(counts)
+
+                if max_count > 0 and min_count / max_count < 0.1:
+                    checks['warnings'].append(
+                        f"Severe class imbalance detected: ratio {min_count/max_count:.2%}"
+                    )
+                    checks['recommendations'].append(
+                        "Consider oversampling minority classes or using focal loss"
+                    )
+                elif max_count > 0 and min_count / max_count < 0.3:
+                    checks['warnings'].append(
+                        f"Moderate class imbalance: ratio {min_count/max_count:.2%}"
+                    )
+
+        # Check image count
+        if self.stats.get('total_images', 0) < 100:
+            checks['warnings'].append(
+                f"Small dataset: only {self.stats.get('total_images', 0)} images"
+            )
+            checks['recommendations'].append(
+                "Consider data augmentation or transfer learning"
+            )
+
+        # Check for missing annotations
+        if 'annotations' in self.stats:
+            ann_stats = self.stats['annotations']
+            total_images = self.stats.get('total_images', 0)
+            images_with_ann = ann_stats.get('images_with_annotations', 0)
+
+            if total_images > 0 and images_with_ann < total_images:
+                missing = total_images - images_with_ann
+                checks['warnings'].append(
+                    f"{missing} images have no annotations"
+                )
+
+        return checks
+
+
+# ============================================================================
+# Format Conversion
+# ============================================================================
+
+class FormatConverter:
+    """Convert between dataset formats."""
+
+    def __init__(self, input_path: str, output_path: str):
+        self.input_path = Path(input_path)
+        self.output_path = Path(output_path)
+
+    def convert(self, target_format: str, source_format: str = None) -> Dict:
+        """Convert dataset to target format."""
+        # Auto-detect source format if not specified
+        if source_format is None:
+            analyzer = DatasetAnalyzer(str(self.input_path))
+            analyzer.analyze()
+            source_format = analyzer.stats.get('format', 'unknown')
+
+        logger.info(f"Converting from {source_format} to {target_format}")
+
+        conversion_key = f"{source_format}_to_{target_format}"
+
+        converters = {
+            'coco_to_yolo': self._coco_to_yolo,
+            'yolo_to_coco': self._yolo_to_coco,
+            'voc_to_coco': self._voc_to_coco,
+            'voc_to_yolo': self._voc_to_yolo,
+            'coco_to_voc': self._coco_to_voc,
+        }
+
+        if conversion_key not in converters:
+            return {'error': f"Unsupported conversion: {source_format} -> {target_format}"}
+
+        return converters[conversion_key]()
+
+    def _coco_to_yolo(self) -> Dict:
+        """Convert COCO format to YOLO format."""
+        results = {'converted_images': 0, 'converted_annotations': 0}
+
+        # Find COCO JSON
+        coco_files = list(self.input_path.rglob('*.json'))
+
+        for coco_file in coco_files:
+            try:
+                with open(coco_file) as f:
+                    coco_data = json.load(f)
+
+                if 'annotations' not in coco_data:
+                    continue
+
+                # Create output directories
+                self.output_path.mkdir(parents=True, exist_ok=True)
+                labels_dir = self.output_path / 'labels'
+                labels_dir.mkdir(exist_ok=True)
+
+                # Build category and image mappings
+                cat_map = {}
+                for i, cat in enumerate(coco_data.get('categories', [])):
+                    cat_map[cat['id']] = i
+
+                img_map = {}
+                for img in coco_data.get('images', []):
+                    img_map[img['id']] = {
+                        'file_name': img['file_name'],
+                        'width': img['width'],
+                        'height': img['height']
+                    }
+
+                # Group annotations by image
+                annotations_by_image = defaultdict(list)
+                for ann in coco_data['annotations']:
+                    annotations_by_image[ann['image_id']].append(ann)
+
+                # Write YOLO format labels
+                for img_id, annotations in annotations_by_image.items():
+                    if img_id not in img_map:
+                        continue
+
+                    img_info = img_map[img_id]
+                    label_name = Path(img_info['file_name']).stem + '.txt'
+                    label_path = labels_dir / label_name
+
+                    with open(label_path, 'w') as f:
+                        for ann in annotations:
+                            if 'bbox' not in ann:
+                                continue
+
+                            bbox = ann['bbox']  # [x, y, width, height]
+                            cat_id = cat_map.get(ann['category_id'], 0)
+
+                            # Convert to YOLO format (normalized x_center, y_center, width, height)
+                            x_center = (bbox[0] + bbox[2] / 2) / img_info['width']
+                            y_center = (bbox[1] + bbox[3] / 2) / img_info['height']
+                            w = bbox[2] / img_info['width']
+                            h = bbox[3] / img_info['height']
+
+                            f.write(f"{cat_id} {x_center:.6f} {y_center:.6f} {w:.6f} {h:.6f}\n")
+                            results['converted_annotations'] += 1
+
+                    results['converted_images'] += 1
+
+                # Write classes.txt
+                classes = [None] * len(cat_map)
+                for cat in coco_data.get('categories', []):
+                    idx = cat_map[cat['id']]
+                    classes[idx] = cat['name']
+
+                with open(self.output_path / 'classes.txt', 'w') as f:
+                    for class_name in classes:
+                        f.write(f"{class_name}\n")
+
+                # Write data.yaml for YOLO training
+                yaml_content = YOLO_DATA_YAML_TEMPLATE.format(
+                    dataset_path=str(self.output_path.absolute()),
+                    train_path='images/train',
+                    val_path='images/val',
+                    test_path='images/test',
+                    num_classes=len(classes),
+                    class_names=classes
+                )
+                with open(self.output_path / 'data.yaml', 'w') as f:
+                    f.write(yaml_content)
+
+            except Exception as e:
+                logger.error(f"Error converting {coco_file}: {e}")
+
+        return results
+
+    def _yolo_to_coco(self) -> Dict:
+        """Convert YOLO format to COCO format."""
+        results = {'converted_images': 0, 'converted_annotations': 0}
+
+        coco_data = COCO_CATEGORIES_TEMPLATE.copy()
+        coco_data['images'] = []
+        coco_data['annotations'] = []
+        coco_data['categories'] = []
+
+        # Read classes
+        classes_file = self.input_path / 'classes.txt'
+        class_names = []
+        if classes_file.exists():
+            with open(classes_file) as f:
+                class_names = [line.strip() for line in f.readlines()]
+
+        for i, name in enumerate(class_names):
+            coco_data['categories'].append({
+                'id': i,
+                'name': name,
+                'supercategory': 'object'
+            })
+
+        # Find images and labels
+        images = []
+        for ext in SUPPORTED_IMAGE_EXTENSIONS:
+            images.extend(self.input_path.rglob(f'*{ext}'))
+
+        annotation_id = 1
+        for img_id, img_path in enumerate(images, 1):
+            # Try to get image dimensions (without PIL)
+            # Assume 640x640 if can't determine
+            width, height = 640, 640
+
+            coco_data['images'].append({
+                'id': img_id,
+                'file_name': img_path.name,
+                'width': width,
+                'height': height
+            })
+            results['converted_images'] += 1
+
+            # Find corresponding label
+            label_path = img_path.with_suffix('.txt')
+            if not label_path.exists():
+                # Try labels subdirectory
+                label_path = img_path.parent.parent / 'labels' / (img_path.stem + '.txt')
+
+            if label_path.exists():
+                with open(label_path) as f:
+                    for line in f:
+                        parts = line.strip().split()
+                        if len(parts) >= 5:
+                            class_id = int(parts[0])
+                            x_center = float(parts[1]) * width
+                            y_center = float(parts[2]) * height
+                            w = float(parts[3]) * width
+                            h = float(parts[4]) * height
+
+                            # Convert to COCO format [x, y, width, height]
+                            x = x_center - w / 2
+                            y = y_center - h / 2
+
+                            coco_data['annotations'].append({
+                                'id': annotation_id,
+                                'image_id': img_id,
+                                'category_id': class_id,
+                                'bbox': [x, y, w, h],
+                                'area': w * h,
+                                'iscrowd': 0
+                            })
+                            annotation_id += 1
+                            results['converted_annotations'] += 1
+
+        # Write COCO JSON
+        self.output_path.mkdir(parents=True, exist_ok=True)
+        with open(self.output_path / 'annotations.json', 'w') as f:
+            json.dump(coco_data, f, indent=2)
+
+        return results
+
+    def _voc_to_coco(self) -> Dict:
+        """Convert Pascal VOC format to COCO format."""
+        results = {'converted_images': 0, 'converted_annotations': 0}
+
+        coco_data = COCO_CATEGORIES_TEMPLATE.copy()
+        coco_data['images'] = []
+        coco_data['annotations'] = []
+        coco_data['categories'] = []
+
+        class_to_id = {}
+        annotation_id = 1
+
+        for img_id, xml_file in enumerate(self.input_path.rglob('*.xml'), 1):
+            try:
+                tree = ET.parse(xml_file)
+                root = tree.getroot()
+
+                if root.tag != 'annotation':
+                    continue
+
+                # Get image info
+                filename = root.find('filename')
+                size = root.find('size')
+
+                if filename is None or size is None:
+                    continue
+
+                width = int(size.find('width').text)
+                height = int(size.find('height').text)
+
+                coco_data['images'].append({
+                    'id': img_id,
+                    'file_name': filename.text,
+                    'width': width,
+                    'height': height
+                })
+                results['converted_images'] += 1
+
+                # Convert objects
+                for obj in root.findall('object'):
+                    name = obj.find('name').text
+
+                    if name not in class_to_id:
+                        class_to_id[name] = len(class_to_id)
+                        coco_data['categories'].append({
+                            'id': class_to_id[name],
+                            'name': name,
+                            'supercategory': 'object'
+                        })
+
+                    bndbox = obj.find('bndbox')
+                    xmin = float(bndbox.find('xmin').text)
+                    ymin = float(bndbox.find('ymin').text)
+                    xmax = float(bndbox.find('xmax').text)
+                    ymax = float(bndbox.find('ymax').text)
+
+                    coco_data['annotations'].append({
+                        'id': annotation_id,
+                        'image_id': img_id,
+                        'category_id': class_to_id[name],
+                        'bbox': [xmin, ymin, xmax - xmin, ymax - ymin],
+                        'area': (xmax - xmin) * (ymax - ymin),
+                        'iscrowd': 0
+                    })
+                    annotation_id += 1
+                    results['converted_annotations'] += 1
+
+            except Exception as e:
+                logger.warning(f"Error parsing {xml_file}: {e}")
+
+        # Write output
+        self.output_path.mkdir(parents=True, exist_ok=True)
+        with open(self.output_path / 'annotations.json', 'w') as f:
+            json.dump(coco_data, f, indent=2)
+
+        return results
+
+    def _voc_to_yolo(self) -> Dict:
+        """Convert Pascal VOC format to YOLO format."""
+        # First convert to COCO, then to YOLO
+        temp_coco = self.output_path / '_temp_coco'
+
+        converter1 = FormatConverter(str(self.input_path), str(temp_coco))
+        converter1._voc_to_coco()
+
+        converter2 = FormatConverter(str(temp_coco), str(self.output_path))
+        results = converter2._coco_to_yolo()
+
+        # Clean up temp
+        shutil.rmtree(temp_coco, ignore_errors=True)
+
+        return results
+
+    def _coco_to_voc(self) -> Dict:
+        """Convert COCO format to Pascal VOC format."""
+        results = {'converted_images': 0, 'converted_annotations': 0}
+
+        self.output_path.mkdir(parents=True, exist_ok=True)
+        annotations_dir = self.output_path / 'Annotations'
+        annotations_dir.mkdir(exist_ok=True)
+
+        for coco_file in self.input_path.rglob('*.json'):
+            try:
+                with open(coco_file) as f:
+                    coco_data = json.load(f)
+
+                if 'annotations' not in coco_data:
+                    continue
+
+                # Build mappings
+                cat_map = {cat['id']: cat['name'] for cat in coco_data.get('categories', [])}
+                img_map = {img['id']: img for img in coco_data.get('images', [])}
+
+                # Group by image
+                ann_by_image = defaultdict(list)
+                for ann in coco_data['annotations']:
+                    ann_by_image[ann['image_id']].append(ann)
+
+                for img_id, annotations in ann_by_image.items():
+                    if img_id not in img_map:
+                        continue
+
+                    img_info = img_map[img_id]
+
+                    # Create VOC XML
+                    annotation = ET.Element('annotation')
+
+                    ET.SubElement(annotation, 'folder').text = 'images'
+                    ET.SubElement(annotation, 'filename').text = img_info['file_name']
+
+                    size = ET.SubElement(annotation, 'size')
+                    ET.SubElement(size, 'width').text = str(img_info['width'])
+                    ET.SubElement(size, 'height').text = str(img_info['height'])
+                    ET.SubElement(size, 'depth').text = '3'
+
+                    for ann in annotations:
+                        obj = ET.SubElement(annotation, 'object')
+                        ET.SubElement(obj, 'name').text = cat_map.get(ann['category_id'], 'unknown')
+                        ET.SubElement(obj, 'difficult').text = '0'
+
+                        bbox = ann['bbox']
+                        bndbox = ET.SubElement(obj, 'bndbox')
+                        ET.SubElement(bndbox, 'xmin').text = str(int(bbox[0]))
+                        ET.SubElement(bndbox, 'ymin').text = str(int(bbox[1]))
+                        ET.SubElement(bndbox, 'xmax').text = str(int(bbox[0] + bbox[2]))
+                        ET.SubElement(bndbox, 'ymax').text = str(int(bbox[1] + bbox[3]))
+
+                        results['converted_annotations'] += 1
+
+                    # Write XML
+                    xml_name = Path(img_info['file_name']).stem + '.xml'
+                    tree = ET.ElementTree(annotation)
+                    tree.write(annotations_dir / xml_name)
+                    results['converted_images'] += 1
+
+            except Exception as e:
+                logger.error(f"Error converting {coco_file}: {e}")
+
+        return results
+
+
+# ============================================================================
+# Dataset Splitting
+# ============================================================================
+
+class DatasetSplitter:
+    """Split dataset into train/val/test sets."""
+
+    def __init__(self, dataset_path: str, output_path: str = None):
+        self.dataset_path = Path(dataset_path)
+        self.output_path = Path(output_path) if output_path else self.dataset_path
+
+    def split(self, train: float = 0.8, val: float = 0.1, test: float = 0.1,
+              stratify: bool = True, seed: int = 42) -> Dict:
+        """Split dataset with optional stratification."""
+
+        if abs(train + val + test - 1.0) > 0.001:
+            raise ValueError(f"Split ratios must sum to 1.0, got {train + val + test}")
+
+        random.seed(seed)
+        logger.info(f"Splitting dataset: train={train}, val={val}, test={test}")
+
+        # Detect format and find images
+        analyzer = DatasetAnalyzer(str(self.dataset_path))
+        analyzer.analyze()
+        detected_format = analyzer.stats.get('format', 'unknown')
+
+        images = []
+        for ext in SUPPORTED_IMAGE_EXTENSIONS:
+            images.extend(self.dataset_path.rglob(f'*{ext}'))
+
+        if not images:
+            return {'error': 'No images found'}
+
+        # Stratify if requested and we have class info
+        if stratify and detected_format in ['coco', 'yolo']:
+            splits = self._stratified_split(images, detected_format, train, val, test)
+        else:
+            splits = self._random_split(images, train, val, test)
+
+        # Create output directories and copy/link files
+        results = self._create_split_directories(splits, detected_format)
+
+        return results
+
+    def _random_split(self, images: List[Path], train: float, val: float, test: float) -> Dict:
+        """Perform random split."""
+        images = list(images)
+        random.shuffle(images)
+
+        n = len(images)
+        train_end = int(n * train)
+        val_end = train_end + int(n * val)
+
+        return {
+            'train': images[:train_end],
+            'val': images[train_end:val_end],
+            'test': images[val_end:]
+        }
+
+    def _stratified_split(self, images: List[Path], format: str,
+                         train: float, val: float, test: float) -> Dict:
+        """Perform stratified split based on class distribution."""
+
+        # Group images by their primary class
+        image_classes = {}
+
+        for img in images:
+            if format == 'yolo':
+                label_path = img.with_suffix('.txt')
+                if not label_path.exists():
+                    label_path = img.parent.parent / 'labels' / (img.stem + '.txt')
+
+                if label_path.exists():
+                    with open(label_path) as f:
+                        line = f.readline()
+                    if line:
+                        class_id = int(line.split()[0])
+                        image_classes[img] = class_id
+                else:
+                    image_classes[img] = -1  # No annotation
+            else:
+                image_classes[img] = -1  # Default for other formats
+
+        # Group by class
+        class_images = defaultdict(list)
+        for img, class_id in image_classes.items():
+            class_images[class_id].append(img)
+
+        # Split each class proportionally
+        splits = {'train': [], 'val': [], 'test': []}
+
+        for class_id, class_imgs in class_images.items():
+            random.shuffle(class_imgs)
+            n = len(class_imgs)
+            train_end = int(n * train)
+            val_end = train_end + int(n * val)
+
+            splits['train'].extend(class_imgs[:train_end])
+            splits['val'].extend(class_imgs[train_end:val_end])
+            splits['test'].extend(class_imgs[val_end:])
+
+        # Shuffle final splits
+        for key in splits:
+            random.shuffle(splits[key])
+
+        return splits
+
+    def _create_split_directories(self, splits: Dict, format: str) -> Dict:
+        """Create split directories and organize files."""
+        results = {
+            'train_count': len(splits['train']),
+            'val_count': len(splits['val']),
+            'test_count': len(splits['test']),
+            'output_path': str(self.output_path)
+        }
+
+        # Create directory structure
+        for split_name in ['train', 'val', 'test']:
+            images_dir = self.output_path / 'images' / split_name
+            labels_dir = self.output_path / 'labels' / split_name
+            images_dir.mkdir(parents=True, exist_ok=True)
+            labels_dir.mkdir(parents=True, exist_ok=True)
+
+            for img_path in splits[split_name]:
+                # Create symlink for image
+                dst_img = images_dir / img_path.name
+                if not dst_img.exists():
+                    try:
+                        dst_img.symlink_to(img_path.absolute())
+                    except OSError:
+                        # Fall back to copy if symlink fails
+                        shutil.copy2(img_path, dst_img)
+
+                # Handle label file
+                if format == 'yolo':
+                    label_path = img_path.with_suffix('.txt')
+                    if not label_path.exists():
+                        label_path = img_path.parent.parent / 'labels' / (img_path.stem + '.txt')
+
+                    if label_path.exists():
+                        dst_label = labels_dir / (img_path.stem + '.txt')
+                        if not dst_label.exists():
+                            try:
+                                dst_label.symlink_to(label_path.absolute())
+                            except OSError:
+                                shutil.copy2(label_path, dst_label)
+
+        # Generate data.yaml for YOLO
+        if format == 'yolo':
+            # Read classes
+            classes_file = self.dataset_path / 'classes.txt'
+            class_names = []
+            if classes_file.exists():
+                with open(classes_file) as f:
+                    class_names = [line.strip() for line in f.readlines()]
+
+            yaml_content = YOLO_DATA_YAML_TEMPLATE.format(
+                dataset_path=str(self.output_path.absolute()),
+                train_path='images/train',
+                val_path='images/val',
+                test_path='images/test',
+                num_classes=len(class_names),
+                class_names=class_names
+            )
+            with open(self.output_path / 'data.yaml', 'w') as f:
+                f.write(yaml_content)
+
+        return results
+
+
+# ============================================================================
+# Augmentation Configuration
+# ============================================================================
+
+class AugmentationConfigGenerator:
+    """Generate augmentation configurations for different CV tasks."""
+
+    @staticmethod
+    def generate(task: str, intensity: str = 'medium',
+                 framework: str = 'albumentations') -> Dict:
+        """Generate augmentation config for task and intensity."""
+
+        if task not in AUGMENTATION_PRESETS:
+            return {'error': f"Unknown task: {task}. Use: detection, segmentation, classification"}
+
+        if intensity not in AUGMENTATION_PRESETS[task]:
+            return {'error': f"Unknown intensity: {intensity}. Use: light, medium, heavy"}
+
+        base_config = AUGMENTATION_PRESETS[task][intensity]
+
+        if framework == 'albumentations':
+            return AugmentationConfigGenerator._to_albumentations(base_config, task)
+        elif framework == 'torchvision':
+            return AugmentationConfigGenerator._to_torchvision(base_config, task)
+        elif framework == 'ultralytics':
+            return AugmentationConfigGenerator._to_ultralytics(base_config, task)
+        else:
+            return base_config
+
+    @staticmethod
+    def _to_albumentations(config: Dict, task: str) -> Dict:
+        """Convert to Albumentations format."""
+        transforms = []
+
+        for aug_name, params in config.items():
+            if aug_name == 'horizontal_flip':
+                transforms.append({
+                    'type': 'HorizontalFlip',
+                    'p': params
+                })
+            elif aug_name == 'vertical_flip':
+                transforms.append({
+                    'type': 'VerticalFlip',
+                    'p': params
+                })
+            elif aug_name == 'rotate':
+                transforms.append({
+                    'type': 'Rotate',
+                    'limit': params.get('limit', 15),
+                    'p': params.get('p', 0.5)
+                })
+            elif aug_name == 'scale':
+                transforms.append({
+                    'type': 'RandomScale',
+                    'scale_limit': params.get('scale_limit', 0.2),
+                    'p': params.get('p', 0.5)
+                })
+            elif aug_name == 'brightness_contrast':
+                transforms.append({
+                    'type': 'RandomBrightnessContrast',
+                    'brightness_limit': params.get('brightness_limit', 0.2),
+                    'contrast_limit': params.get('contrast_limit', 0.2),
+                    'p': params.get('p', 0.5)
+                })
+            elif aug_name == 'hue_saturation':
+                transforms.append({
+                    'type': 'HueSaturationValue',
+                    'hue_shift_limit': params.get('hue_shift_limit', 20),
+                    'sat_shift_limit': params.get('sat_shift_limit', 30),
+                    'p': params.get('p', 0.5)
+                })
+            elif aug_name == 'blur':
+                transforms.append({
+                    'type': 'Blur',
+                    'blur_limit': params.get('blur_limit', 5),
+                    'p': params.get('p', 0.3)
+                })
+            elif aug_name == 'noise':
+                transforms.append({
+                    'type': 'GaussNoise',
+                    'var_limit': params.get('var_limit', (10, 50)),
+                    'p': params.get('p', 0.3)
+                })
+            elif aug_name == 'elastic_transform':
+                transforms.append({
+                    'type': 'ElasticTransform',
+                    'alpha': params.get('alpha', 100),
+                    'sigma': params.get('sigma', 10),
+                    'p': params.get('p', 0.3)
+                })
+            elif aug_name == 'cutout':
+                transforms.append({
+                    'type': 'CoarseDropout',
+                    'max_holes': params.get('num_holes', 8),
+                    'max_height': params.get('max_h_size', 32),
+                    'max_width': params.get('max_w_size', 32),
+                    'p': params.get('p', 0.3)
+                })
+
+        # Add bbox format for detection
+        bbox_params = None
+        if task == 'detection':
+            bbox_params = {
+                'format': 'pascal_voc',
+                'label_fields': ['class_labels'],
+                'min_visibility': 0.3
+            }
+
+        return {
+            'framework': 'albumentations',
+            'task': task,
+            'transforms': transforms,
+            'bbox_params': bbox_params,
+            'code_example': AugmentationConfigGenerator._albumentations_code(transforms, task)
+        }
+
+    @staticmethod
+    def _albumentations_code(transforms: List, task: str) -> str:
+        """Generate Albumentations code example."""
+        code = """import albumentations as A
+from albumentations.pytorch import ToTensorV2
+
+transform = A.Compose([
+"""
+        for t in transforms:
+            params = ', '.join(f"{k}={v}" for k, v in t.items() if k != 'type')
+            code += f"    A.{t['type']}({params}),\n"
+
+        code += "    A.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),\n"
+        code += "    ToTensorV2(),\n"
+        code += "]"
+
+        if task == 'detection':
+            code += ", bbox_params=A.BboxParams(format='pascal_voc', label_fields=['class_labels']))"
+        else:
+            code += ")"
+
+        return code
+
+    @staticmethod
+    def _to_torchvision(config: Dict, task: str) -> Dict:
+        """Convert to torchvision transforms format."""
+        transforms = []
+
+        for aug_name, params in config.items():
+            if aug_name == 'horizontal_flip':
+                transforms.append({
+                    'type': 'RandomHorizontalFlip',
+                    'p': params
+                })
+            elif aug_name == 'vertical_flip':
+                transforms.append({
+                    'type': 'RandomVerticalFlip',
+                    'p': params
+                })
+            elif aug_name == 'rotate':
+                transforms.append({
+                    'type': 'RandomRotation',
+                    'degrees': params.get('limit', 15)
+                })
+            elif aug_name == 'color_jitter':
+                transforms.append({
+                    'type': 'ColorJitter',
+                    'brightness': params.get('brightness', 0.2),
+                    'contrast': params.get('contrast', 0.2),
+                    'saturation': params.get('saturation', 0.2),
+                    'hue': params.get('hue', 0.1)
+                })
+
+        return {
+            'framework': 'torchvision',
+            'task': task,
+            'transforms': transforms
+        }
+
+    @staticmethod
+    def _to_ultralytics(config: Dict, task: str) -> Dict:
+        """Convert to Ultralytics YOLO format."""
+        yolo_config = {
+            'hsv_h': 0.015,
+            'hsv_s': 0.7,
+            'hsv_v': 0.4,
+            'degrees': config.get('rotate', {}).get('limit', 0.0),
+            'translate': 0.1,
+            'scale': config.get('scale', {}).get('scale_limit', 0.5),
+            'shear': 0.0,
+            'perspective': 0.0,
+            'flipud': config.get('vertical_flip', 0.0),
+            'fliplr': config.get('horizontal_flip', 0.5),
+            'mosaic': config.get('mosaic', {}).get('p', 1.0) if 'mosaic' in config else 0.0,
+            'mixup': config.get('mixup', {}).get('p', 0.0) if 'mixup' in config else 0.0,
+            'copy_paste': 0.0
+        }
+
+        return {
+            'framework': 'ultralytics',
+            'task': task,
+            'config': yolo_config,
+            'usage': "# Add to data.yaml or pass to Trainer\nmodel.train(data='data.yaml', augment=True, **aug_config)"
+        }
+
+
+# ============================================================================
+# Dataset Validation
+# ============================================================================
+
+class DatasetValidator:
+    """Validate dataset integrity and quality."""
+
+    def __init__(self, dataset_path: str, format: str = None):
+        self.dataset_path = Path(dataset_path)
+        self.format = format
+
+    def validate(self) -> Dict:
+        """Run all validation checks."""
+        results = {
+            'valid': True,
+            'errors': [],
+            'warnings': [],
+            'stats': {}
+        }
+
+        # Auto-detect format if not specified
+        if self.format is None:
+            analyzer = DatasetAnalyzer(str(self.dataset_path))
+            analyzer.analyze()
+            self.format = analyzer.stats.get('format', 'unknown')
+
+        results['format'] = self.format
+
+        # Run format-specific validation
+        if self.format == 'coco':
+            self._validate_coco(results)
+        elif self.format == 'yolo':
+            self._validate_yolo(results)
+        elif self.format == 'voc':
+            self._validate_voc(results)
+        else:
+            results['warnings'].append(f"Unknown format: {self.format}")
+
+        # General checks
+        self._validate_images(results)
+        self._check_duplicates(results)
+
+        # Set overall validity
+        results['valid'] = len(results['errors']) == 0
+
+        return results
+
+    def _validate_coco(self, results: Dict):
+        """Validate COCO format dataset."""
+        for json_file in self.dataset_path.rglob('*.json'):
+            try:
+                with open(json_file) as f:
+                    data = json.load(f)
+
+                if 'annotations' not in data:
+                    continue
+
+                # Check required fields
+                if 'images' not in data:
+                    results['errors'].append(f"{json_file}: Missing 'images' field")
+                if 'categories' not in data:
+                    results['warnings'].append(f"{json_file}: Missing 'categories' field")
+
+                # Validate annotations
+                image_ids = {img['id'] for img in data.get('images', [])}
+                category_ids = {cat['id'] for cat in data.get('categories', [])}
+
+                for ann in data['annotations']:
+                    if ann.get('image_id') not in image_ids:
+                        results['errors'].append(
+                            f"Annotation {ann.get('id')} references non-existent image {ann.get('image_id')}"
+                        )
+                    if ann.get('category_id') not in category_ids:
+                        results['warnings'].append(
+                            f"Annotation {ann.get('id')} references unknown category {ann.get('category_id')}"
+                        )
+
+                    # Validate bbox
+                    if 'bbox' in ann:
+                        bbox = ann['bbox']
+                        if len(bbox) != 4:
+                            results['errors'].append(
+                                f"Annotation {ann.get('id')}: Invalid bbox format"
+                            )
+                        elif any(v < 0 for v in bbox[:2]) or any(v <= 0 for v in bbox[2:]):
+                            results['warnings'].append(
+                                f"Annotation {ann.get('id')}: Suspicious bbox values {bbox}"
+                            )
+
+                results['stats']['coco_images'] = len(data.get('images', []))
+                results['stats']['coco_annotations'] = len(data['annotations'])
+                results['stats']['coco_categories'] = len(data.get('categories', []))
+
+            except json.JSONDecodeError as e:
+                results['errors'].append(f"{json_file}: Invalid JSON - {e}")
+            except Exception as e:
+                results['errors'].append(f"{json_file}: Error - {e}")
+
+    def _validate_yolo(self, results: Dict):
+        """Validate YOLO format dataset."""
+        label_files = list(self.dataset_path.rglob('*.txt'))
+        valid_labels = 0
+        invalid_labels = 0
+
+        for txt_file in label_files:
+            if txt_file.name == 'classes.txt':
+                continue
+
+            try:
+                with open(txt_file) as f:
+                    lines = f.readlines()
+
+                for line_num, line in enumerate(lines, 1):
+                    parts = line.strip().split()
+                    if not parts:
+                        continue
+
+                    if len(parts) < 5:
+                        results['errors'].append(
+                            f"{txt_file}:{line_num}: Expected 5 values, got {len(parts)}"
+                        )
+                        invalid_labels += 1
+                        continue
+
+                    try:
+                        class_id = int(parts[0])
+                        x, y, w, h = map(float, parts[1:5])
+
+                        # Check normalized coordinates
+                        if not (0 <= x <= 1 and 0 <= y <= 1):
+                            results['warnings'].append(
+                                f"{txt_file}:{line_num}: Center coords outside [0,1]: ({x}, {y})"
+                            )
+                        if not (0 < w <= 1 and 0 < h <= 1):
+                            results['warnings'].append(
+                                f"{txt_file}:{line_num}: Size outside (0,1]: ({w}, {h})"
+                            )
+
+                        valid_labels += 1
+
+                    except ValueError as e:
+                        results['errors'].append(
+                            f"{txt_file}:{line_num}: Invalid values - {e}"
+                        )
+                        invalid_labels += 1
+
+            except Exception as e:
+                results['errors'].append(f"{txt_file}: Error - {e}")
+
+        results['stats']['yolo_valid_labels'] = valid_labels
+        results['stats']['yolo_invalid_labels'] = invalid_labels
+
+    def _validate_voc(self, results: Dict):
+        """Validate Pascal VOC format dataset."""
+        xml_files = list(self.dataset_path.rglob('*.xml'))
+        valid_annotations = 0
+
+        for xml_file in xml_files:
+            try:
+                tree = ET.parse(xml_file)
+                root = tree.getroot()
+
+                if root.tag != 'annotation':
+                    continue
+
+                # Check required fields
+                filename = root.find('filename')
+                if filename is None:
+                    results['warnings'].append(f"{xml_file}: Missing filename")
+
+                size = root.find('size')
+                if size is None:
+                    results['warnings'].append(f"{xml_file}: Missing size")
+                else:
+                    for dim in ['width', 'height']:
+                        if size.find(dim) is None:
+                            results['errors'].append(f"{xml_file}: Missing {dim}")
+
+                # Validate objects
+                for obj in root.findall('object'):
+                    name = obj.find('name')
+                    if name is None or not name.text:
+                        results['errors'].append(f"{xml_file}: Object missing name")
+
+                    bndbox = obj.find('bndbox')
+                    if bndbox is None:
+                        results['errors'].append(f"{xml_file}: Object missing bndbox")
+                    else:
+                        for coord in ['xmin', 'ymin', 'xmax', 'ymax']:
+                            elem = bndbox.find(coord)
+                            if elem is None:
+                                results['errors'].append(f"{xml_file}: Missing {coord}")
+
+                    valid_annotations += 1
+
+            except ET.ParseError as e:
+                results['errors'].append(f"{xml_file}: XML parse error - {e}")
+            except Exception as e:
+                results['errors'].append(f"{xml_file}: Error - {e}")
+
+        results['stats']['voc_annotations'] = valid_annotations
+
+    def _validate_images(self, results: Dict):
+        """Check for image file issues."""
+        images = []
+        for ext in SUPPORTED_IMAGE_EXTENSIONS:
+            images.extend(self.dataset_path.rglob(f'*{ext}'))
+
+        results['stats']['total_images'] = len(images)
+
+        # Check for empty images
+        empty_images = [img for img in images if img.stat().st_size == 0]
+        if empty_images:
+            results['errors'].append(f"Found {len(empty_images)} empty image files")
+
+        # Check for very small images
+        small_images = [img for img in images if img.stat().st_size < 1000]
+        if small_images:
+            results['warnings'].append(f"Found {len(small_images)} very small images (<1KB)")
+
+    def _check_duplicates(self, results: Dict):
+        """Check for duplicate images by hash."""
+        images = []
+        for ext in SUPPORTED_IMAGE_EXTENSIONS:
+            images.extend(self.dataset_path.rglob(f'*{ext}'))
+
+        hashes = {}
+        duplicates = []
+
+        for img in images:
+            try:
+                with open(img, 'rb') as f:
+                    file_hash = hashlib.md5(f.read()).hexdigest()
+
+                if file_hash in hashes:
+                    duplicates.append((img, hashes[file_hash]))
+                else:
+                    hashes[file_hash] = img
+            except:
+                pass
+
+        if duplicates:
+            results['warnings'].append(f"Found {len(duplicates)} duplicate images")
+            results['stats']['duplicate_images'] = len(duplicates)
+
+
+# ============================================================================
+# Main CLI
+# ============================================================================
 
 def main():
-    """Main entry point"""
     parser = argparse.ArgumentParser(
-        description="Dataset Pipeline Builder"
+        description="Dataset Pipeline Builder for Computer Vision",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+  Analyze dataset:
+    python dataset_pipeline_builder.py analyze --input /path/to/dataset
+
+  Convert COCO to YOLO:
+    python dataset_pipeline_builder.py convert --input /path/to/coco --output /path/to/yolo --format yolo
+
+  Split dataset:
+    python dataset_pipeline_builder.py split --input /path/to/dataset --train 0.8 --val 0.1 --test 0.1
+
+  Generate augmentation config:
+    python dataset_pipeline_builder.py augment-config --task detection --intensity heavy
+
+  Validate dataset:
+    python dataset_pipeline_builder.py validate --input /path/to/dataset --format coco
+        """
     )
-    parser.add_argument('--input', '-i', required=True, help='Input path')
-    parser.add_argument('--output', '-o', required=True, help='Output path')
-    parser.add_argument('--config', '-c', help='Configuration file')
-    parser.add_argument('--verbose', '-v', action='store_true', help='Verbose output')
-    
+
+    subparsers = parser.add_subparsers(dest='command', help='Command to run')
+
+    # Analyze command
+    analyze_parser = subparsers.add_parser('analyze', help='Analyze dataset structure and statistics')
+    analyze_parser.add_argument('--input', '-i', required=True, help='Path to dataset')
+    analyze_parser.add_argument('--json', action='store_true', help='Output as JSON')
+
+    # Convert command
+    convert_parser = subparsers.add_parser('convert', help='Convert between annotation formats')
+    convert_parser.add_argument('--input', '-i', required=True, help='Input dataset path')
+    convert_parser.add_argument('--output', '-o', required=True, help='Output dataset path')
+    convert_parser.add_argument('--format', '-f', required=True,
+                               choices=['yolo', 'coco', 'voc'],
+                               help='Target format')
+    convert_parser.add_argument('--source-format', '-s',
+                               choices=['yolo', 'coco', 'voc'],
+                               help='Source format (auto-detected if not specified)')
+
+    # Split command
+    split_parser = subparsers.add_parser('split', help='Split dataset into train/val/test')
+    split_parser.add_argument('--input', '-i', required=True, help='Input dataset path')
+    split_parser.add_argument('--output', '-o', help='Output path (default: same as input)')
+    split_parser.add_argument('--train', type=float, default=0.8, help='Train split ratio')
+    split_parser.add_argument('--val', type=float, default=0.1, help='Validation split ratio')
+    split_parser.add_argument('--test', type=float, default=0.1, help='Test split ratio')
+    split_parser.add_argument('--stratify', action='store_true', help='Stratify by class')
+    split_parser.add_argument('--seed', type=int, default=42, help='Random seed')
+
+    # Augmentation config command
+    aug_parser = subparsers.add_parser('augment-config', help='Generate augmentation configuration')
+    aug_parser.add_argument('--task', '-t', required=True,
+                           choices=['detection', 'segmentation', 'classification'],
+                           help='CV task type')
+    aug_parser.add_argument('--intensity', '-n', default='medium',
+                           choices=['light', 'medium', 'heavy'],
+                           help='Augmentation intensity')
+    aug_parser.add_argument('--framework', '-f', default='albumentations',
+                           choices=['albumentations', 'torchvision', 'ultralytics'],
+                           help='Target framework')
+    aug_parser.add_argument('--output', '-o', help='Output file path')
+
+    # Validate command
+    validate_parser = subparsers.add_parser('validate', help='Validate dataset integrity')
+    validate_parser.add_argument('--input', '-i', required=True, help='Path to dataset')
+    validate_parser.add_argument('--format', '-f',
+                                choices=['yolo', 'coco', 'voc'],
+                                help='Dataset format (auto-detected if not specified)')
+    validate_parser.add_argument('--json', action='store_true', help='Output as JSON')
+
     args = parser.parse_args()
-    
-    if args.verbose:
-        logging.getLogger().setLevel(logging.DEBUG)
-    
-    try:
-        config = {
-            'input': args.input,
-            'output': args.output
-        }
-        
-        processor = DatasetPipelineBuilder(config)
-        results = processor.process()
-        
-        print(json.dumps(results, indent=2))
-        sys.exit(0)
-        
-    except Exception as e:
-        logger.error(f"Fatal error: {e}")
+
+    if args.command is None:
+        parser.print_help()
         sys.exit(1)
 
+    try:
+        if args.command == 'analyze':
+            analyzer = DatasetAnalyzer(args.input)
+            results = analyzer.analyze()
+
+            if args.json:
+                print(json.dumps(results, indent=2, default=str))
+            else:
+                print("\n" + "="*60)
+                print("DATASET ANALYSIS REPORT")
+                print("="*60)
+                print(f"\nFormat: {results.get('format', 'unknown')}")
+                print(f"Total Images: {results.get('total_images', 0)}")
+
+                if 'image_stats' in results:
+                    stats = results['image_stats']
+                    print(f"\nImage Statistics:")
+                    print(f"  Total Size: {stats.get('total_size_mb', 0):.2f} MB")
+                    print(f"  Extensions: {stats.get('extensions', {})}")
+                    print(f"  Locations: {stats.get('locations', {})}")
+
+                if 'annotations' in results:
+                    ann = results['annotations']
+                    print(f"\nAnnotations:")
+                    print(f"  Total: {ann.get('total_annotations', 0)}")
+                    print(f"  Images with annotations: {ann.get('images_with_annotations', 0)}")
+                    if 'classes' in ann:
+                        print(f"  Classes: {len(ann['classes'])}")
+                        for cls, count in sorted(ann['classes'].items(), key=lambda x: -x[1])[:10]:
+                            print(f"    - {cls}: {count}")
+
+                if 'quality' in results:
+                    q = results['quality']
+                    if q.get('warnings'):
+                        print(f"\nWarnings:")
+                        for w in q['warnings']:
+                            print(f"  ⚠ {w}")
+                    if q.get('recommendations'):
+                        print(f"\nRecommendations:")
+                        for r in q['recommendations']:
+                            print(f"  → {r}")
+
+        elif args.command == 'convert':
+            converter = FormatConverter(args.input, args.output)
+            results = converter.convert(args.format, args.source_format)
+            print(json.dumps(results, indent=2))
+
+        elif args.command == 'split':
+            output = args.output if args.output else args.input
+            splitter = DatasetSplitter(args.input, output)
+            results = splitter.split(
+                train=args.train,
+                val=args.val,
+                test=args.test,
+                stratify=args.stratify,
+                seed=args.seed
+            )
+            print(json.dumps(results, indent=2))
+
+        elif args.command == 'augment-config':
+            config = AugmentationConfigGenerator.generate(
+                args.task,
+                args.intensity,
+                args.framework
+            )
+
+            output = json.dumps(config, indent=2)
+
+            if args.output:
+                with open(args.output, 'w') as f:
+                    f.write(output)
+                print(f"Configuration saved to {args.output}")
+            else:
+                print(output)
+
+        elif args.command == 'validate':
+            validator = DatasetValidator(args.input, args.format)
+            results = validator.validate()
+
+            if args.json:
+                print(json.dumps(results, indent=2))
+            else:
+                print("\n" + "="*60)
+                print("DATASET VALIDATION REPORT")
+                print("="*60)
+                print(f"\nFormat: {results.get('format', 'unknown')}")
+                print(f"Valid: {'✓' if results['valid'] else '✗'}")
+
+                if results.get('errors'):
+                    print(f"\nErrors ({len(results['errors'])}):")
+                    for err in results['errors'][:10]:
+                        print(f"  ✗ {err}")
+                    if len(results['errors']) > 10:
+                        print(f"  ... and {len(results['errors']) - 10} more")
+
+                if results.get('warnings'):
+                    print(f"\nWarnings ({len(results['warnings'])}):")
+                    for warn in results['warnings'][:10]:
+                        print(f"  ⚠ {warn}")
+                    if len(results['warnings']) > 10:
+                        print(f"  ... and {len(results['warnings']) - 10} more")
+
+                if results.get('stats'):
+                    print(f"\nStatistics:")
+                    for key, value in results['stats'].items():
+                        print(f"  {key}: {value}")
+
+        sys.exit(0)
+
+    except Exception as e:
+        logger.error(f"Error: {e}")
+        sys.exit(1)
+
+
 if __name__ == '__main__':
     main()
diff --git a/engineering-team/senior-computer-vision/scripts/inference_optimizer.py b/engineering-team/senior-computer-vision/scripts/inference_optimizer.py
index 97f5c8d..333e1ec 100755
--- a/engineering-team/senior-computer-vision/scripts/inference_optimizer.py
+++ b/engineering-team/senior-computer-vision/scripts/inference_optimizer.py
@@ -1,17 +1,26 @@
 #!/usr/bin/env python3
 """
 Inference Optimizer
-Production-grade tool for senior computer vision engineer
+
+Analyzes and benchmarks vision models, and provides optimization recommendations.
+Supports PyTorch, ONNX, and TensorRT models.
+
+Usage:
+    python inference_optimizer.py model.pt --benchmark
+    python inference_optimizer.py model.pt --export onnx --output model.onnx
+    python inference_optimizer.py model.onnx --analyze
 """
 
 import os
 import sys
 import json
-import logging
 import argparse
+import logging
+import time
 from pathlib import Path
-from typing import Dict, List, Optional
+from typing import Dict, List, Optional, Any, Tuple
 from datetime import datetime
+import statistics
 
 logging.basicConfig(
     level=logging.INFO,
@@ -19,82 +28,530 @@ logging.basicConfig(
 )
 logger = logging.getLogger(__name__)
 
+
+# Model format signatures
+MODEL_FORMATS = {
+    '.pt': 'pytorch',
+    '.pth': 'pytorch',
+    '.onnx': 'onnx',
+    '.engine': 'tensorrt',
+    '.trt': 'tensorrt',
+    '.xml': 'openvino',
+    '.mlpackage': 'coreml',
+    '.mlmodel': 'coreml',
+}
+
+# Optimization recommendations
+OPTIMIZATION_PATHS = {
+    ('pytorch', 'gpu'): ['onnx', 'tensorrt_fp16'],
+    ('pytorch', 'cpu'): ['onnx', 'onnxruntime'],
+    ('pytorch', 'edge'): ['onnx', 'tensorrt_int8'],
+    ('pytorch', 'mobile'): ['onnx', 'tflite'],
+    ('pytorch', 'apple'): ['coreml'],
+    ('pytorch', 'intel'): ['onnx', 'openvino'],
+    ('onnx', 'gpu'): ['tensorrt_fp16'],
+    ('onnx', 'cpu'): ['onnxruntime'],
+}
+
+
 class InferenceOptimizer:
-    """Production-grade inference optimizer"""
-    
-    def __init__(self, config: Dict):
-        self.config = config
-        self.results = {
-            'status': 'initialized',
-            'start_time': datetime.now().isoformat(),
-            'processed_items': 0
+    """Analyzes and optimizes vision model inference."""
+
+    def __init__(self, model_path: str):
+        self.model_path = Path(model_path)
+        self.model_format = self._detect_format()
+        self.model_info = {}
+        self.benchmark_results = {}
+
+    def _detect_format(self) -> str:
+        """Detect model format from file extension."""
+        suffix = self.model_path.suffix.lower()
+        if suffix in MODEL_FORMATS:
+            return MODEL_FORMATS[suffix]
+        raise ValueError(f"Unknown model format: {suffix}")
+
+    def analyze_model(self) -> Dict[str, Any]:
+        """Analyze model structure and size."""
+        logger.info(f"Analyzing model: {self.model_path}")
+
+        analysis = {
+            'path': str(self.model_path),
+            'format': self.model_format,
+            'file_size_mb': self.model_path.stat().st_size / 1024 / 1024,
+            'parameters': None,
+            'layers': [],
+            'input_shape': None,
+            'output_shape': None,
+            'ops_count': None,
         }
-        logger.info(f"Initialized {self.__class__.__name__}")
-    
-    def validate_config(self) -> bool:
-        """Validate configuration"""
-        logger.info("Validating configuration...")
-        # Add validation logic
-        logger.info("Configuration validated")
-        return True
-    
-    def process(self) -> Dict:
-        """Main processing logic"""
-        logger.info("Starting processing...")
-        
+
+        if self.model_format == 'onnx':
+            analysis.update(self._analyze_onnx())
+        elif self.model_format == 'pytorch':
+            analysis.update(self._analyze_pytorch())
+
+        self.model_info = analysis
+        return analysis
+
+    def _analyze_onnx(self) -> Dict[str, Any]:
+        """Analyze ONNX model."""
         try:
-            self.validate_config()
-            
-            # Main processing
-            result = self._execute()
-            
-            self.results['status'] = 'completed'
-            self.results['end_time'] = datetime.now().isoformat()
-            
-            logger.info("Processing completed successfully")
-            return self.results
-            
+            import onnx
+            model = onnx.load(str(self.model_path))
+            onnx.checker.check_model(model)
+
+            # Count parameters
+            total_params = 0
+            for initializer in model.graph.initializer:
+                param_count = 1
+                for dim in initializer.dims:
+                    param_count *= dim
+                total_params += param_count
+
+            # Get input/output shapes
+            inputs = []
+            for inp in model.graph.input:
+                shape = [d.dim_value if d.dim_value else -1
+                        for d in inp.type.tensor_type.shape.dim]
+                inputs.append({'name': inp.name, 'shape': shape})
+
+            outputs = []
+            for out in model.graph.output:
+                shape = [d.dim_value if d.dim_value else -1
+                        for d in out.type.tensor_type.shape.dim]
+                outputs.append({'name': out.name, 'shape': shape})
+
+            # Count operators
+            op_counts = {}
+            for node in model.graph.node:
+                op_type = node.op_type
+                op_counts[op_type] = op_counts.get(op_type, 0) + 1
+
+            return {
+                'parameters': total_params,
+                'inputs': inputs,
+                'outputs': outputs,
+                'operator_counts': op_counts,
+                'num_nodes': len(model.graph.node),
+                'opset_version': model.opset_import[0].version if model.opset_import else None,
+            }
+
+        except ImportError:
+            logger.warning("onnx package not installed, skipping detailed analysis")
+            return {}
         except Exception as e:
-            self.results['status'] = 'failed'
-            self.results['error'] = str(e)
-            logger.error(f"Processing failed: {e}")
-            raise
-    
-    def _execute(self) -> Dict:
-        """Execute main logic"""
-        # Implementation here
-        return {'success': True}
+            logger.error(f"Error analyzing ONNX model: {e}")
+            return {'error': str(e)}
+
+    def _analyze_pytorch(self) -> Dict[str, Any]:
+        """Analyze PyTorch model."""
+        try:
+            import torch
+
+            # Try to load as checkpoint
+            checkpoint = torch.load(str(self.model_path), map_location='cpu')
+
+            # Handle different checkpoint formats
+            if isinstance(checkpoint, dict):
+                if 'model' in checkpoint:
+                    state_dict = checkpoint['model']
+                elif 'state_dict' in checkpoint:
+                    state_dict = checkpoint['state_dict']
+                else:
+                    state_dict = checkpoint
+            else:
+                # Assume it's the model itself
+                if hasattr(checkpoint, 'state_dict'):
+                    state_dict = checkpoint.state_dict()
+                else:
+                    return {'error': 'Could not extract state dict'}
+
+            # Count parameters
+            total_params = 0
+            layer_info = []
+            for name, param in state_dict.items():
+                if hasattr(param, 'numel'):
+                    param_count = param.numel()
+                    total_params += param_count
+                    layer_info.append({
+                        'name': name,
+                        'shape': list(param.shape),
+                        'params': param_count,
+                        'dtype': str(param.dtype)
+                    })
+
+            return {
+                'parameters': total_params,
+                'layers': layer_info[:20],  # First 20 layers
+                'num_layers': len(layer_info),
+            }
+
+        except ImportError:
+            logger.warning("torch package not installed, skipping detailed analysis")
+            return {}
+        except Exception as e:
+            logger.error(f"Error analyzing PyTorch model: {e}")
+            return {'error': str(e)}
+
+    def benchmark(self, input_size: Tuple[int, int] = (640, 640),
+                  batch_sizes: List[int] = None,
+                  num_iterations: int = 100,
+                  warmup: int = 10) -> Dict[str, Any]:
+        """Benchmark model inference speed."""
+        if batch_sizes is None:
+            batch_sizes = [1, 4, 8, 16]
+
+        logger.info(f"Benchmarking model with input size {input_size}")
+
+        results = {
+            'input_size': input_size,
+            'num_iterations': num_iterations,
+            'warmup_iterations': warmup,
+            'batch_results': [],
+            'device': 'cpu',
+        }
+
+        try:
+            if self.model_format == 'onnx':
+                results.update(self._benchmark_onnx(input_size, batch_sizes,
+                                                    num_iterations, warmup))
+            elif self.model_format == 'pytorch':
+                results.update(self._benchmark_pytorch(input_size, batch_sizes,
+                                                       num_iterations, warmup))
+            else:
+                results['error'] = f"Benchmarking not supported for {self.model_format}"
+
+        except Exception as e:
+            results['error'] = str(e)
+            logger.error(f"Benchmark failed: {e}")
+
+        self.benchmark_results = results
+        return results
+
+    def _benchmark_onnx(self, input_size: Tuple[int, int],
+                        batch_sizes: List[int],
+                        num_iterations: int, warmup: int) -> Dict[str, Any]:
+        """Benchmark ONNX model."""
+        import numpy as np
+
+        try:
+            import onnxruntime as ort
+
+            # Try GPU first, fall back to CPU
+            providers = ['CPUExecutionProvider']
+            try:
+                if 'CUDAExecutionProvider' in ort.get_available_providers():
+                    providers = ['CUDAExecutionProvider'] + providers
+            except:
+                pass
+
+            session = ort.InferenceSession(str(self.model_path), providers=providers)
+            input_name = session.get_inputs()[0].name
+            device = 'cuda' if 'CUDA' in session.get_providers()[0] else 'cpu'
+
+            results = {'device': device, 'provider': session.get_providers()[0]}
+            batch_results = []
+
+            for batch_size in batch_sizes:
+                # Create dummy input
+                dummy = np.random.randn(batch_size, 3, *input_size).astype(np.float32)
+
+                # Warmup
+                for _ in range(warmup):
+                    session.run(None, {input_name: dummy})
+
+                # Benchmark
+                latencies = []
+                for _ in range(num_iterations):
+                    start = time.perf_counter()
+                    session.run(None, {input_name: dummy})
+                    latencies.append((time.perf_counter() - start) * 1000)
+
+                batch_result = {
+                    'batch_size': batch_size,
+                    'mean_latency_ms': statistics.mean(latencies),
+                    'std_latency_ms': statistics.stdev(latencies) if len(latencies) > 1 else 0,
+                    'min_latency_ms': min(latencies),
+                    'max_latency_ms': max(latencies),
+                    'p50_latency_ms': sorted(latencies)[len(latencies) // 2],
+                    'p95_latency_ms': sorted(latencies)[int(len(latencies) * 0.95)],
+                    'p99_latency_ms': sorted(latencies)[int(len(latencies) * 0.99)],
+                    'throughput_fps': batch_size * 1000 / statistics.mean(latencies),
+                }
+                batch_results.append(batch_result)
+
+                logger.info(f"Batch {batch_size}: {batch_result['mean_latency_ms']:.2f}ms, "
+                           f"{batch_result['throughput_fps']:.1f} FPS")
+
+            results['batch_results'] = batch_results
+            return results
+
+        except ImportError:
+            return {'error': 'onnxruntime not installed'}
+
+    def _benchmark_pytorch(self, input_size: Tuple[int, int],
+                          batch_sizes: List[int],
+                          num_iterations: int, warmup: int) -> Dict[str, Any]:
+        """Benchmark PyTorch model."""
+        try:
+            import torch
+            import numpy as np
+
+            # Load model
+            device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
+            checkpoint = torch.load(str(self.model_path), map_location=device)
+
+            # Handle different checkpoint formats
+            if isinstance(checkpoint, dict) and 'model' in checkpoint:
+                model = checkpoint['model']
+            elif hasattr(checkpoint, 'forward'):
+                model = checkpoint
+            else:
+                return {'error': 'Could not load model for benchmarking'}
+
+            model.to(device)
+            model.train(False)
+
+            results = {'device': str(device)}
+            batch_results = []
+
+            with torch.no_grad():
+                for batch_size in batch_sizes:
+                    dummy = torch.randn(batch_size, 3, *input_size, device=device)
+
+                    # Warmup
+                    for _ in range(warmup):
+                        _ = model(dummy)
+                    if device.type == 'cuda':
+                        torch.cuda.synchronize()
+
+                    # Benchmark
+                    latencies = []
+                    for _ in range(num_iterations):
+                        if device.type == 'cuda':
+                            torch.cuda.synchronize()
+                        start = time.perf_counter()
+                        _ = model(dummy)
+                        if device.type == 'cuda':
+                            torch.cuda.synchronize()
+                        latencies.append((time.perf_counter() - start) * 1000)
+
+                    batch_result = {
+                        'batch_size': batch_size,
+                        'mean_latency_ms': statistics.mean(latencies),
+                        'std_latency_ms': statistics.stdev(latencies) if len(latencies) > 1 else 0,
+                        'min_latency_ms': min(latencies),
+                        'max_latency_ms': max(latencies),
+                        'throughput_fps': batch_size * 1000 / statistics.mean(latencies),
+                    }
+                    batch_results.append(batch_result)
+
+                    logger.info(f"Batch {batch_size}: {batch_result['mean_latency_ms']:.2f}ms, "
+                               f"{batch_result['throughput_fps']:.1f} FPS")
+
+            results['batch_results'] = batch_results
+            return results
+
+        except ImportError:
+            return {'error': 'torch not installed'}
+        except Exception as e:
+            return {'error': str(e)}
+
+    def get_optimization_recommendations(self, target: str = 'gpu') -> List[Dict[str, Any]]:
+        """Get optimization recommendations for target platform."""
+        recommendations = []
+
+        key = (self.model_format, target)
+        if key in OPTIMIZATION_PATHS:
+            path = OPTIMIZATION_PATHS[key]
+            for step in path:
+                rec = {
+                    'step': step,
+                    'description': self._get_step_description(step),
+                    'expected_speedup': self._get_expected_speedup(step),
+                    'command': self._get_step_command(step),
+                }
+                recommendations.append(rec)
+
+        # Add general recommendations
+        if self.model_info:
+            params = self.model_info.get('parameters', 0)
+            if params and params > 50_000_000:
+                recommendations.append({
+                    'step': 'pruning',
+                    'description': f'Model has {params/1e6:.1f}M parameters. '
+                                 'Consider structured pruning to reduce size.',
+                    'expected_speedup': '1.5-2x',
+                })
+
+            file_size = self.model_info.get('file_size_mb', 0)
+            if file_size > 100:
+                recommendations.append({
+                    'step': 'quantization',
+                    'description': f'Model size is {file_size:.1f}MB. '
+                                 'INT8 quantization can reduce by 75%.',
+                    'expected_speedup': '2-4x',
+                })
+
+        return recommendations
+
+    def _get_step_description(self, step: str) -> str:
+        """Get description for optimization step."""
+        descriptions = {
+            'onnx': 'Export to ONNX format for framework-agnostic deployment',
+            'tensorrt_fp16': 'Convert to TensorRT with FP16 precision for NVIDIA GPUs',
+            'tensorrt_int8': 'Convert to TensorRT with INT8 quantization for edge devices',
+            'onnxruntime': 'Use ONNX Runtime for optimized CPU/GPU inference',
+            'openvino': 'Convert to OpenVINO for Intel CPU/GPU optimization',
+            'coreml': 'Convert to CoreML for Apple Silicon acceleration',
+            'tflite': 'Convert to TensorFlow Lite for mobile deployment',
+        }
+        return descriptions.get(step, step)
+
+    def _get_expected_speedup(self, step: str) -> str:
+        """Get expected speedup for optimization step."""
+        speedups = {
+            'onnx': '1-1.5x',
+            'tensorrt_fp16': '2-4x',
+            'tensorrt_int8': '3-6x',
+            'onnxruntime': '1.2-2x',
+            'openvino': '1.5-3x',
+            'coreml': '2-5x (on Apple Silicon)',
+            'tflite': '1-2x',
+        }
+        return speedups.get(step, 'varies')
+
+    def _get_step_command(self, step: str) -> str:
+        """Get command for optimization step."""
+        model_name = self.model_path.stem
+        commands = {
+            'onnx': f'yolo export model={model_name}.pt format=onnx',
+            'tensorrt_fp16': f'trtexec --onnx={model_name}.onnx --saveEngine={model_name}.engine --fp16',
+            'tensorrt_int8': f'trtexec --onnx={model_name}.onnx --saveEngine={model_name}.engine --int8',
+            'onnxruntime': f'pip install onnxruntime-gpu',
+            'openvino': f'mo --input_model {model_name}.onnx --output_dir openvino/',
+            'coreml': f'yolo export model={model_name}.pt format=coreml',
+        }
+        return commands.get(step, '')
+
+    def print_summary(self):
+        """Print analysis and benchmark summary."""
+        print("\n" + "=" * 70)
+        print("MODEL ANALYSIS SUMMARY")
+        print("=" * 70)
+
+        if self.model_info:
+            print(f"Path:        {self.model_info.get('path', 'N/A')}")
+            print(f"Format:      {self.model_info.get('format', 'N/A')}")
+            print(f"File Size:   {self.model_info.get('file_size_mb', 0):.2f} MB")
+
+            params = self.model_info.get('parameters')
+            if params:
+                print(f"Parameters:  {params:,} ({params/1e6:.2f}M)")
+
+            if 'num_nodes' in self.model_info:
+                print(f"Nodes:       {self.model_info['num_nodes']}")
+
+        if self.benchmark_results and 'batch_results' in self.benchmark_results:
+            print("\n" + "-" * 70)
+            print("BENCHMARK RESULTS")
+            print("-" * 70)
+            print(f"Device:      {self.benchmark_results.get('device', 'N/A')}")
+            print(f"Input Size:  {self.benchmark_results.get('input_size', 'N/A')}")
+            print()
+            print(f"{'Batch':<8} {'Latency (ms)':<15} {'Throughput (FPS)':<18} {'P99 (ms)':<12}")
+            print("-" * 55)
+
+            for result in self.benchmark_results['batch_results']:
+                print(f"{result['batch_size']:<8} "
+                      f"{result['mean_latency_ms']:<15.2f} "
+                      f"{result['throughput_fps']:<18.1f} "
+                      f"{result.get('p99_latency_ms', 0):<12.2f}")
+
+        print("=" * 70 + "\n")
+
 
 def main():
-    """Main entry point"""
     parser = argparse.ArgumentParser(
-        description="Inference Optimizer"
+        description="Analyze and optimize vision model inference"
     )
-    parser.add_argument('--input', '-i', required=True, help='Input path')
-    parser.add_argument('--output', '-o', required=True, help='Output path')
-    parser.add_argument('--config', '-c', help='Configuration file')
-    parser.add_argument('--verbose', '-v', action='store_true', help='Verbose output')
-    
+    parser.add_argument('model_path', help='Path to model file')
+    parser.add_argument('--analyze', action='store_true',
+                       help='Analyze model structure')
+    parser.add_argument('--benchmark', action='store_true',
+                       help='Benchmark inference speed')
+    parser.add_argument('--input-size', type=int, nargs=2, default=[640, 640],
+                       metavar=('H', 'W'), help='Input image size')
+    parser.add_argument('--batch-sizes', type=int, nargs='+', default=[1, 4, 8],
+                       help='Batch sizes to benchmark')
+    parser.add_argument('--iterations', type=int, default=100,
+                       help='Number of benchmark iterations')
+    parser.add_argument('--warmup', type=int, default=10,
+                       help='Number of warmup iterations')
+    parser.add_argument('--target', choices=['gpu', 'cpu', 'edge', 'mobile', 'apple', 'intel'],
+                       default='gpu', help='Target deployment platform')
+    parser.add_argument('--recommend', action='store_true',
+                       help='Show optimization recommendations')
+    parser.add_argument('--json', action='store_true',
+                       help='Output as JSON')
+    parser.add_argument('--output', '-o', help='Output file path')
+
     args = parser.parse_args()
-    
-    if args.verbose:
-        logging.getLogger().setLevel(logging.DEBUG)
-    
-    try:
-        config = {
-            'input': args.input,
-            'output': args.output
-        }
-        
-        processor = InferenceOptimizer(config)
-        results = processor.process()
-        
-        print(json.dumps(results, indent=2))
-        sys.exit(0)
-        
-    except Exception as e:
-        logger.error(f"Fatal error: {e}")
+
+    if not Path(args.model_path).exists():
+        logger.error(f"Model not found: {args.model_path}")
         sys.exit(1)
 
+    try:
+        optimizer = InferenceOptimizer(args.model_path)
+    except ValueError as e:
+        logger.error(str(e))
+        sys.exit(1)
+
+    results = {}
+
+    # Analyze model
+    if args.analyze or not (args.benchmark or args.recommend):
+        results['analysis'] = optimizer.analyze_model()
+
+    # Benchmark
+    if args.benchmark:
+        results['benchmark'] = optimizer.benchmark(
+            input_size=tuple(args.input_size),
+            batch_sizes=args.batch_sizes,
+            num_iterations=args.iterations,
+            warmup=args.warmup
+        )
+
+    # Recommendations
+    if args.recommend:
+        if not optimizer.model_info:
+            optimizer.analyze_model()
+        results['recommendations'] = optimizer.get_optimization_recommendations(args.target)
+
+    # Output
+    if args.json:
+        print(json.dumps(results, indent=2, default=str))
+    else:
+        optimizer.print_summary()
+
+        if args.recommend and 'recommendations' in results:
+            print("OPTIMIZATION RECOMMENDATIONS")
+            print("-" * 70)
+            for i, rec in enumerate(results['recommendations'], 1):
+                print(f"\n{i}. {rec['step'].upper()}")
+                print(f"   {rec['description']}")
+                print(f"   Expected speedup: {rec['expected_speedup']}")
+                if rec.get('command'):
+                    print(f"   Command: {rec['command']}")
+            print()
+
+    # Save to file
+    if args.output:
+        with open(args.output, 'w') as f:
+            json.dump(results, f, indent=2, default=str)
+        logger.info(f"Results saved to {args.output}")
+
+
 if __name__ == '__main__':
     main()
diff --git a/engineering-team/senior-computer-vision/scripts/vision_model_trainer.py b/engineering-team/senior-computer-vision/scripts/vision_model_trainer.py
index 84edf9a..c1a36fb 100755
--- a/engineering-team/senior-computer-vision/scripts/vision_model_trainer.py
+++ b/engineering-team/senior-computer-vision/scripts/vision_model_trainer.py
@@ -1,16 +1,22 @@
 #!/usr/bin/env python3
 """
-Vision Model Trainer
-Production-grade tool for senior computer vision engineer
+Vision Model Trainer Configuration Generator
+
+Generates training configuration files for object detection and segmentation models.
+Supports Ultralytics YOLO, Detectron2, and MMDetection frameworks.
+
+Usage:
+    python vision_model_trainer.py <data_dir> --task detection --arch yolov8m
+    python vision_model_trainer.py <data_dir> --framework detectron2 --arch faster_rcnn_R_50_FPN
 """
 
 import os
 import sys
 import json
-import logging
 import argparse
+import logging
 from pathlib import Path
-from typing import Dict, List, Optional
+from typing import Dict, List, Optional, Any
 from datetime import datetime
 
 logging.basicConfig(
@@ -19,82 +25,552 @@ logging.basicConfig(
 )
 logger = logging.getLogger(__name__)
 
+
+# Architecture configurations
+YOLO_ARCHITECTURES = {
+    'yolov8n': {'params': '3.2M', 'gflops': 8.7, 'map': 37.3},
+    'yolov8s': {'params': '11.2M', 'gflops': 28.6, 'map': 44.9},
+    'yolov8m': {'params': '25.9M', 'gflops': 78.9, 'map': 50.2},
+    'yolov8l': {'params': '43.7M', 'gflops': 165.2, 'map': 52.9},
+    'yolov8x': {'params': '68.2M', 'gflops': 257.8, 'map': 53.9},
+    'yolov5n': {'params': '1.9M', 'gflops': 4.5, 'map': 28.0},
+    'yolov5s': {'params': '7.2M', 'gflops': 16.5, 'map': 37.4},
+    'yolov5m': {'params': '21.2M', 'gflops': 49.0, 'map': 45.4},
+    'yolov5l': {'params': '46.5M', 'gflops': 109.1, 'map': 49.0},
+    'yolov5x': {'params': '86.7M', 'gflops': 205.7, 'map': 50.7},
+}
+
+DETECTRON2_ARCHITECTURES = {
+    'faster_rcnn_R_50_FPN': {'backbone': 'R-50-FPN', 'map': 37.9},
+    'faster_rcnn_R_101_FPN': {'backbone': 'R-101-FPN', 'map': 39.4},
+    'faster_rcnn_X_101_FPN': {'backbone': 'X-101-FPN', 'map': 41.0},
+    'mask_rcnn_R_50_FPN': {'backbone': 'R-50-FPN', 'map': 38.6},
+    'mask_rcnn_R_101_FPN': {'backbone': 'R-101-FPN', 'map': 40.0},
+    'retinanet_R_50_FPN': {'backbone': 'R-50-FPN', 'map': 36.4},
+    'retinanet_R_101_FPN': {'backbone': 'R-101-FPN', 'map': 37.7},
+}
+
+MMDETECTION_ARCHITECTURES = {
+    'faster_rcnn_r50_fpn': {'backbone': 'ResNet50', 'map': 37.4},
+    'faster_rcnn_r101_fpn': {'backbone': 'ResNet101', 'map': 39.4},
+    'mask_rcnn_r50_fpn': {'backbone': 'ResNet50', 'map': 38.2},
+    'yolox_s': {'backbone': 'CSPDarknet', 'map': 40.5},
+    'yolox_m': {'backbone': 'CSPDarknet', 'map': 46.9},
+    'yolox_l': {'backbone': 'CSPDarknet', 'map': 49.7},
+    'detr_r50': {'backbone': 'ResNet50', 'map': 42.0},
+    'dino_r50': {'backbone': 'ResNet50', 'map': 49.0},
+}
+
+
 class VisionModelTrainer:
-    """Production-grade vision model trainer"""
-    
-    def __init__(self, config: Dict):
-        self.config = config
-        self.results = {
-            'status': 'initialized',
-            'start_time': datetime.now().isoformat(),
-            'processed_items': 0
+    """Generates training configurations for vision models."""
+
+    def __init__(self, data_dir: str, task: str = 'detection',
+                 framework: str = 'ultralytics'):
+        self.data_dir = Path(data_dir)
+        self.task = task
+        self.framework = framework
+        self.config = {}
+
+    def analyze_dataset(self) -> Dict[str, Any]:
+        """Analyze dataset structure and statistics."""
+        logger.info(f"Analyzing dataset at {self.data_dir}")
+
+        analysis = {
+            'path': str(self.data_dir),
+            'exists': self.data_dir.exists(),
+            'images': {'train': 0, 'val': 0, 'test': 0},
+            'annotations': {'format': None, 'classes': []},
+            'recommendations': []
         }
-        logger.info(f"Initialized {self.__class__.__name__}")
-    
-    def validate_config(self) -> bool:
-        """Validate configuration"""
-        logger.info("Validating configuration...")
-        # Add validation logic
-        logger.info("Configuration validated")
-        return True
-    
-    def process(self) -> Dict:
-        """Main processing logic"""
-        logger.info("Starting processing...")
-        
-        try:
-            self.validate_config()
-            
-            # Main processing
-            result = self._execute()
-            
-            self.results['status'] = 'completed'
-            self.results['end_time'] = datetime.now().isoformat()
-            
-            logger.info("Processing completed successfully")
-            return self.results
-            
-        except Exception as e:
-            self.results['status'] = 'failed'
-            self.results['error'] = str(e)
-            logger.error(f"Processing failed: {e}")
-            raise
-    
-    def _execute(self) -> Dict:
-        """Execute main logic"""
-        # Implementation here
-        return {'success': True}
+
+        if not self.data_dir.exists():
+            analysis['recommendations'].append(
+                f"Directory {self.data_dir} does not exist"
+            )
+            return analysis
+
+        # Check for common dataset structures
+        # COCO format
+        if (self.data_dir / 'annotations').exists():
+            analysis['annotations']['format'] = 'coco'
+            for split in ['train', 'val', 'test']:
+                ann_file = self.data_dir / 'annotations' / f'{split}.json'
+                if ann_file.exists():
+                    with open(ann_file, 'r') as f:
+                        data = json.load(f)
+                        analysis['images'][split] = len(data.get('images', []))
+                        if not analysis['annotations']['classes']:
+                            analysis['annotations']['classes'] = [
+                                c['name'] for c in data.get('categories', [])
+                            ]
+
+        # YOLO format
+        elif (self.data_dir / 'labels').exists():
+            analysis['annotations']['format'] = 'yolo'
+            for split in ['train', 'val', 'test']:
+                img_dir = self.data_dir / 'images' / split
+                if img_dir.exists():
+                    analysis['images'][split] = len(list(img_dir.glob('*.*')))
+
+            # Try to read classes from data.yaml
+            data_yaml = self.data_dir / 'data.yaml'
+            if data_yaml.exists():
+                import yaml
+                with open(data_yaml, 'r') as f:
+                    data = yaml.safe_load(f)
+                    analysis['annotations']['classes'] = data.get('names', [])
+
+        # Generate recommendations
+        total_images = sum(analysis['images'].values())
+        if total_images < 100:
+            analysis['recommendations'].append(
+                f"Dataset has only {total_images} images. "
+                "Consider collecting more data or using transfer learning."
+            )
+        if total_images < 1000:
+            analysis['recommendations'].append(
+                "Use aggressive data augmentation (mosaic, mixup) for small datasets."
+            )
+
+        num_classes = len(analysis['annotations']['classes'])
+        if num_classes > 80:
+            analysis['recommendations'].append(
+                f"Large number of classes ({num_classes}). "
+                "Consider using larger model (yolov8l/x) or longer training."
+            )
+
+        logger.info(f"Found {total_images} images, {num_classes} classes")
+        return analysis
+
+    def generate_yolo_config(self, arch: str, epochs: int = 100,
+                             batch: int = 16, imgsz: int = 640,
+                             **kwargs) -> Dict[str, Any]:
+        """Generate Ultralytics YOLO training configuration."""
+        if arch not in YOLO_ARCHITECTURES:
+            available = ', '.join(YOLO_ARCHITECTURES.keys())
+            raise ValueError(f"Unknown architecture: {arch}. Available: {available}")
+
+        arch_info = YOLO_ARCHITECTURES[arch]
+
+        config = {
+            'model': f'{arch}.pt',
+            'data': str(self.data_dir / 'data.yaml'),
+            'epochs': epochs,
+            'batch': batch,
+            'imgsz': imgsz,
+            'patience': 50,
+            'save': True,
+            'save_period': -1,
+            'cache': False,
+            'device': '0',
+            'workers': 8,
+            'project': 'runs/detect',
+            'name': f'{arch}_{datetime.now().strftime("%Y%m%d_%H%M%S")}',
+            'exist_ok': False,
+            'pretrained': True,
+            'optimizer': 'auto',
+            'verbose': True,
+            'seed': 0,
+            'deterministic': True,
+            'single_cls': False,
+            'rect': False,
+            'cos_lr': False,
+            'close_mosaic': 10,
+            'resume': False,
+            'amp': True,
+            'fraction': 1.0,
+            'profile': False,
+            'freeze': None,
+            'lr0': 0.01,
+            'lrf': 0.01,
+            'momentum': 0.937,
+            'weight_decay': 0.0005,
+            'warmup_epochs': 3.0,
+            'warmup_momentum': 0.8,
+            'warmup_bias_lr': 0.1,
+            'box': 7.5,
+            'cls': 0.5,
+            'dfl': 1.5,
+            'pose': 12.0,
+            'kobj': 1.0,
+            'label_smoothing': 0.0,
+            'nbs': 64,
+            'hsv_h': 0.015,
+            'hsv_s': 0.7,
+            'hsv_v': 0.4,
+            'degrees': 0.0,
+            'translate': 0.1,
+            'scale': 0.5,
+            'shear': 0.0,
+            'perspective': 0.0,
+            'flipud': 0.0,
+            'fliplr': 0.5,
+            'bgr': 0.0,
+            'mosaic': 1.0,
+            'mixup': 0.0,
+            'copy_paste': 0.0,
+            'auto_augment': 'randaugment',
+            'erasing': 0.4,
+            'crop_fraction': 1.0,
+        }
+
+        # Update with user overrides
+        config.update(kwargs)
+
+        # Task-specific settings
+        if self.task == 'segmentation':
+            config['model'] = f'{arch}-seg.pt'
+            config['overlap_mask'] = True
+            config['mask_ratio'] = 4
+
+        # Metadata
+        config['_metadata'] = {
+            'architecture': arch,
+            'arch_info': arch_info,
+            'task': self.task,
+            'framework': 'ultralytics',
+            'generated_at': datetime.now().isoformat()
+        }
+
+        self.config = config
+        return config
+
+    def generate_detectron2_config(self, arch: str, epochs: int = 12,
+                                   batch: int = 16, **kwargs) -> Dict[str, Any]:
+        """Generate Detectron2 training configuration."""
+        if arch not in DETECTRON2_ARCHITECTURES:
+            available = ', '.join(DETECTRON2_ARCHITECTURES.keys())
+            raise ValueError(f"Unknown architecture: {arch}. Available: {available}")
+
+        arch_info = DETECTRON2_ARCHITECTURES[arch]
+        iterations = epochs * 1000  # Approximate
+
+        config = {
+            'MODEL': {
+                'WEIGHTS': f'detectron2://COCO-Detection/{arch}_3x/137849458/model_final_280758.pkl',
+                'ROI_HEADS': {
+                    'NUM_CLASSES': len(self._get_classes()),
+                    'BATCH_SIZE_PER_IMAGE': 512,
+                    'POSITIVE_FRACTION': 0.25,
+                    'SCORE_THRESH_TEST': 0.05,
+                    'NMS_THRESH_TEST': 0.5,
+                },
+                'BACKBONE': {
+                    'FREEZE_AT': 2
+                },
+                'FPN': {
+                    'IN_FEATURES': ['res2', 'res3', 'res4', 'res5']
+                },
+                'ANCHOR_GENERATOR': {
+                    'SIZES': [[32], [64], [128], [256], [512]],
+                    'ASPECT_RATIOS': [[0.5, 1.0, 2.0]]
+                },
+                'RPN': {
+                    'PRE_NMS_TOPK_TRAIN': 2000,
+                    'PRE_NMS_TOPK_TEST': 1000,
+                    'POST_NMS_TOPK_TRAIN': 1000,
+                    'POST_NMS_TOPK_TEST': 1000,
+                }
+            },
+            'DATASETS': {
+                'TRAIN': ('custom_train',),
+                'TEST': ('custom_val',),
+            },
+            'DATALOADER': {
+                'NUM_WORKERS': 4,
+                'SAMPLER_TRAIN': 'TrainingSampler',
+                'FILTER_EMPTY_ANNOTATIONS': True,
+            },
+            'SOLVER': {
+                'IMS_PER_BATCH': batch,
+                'BASE_LR': 0.001,
+                'STEPS': (int(iterations * 0.7), int(iterations * 0.9)),
+                'MAX_ITER': iterations,
+                'WARMUP_FACTOR': 1.0 / 1000,
+                'WARMUP_ITERS': 1000,
+                'WARMUP_METHOD': 'linear',
+                'GAMMA': 0.1,
+                'MOMENTUM': 0.9,
+                'WEIGHT_DECAY': 0.0001,
+                'WEIGHT_DECAY_NORM': 0.0,
+                'CHECKPOINT_PERIOD': 5000,
+                'AMP': {
+                    'ENABLED': True
+                }
+            },
+            'INPUT': {
+                'MIN_SIZE_TRAIN': (640, 672, 704, 736, 768, 800),
+                'MAX_SIZE_TRAIN': 1333,
+                'MIN_SIZE_TEST': 800,
+                'MAX_SIZE_TEST': 1333,
+                'FORMAT': 'BGR',
+            },
+            'TEST': {
+                'EVAL_PERIOD': 5000,
+                'DETECTIONS_PER_IMAGE': 100,
+            },
+            'OUTPUT_DIR': f'./output/{arch}_{datetime.now().strftime("%Y%m%d_%H%M%S")}',
+        }
+
+        # Add mask head for instance segmentation
+        if 'mask' in arch.lower():
+            config['MODEL']['MASK_ON'] = True
+            config['MODEL']['ROI_MASK_HEAD'] = {
+                'POOLER_RESOLUTION': 14,
+                'POOLER_SAMPLING_RATIO': 0,
+                'POOLER_TYPE': 'ROIAlignV2'
+            }
+
+        config.update(kwargs)
+        config['_metadata'] = {
+            'architecture': arch,
+            'arch_info': arch_info,
+            'task': self.task,
+            'framework': 'detectron2',
+            'generated_at': datetime.now().isoformat()
+        }
+
+        self.config = config
+        return config
+
+    def generate_mmdetection_config(self, arch: str, epochs: int = 12,
+                                    batch: int = 16, **kwargs) -> Dict[str, Any]:
+        """Generate MMDetection training configuration."""
+        if arch not in MMDETECTION_ARCHITECTURES:
+            available = ', '.join(MMDETECTION_ARCHITECTURES.keys())
+            raise ValueError(f"Unknown architecture: {arch}. Available: {available}")
+
+        arch_info = MMDETECTION_ARCHITECTURES[arch]
+
+        config = {
+            '_base_': [
+                f'../_base_/models/{arch}.py',
+                '../_base_/datasets/coco_detection.py',
+                '../_base_/schedules/schedule_1x.py',
+                '../_base_/default_runtime.py'
+            ],
+            'model': {
+                'roi_head': {
+                    'bbox_head': {
+                        'num_classes': len(self._get_classes())
+                    }
+                }
+            },
+            'data': {
+                'samples_per_gpu': batch // 2,
+                'workers_per_gpu': 4,
+                'train': {
+                    'type': 'CocoDataset',
+                    'ann_file': str(self.data_dir / 'annotations' / 'train.json'),
+                    'img_prefix': str(self.data_dir / 'images' / 'train'),
+                },
+                'val': {
+                    'type': 'CocoDataset',
+                    'ann_file': str(self.data_dir / 'annotations' / 'val.json'),
+                    'img_prefix': str(self.data_dir / 'images' / 'val'),
+                },
+                'test': {
+                    'type': 'CocoDataset',
+                    'ann_file': str(self.data_dir / 'annotations' / 'val.json'),
+                    'img_prefix': str(self.data_dir / 'images' / 'val'),
+                }
+            },
+            'optimizer': {
+                'type': 'SGD',
+                'lr': 0.02,
+                'momentum': 0.9,
+                'weight_decay': 0.0001
+            },
+            'optimizer_config': {
+                'grad_clip': {'max_norm': 35, 'norm_type': 2}
+            },
+            'lr_config': {
+                'policy': 'step',
+                'warmup': 'linear',
+                'warmup_iters': 500,
+                'warmup_ratio': 0.001,
+                'step': [int(epochs * 0.7), int(epochs * 0.9)]
+            },
+            'runner': {
+                'type': 'EpochBasedRunner',
+                'max_epochs': epochs
+            },
+            'checkpoint_config': {
+                'interval': 1
+            },
+            'log_config': {
+                'interval': 50,
+                'hooks': [
+                    {'type': 'TextLoggerHook'},
+                    {'type': 'TensorboardLoggerHook'}
+                ]
+            },
+            'work_dir': f'./work_dirs/{arch}_{datetime.now().strftime("%Y%m%d_%H%M%S")}',
+            'load_from': None,
+            'resume_from': None,
+            'fp16': {'loss_scale': 512.0}
+        }
+
+        config.update(kwargs)
+        config['_metadata'] = {
+            'architecture': arch,
+            'arch_info': arch_info,
+            'task': self.task,
+            'framework': 'mmdetection',
+            'generated_at': datetime.now().isoformat()
+        }
+
+        self.config = config
+        return config
+
+    def _get_classes(self) -> List[str]:
+        """Get class names from dataset."""
+        analysis = self.analyze_dataset()
+        classes = analysis['annotations']['classes']
+        if not classes:
+            classes = ['object']  # Default fallback
+        return classes
+
+    def save_config(self, output_path: str) -> str:
+        """Save configuration to file."""
+        output_path = Path(output_path)
+        output_path.parent.mkdir(parents=True, exist_ok=True)
+
+        if self.framework == 'ultralytics':
+            # YOLO uses YAML
+            import yaml
+            with open(output_path, 'w') as f:
+                yaml.dump(self.config, f, default_flow_style=False, sort_keys=False)
+        else:
+            # Detectron2 and MMDetection use Python configs
+            with open(output_path, 'w') as f:
+                f.write("# Auto-generated configuration\n")
+                f.write(f"# Generated at: {datetime.now().isoformat()}\n\n")
+                f.write(f"config = {json.dumps(self.config, indent=2)}\n")
+
+        logger.info(f"Configuration saved to {output_path}")
+        return str(output_path)
+
+    def generate_training_command(self) -> str:
+        """Generate the training command for the framework."""
+        if self.framework == 'ultralytics':
+            return f"yolo detect train data={self.config.get('data', 'data.yaml')} " \
+                   f"model={self.config.get('model', 'yolov8m.pt')} " \
+                   f"epochs={self.config.get('epochs', 100)} " \
+                   f"imgsz={self.config.get('imgsz', 640)}"
+        elif self.framework == 'detectron2':
+            return f"python train_net.py --config-file config.yaml --num-gpus 1"
+        elif self.framework == 'mmdetection':
+            return f"python tools/train.py config.py"
+        return ""
+
+    def print_summary(self):
+        """Print configuration summary."""
+        meta = self.config.get('_metadata', {})
+
+        print("\n" + "=" * 60)
+        print("TRAINING CONFIGURATION SUMMARY")
+        print("=" * 60)
+        print(f"Framework:     {meta.get('framework', 'unknown')}")
+        print(f"Architecture:  {meta.get('architecture', 'unknown')}")
+        print(f"Task:          {meta.get('task', 'detection')}")
+
+        if 'arch_info' in meta:
+            info = meta['arch_info']
+            if 'params' in info:
+                print(f"Parameters:    {info['params']}")
+            if 'map' in info:
+                print(f"COCO mAP:      {info['map']}")
+
+        print("-" * 60)
+        print("Training Command:")
+        print(f"  {self.generate_training_command()}")
+        print("=" * 60 + "\n")
+
 
 def main():
-    """Main entry point"""
     parser = argparse.ArgumentParser(
-        description="Vision Model Trainer"
+        description="Generate vision model training configurations"
     )
-    parser.add_argument('--input', '-i', required=True, help='Input path')
-    parser.add_argument('--output', '-o', required=True, help='Output path')
-    parser.add_argument('--config', '-c', help='Configuration file')
-    parser.add_argument('--verbose', '-v', action='store_true', help='Verbose output')
-    
+    parser.add_argument('data_dir', help='Path to dataset directory')
+    parser.add_argument('--task', choices=['detection', 'segmentation'],
+                       default='detection', help='Task type')
+    parser.add_argument('--framework', choices=['ultralytics', 'detectron2', 'mmdetection'],
+                       default='ultralytics', help='Training framework')
+    parser.add_argument('--arch', default='yolov8m',
+                       help='Model architecture')
+    parser.add_argument('--epochs', type=int, default=100, help='Training epochs')
+    parser.add_argument('--batch', type=int, default=16, help='Batch size')
+    parser.add_argument('--imgsz', type=int, default=640, help='Image size')
+    parser.add_argument('--output', '-o', help='Output config file path')
+    parser.add_argument('--analyze-only', action='store_true',
+                       help='Only analyze dataset, do not generate config')
+    parser.add_argument('--json', action='store_true',
+                       help='Output as JSON')
+
     args = parser.parse_args()
-    
-    if args.verbose:
-        logging.getLogger().setLevel(logging.DEBUG)
-    
+
+    trainer = VisionModelTrainer(
+        data_dir=args.data_dir,
+        task=args.task,
+        framework=args.framework
+    )
+
+    # Analyze dataset
+    analysis = trainer.analyze_dataset()
+
+    if args.analyze_only:
+        if args.json:
+            print(json.dumps(analysis, indent=2))
+        else:
+            print("\nDataset Analysis:")
+            print(f"  Path: {analysis['path']}")
+            print(f"  Format: {analysis['annotations']['format']}")
+            print(f"  Classes: {len(analysis['annotations']['classes'])}")
+            print(f"  Images - Train: {analysis['images']['train']}, "
+                  f"Val: {analysis['images']['val']}, "
+                  f"Test: {analysis['images']['test']}")
+            if analysis['recommendations']:
+                print("\nRecommendations:")
+                for rec in analysis['recommendations']:
+                    print(f"  - {rec}")
+        return
+
+    # Generate configuration
     try:
-        config = {
-            'input': args.input,
-            'output': args.output
-        }
-        
-        processor = VisionModelTrainer(config)
-        results = processor.process()
-        
-        print(json.dumps(results, indent=2))
-        sys.exit(0)
-        
-    except Exception as e:
-        logger.error(f"Fatal error: {e}")
+        if args.framework == 'ultralytics':
+            config = trainer.generate_yolo_config(
+                arch=args.arch,
+                epochs=args.epochs,
+                batch=args.batch,
+                imgsz=args.imgsz
+            )
+        elif args.framework == 'detectron2':
+            config = trainer.generate_detectron2_config(
+                arch=args.arch,
+                epochs=args.epochs,
+                batch=args.batch
+            )
+        elif args.framework == 'mmdetection':
+            config = trainer.generate_mmdetection_config(
+                arch=args.arch,
+                epochs=args.epochs,
+                batch=args.batch
+            )
+    except ValueError as e:
+        logger.error(str(e))
         sys.exit(1)
 
+    # Output
+    if args.json:
+        print(json.dumps(config, indent=2))
+    else:
+        trainer.print_summary()
+
+        if args.output:
+            trainer.save_config(args.output)
+
+
 if __name__ == '__main__':
     main()