Files
antigravity-skills-reference/plugins/antigravity-awesome-skills-claude/skills/007/references/incident-playbooks.md

395 lines
15 KiB
Markdown

# Incident Response Playbooks
> Extended playbooks for common security incidents.
> Each follows 5 phases: Contain, Assess, Remediate, Prevent, Document.
> Use with `007 incident` or when responding to any security event.
---
## Playbook 1: Data Breach
**Severity:** CRITICAL
**Response Time:** Immediate (< 15 minutes to begin containment)
### Phase 1: Contain
- [ ] Identify the source of the breach (compromised credential, vulnerability, insider)
- [ ] Revoke compromised credentials immediately (API keys, tokens, passwords)
- [ ] Isolate affected systems from the network
- [ ] Block the attacker's IP/access path if identifiable
- [ ] Preserve forensic evidence (do NOT wipe or restart affected systems yet)
### Phase 2: Assess
- [ ] Determine what data was exposed (PII, financial, credentials, business data)
- [ ] Determine scope: how many users/records affected
- [ ] Identify the attack timeline (when it started, when it was detected)
- [ ] Review access logs to trace the attacker's actions
- [ ] Assess if data was exfiltrated or only accessed
### Phase 3: Remediate
- [ ] Patch the vulnerability that was exploited
- [ ] Force password reset for all affected users
- [ ] Rotate all potentially compromised secrets (API keys, DB passwords, certificates)
- [ ] Clean malware/backdoors if installed
- [ ] Restore from clean backups if data was tampered with
### Phase 4: Prevent
- [ ] Implement missing access controls identified during the breach
- [ ] Add monitoring for the attack pattern used
- [ ] Enable encryption at rest for exposed data stores
- [ ] Implement DLP (Data Loss Prevention) rules
- [ ] Review and restrict access permissions (least privilege)
### Phase 5: Document
- [ ] Complete incident timeline with timestamps
- [ ] Root cause analysis (RCA)
- [ ] List of all affected systems and data
- [ ] Actions taken and by whom
- [ ] Regulatory notifications (LGPD: 72 hours, GDPR: 72 hours)
- [ ] User notification if PII was exposed
- [ ] Lessons learned and process improvements
### Communication Template
```
SUBJECT: [CRITICAL] Security Incident - Data Breach Detected
STATUS: Active incident as of {timestamp}
SEVERITY: CRITICAL
INCIDENT ID: INC-{YYYY}-{NNN}
SUMMARY:
A data breach affecting {scope} has been detected. The breach involves
{type of data} from {source system}.
CURRENT ACTIONS:
- Compromised access has been revoked
- Affected systems are isolated
- Investigation is in progress
AFFECTED DATA:
- Type: {PII / financial / credentials / business}
- Records: {approximate count}
- Users: {approximate count}
NEXT STEPS:
- Complete forensic analysis by {ETA}
- Regulatory notification by {deadline}
- User communication by {deadline}
CONTACT: {incident commander} at {contact info}
```
---
## Playbook 2: DDoS / DoS
**Severity:** HIGH
**Response Time:** < 5 minutes to begin mitigation
### Phase 1: Contain
- [ ] Confirm it is an attack (not a legitimate traffic spike)
- [ ] Activate CDN/WAF DDoS protection (Cloudflare Under Attack Mode, AWS Shield, etc.)
- [ ] Enable rate limiting emergency mode (aggressive thresholds)
- [ ] Block obvious attack source IPs/ranges at the edge
- [ ] Scale infrastructure if possible (auto-scaling groups)
- [ ] Enable geo-blocking if attack originates from specific regions
### Phase 2: Assess
- [ ] Identify attack type (volumetric, protocol, application layer)
- [ ] Identify attack source patterns (IP ranges, user agents, request patterns)
- [ ] Measure impact on service availability and user experience
- [ ] Check if DDoS is a distraction for another attack (data breach, etc.)
- [ ] Review resource utilization (CPU, memory, bandwidth, connections)
### Phase 3: Remediate
- [ ] Implement targeted blocking rules based on attack patterns
- [ ] Optimize application to handle increased load (caching, static responses)
- [ ] Contact ISP/hosting provider for upstream filtering if needed
- [ ] Move critical services behind additional protection layers
- [ ] Gradually relax emergency protections as attack subsides
### Phase 4: Prevent
- [ ] Implement permanent rate limiting with appropriate thresholds
- [ ] Deploy CDN with DDoS protection for all public endpoints
- [ ] Set up auto-scaling with cost limits
- [ ] Create runbooks for common DDoS patterns
- [ ] Implement challenge-based protection (CAPTCHA) for sensitive endpoints
### Phase 5: Document
- [ ] Attack timeline, peak traffic volume, duration
- [ ] Attack type and source characteristics
- [ ] Service impact (downtime, degraded performance, affected users)
- [ ] Mitigation actions and effectiveness
- [ ] Cost impact (infrastructure, lost revenue)
- [ ] Recommendations for improved resilience
---
## Playbook 3: Ransomware
**Severity:** CRITICAL
**Response Time:** Immediate (< 10 minutes to isolate)
### Phase 1: Contain
- [ ] IMMEDIATELY disconnect affected systems from network (pull cable, disable WiFi)
- [ ] Do NOT power off systems (preserves forensic evidence in memory)
- [ ] Identify patient zero (first infected system)
- [ ] Block lateral movement (disable SMB, RDP between segments)
- [ ] Isolate backup systems to prevent encryption
- [ ] Alert all employees to disconnect suspicious systems
### Phase 2: Assess
- [ ] Identify the ransomware variant (check ransom note, file extensions)
- [ ] Determine scope: which systems and data are encrypted
- [ ] Check if backups are intact and uncompromised
- [ ] Assess if data was exfiltrated before encryption (double extortion)
- [ ] Check for decryption tools (NoMoreRansom.org)
- [ ] Determine entry point (phishing email, RDP brute force, vulnerable software)
### Phase 3: Remediate
- [ ] If clean backups exist: wipe and restore from backup
- [ ] If no backups: evaluate decryption options (public tools, negotiation as last resort)
- [ ] Patch the vulnerability that was exploited
- [ ] Remove all persistence mechanisms (scheduled tasks, registry keys, services)
- [ ] Scan all systems for remaining malware before reconnecting
- [ ] Change ALL passwords (domain admin first, then all users)
### Phase 4: Prevent
- [ ] Implement network segmentation
- [ ] Deploy EDR (Endpoint Detection and Response) on all systems
- [ ] Disable SMB v1, restrict RDP access
- [ ] Implement 3-2-1 backup strategy (3 copies, 2 media types, 1 offsite)
- [ ] Air-gapped or immutable backup storage
- [ ] Regular backup restoration tests
- [ ] Employee phishing awareness training
### Phase 5: Document
- [ ] Complete attack timeline
- [ ] Entry point and propagation method
- [ ] Data impact (encrypted, exfiltrated, lost)
- [ ] Recovery method and time to recovery
- [ ] Financial impact (ransom demand, downtime cost, recovery cost)
- [ ] Law enforcement report (recommended)
---
## Playbook 4: Supply Chain Compromise
**Severity:** CRITICAL
**Response Time:** < 30 minutes to assess, < 2 hours to contain
### Phase 1: Contain
- [ ] Identify the compromised dependency/package/vendor
- [ ] Pin to last known good version immediately
- [ ] Block outbound connections from affected systems to unknown IPs
- [ ] Audit all systems using the compromised component
- [ ] Halt all deployments until assessment is complete
- [ ] Check if compromised code was executed in production
### Phase 2: Assess
- [ ] Determine what the malicious code does (data exfiltration, backdoor, crypto-miner)
- [ ] Identify affected versions and timeline of compromise
- [ ] Check package manager advisories (npm, PyPI, Maven security advisories)
- [ ] Review build logs for when compromised version was first introduced
- [ ] Scan all artifacts built with the compromised dependency
- [ ] Check if secrets/credentials were exposed to the malicious code
### Phase 3: Remediate
- [ ] Update to patched version or remove dependency
- [ ] Rotate all secrets that could have been accessed
- [ ] Rebuild and redeploy all affected services from clean sources
- [ ] Scan all systems for backdoors or persistence mechanisms
- [ ] Audit build pipeline for additional compromises
### Phase 4: Prevent
- [ ] Implement dependency pinning with lock files
- [ ] Enable integrity checking (checksums, signatures)
- [ ] Set up automated vulnerability scanning (Dependabot, Snyk, pip-audit)
- [ ] Use private package registries with approved packages
- [ ] Implement SBOM (Software Bill of Materials)
- [ ] Code review for dependency updates
- [ ] Monitor for typosquatting attacks on your dependencies
### Phase 5: Document
- [ ] Compromised component, versions, and timeline
- [ ] Impact assessment (systems affected, data exposed)
- [ ] Detection method (how was it discovered)
- [ ] Remediation actions and verification
- [ ] Supply chain security improvements implemented
---
## Playbook 5: Insider Threat
**Severity:** HIGH to CRITICAL
**Response Time:** < 1 hour (balance speed with discretion)
### Phase 1: Contain
- [ ] Do NOT alert the suspected insider yet
- [ ] Engage HR and legal before technical actions
- [ ] Increase monitoring on the suspected account (audit logging)
- [ ] Restrict access to most sensitive systems without raising suspicion
- [ ] Preserve all evidence (logs, emails, file access records)
- [ ] Secure backup copies of evidence
### Phase 2: Assess
- [ ] Review access logs for unusual patterns (off-hours access, bulk downloads)
- [ ] Check for unauthorized data transfers (USB, email, cloud storage)
- [ ] Review code changes for backdoors or unauthorized modifications
- [ ] Assess what data/systems the insider has access to
- [ ] Determine if the threat is malicious or negligent
- [ ] Involve digital forensics if warranted
### Phase 3: Remediate
- [ ] Coordinate with HR/legal for appropriate action
- [ ] Revoke all access immediately when action is taken
- [ ] Change shared credentials the insider had access to
- [ ] Review and revoke any API keys/tokens created by the insider
- [ ] Audit code changes made by the insider in the last N months
- [ ] Check for scheduled tasks, cron jobs, or time bombs
### Phase 4: Prevent
- [ ] Implement Data Loss Prevention (DLP) tools
- [ ] Enforce least-privilege access across the organization
- [ ] Regular access reviews (quarterly minimum)
- [ ] Implement user behavior analytics (UBA)
- [ ] Offboarding checklist with comprehensive access revocation
- [ ] Background checks for roles with sensitive access
### Phase 5: Document
- [ ] Complete timeline of insider actions
- [ ] Data/systems accessed or compromised
- [ ] Evidence collected and chain of custody
- [ ] HR/legal actions taken
- [ ] Access control improvements implemented
---
## Playbook 6: Credential Stuffing
**Severity:** HIGH
**Response Time:** < 30 minutes to begin mitigation
### Phase 1: Contain
- [ ] Detect the attack (spike in failed logins, multiple accounts from same IPs)
- [ ] Enable aggressive rate limiting on login endpoints
- [ ] Block attacking IP ranges at WAF/CDN level
- [ ] Enable CAPTCHA on login forms
- [ ] Temporarily lock accounts with multiple failed attempts
### Phase 2: Assess
- [ ] Determine how many accounts were successfully compromised
- [ ] Identify the source of credential lists (check haveibeenpwned.com)
- [ ] Review compromised accounts for unauthorized actions
- [ ] Check if attackers accessed sensitive data or made changes
- [ ] Assess financial impact (fraudulent transactions, data access)
### Phase 3: Remediate
- [ ] Force password reset on all compromised accounts
- [ ] Notify affected users with guidance to use unique passwords
- [ ] Reverse any unauthorized actions (transactions, settings changes)
- [ ] Block known compromised credential pairs
- [ ] Invalidate all active sessions for affected accounts
### Phase 4: Prevent
- [ ] Implement MFA (multi-factor authentication), push to all users
- [ ] Deploy credential stuffing detection (rate + pattern analysis)
- [ ] Check passwords against breach databases on registration/change
- [ ] Implement progressive delays on failed login attempts
- [ ] Device fingerprinting and anomaly detection
- [ ] Bot detection on authentication endpoints
### Phase 5: Document
- [ ] Attack timeline, volume, and success rate
- [ ] Number of compromised accounts and impact
- [ ] Source IP analysis
- [ ] Detection method and time to detection
- [ ] User communications sent
- [ ] Authentication security improvements
---
## Playbook 7: API Abuse
**Severity:** MEDIUM to HIGH
**Response Time:** < 1 hour
### Phase 1: Contain
- [ ] Identify the abusive client (API key, IP, user account)
- [ ] Rate limit or throttle the abusive client specifically
- [ ] If data scraping: block the client and return generic errors
- [ ] If financial abuse: freeze the account pending review
- [ ] Preserve request logs for analysis
### Phase 2: Assess
- [ ] Determine the type of abuse (scraping, brute force, fraud, free tier abuse)
- [ ] Quantify the impact (cost, data exposed, service degradation)
- [ ] Review if the abuse exploited a legitimate API or a vulnerability
- [ ] Check ToS violations
- [ ] Determine if automated (bot) or manual
### Phase 3: Remediate
- [ ] Revoke the abusive client's API keys
- [ ] Block abusive patterns (specific endpoints, request signatures)
- [ ] If vulnerability-based: patch the vulnerability
- [ ] If scraping: implement anti-bot measures
- [ ] If fraud: reverse fraudulent transactions, report to legal
### Phase 4: Prevent
- [ ] Implement per-client rate limiting with appropriate tiers
- [ ] Add request cost tracking (weighted rate limiting for expensive endpoints)
- [ ] Deploy bot detection (fingerprinting, behavior analysis)
- [ ] Implement API usage quotas and billing
- [ ] Add anomaly detection on API usage patterns
- [ ] Review API design for abuse vectors (pagination, filtering, bulk endpoints)
### Phase 5: Document
- [ ] Abuse type, method, and timeline
- [ ] Impact (financial, data, service)
- [ ] Client identification and evidence
- [ ] Actions taken (blocking, revocation)
- [ ] API security improvements implemented
---
## General Communication Template
Use this template for any incident type:
```
SUBJECT: [{SEVERITY}] Security Incident - {Brief Description}
STATUS: {Active / Contained / Resolved}
SEVERITY: {CRITICAL / HIGH / MEDIUM / LOW}
INCIDENT ID: INC-{YYYY}-{NNN}
DETECTED: {timestamp}
INCIDENT COMMANDER: {name}
SUMMARY:
{2-3 sentences describing what happened, what is affected, and current status.}
IMPACT:
- Systems: {affected systems}
- Data: {type of data affected, approximate scope}
- Users: {number of affected users}
- Business: {business impact description}
CURRENT STATUS:
- Phase: {Contain / Assess / Remediate / Prevent / Document}
- Actions completed: {list}
- Actions in progress: {list}
NEXT UPDATE: {timestamp for next status update}
CONTACT: {incident commander contact}
```
---
## Severity Classification Reference
| Severity | Examples | Response Time | Escalation |
|----------|---------|---------------|------------|
| **CRITICAL** | Data breach, ransomware, active exploitation | < 15 min | Immediate: CEO, CTO, Legal |
| **HIGH** | DDoS, credential stuffing, supply chain compromise | < 30 min | Within 1 hour: CTO, Engineering Lead |
| **MEDIUM** | API abuse, single account compromise, non-critical vuln exploited | < 2 hours | Within 4 hours: Engineering Lead |
| **LOW** | Failed attack attempt, minor misconfiguration found | < 24 hours | Next business day: Team Lead |