feat: add three-layer PII defense system (pre-commit + gitleaks + CLAUDE.md)
Prevents sensitive data (user paths, phone numbers, personal IDs) from entering git history. Born from redacting 6 historical commits. - .gitleaks.toml: custom rules for absolute paths, phone numbers, usernames - .githooks/pre-commit: dual-layer scan (gitleaks + regex fallback) - CLAUDE.md: updated Privacy section documenting the defense system Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
53
.githooks/pre-commit
Executable file
53
.githooks/pre-commit
Executable file
@@ -0,0 +1,53 @@
|
||||
#!/bin/bash
|
||||
# Pre-commit hook: scan staged changes for sensitive data
|
||||
# Install: git config core.hooksPath .githooks
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
NC='\033[0m'
|
||||
|
||||
echo "🔍 Scanning staged changes for sensitive data..."
|
||||
|
||||
FAILED=0
|
||||
|
||||
# Layer 1: gitleaks (if available)
|
||||
if command -v gitleaks &>/dev/null; then
|
||||
if ! gitleaks protect --staged --config .gitleaks.toml --no-banner 2>/dev/null; then
|
||||
echo -e "${RED}❌ gitleaks found secrets in staged changes${NC}"
|
||||
FAILED=1
|
||||
fi
|
||||
else
|
||||
echo -e "${YELLOW}⚠ gitleaks not installed (brew install gitleaks), falling back to pattern scan${NC}"
|
||||
fi
|
||||
|
||||
# Layer 2: fast regex scan (always runs, catches what gitleaks config might miss)
|
||||
STAGED_DIFF=$(git diff --cached --diff-filter=ACDMR)
|
||||
|
||||
PATTERNS=(
|
||||
'/Users/[a-zA-Z][a-zA-Z0-9_-]+/'
|
||||
'/home/[a-zA-Z][a-zA-Z0-9_-]+/'
|
||||
'C:\\Users\\[a-zA-Z]'
|
||||
'songtiansheng'
|
||||
'tiansheng'
|
||||
'15366[0-9]+'
|
||||
)
|
||||
|
||||
for pattern in "${PATTERNS[@]}"; do
|
||||
MATCHES=$(echo "$STAGED_DIFF" | grep -nE "^\+" | grep -E "$pattern" | grep -v "^+++\|\.gitleaks\.toml\|\.githooks/\|\.gitignore\|placeholder\|example\|CLAUDE\.md" || true)
|
||||
if [ -n "$MATCHES" ]; then
|
||||
echo -e "${RED}❌ Found sensitive pattern '${pattern}':${NC}"
|
||||
echo "$MATCHES" | head -5
|
||||
FAILED=1
|
||||
fi
|
||||
done
|
||||
|
||||
if [ $FAILED -eq 1 ]; then
|
||||
echo ""
|
||||
echo -e "${RED}Commit blocked. Fix the issues above, or use --no-verify to bypass (not recommended).${NC}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo -e "${GREEN}✅ No sensitive data found in staged changes.${NC}"
|
||||
53
.gitleaks.toml
Normal file
53
.gitleaks.toml
Normal file
@@ -0,0 +1,53 @@
|
||||
# Gitleaks custom rules for claude-code-skills repo
|
||||
# Catches personal info that shouldn't be in an open source repo
|
||||
|
||||
title = "claude-code-skills sensitive data rules"
|
||||
|
||||
[extend]
|
||||
useDefault = true
|
||||
|
||||
# Global allowlist: files that are allowed to contain patterns
|
||||
# (the config file itself, hooks, and contribution guides)
|
||||
[allowlist]
|
||||
paths = [
|
||||
'''\.gitleaks\.toml$''',
|
||||
'''\.githooks/''',
|
||||
'''CONTRIBUTING\.md$''',
|
||||
'''CLAUDE\.md$''',
|
||||
]
|
||||
|
||||
[[rules]]
|
||||
id = "absolute-user-path-macos"
|
||||
description = "Hardcoded macOS user home directory path"
|
||||
regex = '''/Users/[a-zA-Z][a-zA-Z0-9_-]+/'''
|
||||
tags = ["pii", "path"]
|
||||
|
||||
[[rules]]
|
||||
id = "absolute-user-path-linux"
|
||||
description = "Hardcoded Linux home directory path"
|
||||
regex = '''/home/[a-zA-Z][a-zA-Z0-9_-]+/'''
|
||||
tags = ["pii", "path"]
|
||||
|
||||
[[rules]]
|
||||
id = "windows-user-path"
|
||||
description = "Hardcoded Windows user profile path"
|
||||
regex = '''C:\\Users\\[a-zA-Z][a-zA-Z0-9_-]+\\'''
|
||||
tags = ["pii", "path"]
|
||||
|
||||
[[rules]]
|
||||
id = "phone-number-cn"
|
||||
description = "Chinese mobile phone number"
|
||||
regex = '''1[3-9]\d{9}'''
|
||||
tags = ["pii", "phone"]
|
||||
|
||||
[[rules]]
|
||||
id = "douban-user-id-literal"
|
||||
description = "Hardcoded Douban user ID"
|
||||
regex = '''songtiansheng'''
|
||||
tags = ["pii", "username"]
|
||||
|
||||
[[rules]]
|
||||
id = "email-personal"
|
||||
description = "Personal email address"
|
||||
regex = '''[a-zA-Z0-9._%+-]+@(gmail|qq|163|126|outlook|hotmail|yahoo|icloud|foxmail)\.[a-zA-Z]{2,}'''
|
||||
tags = ["pii", "email"]
|
||||
13
CLAUDE.md
13
CLAUDE.md
@@ -115,13 +115,22 @@ description: Clear description with activation triggers. This skill should be us
|
||||
---
|
||||
```
|
||||
|
||||
### Privacy and Path Guidelines
|
||||
### Privacy and Path Guidelines (Enforced by Pre-commit Hook)
|
||||
|
||||
Skills for public distribution must NOT contain:
|
||||
- Absolute paths to user directories (`/home/username/`, `/Users/username/`)
|
||||
- Personal usernames, company names, product names
|
||||
- Phone numbers, personal email addresses
|
||||
- OneDrive paths or environment-specific absolute paths
|
||||
- Use relative paths within skill bundle or standard placeholders
|
||||
- Use relative paths within skill bundle or standard placeholders (`~/workspace/`, `<user_id>`)
|
||||
|
||||
**Three-layer defense system:**
|
||||
1. **CLAUDE.md rules** (this section) — Claude avoids generating sensitive content
|
||||
2. **Pre-commit hook** (`.githooks/pre-commit`) — blocks commits with sensitive patterns
|
||||
3. **gitleaks** (`.gitleaks.toml`) — deep scan with custom rules for this repo
|
||||
|
||||
The pre-commit hook is auto-activated via `git config core.hooksPath .githooks`.
|
||||
If it fires, fix the issue — do NOT use `--no-verify` to bypass.
|
||||
|
||||
### Content Organization
|
||||
|
||||
|
||||
Reference in New Issue
Block a user