feat(pdf-creator): add theme system + Chrome backend; add terraform-skill draft

- pdf-creator v1.2.0: theme system (default/warm-terra), dual backend
  (weasyprint/chrome auto-detect), argparse CLI, extracted CSS to themes/
- terraform-skill: operational traps from real deployments (provisioner
  timing, DNS duplication, multi-env isolation, pre-deploy validation)
- asr-transcribe-to-text: add security scan marker

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
daymade
2026-04-02 23:33:03 +08:00
parent b9facf3516
commit 87221d94d5
10 changed files with 1091 additions and 207 deletions

View File

@@ -414,15 +414,17 @@
}, },
{ {
"name": "pdf-creator", "name": "pdf-creator",
"description": "Create PDF documents from markdown with proper Chinese font support using weasyprint. Use when converting markdown to PDF, generating formal documents (legal filings, trademark applications, reports), or when Chinese typography is required. Triggers include convert to PDF, generate PDF, markdown to PDF, or printable documents", "description": "Create PDF documents from markdown with Chinese font support. Supports theme system (default for formal docs, warm-terra for training materials) and dual backend (weasyprint or Chrome). Triggers include convert to PDF, generate PDF, markdown to PDF, or printable documents",
"source": "./", "source": "./",
"strict": false, "strict": false,
"version": "1.1.0", "version": "1.2.0",
"category": "document-conversion", "category": "document-conversion",
"keywords": [ "keywords": [
"pdf", "pdf",
"markdown", "markdown",
"weasyprint", "weasyprint",
"chrome",
"themes",
"chinese-fonts", "chinese-fonts",
"document-generation", "document-generation",
"legal", "legal",

View File

@@ -0,0 +1,4 @@
Security scan passed
Scanned at: 2026-03-22T23:58:59.508059
Tool: gitleaks + pattern-based validation
Content hash: 5dbbc175b8bfd6c6c8ab97f4112bf1f48eb489ef2a55375f88266deade644cd4

View File

@@ -1,55 +1,60 @@
--- ---
name: pdf-creator name: pdf-creator
description: Create PDF documents from markdown with proper Chinese font support using weasyprint. This skill should be used when converting markdown to PDF, generating formal documents (legal, trademark filings, reports), or when Chinese typography is required. Triggers include "convert to PDF", "generate PDF", "markdown to PDF", or any request for creating printable documents. description: Create PDF documents from markdown with proper Chinese font support. Supports theme system (default for formal docs, warm-terra for training materials) and dual backend (weasyprint or Chrome). Triggers include "convert to PDF", "generate PDF", "markdown to PDF", or any request for creating printable documents.
--- ---
# PDF Creator # PDF Creator
Create professional PDF documents from markdown with proper Chinese font support. Create professional PDF documents from markdown with Chinese font support and theme system.
## Quick Start ## Quick Start
Convert a single markdown file:
```bash ```bash
cd ~/workspace/claude-code-skills/pdf-creator # Default theme (formal: Songti SC + black/grey)
uv run --with weasyprint --with markdown scripts/md_to_pdf.py input.md output.pdf uv run --with weasyprint scripts/md_to_pdf.py input.md output.pdf
# Warm theme (training: PingFang SC + terra cotta)
uv run --with weasyprint scripts/md_to_pdf.py input.md --theme warm-terra
# No weasyprint? Use Chrome backend (auto-detected if weasyprint unavailable)
python scripts/md_to_pdf.py input.md --theme warm-terra --backend chrome
# List available themes
python scripts/md_to_pdf.py --list-themes dummy.md
``` ```
Batch convert multiple files: ## Themes
Stored in `themes/*.css`. Each theme is a standalone CSS file.
| Theme | Font | Color | Best for |
|-------|------|-------|----------|
| `default` | Songti SC + Heiti SC | Black/grey | Legal docs, contracts, formal reports |
| `warm-terra` | PingFang SC | Terra cotta (#d97756) + warm neutrals | Course outlines, training materials, workshops |
To create a new theme: copy `themes/default.css`, modify, save as `themes/your-theme.css`.
## Backends
The script auto-detects the best available backend:
| Backend | Install | Pros | Cons |
|---------|---------|------|------|
| `weasyprint` | `pip install weasyprint` | Precise CSS rendering, no browser needed | Requires system libs (cairo, pango) |
| `chrome` | Google Chrome installed | Zero Python deps, great CJK support | Larger binary, slightly less CSS control |
Override with `--backend chrome` or `--backend weasyprint`.
## Batch Convert
```bash ```bash
uv run --with weasyprint --with markdown scripts/batch_convert.py *.md --output-dir ./pdfs uv run --with weasyprint scripts/batch_convert.py *.md --output-dir ./pdfs
``` ```
macOS ARM (Homebrew) 的 `DYLD_LIBRARY_PATH` 会自动检测配置,无需手动设置。
## Font Configuration
The scripts use these Chinese fonts (with fallbacks):
| Font Type | Primary | Fallbacks |
|-----------|---------|-----------|
| Body text | Songti SC | SimSun, STSong, Noto Serif CJK SC |
| Headings | Heiti SC | SimHei, STHeiti, Noto Sans CJK SC |
## Output Specifications
- **Page size**: A4
- **Margins**: 2.5cm top/bottom, 2cm left/right
- **Body font**: 12pt, 1.8 line height
- **Max file size**: Designed to stay under 2MB for form submissions
## Common Use Cases
1. **Legal documents**: Trademark filings, contracts, evidence lists
2. **Reports**: Business reports, technical documentation
3. **Formal letters**: Official correspondence requiring print format
## Troubleshooting ## Troubleshooting
**Problem**: Chinese characters display as boxes **Chinese characters display as boxes**: Ensure Chinese fonts are installed (Songti SC, PingFang SC, etc.)
**Solution**: Ensure Songti SC or other Chinese fonts are installed on the system
**Problem**: `weasyprint` import error **weasyprint import error**: Run with `uv run --with weasyprint` or use `--backend chrome` instead.
**Solution**: Run with `uv run --with weasyprint --with markdown` to ensure dependencies
**Chrome header/footer appearing**: The script passes `--no-pdf-header-footer`. If it still appears, your Chrome version may not support this flag — update Chrome.

View File

@@ -1,150 +1,110 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
""" """
Markdown to PDF converter with Chinese font support. Markdown to PDF converter with Chinese font support and theme system.
Converts markdown files to PDF using pandoc (markdown→HTML) + weasyprint (HTML→PDF). Converts markdown files to PDF using:
Designed for formal documents (trademark filings, legal documents, reports). - pandoc (markdown → HTML)
- weasyprint or headless Chrome (HTML → PDF), auto-detected
Usage: Usage:
python md_to_pdf.py input.md output.pdf python md_to_pdf.py input.md output.pdf
python md_to_pdf.py input.md # outputs input.pdf python md_to_pdf.py input.md --theme warm-terra
python md_to_pdf.py input.md --theme default --backend chrome
python md_to_pdf.py input.md # outputs input.pdf, default theme, auto backend
Themes:
Stored in ../themes/*.css. Built-in themes:
- default: Songti SC + black/grey, formal documents
- warm-terra: PingFang SC + terra cotta, training/workshop materials
Requirements: Requirements:
pip install weasyprint
pandoc (system install, e.g. brew install pandoc) pandoc (system install, e.g. brew install pandoc)
weasyprint (pip install weasyprint) OR Google Chrome (for --backend chrome)
macOS environment setup (if needed):
export DYLD_LIBRARY_PATH="/opt/homebrew/lib:$DYLD_LIBRARY_PATH"
""" """
from __future__ import annotations
import argparse
import os import os
import platform import platform
import re import re
import shutil import shutil
import subprocess import subprocess
import sys import sys
import tempfile
from pathlib import Path from pathlib import Path
# Auto-configure library path on macOS ARM (Homebrew) — must be before weasyprint import SCRIPT_DIR = Path(__file__).resolve().parent
if platform.system() == 'Darwin': THEMES_DIR = SCRIPT_DIR.parent / "themes"
_homebrew_lib = '/opt/homebrew/lib'
# macOS ARM: auto-configure library path for weasyprint
if platform.system() == "Darwin":
_homebrew_lib = "/opt/homebrew/lib"
if Path(_homebrew_lib).is_dir(): if Path(_homebrew_lib).is_dir():
_cur = os.environ.get('DYLD_LIBRARY_PATH', '') _cur = os.environ.get("DYLD_LIBRARY_PATH", "")
if _homebrew_lib not in _cur: if _homebrew_lib not in _cur:
os.environ['DYLD_LIBRARY_PATH'] = f"{_homebrew_lib}:{_cur}" if _cur else _homebrew_lib os.environ["DYLD_LIBRARY_PATH"] = (
f"{_homebrew_lib}:{_cur}" if _cur else _homebrew_lib
from weasyprint import CSS, HTML )
# CSS with Chinese font support def _find_chrome() -> str | None:
CSS_STYLES = """ """Find Chrome/Chromium binary path."""
@page { candidates = [
size: A4; "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome",
margin: 2.5cm 2cm; "/Applications/Chromium.app/Contents/MacOS/Chromium",
} shutil.which("google-chrome"),
shutil.which("chromium"),
shutil.which("chrome"),
]
for c in candidates:
if c and Path(c).exists():
return str(c)
return None
body {
font-family: 'Songti SC', 'SimSun', 'STSong', 'Noto Serif CJK SC', serif;
font-size: 12pt;
line-height: 1.8;
color: #000;
width: 100%;
}
h1 { def _has_weasyprint() -> bool:
font-family: 'Heiti SC', 'SimHei', 'STHeiti', 'Noto Sans CJK SC', sans-serif; """Check if weasyprint is importable."""
font-size: 18pt; try:
font-weight: bold; import weasyprint # noqa: F401
text-align: center;
margin-top: 0;
margin-bottom: 1.5em;
}
h2 { return True
font-family: 'Heiti SC', 'SimHei', 'STHeiti', 'Noto Sans CJK SC', sans-serif; except ImportError:
font-size: 14pt; return False
font-weight: bold;
margin-top: 1.5em;
margin-bottom: 0.8em;
}
h3 {
font-family: 'Heiti SC', 'SimHei', 'STHeiti', 'Noto Sans CJK SC', sans-serif;
font-size: 12pt;
font-weight: bold;
margin-top: 1em;
margin-bottom: 0.5em;
}
p { def _detect_backend() -> str:
margin: 0.8em 0; """Auto-detect best available backend: weasyprint > chrome."""
text-align: justify; if _has_weasyprint():
} return "weasyprint"
if _find_chrome():
return "chrome"
print(
"Error: No PDF backend found. Install weasyprint (pip install weasyprint) "
"or Google Chrome.",
file=sys.stderr,
)
sys.exit(1)
ul, ol {
margin: 0.8em 0;
padding-left: 2em;
}
li { def _load_theme(theme_name: str) -> str:
margin: 0.4em 0; """Load CSS from themes directory."""
} theme_file = THEMES_DIR / f"{theme_name}.css"
if not theme_file.exists():
available = [f.stem for f in THEMES_DIR.glob("*.css")]
print(
f"Error: Theme '{theme_name}' not found. Available: {available}",
file=sys.stderr,
)
sys.exit(1)
return theme_file.read_text(encoding="utf-8")
table {
border-collapse: collapse;
width: 100%;
margin: 1em 0;
font-size: 10pt;
table-layout: fixed;
}
th, td { def _list_themes() -> list[str]:
border: 1px solid #666; """List available theme names."""
padding: 8px 6px; if not THEMES_DIR.exists():
text-align: left; return []
overflow-wrap: break-word; return sorted(f.stem for f in THEMES_DIR.glob("*.css"))
word-break: normal;
}
th {
background-color: #f0f0f0;
font-weight: bold;
}
hr {
border: none;
border-top: 1px solid #ccc;
margin: 1.5em 0;
}
strong {
font-weight: bold;
}
code {
font-family: 'SF Mono', 'Monaco', 'Menlo', monospace;
font-size: 10pt;
background-color: #f5f5f5;
padding: 0.2em 0.4em;
border-radius: 3px;
}
pre {
background-color: #f5f5f5;
padding: 1em;
overflow-x: auto;
font-size: 10pt;
line-height: 1.4;
border-radius: 4px;
}
blockquote {
border-left: 3px solid #ccc;
margin: 1em 0;
padding-left: 1em;
color: #555;
}
"""
def _ensure_list_spacing(text: str) -> str: def _ensure_list_spacing(text: str) -> str:
@@ -152,39 +112,36 @@ def _ensure_list_spacing(text: str) -> str:
Both Python markdown library and pandoc require a blank line before a list Both Python markdown library and pandoc require a blank line before a list
when it follows a paragraph. Without it, list items render as plain text. when it follows a paragraph. Without it, list items render as plain text.
This preprocessor adds blank lines before list items when needed, without
modifying the user's original markdown file.
""" """
lines = text.split('\n') lines = text.split("\n")
result = [] result = []
list_re = re.compile(r'^(\s*)([-*+]|\d+\.)\s') list_re = re.compile(r"^(\s*)([-*+]|\d+\.)\s")
for i, line in enumerate(lines): for i, line in enumerate(lines):
if i > 0 and list_re.match(line): if i > 0 and list_re.match(line):
prev = lines[i - 1] prev = lines[i - 1]
if prev.strip() and not list_re.match(prev): if prev.strip() and not list_re.match(prev):
result.append('') result.append("")
result.append(line) result.append(line)
return '\n'.join(result) return "\n".join(result)
def _md_to_html(md_file: str) -> str: def _md_to_html(md_file: str) -> str:
"""Convert markdown to HTML using pandoc with list spacing preprocessing. """Convert markdown to HTML using pandoc with list spacing preprocessing."""
if not shutil.which("pandoc"):
Reads the markdown file, preprocesses it to ensure proper list spacing, print(
then passes the content to pandoc via stdin. The original file is not modified. "Error: pandoc not found. Install with: brew install pandoc",
""" file=sys.stderr,
if not shutil.which('pandoc'): )
print("Error: pandoc not found. Install with: brew install pandoc", file=sys.stderr)
sys.exit(1) sys.exit(1)
# Read and preprocess markdown to ensure list spacing md_content = Path(md_file).read_text(encoding="utf-8")
md_content = Path(md_file).read_text(encoding='utf-8')
md_content = _ensure_list_spacing(md_content) md_content = _ensure_list_spacing(md_content)
result = subprocess.run( result = subprocess.run(
['pandoc', '-f', 'markdown', '-t', 'html'], ["pandoc", "-f", "markdown", "-t", "html"],
input=md_content, capture_output=True, text=True, input=md_content,
capture_output=True,
text=True,
) )
if result.returncode != 0: if result.returncode != 0:
print(f"Error: pandoc failed: {result.stderr}", file=sys.stderr) print(f"Error: pandoc failed: {result.stderr}", file=sys.stderr)
@@ -193,58 +150,152 @@ def _md_to_html(md_file: str) -> str:
return result.stdout return result.stdout
def markdown_to_pdf(md_file: str, pdf_file: str | None = None) -> str: def _build_full_html(html_content: str, css: str, title: str) -> str:
""" """Wrap HTML content in a full document with CSS."""
Convert markdown file to PDF with Chinese font support. return f"""<!DOCTYPE html>
Args:
md_file: Path to input markdown file
pdf_file: Path to output PDF file (optional, defaults to same name as input)
Returns:
Path to generated PDF file
"""
md_path = Path(md_file)
if pdf_file is None:
pdf_file = str(md_path.with_suffix('.pdf'))
# Convert to HTML via pandoc
html_content = _md_to_html(md_file)
# Create full HTML document
full_html = f"""<!DOCTYPE html>
<html lang="zh-CN"> <html lang="zh-CN">
<head> <head>
<meta charset="UTF-8"> <meta charset="UTF-8">
<title>{md_path.stem}</title> <title>{title}</title>
<style>{css}</style>
</head> </head>
<body> <body>
{html_content} {html_content}
</body> </body>
</html>""" </html>"""
# Generate PDF
HTML(string=full_html).write_pdf(pdf_file, stylesheets=[CSS(string=CSS_STYLES)])
def _render_weasyprint(full_html: str, pdf_file: str, css: str) -> None:
"""Render PDF using weasyprint."""
from weasyprint import CSS, HTML
HTML(string=full_html).write_pdf(pdf_file, stylesheets=[CSS(string=css)])
def _render_chrome(full_html: str, pdf_file: str) -> None:
"""Render PDF using headless Chrome."""
chrome = _find_chrome()
if not chrome:
print("Error: Chrome not found.", file=sys.stderr)
sys.exit(1)
with tempfile.NamedTemporaryFile(
suffix=".html", mode="w", encoding="utf-8", delete=False
) as f:
f.write(full_html)
html_path = f.name
try:
result = subprocess.run(
[
chrome,
"--headless",
"--disable-gpu",
"--no-pdf-header-footer",
f"--print-to-pdf={pdf_file}",
html_path,
],
capture_output=True,
text=True,
)
if not Path(pdf_file).exists():
print(
f"Error: Chrome failed to generate PDF. stderr: {result.stderr}",
file=sys.stderr,
)
sys.exit(1)
finally:
Path(html_path).unlink(missing_ok=True)
def markdown_to_pdf(
md_file: str,
pdf_file: str | None = None,
theme: str = "default",
backend: str | None = None,
) -> str:
"""
Convert markdown file to PDF.
Args:
md_file: Path to input markdown file
pdf_file: Path to output PDF (optional, defaults to same name as input)
theme: Theme name (from themes/ directory)
backend: 'weasyprint', 'chrome', or None (auto-detect)
Returns:
Path to generated PDF file
"""
md_path = Path(md_file)
if pdf_file is None:
pdf_file = str(md_path.with_suffix(".pdf"))
if backend is None:
backend = _detect_backend()
css = _load_theme(theme)
html_content = _md_to_html(md_file)
full_html = _build_full_html(html_content, css, md_path.stem)
if backend == "weasyprint":
_render_weasyprint(full_html, pdf_file, css)
elif backend == "chrome":
_render_chrome(full_html, pdf_file)
else:
print(f"Error: Unknown backend '{backend}'", file=sys.stderr)
sys.exit(1)
size_kb = Path(pdf_file).stat().st_size / 1024
print(f"Generated: {pdf_file} ({size_kb:.0f}KB, theme={theme}, backend={backend})")
return pdf_file return pdf_file
def main(): def main():
if len(sys.argv) < 2: available_themes = _list_themes()
print("Usage: python md_to_pdf.py <input.md> [output.pdf]")
print("\nConverts markdown to PDF with Chinese font support.") parser = argparse.ArgumentParser(
description="Markdown to PDF with Chinese font support and themes."
)
parser.add_argument("input", help="Input markdown file")
parser.add_argument("output", nargs="?", help="Output PDF file (optional)")
parser.add_argument(
"--theme",
default="default",
choices=available_themes or ["default"],
help=f"CSS theme (available: {', '.join(available_themes) or 'default'})",
)
parser.add_argument(
"--backend",
choices=["weasyprint", "chrome"],
default=None,
help="PDF rendering backend (default: auto-detect)",
)
parser.add_argument(
"--list-themes",
action="store_true",
help="List available themes and exit",
)
args = parser.parse_args()
if args.list_themes:
for t in available_themes:
marker = " (default)" if t == "default" else ""
css_file = THEMES_DIR / f"{t}.css"
first_line = ""
for line in css_file.read_text().splitlines():
line = line.strip()
if line.startswith("*") and "" in line:
first_line = line.lstrip("* ").strip()
break
print(f" {t}{marker}: {first_line}")
sys.exit(0)
if not Path(args.input).exists():
print(f"Error: File not found: {args.input}", file=sys.stderr)
sys.exit(1) sys.exit(1)
md_file = sys.argv[1] markdown_to_pdf(args.input, args.output, args.theme, args.backend)
pdf_file = sys.argv[2] if len(sys.argv) > 2 else None
if not Path(md_file).exists():
print(f"Error: File not found: {md_file}")
sys.exit(1)
output = markdown_to_pdf(md_file, pdf_file)
print(f"Generated: {output}")
if __name__ == "__main__": if __name__ == "__main__":

View File

@@ -0,0 +1,88 @@
/*
* Default — PDF theme for formal documents
*
* Color palette: black/grey, no accent color
* Font: Songti SC (body) + Heiti SC (headings)
* Best for: legal documents, trademark filings, contracts, formal reports
*
* This is the original built-in theme from md_to_pdf.py, extracted for reference.
*/
@page {
size: A4;
margin: 2.5cm 2cm;
}
body {
font-family: 'Songti SC', 'SimSun', 'STSong', 'Noto Serif CJK SC', serif;
font-size: 12pt;
line-height: 1.8;
color: #000;
width: 100%;
}
h1 {
font-family: 'Heiti SC', 'SimHei', 'STHeiti', 'Noto Sans CJK SC', sans-serif;
font-size: 18pt;
font-weight: bold;
text-align: center;
margin-top: 0;
margin-bottom: 1.5em;
}
h2 {
font-family: 'Heiti SC', 'SimHei', 'STHeiti', 'Noto Sans CJK SC', sans-serif;
font-size: 14pt;
font-weight: bold;
margin-top: 1.5em;
margin-bottom: 0.8em;
}
h3 {
font-family: 'Heiti SC', 'SimHei', 'STHeiti', 'Noto Sans CJK SC', sans-serif;
font-size: 12pt;
font-weight: bold;
margin-top: 1em;
margin-bottom: 0.5em;
}
p {
margin: 0.8em 0;
text-align: justify;
}
ul, ol {
margin: 0.8em 0;
padding-left: 2em;
}
li {
margin: 0.4em 0;
}
table {
border-collapse: collapse;
width: 100%;
margin: 1em 0;
font-size: 10pt;
table-layout: fixed;
}
th, td {
border: 1px solid #666;
padding: 8px 6px;
text-align: left;
overflow-wrap: break-word;
word-break: normal;
}
th {
background-color: #f0f0f0;
font-weight: bold;
}
hr {
border: none;
border-top: 1px solid #ccc;
margin: 1.5em 0;
}

View File

@@ -0,0 +1,121 @@
/*
* Warm Terra — PDF theme for workshop/training documents
*
* Color palette: terra cotta (#d97756) + warm neutrals
* Font: PingFang SC (macOS) / Microsoft YaHei (Windows)
* Best for: course outlines, training materials, workshop agendas
*
* Usage with md_to_pdf.py:
* python md_to_pdf.py input.md output.pdf --theme warm-terra
*
* Usage with pandoc + Chrome (fallback):
* pandoc input.md -o /tmp/out.html --standalone -H <(cat this-file.css wrapped in <style>)
* chrome --headless --no-pdf-header-footer --print-to-pdf=out.pdf /tmp/out.html
*/
@page {
size: A4;
margin: 12mm;
}
body {
font-family: 'PingFang SC', 'Microsoft YaHei', 'Noto Sans CJK SC', sans-serif;
max-width: 100%;
margin: 0 auto;
padding: 0 10px;
font-size: 13px;
line-height: 1.7;
color: #1f1b17;
}
h1 {
font-size: 22px;
font-weight: 800;
border-bottom: 2px solid #d97756;
padding-bottom: 8px;
margin-top: 0;
margin-bottom: 1em;
}
h2 {
font-size: 17px;
font-weight: 700;
color: #d97756;
margin-top: 24px;
margin-bottom: 0.6em;
}
h3 {
font-size: 14px;
font-weight: 700;
margin-top: 18px;
margin-bottom: 0.5em;
}
p {
margin: 0.6em 0;
}
ul, ol {
padding-left: 20px;
margin: 0.6em 0;
}
li {
margin-bottom: 3px;
word-break: break-word;
}
table {
border-collapse: collapse;
width: 100%;
margin: 10px 0;
font-size: 12px;
}
th, td {
border: 1px solid #e2d6c8;
padding: 5px 8px;
text-align: left;
white-space: nowrap;
}
/* Last column wraps (usually the description/content column) */
td:last-child {
white-space: normal;
}
th {
background: #faf5f0;
font-weight: 700;
}
blockquote {
border-left: 3px solid #d97756;
padding-left: 12px;
color: #6c6158;
margin: 10px 0;
font-size: 13px;
}
hr {
border: none;
border-top: 1px solid #e2d6c8;
margin: 16px 0;
}
/* Hide pandoc-generated header/date */
header, .date {
display: none !important;
}
code {
background: #faf5f0;
padding: 1px 4px;
border-radius: 3px;
font-size: 12px;
}
strong {
color: #1f1b17;
}

233
terraform-skill/SKILL.md Normal file
View File

@@ -0,0 +1,233 @@
---
name: terraform-skill
description: Operational traps for Terraform provisioners, multi-environment isolation, and zero-to-deployment reliability. Covers provisioner timing races, SSH connection conflicts, DNS record duplication, volume permissions, database bootstrap gaps, snapshot cross-contamination, Cloudflare credential format errors, hardcoded domains in Caddyfiles/compose, and init-data-only-on-first-boot pitfalls. Activate when writing null_resource provisioners, creating multi-environment Terraform setups, debugging containers that are Restarting/unhealthy after terraform apply, setting up fresh instances with cloud-init, or any IaC code that SSHs into remote hosts. Also activate when the user mentions terraform plan/apply errors, provisioner failures, infrastructure drift, TLS certificate errors, or Caddy/gateway configuration.
---
# Terraform Operational Traps
Failure patterns from real deployments. Every item caused an incident. Organized as: **exact error → root cause → copy-paste fix**.
## Provisioner traps (symptom → fix)
### `docker: not found` in remote-exec
cloud-init still installing Docker when provisioner SSHs in.
```hcl
provisioner "remote-exec" {
inline = [
"cloud-init status --wait || true",
"which docker || { echo 'FATAL: Docker not ready'; exit 1; }",
]
}
```
### `rsync: connection unexpectedly closed` in local-exec
Terraform holds its SSH connection open; local-exec rsync opens a second one that gets rejected. Never use local-exec for file transfer to remote. Use tarball + file provisioner:
```hcl
provisioner "local-exec" {
command = "tar czf /tmp/src.tar.gz --exclude=node_modules --exclude=.git -C ${path.module}/../../.. myproject"
}
provisioner "file" {
source = "/tmp/src.tar.gz"
destination = "/tmp/src.tar.gz"
}
provisioner "remote-exec" {
inline = ["tar xzf /tmp/src.tar.gz -C /data/ && rm -f /tmp/src.tar.gz"]
}
```
macOS BSD tar: `--exclude` must come BEFORE the source argument.
### `cloud-init status` shows "running" forever
`apt-get -y` does not suppress debconf dialogs. Packages like `iptables-persistent` block on TTY prompts.
```yaml
- |
echo iptables-persistent iptables-persistent/autosave_v4 boolean true | debconf-set-selections
echo iptables-persistent iptables-persistent/autosave_v6 boolean true | debconf-set-selections
DEBIAN_FRONTEND=noninteractive apt-get install -y iptables-persistent
```
Known offenders: `iptables-persistent`, `postfix`, `mysql-server`, `wireshark-common`.
### `EACCES: permission denied` in container logs, container Restarting
Host volume dirs are root-owned; container runs as non-root (uid 1001). Fix before `docker compose up`:
```bash
mkdir -p /data/myapp/data /data/myapp/logs
chown -R 1001:1001 /data/myapp/data /data/myapp/logs
```
Find UID: grep `adduser.*-u` or `USER` in Dockerfile.
### Provisioner fails but no diagnostic output
`set -e` exits on first error, hiding subsequent `docker logs` output. Use `set -u` without `-e`, put one verification gate at the end:
```hcl
provisioner "remote-exec" {
inline = [
"set -u",
"docker compose up -d",
"sleep 15",
"docker logs myapp --tail 20 2>&1 || true",
"docker ps --format 'table {{.Names}}\\t{{.Status}}' || true",
"docker ps --filter name=myapp --format '{{.Status}}' | grep -q healthy || exit 1",
]
}
```
### Container `Restarting` — database tables missing
DB migrations not in provisioner. PostgreSQL `docker-entrypoint-initdb.d` only runs on empty data dir. Explicitly create DB + run migrations:
```bash
# After postgres healthy:
docker exec pg psql -U postgres -tc "SELECT 1 FROM pg_database WHERE datname='mydb'" | grep -q 1 \
|| docker exec pg psql -U postgres -c "CREATE DATABASE mydb;"
# Idempotent migrations:
for f in migrations/*.sql; do
VER=$(basename $f)
APPLIED=$($PSQL -tAc "SELECT 1 FROM schema_migrations WHERE version='$VER'" | tr -d ' ')
[ "$APPLIED" = "1" ] && continue
{ echo 'BEGIN;'; cat $f; echo 'COMMIT;'; } | $PSQL
$PSQL -tAc "INSERT INTO schema_migrations(version) VALUES ('$VER') ON CONFLICT DO NOTHING"
done
```
### `docker compose build` ignores env var override
Compose reads build args from `.env` file, not shell env. `VAR=x docker compose build` does NOT work.
```bash
# WRONG
DOCKER_WITH_PROXY_MODE=disabled docker compose build
# RIGHT
grep -q DOCKER_WITH_PROXY_MODE .env || echo 'DOCKER_WITH_PROXY_MODE=disabled' >> .env
docker compose build
```
### TLS handshake fails: `Invalid format for Authorization header`
Caddy DNS-01 ACME needs a Cloudflare **API Token** (`cfut_` prefix, 40+ chars, Bearer auth). A **Global API Key** (37 hex chars, X-Auth-Key auth) causes `HTTP 400 Code:6003`. Production may appear to work because it has cached certificates; fresh environments fail on first cert request.
```bash
# Verify token format before deploy:
TOKEN=$(grep CLOUDFLARE_API_TOKEN .env | cut -d= -f2)
echo "$TOKEN" | grep -q "^cfut_" || echo "FATAL: needs API Token, not Global Key"
```
Create scoped token via API:
```bash
curl -s "https://api.cloudflare.com/client/v4/user/tokens" -X POST \
-H "X-Auth-Email: $CF_EMAIL" -H "X-Auth-Key: $CF_GLOBAL_KEY" \
-d '{"name":"caddy-dns-acme","policies":[{"effect":"allow",
"resources":{"com.cloudflare.api.account.zone.<ZONE_ID>":"*"},
"permission_groups":[
{"id":"4755a26eedb94da69e1066d98aa820be","name":"DNS Write"},
{"id":"c8fed203ed3043cba015a93ad1616f1f","name":"Zone Read"}]}]}'
```
### TLS fails on staging but works on production — hardcoded domains
Caddyfile or compose has literal domain names. Staging Caddy loads production config, tries to get certs for domains it doesn't own → ACME fails.
**Caddyfile**: Use `{$VAR}` — Caddy evaluates env vars at startup.
```caddy
# WRONG
gpt-6.pro { tls { dns cloudflare {env.CLOUDFLARE_API_TOKEN} } }
# RIGHT
{$LOBEHUB_DOMAIN} { tls { dns cloudflare {env.CLOUDFLARE_API_TOKEN} } }
```
**Compose**: Use `${VAR:?required}` — fail-fast if unset.
```yaml
# WRONG
- APP_URL=https://gpt-6.pro
# RIGHT
- APP_URL=${APP_URL:?APP_URL is required}
```
Pass the env var to the gateway container so Caddy can read it:
```yaml
environment:
- LOBEHUB_DOMAIN=${LOBEHUB_DOMAIN:?LOBEHUB_DOMAIN is required}
- CLOUDFLARE_API_TOKEN=${CLOUDFLARE_API_TOKEN:?required for DNS-01 TLS}
```
### OAuth login fails: `Social sign in failed`
Casdoor `init_data.json` contains hardcoded redirect URIs. `--createDatabase=true` only applies init_data on first-ever DB creation — not on restarts. Fix via SQL in provisioner:
```bash
# Replace production domain with staging in existing Casdoor DB
$PSQL -c "UPDATE application SET redirect_uris = REPLACE(redirect_uris,
'gpt-6.pro', 'staging.gpt-6.pro')
WHERE name='lobechat'
AND redirect_uris LIKE '%gpt-6.pro%'
AND redirect_uris NOT LIKE '%staging.gpt-6.pro%';"
```
Also check `AUTH_CASDOOR_ISSUER` — it must match the Casdoor subdomain (`auth.staging.example.com`), not the app root domain.
## Multi-environment isolation
Before creating a second environment, grep `.tf` files for hardcoded names. See [references/multi-env-isolation.md](references/multi-env-isolation.md) for the complete matrix.
**Will fail on apply** (globally unique):
| Resource | Scope | Fix |
|---|---|---|
| SSH key pair | Region | `"${env}-deploy"` |
| SLS log project | Account | `"${env}-logs"` |
| CloudMonitor contact | Account | `"${env}-ops"` |
**DNS duplication trap**: Two environments creating A records for the same name in the same Cloudflare zone → two independent record IDs → DNS round-robin → ~50% traffic to wrong instance. Fix: use subdomain isolation (`staging.example.com`) or separate zones. Remember to create DNS records for ALL subdomains Caddy serves (e.g., `auth.staging`, `minio.staging`).
**Snapshot cross-contamination**: Unfiltered `data "alicloud_ecs_snapshots"` returns ALL account snapshots. New env inherits old 100GB snapshot, fails creating 40GB disk. Gate with variable:
```hcl
locals {
latest_snapshot_id = var.enable_snapshot_recovery && length(local.available_snapshots) > 0
? local.available_snapshots[0].snapshot_id : null
}
```
Do NOT add `count` to the data source — changes its state address, causes drift.
## Pre-deploy validation
Run a validation script **before** `terraform apply` to catch configuration errors locally. This eliminates the deploy→discover→fix→redeploy cycle.
Key checks (see [references/pre-deploy-validation.md](references/pre-deploy-validation.md)):
1. `terraform validate` — syntax
2. No hardcoded domains in Caddyfiles or compose files
3. Required env vars present (`LOBEHUB_DOMAIN`, `CLAUDE4DEV_DOMAIN`, `CLOUDFLARE_API_TOKEN`, `APP_URL`, etc.)
4. Cloudflare API Token format (not Global API Key)
5. DNS records exist for all Caddy-served domains
6. Casdoor issuer URL matches `auth.*` subdomain
7. SSH private key exists
Integrate into Makefile: `make pre-deploy ENV=staging` before `make apply`.
## Zero-to-deployment
Fresh disks expose every implicit dependency. See [references/zero-to-deploy-checklist.md](references/zero-to-deploy-checklist.md).
Key items that break provisioners on fresh instances:
1. **Directories**: `mkdir -p /data/{svc1,svc2}` in cloud-init — `file` provisioner fails if target dir missing
2. **Databases**: Explicit `CREATE DATABASE` — PG init scripts only run on empty data dir
3. **Migrations**: Tracked in `schema_migrations` table, applied idempotently
4. **Provisioner ordering**: `depends_on` between resources sharing Docker networks
5. **Memory**: Stop non-critical containers during Docker build on small instances (≤8GB)
6. **Domain parameterization**: Every domain in Caddyfile/compose must be `{$VAR}` / `${VAR:?required}`
7. **Credential format**: Caddy needs API Token (`cfut_`), not Global API Key

View File

@@ -0,0 +1,130 @@
# Multi-Environment Isolation Checklist
When creating a second Terraform environment (`staging`, `lab`, etc.) in the same cloud account alongside production, every item below must be verified. Skip one and you get silent name collisions or cross-contamination.
## Terraform state isolation
Two environments MUST use different state paths. Same OSS/S3 bucket is fine — different prefix isolates completely:
```hcl
# production
backend "oss" {
bucket = "myproject-terraform-state"
prefix = "environments/production"
}
# staging
backend "oss" {
bucket = "myproject-terraform-state" # same bucket OK
prefix = "environments/staging" # different prefix = isolated state
}
```
**Verification**: `terraform state list` in one environment must show ZERO resources from the other.
## Resource naming collision matrix
Grep every `.tf` file for hardcoded names. Every globally-unique resource will collide.
### Must rename (apply will fail)
| Resource | Uniqueness scope | Fix pattern |
|---|---|---|
| SSH key pair (`key_pair_name`) | Region | `"${env}-deploy"` |
| SLS log project (`project_name`) | Account | `"${env}-logs"` |
| CloudMonitor contact (`alarm_contact_name`) | Account | `"${env}-ops"` |
| CloudMonitor contact group | Account | `"${env}-ops"` |
### Should rename (won't fail but causes confusion)
| Resource | Issue if same name |
|---|---|
| Security group name | Two SGs with same name in same VPC, can't tell apart in console |
| ECS instance name/hostname | Two instances named `myapp-spot` in console |
| Data disk name | Same in disk list |
| Auto snapshot policy name | Same in policy list |
| SLS machine group name | Logs from both instances land in same group |
### Pattern: Use a module name variable
```hcl
# production main.tf
module "app" {
source = "../../modules/spot-with-data-disk"
name = "production-spot" # flows to instance_name, disk_name, snapshot_policy_name
}
# staging main.tf
module "app" {
source = "../../modules/spot-with-data-disk"
name = "staging-spot" # all child resource names auto-isolated
}
```
## DNS record isolation
### The duplication trap
Two Terraform environments creating A records for `@` (root) in the same Cloudflare zone:
- Each gets its own Cloudflare record ID (independent)
- Cloudflare now has TWO A records for the same domain
- DNS round-robins between the two IPs
- ~50% of traffic goes to the wrong instance
### Correct patterns
**Pattern A: Subdomain isolation** (recommended for staging/lab):
```hcl
# Production: root domain records
resource "cloudflare_dns_record" "prod" {
name = "@" # gpt-6.pro
}
# Staging: subdomain records only
resource "cloudflare_dns_record" "staging" {
name = "staging" # staging.gpt-6.pro
}
```
**Pattern B: Separate zones** (for fully independent deployments):
Each environment gets its own domain/zone. No shared Cloudflare zone IDs.
**Pattern C: One environment owns DNS** (production):
Only production has DNS resources. Other environments access via IP only.
### Destroy safety
When one environment is destroyed:
- Its DNS records are deleted (by their specific Cloudflare record IDs)
- Other environments' DNS records are NOT affected
- **Verify before destroy**: Compare DNS record IDs between environments:
```bash
terraform state show 'cloudflare_dns_record.app["root"]' | grep "^id"
```
IDs must be different.
## Shared resources (safe to share)
These are referenced but NOT managed by the second environment:
| Resource | Why safe |
|---|---|
| VPC / VSwitch | Referenced by ID, not created |
| Cloudflare zone ID | Referenced, records are independent |
| OSS state bucket | Different prefix = different state |
| SSH public key content | Same key, different key pair resource |
| Cloud provider credentials | Same account, different resources |
## Makefile pattern for multi-environment
```makefile
ENV ?= production
ENV_DIR := environments/$(ENV)
init: ; cd $(ENV_DIR) && terraform init
plan: ; cd $(ENV_DIR) && terraform plan -out=tfplan
apply: ; cd $(ENV_DIR) && terraform apply tfplan
drift: ; cd $(ENV_DIR) && terraform plan -detailed-exitcode
```
Usage: `make plan ENV=staging`

View File

@@ -0,0 +1,82 @@
# Pre-Deploy Validation Pattern
Run before `terraform apply` to catch configuration errors locally. Eliminates the deploy→discover→fix→redeploy cycle that wastes hours.
## Why this matters
Every hardcoded value becomes a bug when creating a second environment. Production accumulates implicit state over time (cached TLS certs, manually created databases, hand-edited configs). Fresh instances expose all of these as failures. A pre-deploy script catches them before they reach the remote.
## Validation categories
### 1. Terraform syntax
```bash
terraform validate
```
### 2. Hardcoded domains
```bash
# Caddyfiles: should use {$VAR} not literal domains
grep -v "^#" gateway/conf.d/*.caddy | grep -c "example\.com" # should be 0
# Compose: should use ${VAR:?required} not literal domains
grep -v "^#" docker-compose.production.yml | grep -c "example\.com" # should be 0
```
### 3. Required env vars
Check that every `${VAR:?required}` in compose has a matching entry in `.env`:
```bash
for VAR in LOBEHUB_DOMAIN CLAUDE4DEV_DOMAIN CLOUDFLARE_API_TOKEN APP_URL AUTH_URL; do
grep -q "^$VAR=" .env || echo "FAIL: $VAR missing"
done
```
### 4. Cloudflare credential format
Caddy's Cloudflare plugin uses Bearer auth. Global API Keys (37 hex chars) fail with `Invalid format for Authorization header`.
```bash
TOKEN=$(grep CLOUDFLARE_API_TOKEN .env | cut -d= -f2)
echo "$TOKEN" | grep -qE "^cfut_|^[A-Za-z0-9_-]{40,}$" || echo "FAIL: looks like Global API Key, not API Token"
```
### 5. DNS ↔ Caddy consistency
Every domain Caddy serves needs a DNS record. Check live resolution:
```bash
for DOMAIN in staging.example.com auth.staging.example.com; do
curl -sf "https://dns.google/resolve?name=$DOMAIN&type=A" | python3 -c \
"import sys,json; d=json.load(sys.stdin); exit(0 if d.get('Answer') else 1)" \
|| echo "FAIL: $DOMAIN not resolving"
done
```
### 6. Casdoor issuer consistency
`AUTH_CASDOOR_ISSUER` must point to `auth.<domain>`, not the app's root domain:
```bash
ISSUER=$(grep AUTH_CASDOOR_ISSUER .env | cut -d= -f2)
DOMAIN=$(grep LOBEHUB_DOMAIN .env | cut -d= -f2)
[ "$ISSUER" = "https://auth.$DOMAIN" ] || echo "FAIL: issuer should be https://auth.$DOMAIN"
```
### 7. SSH key exists
```bash
[ -f ~/.ssh/id_ed25519 ] || echo "FAIL: SSH key not found"
```
## Makefile integration
```makefile
pre-deploy:
@./scripts/validate-env.sh $(ENV)
# Enforce: plan requires pre-deploy to pass
plan: pre-deploy
cd $(ENV_DIR) && terraform plan -out=tfplan
```
## Anti-pattern: deploy-and-pray
The opposite of pre-deploy validation is the "deploy and see what breaks" cycle:
1. `terraform apply` → fails
2. SSH in to debug → discover error
3. Fix locally → commit → re-apply → fails differently
4. Repeat 5-10 times
Each cycle takes 3-5 minutes (plan + apply + provisioner). Pre-deploy catches 80% of issues in <5 seconds locally.

View File

@@ -0,0 +1,168 @@
# Zero-to-Deployment Checklist
A fresh instance with an empty data disk exposes every implicit dependency that production silently relies on. This checklist covers everything that must be explicitly created before services will start.
## Pre-flight: cloud-init must handle
These run at OS boot, before Terraform provisioners:
- [ ] **Mount data disk**: Format if new (`blkid` check), mount to `/data`, add to fstab
- [ ] **Create service directories**: `mkdir -p /data/{service1,service2,...}` — file provisioners fail if target dir doesn't exist
- [ ] **Install Docker + Compose**: Curl installer, enable systemd service
- [ ] **Configure swap**: `fallocate` on data disk (NOT system disk)
- [ ] **SSH hardening**: key-only auth, no password root login
- [ ] **Firewall**: UFW + DOCKER-USER iptables chain
- [ ] **Debconf preseed**: For any package with interactive prompts (iptables-persistent, etc.)
- [ ] **Signal readiness**: Write timestamp to `/data/cloud-init.log`
## Provisioner ordering
Terraform provisioners execute in declaration order within a resource, but resources execute in parallel unless `depends_on` is set.
```
lobehub_deploy ──────────────────→ channel_sync (depends_on lobehub)
→ casdoor_sync (depends_on lobehub)
→ minio_sync (depends_on lobehub)
claude4dev_deploy (depends_on lobehub_deploy)
├─ wait for cloud-init
├─ upload source (tarball via file provisioner)
├─ upload .env (staging variant)
├─ start stateful (postgres, redis) --no-recreate
├─ run DB migrations
├─ build stateless images
├─ fix volume permissions
├─ start stateless (relay, api, frontend, gateway)
└─ verify health
```
## Database bootstrap
### PostgreSQL databases
PostgreSQL `docker-entrypoint-initdb.d` scripts only run when the data directory is empty (first-ever start). On subsequent starts — even if a database doesn't exist — init scripts are skipped.
**Fix**: Explicitly create databases in provisioner:
```bash
# Wait for postgres healthy
sleep 10
# Create database if missing (idempotent)
docker exec my-postgres psql -U postgres -tc \
"SELECT 1 FROM pg_database WHERE datname='mydb'" | grep -q 1 \
|| docker exec my-postgres psql -U postgres -c "CREATE DATABASE mydb;"
```
### Schema migrations
Migrations must be idempotent. Track applied versions:
```bash
PSQL='docker compose exec -T postgres psql -v ON_ERROR_STOP=1 -U myuser -d mydb'
# Create tracking table
$PSQL -tAc "CREATE TABLE IF NOT EXISTS schema_migrations (
version TEXT PRIMARY KEY,
applied_at TIMESTAMPTZ DEFAULT now()
)"
# Apply each migration file in order
for f in migrations/*.sql; do
VER=$(basename $f)
APPLIED=$($PSQL -tAc "SELECT 1 FROM schema_migrations WHERE version='$VER'" | tr -d ' ')
if [ "$APPLIED" = "1" ]; then
echo "Skip: $VER"
else
echo "Apply: $VER"
{ echo 'BEGIN;'; cat $f; echo 'COMMIT;'; } | $PSQL
$PSQL -tAc "INSERT INTO schema_migrations(version) VALUES ('$VER') ON CONFLICT DO NOTHING"
fi
done
```
## Docker build on remote
### Proxy mode
Docker Compose reads build args from `.env` via `${VAR:-default}`. Command-line env vars do NOT override `.env` values for compose interpolation.
```bash
# WRONG: compose still reads DOCKER_WITH_PROXY_MODE from .env
DOCKER_WITH_PROXY_MODE=disabled docker compose build myapp
# RIGHT: modify .env so compose reads the correct value
grep -q DOCKER_WITH_PROXY_MODE .env || echo 'DOCKER_WITH_PROXY_MODE=disabled' >> .env
docker compose build myapp
```
### Memory management
Building Docker images while 10+ containers run can OOM on small instances (8GB). Strategy:
```bash
# Stop non-critical containers to free RAM
cd /data/other-project && docker compose stop search-engine analytics-db || true
# Build (memory-intensive)
cd /data/myproject && docker compose build myapp
# Restart stopped containers
cd /data/other-project && docker compose up -d search-engine analytics-db || true
```
## Volume permissions
Containers running as non-root need writable volume directories:
```bash
# Before docker compose up:
mkdir -p data-dir logs-dir
chown -R 1001:1001 data-dir logs-dir # match container UID
```
Find the UID from the Dockerfile:
```dockerfile
RUN adduser -S myuser -u 1001 -G mygroup
USER myuser # runs as uid 1001
```
## Environment-specific .env files
Production `.env` contains production URLs. Staging needs its own `.env` with:
| Variable | Production | Staging |
|---|---|---|
| `FRONTEND_URL` | `https://myapp.com` | `https://staging.myapp.com` |
| `CORS_ORIGIN` | `https://myapp.com` | `https://staging.myapp.com` |
| `NEW_API_URL` | `http://api-container:3000` | Same (internal Docker network) |
| `DOCKER_WITH_PROXY_MODE` | `required` (if behind proxy) | `disabled` (direct internet) |
**Pattern**: Create `.env.staging` alongside `.env`. In Terraform:
```hcl
locals {
env_src = "${local.repo}/.env.staging" # staging-specific
}
provisioner "file" {
source = local.env_src
destination = "${local.deploy_dir}/.env"
}
```
Rsync must exclude `.env` files (otherwise production .env overwrites staging .env):
```
--exclude=.env --exclude='.env.*'
```
## Verification template
After all services start, verify in the provisioner (not ad-hoc SSH):
```bash
sleep 20
echo '=== Service logs ==='
docker logs my-critical-service --tail 20 2>&1 || true
echo '=== All containers ==='
docker ps --format 'table {{.Names}}\t{{.Status}}' 2>&1 || true
# Final gate (only line that can fail)
docker ps --filter name=my-critical-service --format '{{.Status}}' | grep -q healthy \
|| { echo 'FATAL: service unhealthy'; exit 1; }
```