feat(commands): add /seo-auditor — 7-phase SEO audit pipeline for documentation

- 7 phases: discovery → meta tags → content quality → keywords → links → sitemap → report - Integrates 8 marketing-skill scripts: seo_checker, content_scorer, humanizer_scorer, headline_scorer, seo_optimizer, sitemap_analyzer, schema_validator, topic_cluster_mapper - References 6 SEO knowledge bases for audit framework, AI search, content optimization, URL design, internal linking, AI detection - Auto-fixes: generic titles, missing descriptions, broken links, orphan pages - Preserves high-ranking pages — only fixes critical issues on those - Registered in both commands/ (distributable) and .claude/commands/ (local) Also: sync all doc counts — 28 plugins, 26 eng-core skills, 21 commands Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 10:28:17 +01:00
parent 4709662631
commit 90cef3b3ac
19 changed files with 2303 additions and 26 deletions
--- a/commands/seo-auditor.md
+++ b/commands/seo-auditor.md
@@ -0,0 +1,340 @@
+---
+name: seo-auditor
+description: |
+  Scan and optimize documentation files for SEO. Audits README.md files and docs/ pages for
+  meta tags, headings, keywords, readability, duplicate content, and broken links. Applies
+  fixes, updates sitemap.xml, and generates a report. Usage: /seo-auditor [path]
+---
+
+# /seo-auditor
+
+Systematically scan, audit, and optimize documentation files for SEO. Targets README.md files and docs/ pages — fixes issues in place, preserves rankings on high-performing pages, and generates a final report.
+
+## Usage
+
+```bash
+/seo-auditor                    # Audit all docs/ and root README.md
+/seo-auditor docs/skills/       # Audit a specific docs subdirectory
+/seo-auditor --report-only      # Scan without making changes
+```
+
+## What It Does
+
+Execute all 7 phases sequentially. Auto-fix non-destructive issues. Preserve existing high-ranking content. Report everything at the end.
+
+---
+
+## Phase 1: Discovery & Baseline
+
+### 1a. Identify target files
+
+Scan for documentation files that need SEO audit:
+
+```bash
+# Find all markdown files in docs/ and root README files
+find docs/ -name '*.md' -type f | sort
+find . -maxdepth 2 -name 'README.md' -not -path './.codex/*' -not -path './.gemini/*' | sort
+```
+
+Classify each file:
+- **New/recently modified** — files changed in the last 2 commits (check via `git log`)
+- **Index pages** — `index.md` files (high authority, handle with care)
+- **Skill pages** — `docs/skills/**/*.md` (generated by `generate-docs.py`)
+- **Static pages** — `docs/index.md`, `docs/getting-started.md`, `docs/integrations.md`, etc.
+- **README files** — root and domain-level README.md
+
+### 1b. Capture baseline
+
+For each target file, extract current SEO state:
+- `title:` frontmatter field → becomes `<title>` tag
+- `description:` frontmatter field → becomes `<meta name="description">`
+- First `# H1` heading
+- All `## H2` and `### H3` subheadings
+- Word count
+- Internal link count
+- External link count
+
+Store baseline in memory for the report.
+
+---
+
+## Phase 2: Meta Tag Audit
+
+For every file with YAML frontmatter, check and fix:
+
+### Title Tag (`title:`)
+
+**Rules:**
+- Must exist and be non-empty
+- Length: 50-60 characters ideal (Google truncates at ~60)
+- Must contain a primary keyword
+- Must NOT duplicate another page's title
+- For skill pages: should follow the pattern `{Skill Name} — {Differentiator} - {site_name}`
+- site_name from `mkdocs.yml` is appended automatically — don't duplicate it in the title
+
+**Auto-fix:** If title is generic (e.g., just the skill name), enrich it with domain context using the DOMAIN_SEO_SUFFIX pattern from `scripts/generate-docs.py`.
+
+### Meta Description (`description:`)
+
+**Rules:**
+- Must exist and be non-empty
+- Length: 120-160 characters (Google truncates at ~160)
+- Must contain the primary keyword naturally
+- Must be unique across all pages — no two pages share the same description
+- Should include a call-to-action or value proposition
+- Must NOT start with "This page..." or "This document..."
+
+**Auto-fix:** If description is missing or generic, generate one from the SKILL.md frontmatter description (if available) or from the first paragraph of content. Use the `extract_description_from_frontmatter()` function from `generate-docs.py` as reference.
+
+### Validation Script
+
+Run on each file that has HTML output in `site/`:
+
+```bash
+python3 marketing-skill/seo-audit/scripts/seo_checker.py --file site/{path}/index.html
+```
+
+Parse the score. Flag any page scoring below 60.
+
+---
+
+## Phase 3: Content Quality & Readability
+
+For each target file, analyze and improve:
+
+### Heading Structure
+
+**Rules:**
+- Exactly one `# H1` per page
+- H2s follow H1, H3s follow H2 — no skipping levels
+- Headings should contain keywords naturally (not stuffed)
+- No duplicate headings on the same page
+
+**Auto-fix:** If heading levels skip (H1 → H3), adjust to proper hierarchy.
+
+### Readability
+
+Run the content scorer on each file:
+
+```bash
+python3 marketing-skill/content-production/scripts/content_scorer.py {file_path}
+```
+
+Check scores for:
+- **Readability** — aim for score ≥ 70
+- **Structure** — aim for score ≥ 60
+- **Engagement** — aim for score ≥ 50
+
+### Content Quality Rules
+
+- **Paragraphs:** No single paragraph longer than 5 sentences
+- **Sentences:** Average sentence length 15-20 words
+- **Passive voice:** Less than 15% of sentences
+- **Transition words:** At least 30% of sentences use transitions
+- **Bullet lists:** Use lists for 3+ items instead of comma-separated inline lists
+
+### AI Content Detection
+
+Run the humanizer scorer on non-generated content (README.md files, static pages):
+
+```bash
+python3 marketing-skill/content-humanizer/scripts/humanizer_scorer.py {file_path}
+```
+
+Flag pages scoring below 50 (too AI-sounding). For these pages, apply voice techniques from `marketing-skill/content-humanizer/references/voice-techniques.md`:
+- Replace AI clichés ("delve into", "leverage", "it's important to note")
+- Vary sentence length
+- Add specific examples instead of generic statements
+- Use active voice
+
+**Important:** Only modify content that was recently created or updated. Do NOT rewrite pages that are ranking well — preserve their content.
+
+---
+
+## Phase 4: Keyword Optimization
+
+### 4a. Identify target keywords per page
+
+Based on the page's purpose and domain:
+
+| Page Type | Primary Keywords | Secondary Keywords |
+|-----------|-----------------|-------------------|
+| Homepage (docs/index.md) | "Claude Code Skills", "agent plugins" | "Codex skills", "Gemini CLI", "OpenClaw" |
+| Skill pages | Skill name + "Claude Code" | "agent skill", "Codex plugin", domain terms |
+| Agent pages | Agent name + "AI coding agent" | "Claude Code", "orchestrator" |
+| Command pages | Command name + "slash command" | "Claude Code", "AI coding" |
+| Getting started | "install Claude Code skills" | platform names |
+| Domain index | Domain + "skills" + "plugins" | "Claude Code", platform names |
+
+### 4b. Keyword placement checks
+
+For each page, verify the primary keyword appears in:
+- [ ] Title tag (frontmatter `title:`)
+- [ ] Meta description (frontmatter `description:`)
+- [ ] H1 heading
+- [ ] First paragraph (within first 100 words)
+- [ ] At least one H2 subheading
+- [ ] Image alt text (if images present)
+- [ ] URL slug (for new pages only — never change existing URLs)
+
+### 4c. Keyword density
+
+- Primary keyword: 1-2% of total word count
+- Secondary keywords: 0.5-1% each
+- No keyword stuffing — if density exceeds 3%, reduce it
+
+**Important:** Never change URLs of existing pages. URL changes break incoming links and destroy rankings. Only optimize content and meta tags.
+
+---
+
+## Phase 5: Link Audit
+
+### 5a. Internal links
+
+For each target file, check all markdown links `[text](url)`:
+
+- Verify the target exists (file path resolves)
+- Check for broken relative links (`../`, `./`)
+- Verify anchor links (`#section-name`) point to existing headings
+
+**Auto-fix:** Use the `rewrite_skill_internal_links()` and `rewrite_relative_links()` functions from `generate-docs.py` as reference. Rewrite broken skill-internal links to GitHub source URLs.
+
+### 5b. Duplicate content detection
+
+Compare meta descriptions across all pages:
+
+```bash
+grep -rh '^description:' docs/**/*.md | sort | uniq -d
+```
+
+If duplicates found, make each description unique by adding page-specific context.
+
+Compare H1 headings across all pages — no two pages should have the same H1.
+
+### 5c. Orphan page detection
+
+Check if every page in `docs/` is referenced in `mkdocs.yml` nav. Pages not in nav are orphans — they won't appear in navigation and may not be indexed.
+
+```bash
+# Find doc pages not in mkdocs nav
+find docs -name '*.md' -not -name 'index.md' | while read f; do
+  slug=$(echo "$f" | sed 's|docs/||')
+  grep -q "$slug" mkdocs.yml || echo "ORPHAN: $f"
+done
+```
+
+**Auto-fix:** Add orphan pages to the correct nav section in `mkdocs.yml`.
+
+---
+
+## Phase 6: Sitemap & Build
+
+### 6a. Rebuild the site
+
+```bash
+mkdocs build
+```
+
+This regenerates `site/sitemap.xml` automatically (MkDocs Material generates it during build).
+
+### 6b. Verify sitemap
+
+Check the generated sitemap:
+
+```bash
+python3 marketing-skill/site-architecture/scripts/sitemap_analyzer.py site/sitemap.xml
+```
+
+Verify:
+- All documentation pages appear in the sitemap
+- No broken/404 URLs
+- URL count matches expected page count
+- Depth distribution is reasonable (no pages deeper than 4 levels)
+
+### 6c. Check for sitemap issues
+
+- **Missing pages:** Pages in `mkdocs.yml` nav that don't appear in sitemap
+- **Extra pages:** Pages in sitemap that aren't in nav (orphans)
+- **Duplicate URLs:** Same page accessible via multiple URLs
+
+---
+
+## Phase 7: Report
+
+Generate a concise report for the user:
+
+```
+╔══════════════════════════════════════════════════════════════╗
+║  SEO AUDITOR REPORT                                         ║
+╠══════════════════════════════════════════════════════════════╣
+║                                                              ║
+║  Pages scanned:        {n}                                   ║
+║  Issues found:         {n}                                   ║
+║  Auto-fixed:           {n}                                   ║
+║  Manual review needed: {n}                                   ║
+║                                                              ║
+║  META TAGS                                                   ║
+║    Titles optimized:     {n}                                 ║
+║    Descriptions fixed:   {n}                                 ║
+║    Duplicate titles:     {n} → {n} (fixed)                   ║
+║    Duplicate descs:      {n} → {n} (fixed)                   ║
+║                                                              ║
+║  CONTENT                                                     ║
+║    Readability improved: {n} pages                           ║
+║    Heading fixes:        {n}                                 ║
+║    AI score improved:    {n} pages                           ║
+║                                                              ║
+║  KEYWORDS                                                    ║
+║    Pages missing primary keyword in title: {n}               ║
+║    Pages missing keyword in description:   {n}               ║
+║    Pages with keyword stuffing:            {n}               ║
+║                                                              ║
+║  LINKS                                                       ║
+║    Broken links found:   {n} → {n} (fixed)                   ║
+║    Orphan pages:         {n} → {n} (added to nav)            ║
+║    Duplicate content:    {n} → {n} (deduplicated)            ║
+║                                                              ║
+║  SITEMAP                                                     ║
+║    Total URLs:           {n}                                 ║
+║    Sitemap regenerated:  ✅                                  ║
+║                                                              ║
+║  PRESERVED (no changes — ranking well)                       ║
+║    {list of pages left untouched}                            ║
+║                                                              ║
+╚══════════════════════════════════════════════════════════════╝
+```
+
+### Pages to preserve (do NOT modify)
+
+These pages rank well for their target keywords. Only fix critical issues (broken links, missing meta). Do NOT rewrite content:
+
+- `docs/index.md` — homepage, ranks for "Claude Code Skills"
+- `docs/getting-started.md` — installation guide
+- `docs/integrations.md` — multi-tool support
+- Any page the user explicitly marks as "preserve"
+
+---
+
+## Skill References
+
+| Tool | Path | Use |
+|------|------|-----|
+| SEO Checker | `marketing-skill/seo-audit/scripts/seo_checker.py` | Score HTML pages 0-100 |
+| Content Scorer | `marketing-skill/content-production/scripts/content_scorer.py` | Score content readability/structure/engagement |
+| Humanizer Scorer | `marketing-skill/content-humanizer/scripts/humanizer_scorer.py` | Detect AI-sounding content |
+| Headline Scorer | `marketing-skill/copywriting/scripts/headline_scorer.py` | Score title quality |
+| SEO Optimizer | `marketing-skill/content-production/scripts/seo_optimizer.py` | Optimize content for target keyword |
+| Sitemap Analyzer | `marketing-skill/site-architecture/scripts/sitemap_analyzer.py` | Analyze sitemap structure |
+| Schema Validator | `marketing-skill/schema-markup/scripts/schema_validator.py` | Validate structured data |
+| Topic Cluster Mapper | `marketing-skill/content-strategy/scripts/topic_cluster_mapper.py` | Group pages into content clusters |
+
+### Reference Docs
+
+| Reference | Path | Use |
+|-----------|------|-----|
+| SEO Audit Framework | `marketing-skill/seo-audit/references/seo-audit-reference.md` | Priority order for SEO fixes |
+| AI Search Optimization | `marketing-skill/ai-seo/references/content-patterns.md` | Make content citable by AI |
+| Content Optimization | `marketing-skill/content-production/references/optimization-checklist.md` | Pre-publish checklist |
+| URL Design Guide | `marketing-skill/site-architecture/references/url-design-guide.md` | URL structure best practices |
+| Internal Linking | `marketing-skill/site-architecture/references/internal-linking-playbook.md` | Internal linking strategy |
+| AI Writing Detection | `marketing-skill/content-humanizer/references/ai-tells-checklist.md` | AI cliché removal |