Update skill docs and resources

This commit is contained in:
daymade
2026-02-23 16:16:58 +08:00
parent c1cfacaf76
commit 72d879e609
15 changed files with 1430 additions and 89 deletions

231
product-analysis/SKILL.md Normal file
View File

@@ -0,0 +1,231 @@
---
name: product-analysis
description: Multi-path parallel product analysis with cross-model test-time compute scaling. Spawns parallel agents (Claude Code agent teams + Codex CLI) to explore product from multiple perspectives, then synthesizes findings into actionable optimization plans. Can invoke competitors-analysis for competitive benchmarking. Use when "product audit", "self-review", "发布前审查", "产品分析", "analyze our product", "UX audit", or "信息架构审计".
argument-hint: [scope: full|ux|api|arch|compare]
---
# Product Analysis
Multi-path parallel product analysis that combines **Claude Code agent teams** and **Codex CLI** for cross-model test-time compute scaling.
**Core principle**: Same analysis task, multiple AI perspectives, deep synthesis.
## How It Works
```
/product-analysis full
├─ Step 0: Auto-detect available tools (codex? competitors?)
┌────┼──────────────┐
│ │ │
Claude Code Codex CLI (auto-detected)
Task Agents (background Bash)
(Explore ×3-5) (×2-3 parallel)
│ │
└────────┬──────────┘
Synthesis (main context)
Structured Report
```
## Step 0: Auto-Detect Available Tools
Before launching any agents, detect what tools are available:
```bash
# Check if Codex CLI is installed
which codex 2>/dev/null && codex --version
```
**Decision logic**:
- If `codex` is found: Inform the user — "Codex CLI detected (version X). Will run cross-model analysis for richer perspectives."
- If `codex` is not found: Silently proceed with Claude Code agents only. Do NOT ask the user to install anything.
Also detect the project type to tailor agent prompts:
```bash
# Detect project type
ls package.json 2>/dev/null # Node.js/React
ls pyproject.toml 2>/dev/null # Python
ls Cargo.toml 2>/dev/null # Rust
ls go.mod 2>/dev/null # Go
```
## Scope Modes
Parse `$ARGUMENTS` to determine analysis scope:
| Scope | What it covers | Typical agents |
|-------|---------------|----------------|
| `full` | UX + API + Architecture + Docs (default) | 5 Claude + Codex (if available) |
| `ux` | Frontend navigation, information density, user journey, empty state, onboarding | 3 Claude + Codex (if available) |
| `api` | Backend API coverage, endpoint health, error handling, consistency | 2 Claude + Codex (if available) |
| `arch` | Module structure, dependency graph, code duplication, separation of concerns | 2 Claude + Codex (if available) |
| `compare X Y` | Self-audit + competitive benchmarking (invokes `/competitors-analysis`) | 3 Claude + competitors-analysis |
## Phase 1: Parallel Exploration
Launch all exploration agents simultaneously using Task tool (background mode).
### Claude Code Agents (always)
For each dimension, spawn a Task agent with `subagent_type: Explore` and `run_in_background: true`:
**Agent A — Frontend Navigation & Information Density**
```
Explore the frontend navigation structure and entry points:
1. App.tsx: How many top-level components are mounted simultaneously?
2. Left sidebar: How many buttons/entries? What does each link to?
3. Right sidebar: How many tabs? How many sections per tab?
4. Floating panels: How many drawers/modals? Which overlap in functionality?
5. Count total first-screen interactive elements for a new user.
6. Identify duplicate entry points (same feature accessible from 2+ places).
Give specific file paths, line numbers, and element counts.
```
**Agent B — User Journey & Empty State**
```
Explore the new user experience:
1. Empty state page: What does a user with no sessions see? Count clickable elements.
2. Onboarding flow: How many steps? What information is presented?
3. Prompt input area: How many buttons/controls surround the input box? Which are high-frequency vs low-frequency?
4. Mobile adaptation: How many nav items? How does it differ from desktop?
5. Estimate: Can a new user complete their first conversation in 3 minutes?
Give specific file paths, line numbers, and UX assessment.
```
**Agent C — Backend API & Health**
```
Explore the backend API surface:
1. List ALL API endpoints (method + path + purpose).
2. Identify endpoints that are unused or have no frontend consumer.
3. Check error handling consistency (do all endpoints return structured errors?).
4. Check authentication/authorization patterns (which endpoints require auth?).
5. Identify any endpoints that duplicate functionality.
Give specific file paths and line numbers.
```
**Agent D — Architecture & Module Structure** (full/arch scope only)
```
Explore the module structure and dependencies:
1. Map the module dependency graph (which modules import which).
2. Identify circular dependencies or tight coupling.
3. Find code duplication across modules (same pattern in 3+ places).
4. Check separation of concerns (does each module have a single responsibility?).
5. Identify dead code or unused exports.
Give specific file paths and line numbers.
```
**Agent E — Documentation & Config Consistency** (full scope only)
```
Explore documentation and configuration:
1. Compare README claims vs actual implemented features.
2. Check config file consistency (base.yaml vs .env.example vs code defaults).
3. Find outdated documentation (references to removed features/files).
4. Check test coverage gaps (which modules have no tests?).
Give specific file paths and line numbers.
```
### Codex CLI Agents (auto-detected)
If Codex CLI was detected in Step 0, launch parallel Codex analyses via background Bash.
Each Codex invocation gets the same dimensional prompt but from a different model's perspective:
```bash
codex -m o4-mini \
-c model_reasoning_effort="high" \
--full-auto \
"Analyze the frontend navigation structure of this project. Count all interactive elements visible to a new user on first screen. Identify duplicate entry points where the same feature is accessible from 2+ places. Give specific file paths and counts."
```
Run 2-3 Codex commands in parallel (background Bash), one per major dimension.
**Important**: Codex runs in the project's working directory. It has full filesystem access. The `--full-auto` flag (or `--dangerously-bypass-approvals-and-sandbox` for older versions) enables autonomous execution.
## Phase 2: Competitive Benchmarking (compare scope only)
When scope is `compare`, invoke the competitors-analysis skill for each competitor:
```
Use the Skill tool to invoke: /competitors-analysis {competitor-name} {competitor-url}
```
This delegates to the orthogonal `competitors-analysis` skill which handles:
- Repository cloning and validation
- Evidence-based code analysis (file:line citations)
- Competitor profile generation
## Phase 3: Synthesis
After all agents complete, synthesize findings in the main conversation context.
### Cross-Validation
Compare findings across agents (Claude vs Claude, Claude vs Codex):
- **Agreement** = high confidence finding
- **Disagreement** = investigate deeper (one agent may have missed context)
- **Codex-only finding** = different model perspective, validate manually
### Quantification
Extract hard numbers from agent reports:
| Metric | What to measure |
|--------|----------------|
| First-screen interactive elements | Total count of buttons/links/inputs visible to new user |
| Feature entry point duplication | Number of features with 2+ entry points |
| API endpoints without frontend consumer | Count of unused backend routes |
| Onboarding steps to first value | Steps from launch to first successful action |
| Module coupling score | Number of circular or bi-directional dependencies |
### Structured Output
Produce a layered optimization report:
```markdown
## Product Analysis Report
### Executive Summary
[1-2 sentences: key finding]
### Quantified Findings
| Metric | Value | Assessment |
|--------|-------|------------|
| ... | ... | ... |
### P0: Critical (block launch)
[Issues that prevent basic usability]
### P1: High Priority (launch week)
[Issues that significantly degrade experience]
### P2: Medium Priority (next sprint)
[Issues worth addressing but not blocking]
### Cross-Model Insights
[Findings that only one model identified — worth investigating]
### Competitive Position (if compare scope)
[How we compare on key dimensions]
```
## Workflow Checklist
- [ ] Parse `$ARGUMENTS` for scope
- [ ] Auto-detect Codex CLI availability (`which codex`)
- [ ] Auto-detect project type (package.json / pyproject.toml / etc.)
- [ ] Launch Claude Code Explore agents (3-5 parallel, background)
- [ ] Launch Codex CLI commands (2-3 parallel, background) if detected
- [ ] Invoke `/competitors-analysis` if `compare` scope
- [ ] Collect all agent results
- [ ] Cross-validate findings
- [ ] Quantify metrics
- [ ] Generate structured report with P0/P1/P2 priorities
## References
- [references/analysis_dimensions.md](references/analysis_dimensions.md) — Detailed audit dimension definitions and prompts
- [references/synthesis_methodology.md](references/synthesis_methodology.md) — How to weight and merge multi-agent findings
- [references/codex_patterns.md](references/codex_patterns.md) — Codex CLI invocation patterns and flag reference

View File

@@ -0,0 +1,109 @@
# Analysis Dimensions
Detailed definitions for each audit dimension. Agents should use these as exploration guides.
## Dimension 1: Frontend Navigation & Information Density
**Goal**: Quantify cognitive load for a new user.
**Key questions**:
1. How many top-level components does App.tsx mount simultaneously?
2. How many tabs/sections exist in each sidebar panel?
3. Which features have multiple entry points (duplicate navigation)?
4. What is the total count of interactive elements on first screen?
5. Are there panels/drawers that overlap in functionality?
**Exploration targets**:
- Main app entry (App.tsx or equivalent)
- Left sidebar / navigation components
- Right sidebar / inspector panels
- Floating panels, drawers, modals
- Settings / configuration panels
- Control center / dashboard panels
**Output format**:
```
| Component | Location | Interactive Elements | Overlaps With |
|-----------|----------|---------------------|----------------|
```
## Dimension 2: User Journey & Empty State
**Goal**: Evaluate time-to-first-value for a new user.
**Key questions**:
1. What does a user see when they have no data/sessions/projects?
2. How many steps from launch to first successful action?
3. Is there an onboarding flow? How many steps?
4. How many clickable elements compete for attention in the empty state?
5. Are high-frequency actions visually prioritized over low-frequency ones?
**Exploration targets**:
- Empty state components
- Onboarding dialogs/wizards
- Prompt input area and surrounding controls
- Quick start templates / suggested actions
- Mobile-specific navigation and input
**Output format**:
```
Step N: [Action] → [What user sees] → [Next possible actions: count]
```
## Dimension 3: Backend API Surface
**Goal**: Identify API bloat, inconsistency, and unused endpoints.
**Key questions**:
1. How many total API endpoints exist?
2. Which endpoints have no corresponding frontend call?
3. Are error responses consistent across all endpoints?
4. Is authentication applied consistently?
5. Are there duplicate endpoints serving similar purposes?
**Exploration targets**:
- Router files (API route definitions)
- Frontend API client / fetch calls
- Error handling middleware
- Authentication middleware
- API documentation / OpenAPI spec
**Output format**:
```
| Method | Path | Purpose | Has Frontend Consumer | Auth Required |
|--------|------|---------|----------------------|---------------|
```
## Dimension 4: Architecture & Module Structure
**Goal**: Identify coupling, duplication, and dead code.
**Key questions**:
1. Which modules have circular dependencies?
2. Where is the same pattern duplicated across 3+ files?
3. Which modules have unclear single responsibility?
4. Are there unused exports or dead code paths?
5. How deep is the import chain for core operations?
**Exploration targets**:
- Module `__init__.py` / `index.ts` files
- Import graphs (who imports whom)
- Utility files and shared helpers
- Configuration and factory patterns
## Dimension 5: Documentation & Config Consistency
**Goal**: Find gaps between claims and reality.
**Key questions**:
1. Does README list features that don't exist in code?
2. Are config file defaults consistent with code defaults?
3. Is there documentation for removed/renamed features?
4. Which modules have zero test coverage?
5. Are there TODO/FIXME/HACK comments in production code?
**Exploration targets**:
- README.md, CLAUDE.md, CONTRIBUTING.md
- Config files (YAML, JSON, .env)
- Test directories (coverage gaps)
- Source code comments (TODO/FIXME/HACK)

View File

@@ -0,0 +1,82 @@
# Codex CLI Integration Patterns
How to use OpenAI Codex CLI for cross-model parallel analysis.
## Basic Invocation
```bash
codex -m o4-mini \
-c model_reasoning_effort="high" \
--full-auto \
"Your analysis prompt here"
```
## Flag Reference
| Flag | Purpose | Values |
|------|---------|--------|
| `-m` | Model selection | `o4-mini` (fast), `gpt-5.3-codex-spark` (deep) |
| `-c model_reasoning_effort` | Reasoning depth | `low`, `medium`, `high`, `xhigh` |
| `-c model_reasoning_summary_format` | Summary format | `experimental` (structured output) |
| `--full-auto` | Skip all approval prompts | (no value) |
| `--dangerously-bypass-approvals-and-sandbox` | Legacy full-auto flag | (no value, older versions) |
## Recommended Configurations
### Fast Scan (quick validation)
```bash
codex -m o4-mini \
-c model_reasoning_effort="medium" \
--full-auto \
"prompt"
```
### Deep Analysis (thorough investigation)
```bash
codex -m o4-mini \
-c model_reasoning_effort="xhigh" \
-c model_reasoning_summary_format="experimental" \
--full-auto \
"prompt"
```
## Parallel Execution Pattern
Launch multiple Codex analyses in background using Bash tool with `run_in_background: true`:
```bash
# Dimension 1: Frontend
codex -m o4-mini -c model_reasoning_effort="high" --full-auto \
"Analyze frontend navigation: count interactive elements, find duplicate entry points, assess cognitive load for new users. Give file paths and counts."
# Dimension 2: User Journey
codex -m o4-mini -c model_reasoning_effort="high" --full-auto \
"Analyze new user experience: what does empty state show? How many steps to first action? Count clickable elements competing for attention. Give file paths."
# Dimension 3: Backend API
codex -m o4-mini -c model_reasoning_effort="high" --full-auto \
"List all API endpoints. Identify unused endpoints with no frontend consumer. Check error handling consistency. Give router file paths."
```
## Output Handling
Codex outputs to stdout. When run in background:
1. Use Bash `run_in_background: true` to launch
2. Use `TaskOutput` to retrieve results when done
3. Parse the text output for findings
## Cross-Model Value
The primary value of Codex in this workflow is **independent perspective**:
- Different training data may surface different patterns
- Different reasoning approach may catch what Claude misses
- Agreement across models = high confidence
- Disagreement = worth investigating manually
## Limitations
- Codex CLI must be installed and configured (`codex` command available)
- Requires OpenAI API key configured
- No MCP server access (only filesystem tools)
- Output is unstructured text (needs parsing)
- Rate limits apply per OpenAI account

View File

@@ -0,0 +1,68 @@
# Synthesis Methodology
How to weight, merge, and validate findings from multiple parallel agents.
## Multi-Agent Synthesis Framework
### Step 1: Collect Raw Findings
Wait for all agents to complete. For each agent, extract:
- **Quantitative data**: counts, measurements, lists
- **Qualitative assessments**: good/bad/unclear judgments
- **Evidence**: file paths, line numbers, code snippets
### Step 2: Cross-Validation Matrix
Create a matrix comparing findings across agents:
```
| Finding | Agent A | Agent B | Codex | Confidence |
|---------|---------|---------|-------|------------|
| "57 interactive elements on first screen" | 57 | 54 | 61 | HIGH (3/3 agree on magnitude) |
| "Skills has 3 entry points" | 3 | 3 | 2 | HIGH (2/3 exact match) |
| "Risk pages should be removed" | Yes | - | No | LOW (disagreement, investigate) |
```
**Confidence levels**:
- **HIGH**: 2+ agents agree (exact or same magnitude)
- **MEDIUM**: 1 agent found, others didn't look
- **LOW**: Agents disagree — requires manual investigation
### Step 3: Disagreement Resolution
When agents disagree:
1. Check if they analyzed different files/scopes
2. Check if one agent missed context (e.g., conditional rendering)
3. If genuine disagreement, note both perspectives in report
4. Codex-only findings are "different model perspective" — valuable but need validation
### Step 4: Priority Assignment
**P0 (Critical)**: Issues that prevent a new user from completing basic tasks
- Examples: broken onboarding, missing error messages, dead navigation links
**P1 (High)**: Issues that significantly increase cognitive load or confusion
- Examples: duplicate entry points, information overload, unclear primary action
**P2 (Medium)**: Issues worth addressing but not blocking launch
- Examples: unused API endpoints, minor inconsistencies, missing edge case handling
### Step 5: Report Generation
Structure the report for actionability:
1. **Executive Summary** (2-3 sentences, the "so what")
2. **Quantified Metrics** (hard numbers, no adjectives)
3. **P0 Issues** (with specific file:line references)
4. **P1 Issues** (with suggested fixes)
5. **P2 Issues** (backlog items)
6. **Cross-Model Insights** (findings unique to one model)
7. **Competitive Position** (if compare scope was used)
## Weighting Rules
- Quantitative findings (counts, measurements) > Qualitative judgments
- Code-evidenced findings > Assumption-based findings
- Multi-agent agreement > Single-agent finding
- User-facing issues > Internal code quality issues
- Findings with clear fix path > Vague "should improve" suggestions