2.5 KiB
2.5 KiB
Synthesis Methodology
How to weight, merge, and validate findings from multiple parallel agents.
Multi-Agent Synthesis Framework
Step 1: Collect Raw Findings
Wait for all agents to complete. For each agent, extract:
- Quantitative data: counts, measurements, lists
- Qualitative assessments: good/bad/unclear judgments
- Evidence: file paths, line numbers, code snippets
Step 2: Cross-Validation Matrix
Create a matrix comparing findings across agents:
| Finding | Agent A | Agent B | Codex | Confidence |
|---------|---------|---------|-------|------------|
| "57 interactive elements on first screen" | 57 | 54 | 61 | HIGH (3/3 agree on magnitude) |
| "Skills has 3 entry points" | 3 | 3 | 2 | HIGH (2/3 exact match) |
| "Risk pages should be removed" | Yes | - | No | LOW (disagreement, investigate) |
Confidence levels:
- HIGH: 2+ agents agree (exact or same magnitude)
- MEDIUM: 1 agent found, others didn't look
- LOW: Agents disagree — requires manual investigation
Step 3: Disagreement Resolution
When agents disagree:
- Check if they analyzed different files/scopes
- Check if one agent missed context (e.g., conditional rendering)
- If genuine disagreement, note both perspectives in report
- Codex-only findings are "different model perspective" — valuable but need validation
Step 4: Priority Assignment
P0 (Critical): Issues that prevent a new user from completing basic tasks
- Examples: broken onboarding, missing error messages, dead navigation links
P1 (High): Issues that significantly increase cognitive load or confusion
- Examples: duplicate entry points, information overload, unclear primary action
P2 (Medium): Issues worth addressing but not blocking launch
- Examples: unused API endpoints, minor inconsistencies, missing edge case handling
Step 5: Report Generation
Structure the report for actionability:
- Executive Summary (2-3 sentences, the "so what")
- Quantified Metrics (hard numbers, no adjectives)
- P0 Issues (with specific file:line references)
- P1 Issues (with suggested fixes)
- P2 Issues (backlog items)
- Cross-Model Insights (findings unique to one model)
- Competitive Position (if compare scope was used)
Weighting Rules
- Quantitative findings (counts, measurements) > Qualitative judgments
- Code-evidenced findings > Assumption-based findings
- Multi-agent agreement > Single-agent finding
- User-facing issues > Internal code quality issues
- Findings with clear fix path > Vague "should improve" suggestions