- Correct source accessibility: distinguish circular verification (forbidden) from exclusive information advantage (encouraged) - Add Counter-Review Team with 5 specialized agents (claim-validator, source-diversity-checker, recency-validator, contradiction-finder, counter-review-coordinator) - Add Enterprise Research Mode: 6-dimension data collection framework with SWOT, competitive barrier, and risk matrix analysis - Update version to 2.4.0 - Add comprehensive reference docs: - source_accessibility_policy.md - V6_1_improvements.md - counter_review_team_guide.md - enterprise_analysis_frameworks.md - enterprise_quality_checklist.md - enterprise_research_methodology.md - quality_gates.md - report_template_v6.md - research_notes_format.md - subagent_prompt.md Based on "深度推理" case study methodology lessons learned. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3.7 KiB
3.7 KiB
Quality Gates V6
Gate 1: Task Notes Quality (after P2)
| Check | Standard | Lightweight | Fix |
|---|---|---|---|
| All tasks completed | 100% | 100% | Re-dispatch failed tasks |
| Sources per task | >= 2 | >= 1 | Run additional searches |
| Findings per task | >= 3 | >= 2 | Deepen search or fetch more |
| DEEP tasks have Deep Read Notes | 100% | 100% | Fetch and read top source |
| All source URLs from actual search | 100% | 100% | Remove any invented URL |
Gate 2: Citation Registry (after P3)
| Check | Standard | Lightweight | Fix |
|---|---|---|---|
| Total approved sources | >= 12 | >= 6 | Flag thin areas for P6 |
| Unique domains | >= 5 | >= 3 | Diversify in re-search |
| Max single-source share | <= 25% | <= 30% | Find alternatives |
| Official source coverage | >= 30% for standard | >= 20% for lightweight | Add official sources |
| Source-type balance | official + academic + secondary at least 2 types | same | Fill missing type |
| Dropped sources listed | All | All | Must be explicit |
| No duplicate URLs | 0 duplicates | 0 | Merge during P3 |
Gate 3: Draft Quality (after P5)
| Check | Standard | Lightweight | Fix |
|---|---|---|---|
| Every [n] in registry | 100% | 100% | Remove or fix |
| No dropped source cited | 0 violations | 0 | Remove immediately |
| Citation density | >= 1 per 200 words | >= 1 per 300 words | Add citations |
| Every section has confidence marker | 100% | 100% | Add missing |
| High-confidence claims backed by official source | 100% | 100% | Downgrade or re-source |
| Counter-claim recorded for major sections | 100% | 70% | Add opposing interpretation |
| Total word count | 3000-8000 | 2000-4000 | Adjust scope |
Gate 4: Notes Traceability (after P6)
| Check | Threshold | Fix |
|---|---|---|
| Every specific claim traceable to a task note finding | 100% | 100% |
| Every statistic/number appears in some task note | 100% | 100% |
| No claim contradicts a task note | 0 contradictions | 0 |
| Claims with recency sensitivity include source date and AS_OF | 100% | 100% |
| P6 found >= 3 issues | Must | Re-examine harder if 0 found |
Gate 5: Verification (after P7)
| Check | Threshold | Fix |
|---|---|---|
| Registry cross-check: all [n] valid | 100% | 100% |
| Spot-check: 5+ claims traced to notes | >= 4/5 pass | Fix failing claims |
| No dropped source resurrected | 0 | Remove immediately |
| Source concentration check for key claims | None > 25% | diversify |
Anti-Hallucination Patterns
| Pattern | Where to detect | Fix |
|---|---|---|
| URL not from any subagent search | P7 registry check | Remove citation |
| Claim not in any task note | P6 traceability check | Remove or mark [unverified] |
| Number more precise than source | P6 ("73.2%" when note says "about 70%") | Use note's precision |
| Source authority inflated | P3 registry building | Re-score from notes |
| Source type mismatched to claim | P3 + P6 | Reclassify or replace source |
| "Studies show..." without naming study | P6 | Name specific source or remove |
| Dropped source reappears | P7 cross-check | Remove immediately |
| Subagent invented a URL | Gate 1 (lead verifies subagent notes) | Remove from notes before P3 |
Chinese-Specific Patterns
| Pattern | Fix |
|---|---|
| Fake CNKI URL format | Remove, note gap |
| "某专家表示" without name/institution | Name or remove |
| "据统计" without data source | Add source or qualitative language |
| Fabricated institution report | Verify existence or remove |
| 旧模型信息未标注 AS_OF | 降级置信度并重搜 |