Replace \"深度推理(上海)科技有限公司\" with \"字节跳动子公司\" as the case study example to avoid exposing user's own company info. Also update .gitignore to exclude: - deep-research-output/ (contains sensitive research data) - recovered_deep_research/ - .opencli/ - douban-skill/ (work-in-progress) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
5.7 KiB
Source Accessibility Policy
Version: V6.1
Purpose: Distinguish between legitimate exclusive information advantages and circular verification traps
The Problem
In the "字节跳动" case study, we made a methodology error:
What happened:
- User asked to research their own company: "字节跳动某子公司"
- We accessed user's own Spaceship account (their private registrar)
- Found 25 domains the user already owned
- Reported back: "The company owns these 25 domains"
Why this is wrong:
- This is circular reasoning, not research
- User asked us to discover information about their company
- We instead queried their private data and presented it as findings
- It's like looking in someone's wallet to tell them how much money they have
The real question: Can an external investigator confirm this company exists? Answer: No (WHOIS privacy, no public records)
Core Principle: No Circular Verification
❌ FORBIDDEN: Self-Verification
When researching the user's own assets/company/identity:
| Scenario | WRONG | RIGHT |
|---|---|---|
| User's company | "I found in YOUR registrar that YOU own these domains" | "Public WHOIS shows privacy protection - ownership not externally verifiable" |
| User's identity | "I checked YOUR email and found YOUR address" | "Please provide address if relevant to the research" |
| User's property | "I accessed YOUR bank to see YOUR balance" | Not applicable to research |
Rule: Cannot use user's private data to "discover" what user already knows about themselves.
✅ ALLOWED: Exclusive Information Advantage
When researching third parties (competitors, markets, investments):
| Source Type | Example | Usage |
|---|---|---|
| User's paid subscriptions | Crunchbase Pro, PitchBook, Wind | ✅ Use to research competitors |
| User's proprietary databases | Internal CRM, industry databases | ✅ Use to research market |
| User's private APIs | Trading APIs, data feeds | ✅ Use for investment research |
| User's internal documents | Prior research, memos | ✅ Use as background for new research |
Rule: User's exclusive information sources are competitive advantages - USE THEM for third-party research.
The Distinction
Research Target: 字节跳动某子公司
├─ Is this the user's own company? → YES
├─ Can we use user's private data about it? → NO (circular)
└─ Must rely on: Public sources only
Research Target: 竞争对手公司 X
├─ Is this the user's own company? → NO
├─ Can we use user's Crunchbase Pro? → YES (competitive advantage)
└─ Can use: Public + User's exclusive sources
Corrected Methodology
When Researching User's Own Company
Approach: External investigator perspective
User: "Research my company 字节跳动子公司"
CORRECT RESPONSE:
1. Search public sources (WHOIS, web, news)
2. Find: Website placeholder, privacy-protected WHOIS, no news
3. Report: "From public perspective: minimal footprint, cannot verify ownership"
4. Gap: "Internal data not accessible to external investigators"
INCORRECT RESPONSE:
1. Access user's Spaceship account
2. Find: 25 domains user already knows they own
3. Report: "The company owns 25 domains" (user already knows this!)
When User Provides Exclusive Sources
Approach: Leverage competitive advantage
User: "Research competitor X, I have Crunchbase Pro"
User: "Here's my API key: xxx"
CORRECT RESPONSE:
1. Use provided Crunchbase Pro API
2. Find: Funding history, team info not in public sources
3. Report: "Per Crunchbase Pro [exclusive source], X raised $Y in Series Z"
4. Cite: Accessibility: exclusive (user-provided)
Source Classification
public ✅
- Available to any external researcher
- Examples: Public websites, news, SEC filings
exclusive-user-provided ✅ (FOR THIRD-PARTY RESEARCH)
- User's paid subscriptions, private APIs, internal databases
- USE for: Researching competitors, markets, investments
- DO NOT USE for: Verifying user's own assets/identity
private-user-owned ❌ (FOR SELF-RESEARCH)
- User's own accounts, emails, personal data
- DO NOT USE: Creates circular verification
Information Black Box Protocol
When an entity (including user's own company) has no public footprint:
-
Document what external researcher would find:
- WHOIS: Privacy protected
- Web search: No results
- News: No coverage
-
Report honestly:
Public sources found: 0 External visibility: None Verdict: Cannot verify from public perspective Note: User may have private information not available to external investigators -
Do NOT:
- Use user's private data to "fill gaps"
- Present user's private knowledge as "discovered evidence"
Checklist
When starting research, determine:
-
Who is the research target?
- User's own company/asset? → Public sources ONLY
- Third party? → Can use user's exclusive sources
-
Am I discovering or querying?
- Discovering new info? → Research
- Querying user's own data? → Circular, not allowed
-
Would this finding surprise the user?
- Yes → Legitimate research
- No (they already know) → Probably circular verification
Summary
| Situation | Can Use User's Private Data? | Why? |
|---|---|---|
| Research user's own company | ❌ NO | Circular verification |
| Research competitor using user's Crunchbase | ✅ YES | Competitive advantage |
| Research market using user's database | ✅ YES | Exclusive information |
| "Discover" user's own domain ownership | ❌ NO | User already knows this |