feat(history-finder): Add claude-code-history-files-finder skill
Add new skill for finding and recovering content from Claude Code session history files (.claude/projects/). Features: - Search sessions by keywords across project history - Recover deleted files from Write tool calls - Analyze session statistics and tool usage - Track file evolution across multiple sessions Best practice improvements applied: - Third-person description in frontmatter - Imperative writing style throughout - Progressive disclosure (workflows in references/) - No content duplication between SKILL.md and references - Proper exception handling in scripts - Documented magic numbers Marketplace integration: - Updated marketplace.json (v1.13.0, 20 plugins) - Updated README.md badges, skill section, use cases - Updated README.zh-CN.md with Chinese translations - Updated CLAUDE.md skill count and available skills list - Updated CHANGELOG.md with v1.13.0 entry 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,285 @@
|
||||
# Claude Code Session File Format
|
||||
|
||||
## Overview
|
||||
|
||||
Claude Code stores conversation history in JSONL (JSON Lines) format, where each line is a complete JSON object representing a message or event in the conversation.
|
||||
|
||||
## File Locations
|
||||
|
||||
### Session Files
|
||||
|
||||
```
|
||||
~/.claude/projects/<normalized-project-path>/<session-id>.jsonl
|
||||
```
|
||||
|
||||
**Path normalization**: Project paths are converted by replacing `/` with `-`
|
||||
|
||||
Example:
|
||||
- Project: `/Users/username/Workspace/js/myproject`
|
||||
- Directory: `~/.claude/projects/-Users-username-Workspace-js-myproject/`
|
||||
|
||||
### File Types
|
||||
|
||||
| Pattern | Type | Description |
|
||||
|---------|------|-------------|
|
||||
| `<uuid>.jsonl` | Main session | User conversation sessions |
|
||||
| `agent-<id>.jsonl` | Agent session | Sub-agent execution logs |
|
||||
|
||||
## JSON Structure
|
||||
|
||||
### Message Object
|
||||
|
||||
Every line in a JSONL file follows this structure:
|
||||
|
||||
```json
|
||||
{
|
||||
"role": "user" | "assistant",
|
||||
"message": {
|
||||
"role": "user" | "assistant",
|
||||
"content": [...]
|
||||
},
|
||||
"timestamp": "2025-11-26T00:00:00.000Z",
|
||||
"uuid": "message-uuid",
|
||||
"parentUuid": "parent-message-uuid",
|
||||
"sessionId": "session-uuid"
|
||||
}
|
||||
```
|
||||
|
||||
### Content Types
|
||||
|
||||
The `content` array contains different types of content blocks:
|
||||
|
||||
#### Text Content
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "text",
|
||||
"text": "Message text content"
|
||||
}
|
||||
```
|
||||
|
||||
#### Tool Use (Write)
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "tool_use",
|
||||
"name": "Write",
|
||||
"input": {
|
||||
"file_path": "/absolute/path/to/file.js",
|
||||
"content": "File content here..."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Tool Use (Edit)
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "tool_use",
|
||||
"name": "Edit",
|
||||
"input": {
|
||||
"file_path": "/absolute/path/to/file.js",
|
||||
"old_string": "Original text",
|
||||
"new_string": "Replacement text",
|
||||
"replace_all": false
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Tool Use (Read)
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "tool_use",
|
||||
"name": "Read",
|
||||
"input": {
|
||||
"file_path": "/absolute/path/to/file.js",
|
||||
"offset": 0,
|
||||
"limit": 100
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Tool Use (Bash)
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "tool_use",
|
||||
"name": "Bash",
|
||||
"input": {
|
||||
"command": "ls -la",
|
||||
"description": "List files"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Tool Result
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "tool_result",
|
||||
"tool_use_id": "tool-use-uuid",
|
||||
"content": "Result content",
|
||||
"is_error": false
|
||||
}
|
||||
```
|
||||
|
||||
## Common Extraction Patterns
|
||||
|
||||
### Finding Write Operations
|
||||
|
||||
Look for assistant messages with `tool_use` type and `name: "Write"`:
|
||||
|
||||
```python
|
||||
if item.get("type") == "tool_use" and item.get("name") == "Write":
|
||||
file_path = item["input"]["file_path"]
|
||||
content = item["input"]["content"]
|
||||
```
|
||||
|
||||
### Finding Edit Operations
|
||||
|
||||
```python
|
||||
if item.get("type") == "tool_use" and item.get("name") == "Edit":
|
||||
file_path = item["input"]["file_path"]
|
||||
old_string = item["input"]["old_string"]
|
||||
new_string = item["input"]["new_string"]
|
||||
```
|
||||
|
||||
### Extracting Text Content
|
||||
|
||||
```python
|
||||
for item in message_content:
|
||||
if item.get("type") == "text":
|
||||
text = item.get("text", "")
|
||||
```
|
||||
|
||||
## Field Locations
|
||||
|
||||
Due to schema variations, some fields may appear in different locations:
|
||||
|
||||
### Role Field
|
||||
|
||||
```python
|
||||
role = data.get("role") or data.get("message", {}).get("role")
|
||||
```
|
||||
|
||||
### Content Field
|
||||
|
||||
```python
|
||||
content = data.get("content") or data.get("message", {}).get("content", [])
|
||||
```
|
||||
|
||||
### Timestamp Field
|
||||
|
||||
```python
|
||||
timestamp = data.get("timestamp", "")
|
||||
```
|
||||
|
||||
## Common Use Cases
|
||||
|
||||
### Recover Deleted Files
|
||||
|
||||
1. Search for `Write` tool calls with matching file path
|
||||
2. Extract `input.content` from latest occurrence
|
||||
3. Save to disk with original filename
|
||||
|
||||
### Track File Changes
|
||||
|
||||
1. Find all `Edit` and `Write` operations for a file
|
||||
2. Build chronological list of changes
|
||||
3. Reconstruct file history
|
||||
|
||||
### Search Conversations
|
||||
|
||||
1. Extract all `text` content from messages
|
||||
2. Search for keywords or patterns
|
||||
3. Return matching sessions
|
||||
|
||||
### Analyze Tool Usage
|
||||
|
||||
1. Count occurrences of each tool type
|
||||
2. Track which files were accessed
|
||||
3. Generate usage statistics
|
||||
|
||||
## Edge Cases
|
||||
|
||||
### Empty Content
|
||||
|
||||
Some messages may have empty content arrays:
|
||||
|
||||
```python
|
||||
content = data.get("content", [])
|
||||
if not content:
|
||||
continue
|
||||
```
|
||||
|
||||
### Missing Fields
|
||||
|
||||
Always use `.get()` with defaults:
|
||||
|
||||
```python
|
||||
file_path = item.get("input", {}).get("file_path", "")
|
||||
```
|
||||
|
||||
### JSON Decode Errors
|
||||
|
||||
Session files may contain malformed lines:
|
||||
|
||||
```python
|
||||
try:
|
||||
data = json.loads(line)
|
||||
except json.JSONDecodeError:
|
||||
continue # Skip malformed lines
|
||||
```
|
||||
|
||||
### Large Files
|
||||
|
||||
Session files can be very large (>100MB). Process line-by-line:
|
||||
|
||||
```python
|
||||
with open(session_file, 'r') as f:
|
||||
for line in f: # Streaming, not f.read()
|
||||
process_line(line)
|
||||
```
|
||||
|
||||
## Performance Tips
|
||||
|
||||
### Memory Efficiency
|
||||
|
||||
- Process files line-by-line (streaming)
|
||||
- Don't load entire file into memory
|
||||
- Use generators for large result sets
|
||||
|
||||
### Search Optimization
|
||||
|
||||
- Early exit when keyword count threshold met
|
||||
- Case-insensitive search: normalize once
|
||||
- Use `in` operator for substring matching
|
||||
|
||||
### Deduplication
|
||||
|
||||
When recovering files, keep latest version only:
|
||||
|
||||
```python
|
||||
files_by_path = {}
|
||||
for call in write_calls:
|
||||
files_by_path[file_path] = call # Overwrites earlier versions
|
||||
```
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Personal Information
|
||||
|
||||
Session files may contain:
|
||||
- Absolute file paths with usernames
|
||||
- API keys or credentials in code
|
||||
- Company-specific information
|
||||
- Private conversations
|
||||
|
||||
### Safe Sharing
|
||||
|
||||
Before sharing extracted content:
|
||||
1. Remove absolute paths
|
||||
2. Redact sensitive information
|
||||
3. Use placeholders for usernames
|
||||
4. Verify no credentials present
|
||||
@@ -0,0 +1,88 @@
|
||||
# Workflow Examples
|
||||
|
||||
Detailed workflow examples for common session history recovery scenarios.
|
||||
|
||||
## Recover Files Deleted in Cleanup
|
||||
|
||||
**Scenario**: Files were deleted during code review, need to recover specific components.
|
||||
|
||||
```bash
|
||||
# 1. Find sessions mentioning the deleted files
|
||||
python3 scripts/analyze_sessions.py search /path/to/project \
|
||||
DeletedComponent ModelScreen RemovedFeature
|
||||
|
||||
# 2. Recover content from most relevant session
|
||||
python3 scripts/recover_content.py ~/.claude/projects/.../session-id.jsonl \
|
||||
-k DeletedComponent ModelScreen \
|
||||
-o ./recovered/
|
||||
|
||||
# 3. Review recovered files
|
||||
ls -lh ./recovered/
|
||||
```
|
||||
|
||||
## Track File Evolution Across Sessions
|
||||
|
||||
**Scenario**: Understand how a file changed over multiple sessions.
|
||||
|
||||
```bash
|
||||
# 1. Find sessions that modified the file
|
||||
python3 scripts/analyze_sessions.py search /path/to/project \
|
||||
"componentName.jsx"
|
||||
|
||||
# 2. Analyze each session's file operations
|
||||
for session in session1.jsonl session2.jsonl session3.jsonl; do
|
||||
python3 scripts/analyze_sessions.py stats $session --show-files | \
|
||||
grep "componentName.jsx"
|
||||
done
|
||||
|
||||
# 3. Recover all versions
|
||||
python3 scripts/recover_content.py session1.jsonl -k componentName -o ./v1/
|
||||
python3 scripts/recover_content.py session2.jsonl -k componentName -o ./v2/
|
||||
python3 scripts/recover_content.py session3.jsonl -k componentName -o ./v3/
|
||||
|
||||
# 4. Compare versions
|
||||
diff ./v1/componentName.jsx ./v2/componentName.jsx
|
||||
```
|
||||
|
||||
## Find Session with Specific Implementation
|
||||
|
||||
**Scenario**: Remember implementing a feature but can't find which session.
|
||||
|
||||
```bash
|
||||
# Search for distinctive keywords from that implementation
|
||||
python3 scripts/analyze_sessions.py search /path/to/project \
|
||||
"useModelStatus" "downloadProgress" "ModelScope"
|
||||
|
||||
# Review top match
|
||||
python3 scripts/analyze_sessions.py stats <top-result-session.jsonl>
|
||||
```
|
||||
|
||||
## Batch Recovery Across Multiple Sessions
|
||||
|
||||
**Scenario**: Recover files containing a keyword from all matching sessions.
|
||||
|
||||
```bash
|
||||
# Find relevant sessions
|
||||
sessions=$(python3 scripts/analyze_sessions.py search /path/to/project \
|
||||
keyword --limit 999 | grep "Path:" | awk '{print $2}')
|
||||
|
||||
# Recover from each session
|
||||
for session in $sessions; do
|
||||
output_dir="./recovery_$(basename $session .jsonl)"
|
||||
python3 scripts/recover_content.py "$session" -k keyword -o "$output_dir"
|
||||
done
|
||||
```
|
||||
|
||||
## Custom Extraction from Raw JSONL
|
||||
|
||||
For extraction needs not covered by bundled scripts:
|
||||
|
||||
```python
|
||||
import json
|
||||
|
||||
with open('session.jsonl', 'r') as f:
|
||||
for line in f:
|
||||
data = json.loads(line)
|
||||
# Custom extraction logic
|
||||
# See references/session_file_format.md for structure
|
||||
```
|
||||
Reference in New Issue
Block a user