Files
claude-code-skills-reference/claude-code-history-files-finder/references/session_file_format.md
daymade 20cc442ec4 feat(history-finder): Add claude-code-history-files-finder skill
Add new skill for finding and recovering content from Claude Code
session history files (.claude/projects/).

Features:
- Search sessions by keywords across project history
- Recover deleted files from Write tool calls
- Analyze session statistics and tool usage
- Track file evolution across multiple sessions

Best practice improvements applied:
- Third-person description in frontmatter
- Imperative writing style throughout
- Progressive disclosure (workflows in references/)
- No content duplication between SKILL.md and references
- Proper exception handling in scripts
- Documented magic numbers

Marketplace integration:
- Updated marketplace.json (v1.13.0, 20 plugins)
- Updated README.md badges, skill section, use cases
- Updated README.zh-CN.md with Chinese translations
- Updated CLAUDE.md skill count and available skills list
- Updated CHANGELOG.md with v1.13.0 entry

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-09 16:21:19 +08:00

5.3 KiB

Claude Code Session File Format

Overview

Claude Code stores conversation history in JSONL (JSON Lines) format, where each line is a complete JSON object representing a message or event in the conversation.

File Locations

Session Files

~/.claude/projects/<normalized-project-path>/<session-id>.jsonl

Path normalization: Project paths are converted by replacing / with -

Example:

  • Project: /Users/username/Workspace/js/myproject
  • Directory: ~/.claude/projects/-Users-username-Workspace-js-myproject/

File Types

Pattern Type Description
<uuid>.jsonl Main session User conversation sessions
agent-<id>.jsonl Agent session Sub-agent execution logs

JSON Structure

Message Object

Every line in a JSONL file follows this structure:

{
  "role": "user" | "assistant",
  "message": {
    "role": "user" | "assistant",
    "content": [...]
  },
  "timestamp": "2025-11-26T00:00:00.000Z",
  "uuid": "message-uuid",
  "parentUuid": "parent-message-uuid",
  "sessionId": "session-uuid"
}

Content Types

The content array contains different types of content blocks:

Text Content

{
  "type": "text",
  "text": "Message text content"
}

Tool Use (Write)

{
  "type": "tool_use",
  "name": "Write",
  "input": {
    "file_path": "/absolute/path/to/file.js",
    "content": "File content here..."
  }
}

Tool Use (Edit)

{
  "type": "tool_use",
  "name": "Edit",
  "input": {
    "file_path": "/absolute/path/to/file.js",
    "old_string": "Original text",
    "new_string": "Replacement text",
    "replace_all": false
  }
}

Tool Use (Read)

{
  "type": "tool_use",
  "name": "Read",
  "input": {
    "file_path": "/absolute/path/to/file.js",
    "offset": 0,
    "limit": 100
  }
}

Tool Use (Bash)

{
  "type": "tool_use",
  "name": "Bash",
  "input": {
    "command": "ls -la",
    "description": "List files"
  }
}

Tool Result

{
  "type": "tool_result",
  "tool_use_id": "tool-use-uuid",
  "content": "Result content",
  "is_error": false
}

Common Extraction Patterns

Finding Write Operations

Look for assistant messages with tool_use type and name: "Write":

if item.get("type") == "tool_use" and item.get("name") == "Write":
    file_path = item["input"]["file_path"]
    content = item["input"]["content"]

Finding Edit Operations

if item.get("type") == "tool_use" and item.get("name") == "Edit":
    file_path = item["input"]["file_path"]
    old_string = item["input"]["old_string"]
    new_string = item["input"]["new_string"]

Extracting Text Content

for item in message_content:
    if item.get("type") == "text":
        text = item.get("text", "")

Field Locations

Due to schema variations, some fields may appear in different locations:

Role Field

role = data.get("role") or data.get("message", {}).get("role")

Content Field

content = data.get("content") or data.get("message", {}).get("content", [])

Timestamp Field

timestamp = data.get("timestamp", "")

Common Use Cases

Recover Deleted Files

  1. Search for Write tool calls with matching file path
  2. Extract input.content from latest occurrence
  3. Save to disk with original filename

Track File Changes

  1. Find all Edit and Write operations for a file
  2. Build chronological list of changes
  3. Reconstruct file history

Search Conversations

  1. Extract all text content from messages
  2. Search for keywords or patterns
  3. Return matching sessions

Analyze Tool Usage

  1. Count occurrences of each tool type
  2. Track which files were accessed
  3. Generate usage statistics

Edge Cases

Empty Content

Some messages may have empty content arrays:

content = data.get("content", [])
if not content:
    continue

Missing Fields

Always use .get() with defaults:

file_path = item.get("input", {}).get("file_path", "")

JSON Decode Errors

Session files may contain malformed lines:

try:
    data = json.loads(line)
except json.JSONDecodeError:
    continue  # Skip malformed lines

Large Files

Session files can be very large (>100MB). Process line-by-line:

with open(session_file, 'r') as f:
    for line in f:  # Streaming, not f.read()
        process_line(line)

Performance Tips

Memory Efficiency

  • Process files line-by-line (streaming)
  • Don't load entire file into memory
  • Use generators for large result sets

Search Optimization

  • Early exit when keyword count threshold met
  • Case-insensitive search: normalize once
  • Use in operator for substring matching

Deduplication

When recovering files, keep latest version only:

files_by_path = {}
for call in write_calls:
    files_by_path[file_path] = call  # Overwrites earlier versions

Security Considerations

Personal Information

Session files may contain:

  • Absolute file paths with usernames
  • API keys or credentials in code
  • Company-specific information
  • Private conversations

Safe Sharing

Before sharing extracted content:

  1. Remove absolute paths
  2. Redact sensitive information
  3. Use placeholders for usernames
  4. Verify no credentials present