Files
claude-code-skills-reference/markdown-tools/SKILL.md
daymade 8233430cf2 Release v1.18.1: Enhance markdown-tools with PDF image extraction
- Add extract_pdf_images.py script using PyMuPDF
- Refactor SKILL.md for clearer workflow documentation
- Update installation to use markitdown[pdf] extra
- Update marketplace version to 1.18.1
- Update markdown-tools version to 1.1.0
- Update README/README.zh-CN with new features
- Update QUICKSTART docs with in-app install instructions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 18:46:15 +08:00

2.3 KiB

name, description
name description
markdown-tools Converts documents to markdown (PDFs, Word docs, PowerPoint, Confluence exports) with Windows/WSL path handling. Activates when converting .doc/.docx/PDF/PPTX files to markdown, processing Confluence exports, handling Windows/WSL path conversions, extracting images from PDFs, or working with markitdown utility.

Markdown Tools

Convert documents to markdown with image extraction and Windows/WSL path handling.

Quick Start

Install markitdown with PDF Support

# IMPORTANT: Use [pdf] extra for PDF support
uv tool install "markitdown[pdf]"

# Or via pip
pip install "markitdown[pdf]"

Basic Conversion

markitdown "document.pdf" -o output.md
# Or redirect: markitdown "document.pdf" > output.md

PDF Conversion with Images

markitdown extracts text only. For PDFs with images, use this workflow:

Step 1: Convert Text

markitdown "document.pdf" -o output.md

Step 2: Extract Images

# Create assets directory alongside the markdown
mkdir -p assets

# Extract images using PyMuPDF
uv run --with pymupdf python scripts/extract_pdf_images.py "document.pdf" ./assets

Step 3: Add Image References

Insert image references in the markdown where needed:

![Description](assets/img_page1_1.png)

Step 4: Format Cleanup

markitdown output often needs manual fixes:

  • Add proper heading levels (#, ##, ###)
  • Reconstruct tables in markdown format
  • Fix broken line breaks
  • Restore indentation structure

Path Conversion (Windows/WSL)

# Windows → WSL conversion
C:\Users\name\file.pdf → /mnt/c/Users/name/file.pdf

# Use helper script
python scripts/convert_path.py "C:\Users\name\Documents\file.pdf"

Common Issues

"dependencies needed to read .pdf files"

# Install with PDF support
uv tool install "markitdown[pdf]" --force

FontBBox warnings during PDF conversion

  • These are harmless font parsing warnings, output is still correct

Images missing from output

  • Use scripts/extract_pdf_images.py to extract images separately

Resources

  • scripts/extract_pdf_images.py - Extract images from PDF using PyMuPDF
  • scripts/convert_path.py - Windows to WSL path converter
  • references/conversion-examples.md - Detailed examples for batch operations