Release v1.18.1: Enhance markdown-tools with PDF image extraction

- Add extract_pdf_images.py script using PyMuPDF
- Refactor SKILL.md for clearer workflow documentation
- Update installation to use markitdown[pdf] extra
- Update marketplace version to 1.18.1
- Update markdown-tools version to 1.1.0
- Update README/README.zh-CN with new features
- Update QUICKSTART docs with in-app install instructions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
daymade
2025-12-28 18:46:15 +08:00
parent 515514b058
commit 8233430cf2
9 changed files with 264 additions and 127 deletions

View File

@@ -6,7 +6,7 @@
}, },
"metadata": { "metadata": {
"description": "Professional Claude Code skills for GitHub operations, document conversion, diagram generation, statusline customization, Teams communication, repomix utilities, skill creation, CLI demo generation, LLM icon access, Cloudflare troubleshooting, UI design system extraction, professional presentation creation, YouTube video downloading, secure repomix packaging, ASR transcription correction, video comparison quality analysis, comprehensive QA testing infrastructure, prompt optimization with EARS methodology, session history recovery, documentation cleanup, PDF generation with Chinese font support, CLAUDE.md progressive disclosure optimization, CCPM skill registry search and management, Promptfoo LLM evaluation framework, and iOS app development with XcodeGen and SwiftUI", "description": "Professional Claude Code skills for GitHub operations, document conversion, diagram generation, statusline customization, Teams communication, repomix utilities, skill creation, CLI demo generation, LLM icon access, Cloudflare troubleshooting, UI design system extraction, professional presentation creation, YouTube video downloading, secure repomix packaging, ASR transcription correction, video comparison quality analysis, comprehensive QA testing infrastructure, prompt optimization with EARS methodology, session history recovery, documentation cleanup, PDF generation with Chinese font support, CLAUDE.md progressive disclosure optimization, CCPM skill registry search and management, Promptfoo LLM evaluation framework, and iOS app development with XcodeGen and SwiftUI",
"version": "1.18.0", "version": "1.18.1",
"homepage": "https://github.com/daymade/claude-code-skills" "homepage": "https://github.com/daymade/claude-code-skills"
}, },
"plugins": [ "plugins": [
@@ -32,10 +32,10 @@
}, },
{ {
"name": "markdown-tools", "name": "markdown-tools",
"description": "Convert documents (PDFs, Word, PowerPoint, Confluence exports) to markdown with Windows/WSL path handling support", "description": "Convert documents (PDFs, Word, PowerPoint, Confluence exports) to markdown with Windows/WSL path handling and PDF image extraction support",
"source": "./", "source": "./",
"strict": false, "strict": false,
"version": "1.0.0", "version": "1.1.0",
"category": "document-conversion", "category": "document-conversion",
"keywords": ["markdown", "pdf", "docx", "confluence", "markitdown", "wsl"], "keywords": ["markdown", "pdf", "docx", "confluence", "markitdown", "wsl"],
"skills": ["./markdown-tools"] "skills": ["./markdown-tools"]

View File

@@ -25,6 +25,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Security ### Security
- None - None
## [1.18.1] - 2025-12-28
### Changed
- **markdown-tools**: Enhanced with PDF image extraction capability
- Added `extract_pdf_images.py` script using PyMuPDF
- Refactored SKILL.md for clearer workflow documentation
- Updated installation instructions to use `markitdown[pdf]` extra
- Updated marketplace version from 1.18.0 to 1.18.1
## [1.18.0] - 2025-12-20 ## [1.18.0] - 2025-12-20
### Added ### Added

View File

@@ -32,6 +32,18 @@ Skills use a three-level loading system:
### Installation Scripts ### Installation Scripts
**In Claude Code (in-app):**
```text
/plugin marketplace add daymade/claude-code-skills
```
Then:
1. Select **Browse and install plugins**
2. Select **daymade/claude-code-skills**
3. Select **skill-creator**
4. Select **Install now**
**From your terminal (CLI):**
```bash ```bash
# Automated installation (macOS/Linux) # Automated installation (macOS/Linux)
curl -fsSL https://raw.githubusercontent.com/daymade/claude-code-skills/main/scripts/install.sh | bash curl -fsSL https://raw.githubusercontent.com/daymade/claude-code-skills/main/scripts/install.sh | bash
@@ -73,6 +85,8 @@ cp -r skill-name ~/.claude/skills/
# Then restart Claude Code # Then restart Claude Code
``` ```
In Claude Code, use `/plugin ...` slash commands. In your terminal, use `claude plugin ...`.
### Git Operations ### Git Operations
This repository uses standard git workflow: This repository uses standard git workflow:

View File

@@ -8,6 +8,18 @@ Get started with Claude Code Skills Marketplace in less than 2 minutes!
### Step 1: Install skill-creator ### Step 1: Install skill-creator
**In Claude Code (in-app):**
```text
/plugin marketplace add daymade/claude-code-skills
```
Then:
1. Select **Browse and install plugins**
2. Select **daymade/claude-code-skills**
3. Select **skill-creator**
4. Select **Install now**
**From your terminal (CLI):**
```bash ```bash
# Add the marketplace # Add the marketplace
claude plugin marketplace add https://github.com/daymade/claude-code-skills claude plugin marketplace add https://github.com/daymade/claude-code-skills
@@ -107,7 +119,7 @@ claude plugin marketplace add https://github.com/daymade/claude-code-skills
# Marketplace name: daymade-skills (from marketplace.json) # Marketplace name: daymade-skills (from marketplace.json)
# Use @daymade-skills in install commands (e.g., skill-name@daymade-skills) # Use @daymade-skills in install commands (e.g., skill-name@daymade-skills)
# Do not use /plugin; all commands are `claude plugin ...` # In Claude Code use `/plugin ...`; in your terminal use `claude plugin ...`
# Step 2: Install skills you need # Step 2: Install skills you need
claude plugin install github-ops@daymade-skills claude plugin install github-ops@daymade-skills
claude plugin install markdown-tools@daymade-skills claude plugin install markdown-tools@daymade-skills

View File

@@ -8,6 +8,18 @@
### 步骤 1安装 skill-creator ### 步骤 1安装 skill-creator
**在 Claude Code 内(应用内):**
```text
/plugin marketplace add daymade/claude-code-skills
```
然后:
1. 选择 **Browse and install plugins**
2. 选择 **daymade/claude-code-skills**
3. 选择 **skill-creator**
4. 选择 **Install now**
**在终端CLI**
```bash ```bash
# 添加市场 # 添加市场
claude plugin marketplace add https://github.com/daymade/claude-code-skills claude plugin marketplace add https://github.com/daymade/claude-code-skills
@@ -107,7 +119,7 @@ claude plugin marketplace add https://github.com/daymade/claude-code-skills
# Marketplace 名称daymade-skills来自 marketplace.json # Marketplace 名称daymade-skills来自 marketplace.json
# 安装命令请使用 @daymade-skills例如 skill-name@daymade-skills # 安装命令请使用 @daymade-skills例如 skill-name@daymade-skills
# 所有命令都应使用 `claude plugin ...`(没有 `/plugin` 命令) # 在 Claude Code 内使用 `/plugin ...`,在终端中使用 `claude plugin ...`
# 步骤 2安装你需要的技能 # 步骤 2安装你需要的技能
claude plugin install github-ops@daymade-skills claude plugin install github-ops@daymade-skills
claude plugin install markdown-tools@daymade-skills claude plugin install markdown-tools@daymade-skills

View File

@@ -48,6 +48,18 @@ The `skill-creator` is the **meta-skill** that enables you to build, validate, a
### Quick Install ### Quick Install
**In Claude Code (in-app):**
```text
/plugin marketplace add daymade/claude-code-skills
```
Then:
1. Select **Browse and install plugins**
2. Select **daymade/claude-code-skills**
3. Select **skill-creator**
4. Select **Install now**
**From your terminal (CLI):**
```bash ```bash
claude plugin marketplace add https://github.com/daymade/claude-code-skills claude plugin marketplace add https://github.com/daymade/claude-code-skills
# Marketplace name: daymade-skills (from marketplace.json) # Marketplace name: daymade-skills (from marketplace.json)
@@ -88,6 +100,18 @@ Claude Code, with skill-creator loaded, will guide you through the entire skill
## 🚀 Quick Installation ## 🚀 Quick Installation
### Install Inside Claude Code (In-App)
```text
/plugin marketplace add daymade/claude-code-skills
```
Then:
1. Select **Browse and install plugins**
2. Select **daymade/claude-code-skills**
3. Select the plugin you want
4. Select **Install now**
### Automated Installation (Recommended) ### Automated Installation (Recommended)
**macOS/Linux:** **macOS/Linux:**
@@ -109,7 +133,7 @@ claude plugin marketplace add https://github.com/daymade/claude-code-skills
Marketplace name is `daymade-skills` (from marketplace.json). Use `@daymade-skills` when installing plugins. Marketplace name is `daymade-skills` (from marketplace.json). Use `@daymade-skills` when installing plugins.
Do not use the repo path as a marketplace name (e.g. `@daymade/claude-code-skills` will fail). Do not use the repo path as a marketplace name (e.g. `@daymade/claude-code-skills` will fail).
All plugin commands should use `claude plugin ...` (there is no `/plugin` command). In Claude Code, use `/plugin ...` slash commands. In your terminal, use `claude plugin ...`.
**Essential Skill** (recommended first install): **Essential Skill** (recommended first install):
```bash ```bash
@@ -242,20 +266,20 @@ Comprehensive GitHub operations using gh CLI and GitHub API.
### 2. **markdown-tools** - Document Conversion Suite ### 2. **markdown-tools** - Document Conversion Suite
Converts documents to markdown with Windows/WSL path handling and Obsidian integration. Converts documents to markdown with Windows/WSL path handling and PDF image extraction.
**When to use:** **When to use:**
- Converting .doc/.docx/PDF/PPTX to markdown - Converting .doc/.docx/PDF/PPTX to markdown
- Extracting images from PDF files
- Processing Confluence exports - Processing Confluence exports
- Handling Windows/WSL path conversions - Handling Windows/WSL path conversions
- Working with markitdown utility
**Key features:** **Key features:**
- Multi-format document conversion - Multi-format document conversion
- Confluence export processing - PDF image extraction using PyMuPDF
- Windows/WSL path automation - Windows/WSL path automation
- Obsidian vault integration - Confluence export processing
- Helper scripts for path conversion - Helper scripts for path conversion and image extraction
**🎬 Live Demo** **🎬 Live Demo**

View File

@@ -48,6 +48,18 @@
### 快速安装 ### 快速安装
**在 Claude Code 内(应用内):**
```text
/plugin marketplace add daymade/claude-code-skills
```
然后:
1. 选择 **Browse and install plugins**
2. 选择 **daymade/claude-code-skills**
3. 选择 **skill-creator**
4. 选择 **Install now**
**在终端CLI**
```bash ```bash
claude plugin marketplace add https://github.com/daymade/claude-code-skills claude plugin marketplace add https://github.com/daymade/claude-code-skills
# Marketplace 名称daymade-skills来自 marketplace.json # Marketplace 名称daymade-skills来自 marketplace.json
@@ -88,6 +100,18 @@ claude plugin install skill-creator@daymade-skills
## 🚀 快速安装 ## 🚀 快速安装
### 在 Claude Code 内安装(应用内)
```text
/plugin marketplace add daymade/claude-code-skills
```
然后:
1. 选择 **Browse and install plugins**
2. 选择 **daymade/claude-code-skills**
3. 选择你需要的插件
4. 选择 **Install now**
### 自动化安装(推荐) ### 自动化安装(推荐)
**macOS/Linux** **macOS/Linux**
@@ -109,7 +133,7 @@ claude plugin marketplace add https://github.com/daymade/claude-code-skills
Marketplace 名称是 `daymade-skills`(来自 marketplace.json安装插件时请使用 `@daymade-skills` Marketplace 名称是 `daymade-skills`(来自 marketplace.json安装插件时请使用 `@daymade-skills`
不要把仓库路径当成 marketplace 名称(例如 `@daymade/claude-code-skills` 会失败)。 不要把仓库路径当成 marketplace 名称(例如 `@daymade/claude-code-skills` 会失败)。
所有插件命令都应使用 `claude plugin ...`(没有 `/plugin` 命令) 在 Claude Code 内使用 `/plugin ...` 斜杠命令,在终端中使用 `claude plugin ...`
**必备技能**(推荐首先安装): **必备技能**(推荐首先安装):
```bash ```bash
@@ -264,20 +288,20 @@ CC-Switch 支持以下中国 AI 服务提供商:
### 2. **markdown-tools** - 文档转换套件 ### 2. **markdown-tools** - 文档转换套件
将文档转换为 markdown支持 Windows/WSL 路径处理和 Obsidian 集成 将文档转换为 markdown支持 Windows/WSL 路径处理和 PDF 图片提取
**使用场景:** **使用场景:**
- 转换 .doc/.docx/PDF/PPTX 为 markdown - 转换 .doc/.docx/PDF/PPTX 为 markdown
- 从 PDF 文件中提取图片
- 处理 Confluence 导出 - 处理 Confluence 导出
- 处理 Windows/WSL 路径转换 - 处理 Windows/WSL 路径转换
- 使用 markitdown 工具
**主要功能:** **主要功能:**
- 多格式文档转换 - 多格式文档转换
- Confluence 导出处理 - PDF 图片提取(使用 PyMuPDF
- Windows/WSL 路径自动化 - Windows/WSL 路径自动化
- Obsidian vault 集成 - Confluence 导出处理
- 路径转换辅助脚本 - 路径转换和图片提取辅助脚本
**🎬 实时演示** **🎬 实时演示**

View File

@@ -1,146 +1,93 @@
--- ---
name: markdown-tools name: markdown-tools
description: Converts documents to markdown (PDFs, Word docs, PowerPoint, Confluence exports) with Windows/WSL path handling. Activates when converting .doc/.docx/PDF/PPTX files to markdown, processing Confluence exports, handling Windows/WSL path conversions, or working with markitdown utility. description: Converts documents to markdown (PDFs, Word docs, PowerPoint, Confluence exports) with Windows/WSL path handling. Activates when converting .doc/.docx/PDF/PPTX files to markdown, processing Confluence exports, handling Windows/WSL path conversions, extracting images from PDFs, or working with markitdown utility.
--- ---
# Markdown Tools # Markdown Tools
## Overview Convert documents to markdown with image extraction and Windows/WSL path handling.
This skill provides document conversion to markdown with Windows/WSL path handling support. It helps convert various document formats to markdown and handles path conversions between Windows and WSL environments.
## Core Capabilities
### 1. Markdown Conversion
Convert documents to markdown format with automatic Windows/WSL path handling.
### 2. Confluence Export Processing
Handle Confluence .doc exports with special characters for knowledge base integration.
## Quick Start ## Quick Start
### Convert Any Document to Markdown ### Install markitdown with PDF Support
```bash ```bash
# Basic conversion # IMPORTANT: Use [pdf] extra for PDF support
markitdown "path/to/document.pdf" > output.md uv tool install "markitdown[pdf]"
# WSL path example # Or via pip
markitdown "/mnt/c/Users/username/Documents/file.docx" > output.md pip install "markitdown[pdf]"
``` ```
See `references/conversion-examples.md` for detailed examples of various conversion scenarios. ### Basic Conversion
### Convert Confluence Export
```bash ```bash
# Direct conversion for simple exports markitdown "document.pdf" -o output.md
markitdown "confluence-export.doc" > output.md # Or redirect: markitdown "document.pdf" > output.md
# For exports with special characters, see references/
``` ```
## Path Conversion ## PDF Conversion with Images
### Windows to WSL Path Format markitdown extracts text only. For PDFs with images, use this workflow:
Windows paths must be converted to WSL format before use in bash commands. ### Step 1: Convert Text
**Conversion rules:**
- Replace `C:\` with `/mnt/c/`
- Replace `\` with `/`
- Preserve spaces and special characters
- Use quotes for paths with spaces
**Example conversions:**
```bash
# Windows path
C:\Users\username\Documents\file.doc
# WSL path
/mnt/c/Users/username/Documents/file.doc
```
**Helper script:** Use `scripts/convert_path.py` to automate conversion:
```bash ```bash
python scripts/convert_path.py "C:\Users\username\Downloads\document.doc" markitdown "document.pdf" -o output.md
``` ```
See `references/conversion-examples.md` for detailed path conversion examples. ### Step 2: Extract Images
## Document Conversion Workflows
### Workflow 1: Simple Markdown Conversion
For straightforward document conversions (PDF, .docx without special characters):
1. Convert Windows path to WSL format (if needed)
2. Run markitdown
3. Redirect output to .md file
See `references/conversion-examples.md` for detailed examples.
### Workflow 2: Confluence Export with Special Characters
For Confluence .doc exports that contain special characters or complex formatting:
1. Save .doc file to accessible location
2. Use appropriate conversion method (see references)
3. Verify output formatting
See `references/conversion-examples.md` for step-by-step command examples.
## Error Handling
### Common Issues and Solutions
**markitdown not found:**
```bash ```bash
# Install markitdown via pip # Create assets directory alongside the markdown
pip install markitdown mkdir -p assets
# Or via uv tools # Extract images using PyMuPDF
uv tool install markitdown uv run --with pymupdf python scripts/extract_pdf_images.py "document.pdf" ./assets
``` ```
**Path not found:** ### Step 3: Add Image References
Insert image references in the markdown where needed:
```markdown
![Description](assets/img_page1_1.png)
```
### Step 4: Format Cleanup
markitdown output often needs manual fixes:
- Add proper heading levels (`#`, `##`, `###`)
- Reconstruct tables in markdown format
- Fix broken line breaks
- Restore indentation structure
## Path Conversion (Windows/WSL)
```bash ```bash
# Verify path exists # Windows → WSL conversion
ls -la "/mnt/c/Users/username/Documents/file.doc" C:\Users\name\file.pdf → /mnt/c/Users/name/file.pdf
# Use convert_path.py helper # Use helper script
python scripts/convert_path.py "C:\Users\username\Documents\file.doc" python scripts/convert_path.py "C:\Users\name\Documents\file.pdf"
``` ```
**Encoding issues:** ## Common Issues
- Ensure files are UTF-8 encoded
- Check for special characters in filenames **"dependencies needed to read .pdf files"**
- Use quotes around paths with spaces ```bash
# Install with PDF support
uv tool install "markitdown[pdf]" --force
```
**FontBBox warnings during PDF conversion**
- These are harmless font parsing warnings, output is still correct
**Images missing from output**
- Use `scripts/extract_pdf_images.py` to extract images separately
## Resources ## Resources
### references/conversion-examples.md - `scripts/extract_pdf_images.py` - Extract images from PDF using PyMuPDF
Comprehensive examples for all conversion scenarios including: - `scripts/convert_path.py` - Windows to WSL path converter
- Simple document conversions (PDF, Word, PowerPoint) - `references/conversion-examples.md` - Detailed examples for batch operations
- Confluence export handling
- Path conversion examples for Windows/WSL
- Batch conversion operations
- Error recovery and troubleshooting examples
Load this reference when users need specific command examples or encounter conversion issues.
### scripts/convert_path.py
Python script to automate Windows to WSL path conversion. Handles:
- Drive letter conversion (C:\ → /mnt/c/)
- Backslash to forward slash
- Special characters and spaces
## Best Practices
1. **Convert Windows paths to WSL format** before bash operations
2. **Verify paths exist** before operations using ls or test commands
3. **Check output quality** after conversion
4. **Use markitdown directly** for simple conversions
5. **Test incrementally** - Verify each conversion step before proceeding
6. **Preserve directory structure** when doing batch conversions

View File

@@ -0,0 +1,95 @@
#!/usr/bin/env python3
"""
Extract images from PDF files using PyMuPDF.
Usage:
uv run --with pymupdf python extract_pdf_images.py <pdf_path> [output_dir]
Examples:
uv run --with pymupdf python extract_pdf_images.py document.pdf
uv run --with pymupdf python extract_pdf_images.py document.pdf ./assets
Output:
Images are saved to output_dir (default: ./assets) with names like:
- img_page1_1.png
- img_page2_1.png
"""
import sys
import os
def extract_images(pdf_path: str, output_dir: str = "assets") -> list[str]:
"""
Extract all images from a PDF file.
Args:
pdf_path: Path to the PDF file
output_dir: Directory to save extracted images
Returns:
List of extracted image file paths
"""
try:
import fitz # PyMuPDF
except ImportError:
print("Error: PyMuPDF not installed. Run with:")
print(' uv run --with pymupdf python extract_pdf_images.py <pdf_path>')
sys.exit(1)
os.makedirs(output_dir, exist_ok=True)
doc = fitz.open(pdf_path)
extracted_files = []
for page_num in range(len(doc)):
page = doc[page_num]
image_list = page.get_images()
for img_index, img in enumerate(image_list):
xref = img[0]
base_image = doc.extract_image(xref)
image_bytes = base_image["image"]
image_ext = base_image["ext"]
# Create descriptive filename
img_filename = f"img_page{page_num + 1}_{img_index + 1}.{image_ext}"
img_path = os.path.join(output_dir, img_filename)
with open(img_path, "wb") as f:
f.write(image_bytes)
extracted_files.append(img_path)
print(f"Extracted: {img_filename} ({len(image_bytes):,} bytes)")
doc.close()
print(f"\nTotal: {len(extracted_files)} images extracted to {output_dir}/")
return extracted_files
def main():
if len(sys.argv) < 2 or sys.argv[1] in ("-h", "--help"):
print("Extract images from PDF files using PyMuPDF.")
print()
print("Usage: python extract_pdf_images.py <pdf_path> [output_dir]")
print()
print("Arguments:")
print(" pdf_path Path to the PDF file")
print(" output_dir Directory to save images (default: ./assets)")
print()
print("Example:")
print(" uv run --with pymupdf python extract_pdf_images.py document.pdf ./assets")
sys.exit(0 if "--help" in sys.argv or "-h" in sys.argv else 1)
pdf_path = sys.argv[1]
output_dir = sys.argv[2] if len(sys.argv) > 2 else "assets"
if not os.path.exists(pdf_path):
print(f"Error: File not found: {pdf_path}")
sys.exit(1)
extract_images(pdf_path, output_dir)
if __name__ == "__main__":
main()