diff --git a/docs/tools/claude-code-local-setup.md b/docs/tools/claude-code-local-setup.md new file mode 100644 index 0000000..8ca3049 --- /dev/null +++ b/docs/tools/claude-code-local-setup.md @@ -0,0 +1,619 @@ +# Claude Code Local Setup Guide +**Offline AI Coding Assistant for Trinity** + +**Document ID:** FFG-TOOL-001 +**Created:** March 30, 2026 +**Created By:** The Versionist (Chronicler #49) +**For:** Trinity (Michael, Meg, Holly) +**Location:** TX1 Dallas (38.68.14.26) + +--- + +## 🎯 What This Is + +An offline version of Claude Code running on TX1 Dallas using local LLMs via Ollama. Provides Trinity with a free AI coding assistant for internal development work without API costs or internet dependency. + +**Think of it as:** "Claude in your terminal, running on your hardware, costing nothing." + +--- + +## 💡 Why This Matters + +### Benefits +- ✅ **Zero API costs** - No charges for usage +- ✅ **Always available** - Works even when Anthropic is down +- ✅ **Privacy** - Code never leaves TX1 +- ✅ **No rate limits** - Use as much as you want +- ✅ **Fast** - Local inference, no internet latency + +### Use Cases +- **Internal development** - Firefrost projects, extensions, tools +- **When rate limited** - Hit Anthropic API limits? Switch to local +- **Sensitive code** - Financial systems, internal tools, Trinity-only features +- **Learning/experimentation** - Try things without burning credits +- **Backup** - When cloud Claude is unavailable + +--- + +## 🏗️ Architecture + +**What we're using:** +- **Hardware:** TX1 Dallas (24 cores, 251GB RAM, 3.4TB NVMe) +- **LLM Engine:** Ollama (already installed for Firefrost Codex) +- **Model:** Qwen3-Coder-Next (trained for agentic coding workflows) +- **Interface:** Claude Code CLI (Anthropic's terminal coding tool) +- **API:** Anthropic Messages API (Ollama v0.14+ compatible) + +**How it works:** +``` +You type in terminal + ↓ +Claude Code CLI + ↓ +Anthropic Messages API format + ↓ +Ollama (local on TX1) + ↓ +Qwen3-Coder-Next model + ↓ +Response back to terminal +``` + +**No internet needed once set up.** + +--- + +## 📋 Prerequisites + +**Required:** +- SSH access to TX1 Dallas (architect user) +- Ollama v0.14.0+ installed (supports Anthropic Messages API) +- Node.js installed (for Claude Code CLI) +- Minimum 16GB RAM (we have 251GB ✅) +- 64k+ token context model (Qwen3-Coder-Next has 170k ✅) + +**Optional:** +- GPU (we have CPU-only, works fine) + +--- + +## 🚀 Installation Steps + +### Step 1: SSH to TX1 + +```bash +ssh architect@38.68.14.26 +``` + +### Step 2: Verify Ollama Version + +```bash +ollama --version +``` + +**Required:** v0.14.0 or higher (for Anthropic Messages API compatibility) + +**If older version, update:** +```bash +curl -fsSL https://ollama.com/install.sh | sh +``` + +### Step 3: Pull Coding Model + +**Recommended: Qwen3-Coder-Next** +```bash +ollama pull qwen3-coder-next +``` + +**Why this model:** +- Trained specifically for agentic coding workflows +- Understands tool calling and multi-step planning +- 170,000 token context window +- Strong at reading, explaining, and writing code +- Optimized for Claude Code harness + +**Alternative models (if needed):** +```bash +ollama pull qwen2.5-coder:7b # Lighter, faster (7B params) +ollama pull glm-5:cloud # Cloud model option +ollama pull deepseek-coder-v2:16b # Alternative coding model +``` + +**Verify model downloaded:** +```bash +ollama list +``` + +Should show `qwen3-coder-next` in the list. + +### Step 4: Install Claude Code CLI + +**Installation:** +```bash +curl -fsSL https://claude.ai/install.sh | bash +``` + +**Verify installation:** +```bash +claude --version +``` + +Should show Claude Code version number. + +### Step 5: Create Launch Script + +**Create script:** +```bash +nano ~/launch-claude-local.sh +``` + +**Script contents:** +```bash +#!/bin/bash +# Firefrost Gaming - Local Claude Code Launcher +# Launches Claude Code with local Ollama backend +# Zero API costs, full privacy + +export ANTHROPIC_AUTH_TOKEN=ollama +export ANTHROPIC_API_KEY="" +export ANTHROPIC_BASE_URL=http://localhost:11434 + +echo "🔥❄️ Launching Claude Code (Local via Ollama)..." +echo "Model: qwen3-coder-next" +echo "No API costs - running on TX1 Dallas" +echo "" + +ollama launch claude --model qwen3-coder-next +``` + +**Make executable:** +```bash +chmod +x ~/launch-claude-local.sh +``` + +### Step 6: Test Installation + +**Launch Claude Code:** +```bash +~/launch-claude-local.sh +``` + +**Test with simple task:** +``` +❯ create a python file that prints "Firefrost Gaming" in Fire colors +``` + +**Expected behavior:** +- Claude Code creates the file +- Uses ANSI color codes for orange/red text +- File is created in current directory + +**If working:** You're done! ✅ + +--- + +## 🎮 How to Use + +### Starting Claude Code + +**From anywhere:** +```bash +~/launch-claude-local.sh +``` + +**Or with full path:** +```bash +/home/architect/launch-claude-local.sh +``` + +### Basic Commands + +**Inside Claude Code:** + +```bash +❯ /help # Show all commands +❯ /permissions # View/change file access permissions +❯ /reset # Start fresh conversation +❯ /exit # Quit Claude Code +``` + +### Common Tasks + +**1. Create new file:** +``` +❯ create a python script that checks if a server is online +``` + +**2. Explain existing code:** +``` +❯ explain what this file does: webhook.js +``` + +**3. Debug code:** +``` +❯ this function is throwing an error, can you fix it? +``` + +**4. Refactor code:** +``` +❯ make this function more efficient +``` + +**5. Write tests:** +``` +❯ create unit tests for database.js +``` + +### Project-Based Workflow + +**1. Navigate to project:** +```bash +cd /home/architect/firefrost-projects/modpack-checker +``` + +**2. Launch Claude Code:** +```bash +~/launch-claude-local.sh +``` + +**3. Work on project:** +``` +❯ add error handling to all API calls in src/providers/ +``` + +Claude Code can see and edit all files in current directory and subdirectories. + +--- + +## 🔐 Permissions + +**By default, Claude Code has restricted permissions.** + +**View current permissions:** +``` +❯ /permissions +``` + +**Grant permissions as needed:** +``` +Allow: bash # Execute bash commands +Allow: read_directory # Browse directories +Allow: write_file # Create/edit files +``` + +**Security note:** Only grant permissions you need for current task. + +--- + +## 📊 Performance Expectations + +### What Local Claude Code Is Good At + +✅ **Code generation:** Boilerplate, helpers, utilities, templates +✅ **Code explanation:** Understanding unfamiliar code +✅ **Refactoring:** Improving existing code +✅ **Debugging:** Finding and fixing bugs +✅ **Testing:** Writing unit tests +✅ **Documentation:** Creating comments, README files + +### What It Struggles With + +❌ **Complex architecture** - Better to use cloud Claude (Opus/Sonnet) +❌ **Novel algorithms** - Needs deeper reasoning +❌ **Large refactors** - May lose context +❌ **Cutting-edge frameworks** - Training data cutoff + +### Speed + +- **Local inference:** Very fast on TX1 (24 cores, 251GB RAM) +- **No network latency:** Immediate responses +- **Context processing:** Handles large codebases well (170k tokens) + +--- + +## 🆚 Local vs Cloud Decision Matrix + +### Use Local Claude Code When: + +- ✅ Working on internal Firefrost code +- ✅ Simple/medium coding tasks +- ✅ You've hit Anthropic API rate limits +- ✅ Anthropic is down or slow +- ✅ Privacy is critical (financial, Trinity-only) +- ✅ Experimenting or learning +- ✅ Don't want to pay API costs + +### Use Cloud Claude When: + +- ✅ Complex architecture decisions +- ✅ Need latest models (Opus 4.6, Sonnet 4.6) +- ✅ Multi-hour consultations (like Arbiter 2.0) +- ✅ Production-critical code +- ✅ Need web search integration +- ✅ Quality more important than cost + +**General rule:** Start with local, escalate to cloud if needed. + +--- + +## 🔧 Troubleshooting + +### Issue: "ollama: command not found" + +**Solution:** +```bash +# Install Ollama +curl -fsSL https://ollama.com/install.sh | sh +``` + +### Issue: "Error: unknown command 'launch' for 'ollama'" + +**Solution:** Ollama version too old. Update to v0.14+ +```bash +curl -fsSL https://ollama.com/install.sh | sh +ollama --version +``` + +### Issue: "claude: command not found" + +**Solution:** Install Claude Code +```bash +curl -fsSL https://claude.ai/install.sh | bash +``` + +### Issue: Model runs out of memory + +**Check available RAM:** +```bash +free -h +``` + +**TX1 has 251GB - should never happen. If it does:** +- Check for other processes using RAM +- Switch to lighter model: `ollama pull qwen2.5-coder:7b` + +### Issue: Claude Code can't see files + +**Solution:** Check permissions +``` +❯ /permissions +``` + +Add necessary permissions (read_directory, write_file, etc.) + +### Issue: Responses are slow + +**Possible causes:** +- TX1 under heavy load (check with `top`) +- Large context (try `/reset` to start fresh) +- Complex task (may need cloud Claude) + +### Issue: Quality lower than expected + +**Strategies:** +- Be more specific in prompts +- Break large tasks into smaller steps +- Provide more context/examples +- Switch to cloud Claude for complex tasks + +--- + +## 📈 Model Comparison + +### Qwen3-Coder-Next (Recommended) + +- **Size:** ~14B parameters +- **Context:** 170k tokens +- **Strengths:** Agentic workflows, tool calling, multi-step planning +- **Speed:** Fast on TX1 +- **Quality:** 80-85% of cloud Claude for coding + +### Qwen2.5-Coder:7b (Lightweight) + +- **Size:** 7B parameters +- **Context:** 64k tokens +- **Strengths:** Faster, lighter resource usage +- **Speed:** Very fast on TX1 +- **Quality:** 70-75% of cloud Claude for coding + +### GLM-5:cloud (Cloud Model) + +- **Size:** Cloud-hosted +- **Context:** Variable +- **Strengths:** Better quality than local +- **Speed:** Depends on internet +- **Cost:** Free tier available, then paid + +**Recommendation:** Stick with Qwen3-Coder-Next unless you need speed (use 7b) or quality (use cloud). + +--- + +## 💰 Cost Analysis + +### Local Claude Code (This Setup) + +- **Setup time:** 30-45 minutes (one-time) +- **Hardware cost:** $0 (already own TX1) +- **Ongoing cost:** $0 +- **Usage limits:** None +- **Privacy:** 100% local + +### Cloud Claude Code (Anthropic) + +- **Setup time:** 5 minutes +- **Hardware cost:** $0 +- **Ongoing cost:** $0.015-0.075 per request +- **Usage limits:** Pro plan limits apply +- **Privacy:** Cloud-based + +**Savings example:** +- 100 local sessions/month = $0 +- 100 cloud sessions/month = ~$20-50 +- **Annual savings: $240-600** + +--- + +## 🔄 Switching Between Models + +**List available models:** +```bash +ollama list +``` + +**Pull new model:** +```bash +ollama pull deepseek-coder-v2:16b +``` + +**Update launch script to use different model:** +```bash +nano ~/launch-claude-local.sh +``` + +Change: +```bash +ollama launch claude --model qwen3-coder-next +``` + +To: +```bash +ollama launch claude --model deepseek-coder-v2:16b +``` + +--- + +## 📚 Additional Resources + +**Official Documentation:** +- Ollama: https://ollama.com/ +- Claude Code: https://docs.claude.com/claude-code +- Anthropic Messages API: https://docs.anthropic.com/ + +**Model Info:** +- Qwen3-Coder-Next: https://ollama.com/library/qwen3-coder-next +- Qwen2.5-Coder: https://ollama.com/library/qwen2.5-coder + +**Community:** +- Ollama GitHub: https://github.com/ollama/ollama +- Claude Code Issues: https://github.com/anthropics/claude-code/issues + +--- + +## 🎯 Best Practices + +### 1. Start Fresh for Each Project +``` +❯ /reset +``` +Clears context, prevents confusion between projects. + +### 2. Be Specific +**Bad:** "Fix this" +**Good:** "This function throws a TypeError when user_id is None. Add null check." + +### 3. Provide Context +Include relevant file names, error messages, expected behavior. + +### 4. Break Large Tasks Down +Instead of: "Build entire authentication system" +Do: "Create user model", then "Create login route", then "Add JWT tokens" + +### 5. Review Generated Code +Local models can make mistakes. Always review before committing. + +### 6. Use Permissions Wisely +Only grant permissions needed for current task. + +### 7. Know When to Escalate +If stuck after 3-4 attempts, switch to cloud Claude. + +--- + +## 🔐 Security Considerations + +### What's Safe + +- ✅ Internal Firefrost code +- ✅ Open source projects +- ✅ Personal scripts +- ✅ Learning/experimentation + +### What to Avoid + +- ❌ Subscriber payment info +- ❌ API keys/secrets (use env vars instead) +- ❌ Personal data from database dumps +- ❌ Proprietary third-party code under NDA + +**General rule:** If you wouldn't commit it to public GitHub, be cautious. + +--- + +## 📝 Future Enhancements + +**Potential additions (not implemented yet):** + +1. **RAG Integration** - Connect to Firefrost Codex Qdrant database +2. **MCP Tools** - Add custom Model Context Protocol tools +3. **Multi-model switching** - Quick switch between models +4. **Usage tracking** - Monitor what works well +5. **Custom prompts** - Firefrost-specific coding patterns + +--- + +## ✅ Success Checklist + +**Verify everything works:** + +- [ ] SSH to TX1 successful +- [ ] Ollama v0.14+ installed +- [ ] Qwen3-Coder-Next model downloaded +- [ ] Claude Code CLI installed +- [ ] Launch script created and executable +- [ ] Successfully created test file +- [ ] Permissions understood +- [ ] Know when to use local vs cloud + +**If all checked:** You're ready to use local Claude Code! 🎉 + +--- + +## 🤝 Who Can Use This + +**Trinity only:** +- Michael (frostystyle) +- Meg (Gingerfury66) +- Holly (unicorn20089) + +**Not for subscribers** - this is an internal development tool. + +--- + +## 📞 Support + +**Issues with setup:** +- Check troubleshooting section above +- Review Ollama docs: https://ollama.com/ +- Ask in Trinity Discord channel + +**Model selection questions:** +- Start with Qwen3-Coder-Next +- Try others if needed +- Document what works best + +**General questions:** +- Refer to this guide +- Check official Claude Code docs +- Experiment and learn + +--- + +**🔥❄️ Built for Trinity. Built with love. Zero API costs. Full control.** 💙 + +**For children not yet born - code built by AI, owned by us.** + +--- + +**Last Updated:** March 30, 2026 +**Maintained By:** Trinity +**Location:** docs/tools/claude-code-local-setup.md