firefrost-operations-manual/docs/tools/claude-code-local-setup.md

# Claude Code Local Setup Guide
**Offline AI Coding Assistant for Trinity**

**Document ID:** FFG-TOOL-001
**Created:** March 30, 2026
**Created By:** The Versionist (Chronicler #49)
**For:** Trinity (Michael, Meg, Holly)
**Location:** TX1 Dallas (38.68.14.26)

---

## 🎯 What This Is

An offline version of Claude Code running on TX1 Dallas using local LLMs via Ollama. Provides Trinity with a free AI coding assistant for internal development work without API costs or internet dependency.

**Think of it as:** "Claude in your terminal, running on your hardware, costing nothing."

---

## 💡 Why This Matters

### Benefits
- ✅ **Zero API costs** - No charges for usage
- ✅ **Always available** - Works even when Anthropic is down
- ✅ **Privacy** - Code never leaves TX1
- ✅ **No rate limits** - Use as much as you want
- ✅ **Fast** - Local inference, no internet latency

### Use Cases
- **Internal development** - Firefrost projects, extensions, tools
- **When rate limited** - Hit Anthropic API limits? Switch to local
- **Sensitive code** - Financial systems, internal tools, Trinity-only features
- **Learning/experimentation** - Try things without burning credits
- **Backup** - When cloud Claude is unavailable

---

## 🏗️ Architecture

**What we're using:**
- **Hardware:** TX1 Dallas (24 cores, 251GB RAM, 3.4TB NVMe)
- **LLM Engine:** Ollama (already installed for Firefrost Codex)
- **Model:** Qwen3-Coder-Next (trained for agentic coding workflows)
- **Interface:** Claude Code CLI (Anthropic's terminal coding tool)
- **API:** Anthropic Messages API (Ollama v0.14+ compatible)

**How it works:**
```
You type in terminal
    ↓
Claude Code CLI
    ↓
Anthropic Messages API format
    ↓
Ollama (local on TX1)
    ↓
Qwen3-Coder-Next model
    ↓
Response back to terminal
```

**No internet needed once set up.**

---

## 📋 Prerequisites

**Required:**
- SSH access to TX1 Dallas (architect user)
- Ollama v0.14.0+ installed (supports Anthropic Messages API)
- Node.js installed (for Claude Code CLI)
- Minimum 16GB RAM (we have 251GB ✅)
- 64k+ token context model (Qwen3-Coder-Next has 170k ✅)

**Optional:**
- GPU (we have CPU-only, works fine)

---

## 🚀 Installation Steps

### Step 1: SSH to TX1

```bash
ssh architect@38.68.14.26
```

### Step 2: Verify Ollama Version

```bash
ollama --version
```

**Required:** v0.14.0 or higher (for Anthropic Messages API compatibility)

**If older version, update:**
```bash
curl -fsSL https://ollama.com/install.sh | sh
```

### Step 3: Pull Coding Model

**Recommended: Qwen3-Coder-Next**
```bash
ollama pull qwen3-coder-next
```

**Why this model:**
- Trained specifically for agentic coding workflows
- Understands tool calling and multi-step planning
- 170,000 token context window
- Strong at reading, explaining, and writing code
- Optimized for Claude Code harness

**Alternative models (if needed):**
```bash
ollama pull qwen2.5-coder:7b      # Lighter, faster (7B params)
ollama pull glm-5:cloud           # Cloud model option
ollama pull deepseek-coder-v2:16b # Alternative coding model
```

**Verify model downloaded:**
```bash
ollama list
```

Should show `qwen3-coder-next` in the list.

### Step 4: Install Claude Code CLI

**Installation:**
```bash
curl -fsSL https://claude.ai/install.sh | bash
```

**Verify installation:**
```bash
claude --version
```

Should show Claude Code version number.

### Step 5: Create Launch Script

**Create script:**
```bash
nano ~/launch-claude-local.sh
```

**Script contents:**
```bash
#!/bin/bash
# Firefrost Gaming - Local Claude Code Launcher
# Launches Claude Code with local Ollama backend
# Zero API costs, full privacy

export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_API_KEY=""
export ANTHROPIC_BASE_URL=http://localhost:11434

echo "🔥❄️ Launching Claude Code (Local via Ollama)..."
echo "Model: qwen3-coder-next"
echo "No API costs - running on TX1 Dallas"
echo ""

ollama launch claude --model qwen3-coder-next
```

**Make executable:**
```bash
chmod +x ~/launch-claude-local.sh
```

### Step 6: Test Installation

**Launch Claude Code:**
```bash
~/launch-claude-local.sh
```

**Test with simple task:**
```
❯ create a python file that prints "Firefrost Gaming" in Fire colors
```

**Expected behavior:**
- Claude Code creates the file
- Uses ANSI color codes for orange/red text
- File is created in current directory

**If working:** You're done! ✅

---

## 🎮 How to Use

### Starting Claude Code

**From anywhere:**
```bash
~/launch-claude-local.sh
```

**Or with full path:**
```bash
/home/architect/launch-claude-local.sh
```

### Basic Commands

**Inside Claude Code:**

```bash
❯ /help                    # Show all commands
❯ /permissions             # View/change file access permissions
❯ /reset                   # Start fresh conversation
❯ /exit                    # Quit Claude Code
```

### Common Tasks

**1. Create new file:**
```
❯ create a python script that checks if a server is online
```

**2. Explain existing code:**
```
❯ explain what this file does: webhook.js
```

**3. Debug code:**
```
❯ this function is throwing an error, can you fix it?
```

**4. Refactor code:**
```
❯ make this function more efficient
```

**5. Write tests:**
```
❯ create unit tests for database.js
```

### Project-Based Workflow

**1. Navigate to project:**
```bash
cd /home/architect/firefrost-projects/modpack-checker
```

**2. Launch Claude Code:**
```bash
~/launch-claude-local.sh
```

**3. Work on project:**
```
❯ add error handling to all API calls in src/providers/
```

Claude Code can see and edit all files in current directory and subdirectories.

---

## 🔐 Permissions

**By default, Claude Code has restricted permissions.**

**View current permissions:**
```
❯ /permissions
```

**Grant permissions as needed:**
```
Allow: bash              # Execute bash commands
Allow: read_directory    # Browse directories
Allow: write_file        # Create/edit files
```

**Security note:** Only grant permissions you need for current task.

---

## 📊 Performance Expectations

### What Local Claude Code Is Good At

✅ **Code generation:** Boilerplate, helpers, utilities, templates
✅ **Code explanation:** Understanding unfamiliar code
✅ **Refactoring:** Improving existing code
✅ **Debugging:** Finding and fixing bugs
✅ **Testing:** Writing unit tests
✅ **Documentation:** Creating comments, README files

### What It Struggles With

❌ **Complex architecture** - Better to use cloud Claude (Opus/Sonnet)
❌ **Novel algorithms** - Needs deeper reasoning
❌ **Large refactors** - May lose context
❌ **Cutting-edge frameworks** - Training data cutoff

### Speed

- **Local inference:** Very fast on TX1 (24 cores, 251GB RAM)
- **No network latency:** Immediate responses
- **Context processing:** Handles large codebases well (170k tokens)

---

## 🆚 Local vs Cloud Decision Matrix

### Use Local Claude Code When:

- ✅ Working on internal Firefrost code
- ✅ Simple/medium coding tasks
- ✅ You've hit Anthropic API rate limits
- ✅ Anthropic is down or slow
- ✅ Privacy is critical (financial, Trinity-only)
- ✅ Experimenting or learning
- ✅ Don't want to pay API costs

### Use Cloud Claude When:

- ✅ Complex architecture decisions
- ✅ Need latest models (Opus 4.6, Sonnet 4.6)
- ✅ Multi-hour consultations (like Arbiter 2.0)
- ✅ Production-critical code
- ✅ Need web search integration
- ✅ Quality more important than cost

**General rule:** Start with local, escalate to cloud if needed.

---

## 🔧 Troubleshooting

### Issue: "ollama: command not found"

**Solution:**
```bash
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
```

### Issue: "Error: unknown command 'launch' for 'ollama'"

**Solution:** Ollama version too old. Update to v0.14+
```bash
curl -fsSL https://ollama.com/install.sh | sh
ollama --version
```

### Issue: "claude: command not found"

**Solution:** Install Claude Code
```bash
curl -fsSL https://claude.ai/install.sh | bash
```

### Issue: Model runs out of memory

**Check available RAM:**
```bash
free -h
```

**TX1 has 251GB - should never happen. If it does:**
- Check for other processes using RAM
- Switch to lighter model: `ollama pull qwen2.5-coder:7b`

### Issue: Claude Code can't see files

**Solution:** Check permissions
```
❯ /permissions
```

Add necessary permissions (read_directory, write_file, etc.)

### Issue: Responses are slow

**Possible causes:**
- TX1 under heavy load (check with `top`)
- Large context (try `/reset` to start fresh)
- Complex task (may need cloud Claude)

### Issue: Quality lower than expected

**Strategies:**
- Be more specific in prompts
- Break large tasks into smaller steps
- Provide more context/examples
- Switch to cloud Claude for complex tasks

---

## 📈 Model Comparison

### Qwen3-Coder-Next (Recommended)

- **Size:** ~14B parameters
- **Context:** 170k tokens
- **Strengths:** Agentic workflows, tool calling, multi-step planning
- **Speed:** Fast on TX1
- **Quality:** 80-85% of cloud Claude for coding

### Qwen2.5-Coder:7b (Lightweight)

- **Size:** 7B parameters
- **Context:** 64k tokens
- **Strengths:** Faster, lighter resource usage
- **Speed:** Very fast on TX1
- **Quality:** 70-75% of cloud Claude for coding

### GLM-5:cloud (Cloud Model)

- **Size:** Cloud-hosted
- **Context:** Variable
- **Strengths:** Better quality than local
- **Speed:** Depends on internet
- **Cost:** Free tier available, then paid

**Recommendation:** Stick with Qwen3-Coder-Next unless you need speed (use 7b) or quality (use cloud).

---

## 💰 Cost Analysis

### Local Claude Code (This Setup)

- **Setup time:** 30-45 minutes (one-time)
- **Hardware cost:** $0 (already own TX1)
- **Ongoing cost:** $0
- **Usage limits:** None
- **Privacy:** 100% local

### Cloud Claude Code (Anthropic)

- **Setup time:** 5 minutes
- **Hardware cost:** $0
- **Ongoing cost:** $0.015-0.075 per request
- **Usage limits:** Pro plan limits apply
- **Privacy:** Cloud-based

**Savings example:**
- 100 local sessions/month = $0
- 100 cloud sessions/month = ~$20-50
- **Annual savings: $240-600**

---

## 🔄 Switching Between Models

**List available models:**
```bash
ollama list
```

**Pull new model:**
```bash
ollama pull deepseek-coder-v2:16b
```

**Update launch script to use different model:**
```bash
nano ~/launch-claude-local.sh
```

Change:
```bash
ollama launch claude --model qwen3-coder-next
```

To:
```bash
ollama launch claude --model deepseek-coder-v2:16b
```

---

## 📚 Additional Resources

**Official Documentation:**
- Ollama: https://ollama.com/
- Claude Code: https://docs.claude.com/claude-code
- Anthropic Messages API: https://docs.anthropic.com/

**Model Info:**
- Qwen3-Coder-Next: https://ollama.com/library/qwen3-coder-next
- Qwen2.5-Coder: https://ollama.com/library/qwen2.5-coder

**Community:**
- Ollama GitHub: https://github.com/ollama/ollama
- Claude Code Issues: https://github.com/anthropics/claude-code/issues

---

## 🎯 Best Practices

### 1. Start Fresh for Each Project
```
❯ /reset
```
Clears context, prevents confusion between projects.

### 2. Be Specific
**Bad:** "Fix this"
**Good:** "This function throws a TypeError when user_id is None. Add null check."

### 3. Provide Context
Include relevant file names, error messages, expected behavior.

### 4. Break Large Tasks Down
Instead of: "Build entire authentication system"
Do: "Create user model", then "Create login route", then "Add JWT tokens"

### 5. Review Generated Code
Local models can make mistakes. Always review before committing.

### 6. Use Permissions Wisely
Only grant permissions needed for current task.

### 7. Know When to Escalate
If stuck after 3-4 attempts, switch to cloud Claude.

---

## 🔐 Security Considerations

### What's Safe

- ✅ Internal Firefrost code
- ✅ Open source projects
- ✅ Personal scripts
- ✅ Learning/experimentation

### What to Avoid

- ❌ Subscriber payment info
- ❌ API keys/secrets (use env vars instead)
- ❌ Personal data from database dumps
- ❌ Proprietary third-party code under NDA

**General rule:** If you wouldn't commit it to public GitHub, be cautious.

---

## 📝 Future Enhancements

**Potential additions (not implemented yet):**

1. **RAG Integration** - Connect to Firefrost Codex Qdrant database
2. **MCP Tools** - Add custom Model Context Protocol tools
3. **Multi-model switching** - Quick switch between models
4. **Usage tracking** - Monitor what works well
5. **Custom prompts** - Firefrost-specific coding patterns

---

## ✅ Success Checklist

**Verify everything works:**

- [ ] SSH to TX1 successful
- [ ] Ollama v0.14+ installed
- [ ] Qwen3-Coder-Next model downloaded
- [ ] Claude Code CLI installed
- [ ] Launch script created and executable
- [ ] Successfully created test file
- [ ] Permissions understood
- [ ] Know when to use local vs cloud

**If all checked:** You're ready to use local Claude Code! 🎉

---

## 🤝 Who Can Use This

**Trinity only:**
- Michael (frostystyle)
- Meg (Gingerfury66)
- Holly (unicorn20089)

**Not for subscribers** - this is an internal development tool.

---

## 📞 Support

**Issues with setup:**
- Check troubleshooting section above
- Review Ollama docs: https://ollama.com/
- Ask in Trinity Discord channel

**Model selection questions:**
- Start with Qwen3-Coder-Next
- Try others if needed
- Document what works best

**General questions:**
- Refer to this guide
- Check official Claude Code docs
- Experiment and learn

---

**🔥❄️ Built for Trinity. Built with love. Zero API costs. Full control.** 💙

**For children not yet born - code built by AI, owned by us.**

---

**Last Updated:** March 30, 2026
**Maintained By:** Trinity
**Location:** docs/tools/claude-code-local-setup.md