Files
firefrost-services/services/arbiter/TROUBLESHOOTING.md
Claude (The Golden Chronicler #50) 04e9b407d5 feat: Migrate Arbiter and Modpack Version Checker to monorepo
WHAT WAS DONE:
- Migrated Arbiter (discord-oauth-arbiter) code to services/arbiter/
- Migrated Modpack Version Checker code to services/modpack-version-checker/
- Created .env.example for Arbiter with all required environment variables
- Moved systemd service file to services/arbiter/deploy/
- Organized directory structure per Gemini monorepo recommendations

WHY:
- Consolidate all service code in one repository
- Prepare for Gemini code review (Panel v1.12 compatibility check)
- Enable service-prefixed Git tagging (arbiter-v2.1.0, modpack-v1.0.0)
- Support npm workspaces for shared dependencies

SERVICES MIGRATED:
1. Arbiter (Discord OAuth bot) - Originally written by Gemini + Claude
   - Full source code from ops-manual docs/implementation/
   - Created comprehensive .env.example
   - Ready for Panel v1.12 compatibility verification

2. Modpack Version Checker (Python CLI tool)
   - Full source code from ops-manual docs/tasks/
   - Written for Panel v1.11, needs Gemini review for v1.12
   - Never had code review before

STILL TODO:
- Whitelist Manager - Pull from Billing VPS (38.68.14.188)
  - Currently deployed and running
  - Needs Panel v1.12 API compatibility fix (Task #86)
  - Requires SSH access to pull code

NEXT STEPS:
- Gemini code review for Panel v1.12 API compatibility
- Create package.json for each service
- Test npm workspaces integration
- Deploy after verification

FILES:
- services/arbiter/ (25 new files, full application)
- services/modpack-version-checker/ (21 new files, full application)

Signed-off-by: The Golden Chronicler <claude@firefrostgaming.com>
2026-03-31 21:52:42 +00:00

14 KiB

Firefrost Arbiter - Troubleshooting Guide

Last Updated: March 30, 2026
Prepared by: Claude (Chronicler #49) + Gemini AI


🔍 Quick Diagnostics

Check Service Status

sudo systemctl status arbiter

View Recent Logs

sudo journalctl -u arbiter -n 50

Follow Live Logs

sudo journalctl -u arbiter -f

Check Health Endpoint

curl https://discord-bot.firefrostgaming.com/health

🚨 Common Issues & Solutions

1. "Invalid redirect URI" in Discord OAuth

Symptom: When clicking linking URL or admin login, Discord shows "Invalid Redirect URI" error.

Cause: The redirect URI in your .env file doesn't exactly match what's registered in the Discord Developer Portal.

Solution:

  1. Check .env file:
cat .env | grep APP_URL

Should show: APP_URL=https://discord-bot.firefrostgaming.com (no trailing slash)

  1. Go to Discord Developer Portal → OAuth2 → General

  2. Verify exact URIs are registered:

    • https://discord-bot.firefrostgaming.com/auth/callback
    • https://discord-bot.firefrostgaming.com/admin/callback
  3. Important: Check for:

    • Trailing slashes (don't include them)
    • http vs https mismatch
    • www vs non-www
    • Typos in domain
  4. If you changed the URI, wait 5-10 minutes for Discord to propagate

  5. Restart the application:

sudo systemctl restart arbiter

2. "Bot missing permissions" when assigning roles

Symptom: Logs show "Failed to assign role" or "Missing Permissions" error when trying to assign Discord roles.

Cause: Either the bot wasn't invited with the correct permissions, or the bot's role is positioned below the roles it's trying to assign.

Solution:

Check 1: Bot Has "Manage Roles" Permission

  1. Go to Discord Server → Settings → Roles
  2. Find the bot's role (usually named after the bot)
  3. Verify "Manage Roles" permission is enabled
  4. If not, enable it

Check 2: Role Hierarchy (Most Common Issue)

  1. Go to Discord Server → Settings → Roles
  2. Find the bot's role in the list
  3. Drag it ABOVE all subscription tier roles
  4. The bot can only assign roles that are below its own role

Example correct hierarchy:

1. Owner (you)
2. Admin
3. [Bot Role] ← MUST BE HERE
4. Sovereign
5. Fire Legend
6. Frost Legend
... (all other subscriber roles)

Check 3: Re-invite Bot with Correct Permissions

If role hierarchy is correct but still failing:

  1. Go to Discord Developer Portal → OAuth2 → URL Generator
  2. Select scopes: bot
  3. Select permissions: Manage Roles (minimum)
  4. Copy generated URL
  5. Visit URL and re-authorize bot (this updates permissions)

Test:

# Check if bot can see roles
sudo journalctl -u arbiter -n 100 | grep "Role ID"

3. "Session not persisting" across requests

Symptom: Admin panel logs you out immediately after login, or every page reload requires re-authentication.

Cause: Session cookies not being saved properly, usually due to reverse proxy configuration.

Solution:

Check 1: Express Trust Proxy Setting

Verify in src/index.js:

app.set('trust proxy', 1);

This line MUST be present before session middleware.

Check 2: Nginx Proxy Headers

Edit Nginx config:

sudo nano /etc/nginx/sites-available/arbiter

Verify these headers exist in the location / block:

proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;

Check 3: Cookie Settings for Development

If testing on http://localhost, update src/index.js:

cookie: {
    secure: process.env.NODE_ENV === 'production', // false for localhost
    httpOnly: true,
    maxAge: 1000 * 60 * 60 * 24 * 7
}

Check 4: SESSION_SECRET is Set

grep SESSION_SECRET .env

Should show a 64-character hex string.

Restart after changes:

sudo systemctl restart arbiter
sudo systemctl reload nginx

4. "Ghost API 401 error"

Symptom: Logs show "Ghost API 401 Unauthorized" when trying to search users or update members.

Cause: Invalid or incorrectly formatted Admin API key.

Solution:

Check 1: API Key Format

cat .env | grep CMS_ADMIN_KEY

Should be in format: key_id:secret (with the colon)

Example:

CMS_ADMIN_KEY=65f8a1b2c3d4e5f6:a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6

Check 2: Integration Still Exists

  1. Go to Ghost Admin → Settings → Integrations
  2. Find "Firefrost Arbiter" integration
  3. Verify it's not deleted or disabled
  4. If missing, create new integration and update .env

Check 3: Ghost URL is Correct

cat .env | grep CMS_URL

Should match your Ghost installation URL exactly (no trailing slash).

Check 4: Test API Key Manually

curl -H "Authorization: Ghost <your_admin_key>" \
  "https://firefrostgaming.com/ghost/api/admin/members/"

Should return JSON with member list. If 401, key is invalid.

After fixing:

sudo systemctl restart arbiter

5. "Database locked" errors

Symptom: Logs show "SQLITE_BUSY: database is locked" when multiple webhooks arrive simultaneously.

Cause: SQLite locks the database during writes. If multiple webhooks arrive at exactly the same time, one may fail.

Solution:

Option 1: Increase Timeout (Recommended)

Edit src/database.js:

const Database = require('better-sqlite3');
const db = new Database('linking.db', { timeout: 5000 });

This gives SQLite 5 seconds to wait for locks to clear.

Option 2: Add WAL Mode (Write-Ahead Logging)

Edit src/database.js, add after database creation:

db.pragma('journal_mode = WAL');

WAL mode allows concurrent reads and writes.

Option 3: Retry Logic (For Critical Operations)

In src/routes/webhook.js, wrap database operations:

let retries = 3;
while (retries > 0) {
    try {
        stmt.run(token, customer_email, tier, subscription_id);
        break;
    } catch (error) {
        if (error.code === 'SQLITE_BUSY' && retries > 1) {
            retries--;
            await new Promise(resolve => setTimeout(resolve, 100));
        } else {
            throw error;
        }
    }
}

After changes:

sudo systemctl restart arbiter

6. "Email not sending"

Symptom: Webhook processes successfully but subscriber never receives linking email.

Cause: SMTP connection issue, firewall blocking port 587, or incorrect credentials.

Solution:

Check 1: SMTP Credentials

cat .env | grep SMTP

Verify:

  • SMTP_HOST=38.68.14.188
  • SMTP_USER=noreply@firefrostgaming.com
  • SMTP_PASS=<correct password>

Check 2: Port 587 is Open

From Command Center:

telnet 38.68.14.188 587

Should connect. If "Connection refused":

sudo ufw allow 587

Check 3: Test SMTP Manually

node -e "
const nodemailer = require('nodemailer');
const t = nodemailer.createTransport({
  host: '38.68.14.188',
  port: 587,
  secure: false,
  auth: { user: 'noreply@firefrostgaming.com', pass: 'YOUR_PASSWORD' }
});
t.sendMail({
  from: 'noreply@firefrostgaming.com',
  to: 'your_email@example.com',
  subject: 'Test',
  text: 'Testing SMTP'
}).then(() => console.log('Sent!')).catch(console.error);
"

Check 4: Mailcow Logs

SSH to Billing VPS:

ssh root@38.68.14.188
docker logs -f mailcowdockerized_postfix-mailcow_1 | grep noreply

Look for errors or rejections.

Check 5: Spam Folder

Check if email landed in spam/junk folder.

Check 6: DKIM/SPF Records

Verify DNS records are set up correctly (should be done already, but worth checking if delivery is failing).


7. "Webhook signature verification failed"

Symptom: Paymenter sends webhook but application logs "Invalid webhook signature" and returns 401.

Cause: WEBHOOK_SECRET in .env doesn't match the secret configured in Paymenter.

Solution:

Check 1: Secrets Match

cat .env | grep WEBHOOK_SECRET

Compare to Paymenter webhook configuration:

  1. Paymenter Admin → System → Webhooks
  2. Find Arbiter webhook
  3. Check secret field

They must match exactly.

Check 2: Header Name

Verify Paymenter sends signature in x-signature header.

Edit src/middleware/verifyWebhook.js if needed:

const signature = req.headers['x-signature']; // or 'x-paymenter-signature' or whatever Paymenter uses

Check 3: Signature Algorithm

Verify Paymenter uses HMAC SHA256. If different, update src/middleware/verifyWebhook.js:

const expectedSignature = crypto
    .createHmac('sha256', secret) // or 'sha1', 'md5', etc.
    .update(payload)
    .digest('hex');

Check 4: Payload Format

Paymenter might stringify the JSON differently. Add debug logging:

console.log('Received signature:', signature);
console.log('Payload:', payload);
console.log('Expected signature:', expectedSignature);

Temporary Bypass (Testing Only):

To test without signature verification (NOT for production):

// In src/routes/webhook.js, temporarily comment out:
// router.post('/billing', verifyBillingWebhook, validateBillingPayload, async (req, res) => {
router.post('/billing', validateBillingPayload, async (req, res) => {

After fixing:

sudo systemctl restart arbiter

🔥 Emergency Procedures

Application Won't Start

Symptom: systemctl status arbiter shows "failed" status.

Diagnosis:

sudo journalctl -u arbiter -n 100

Look for:

  • Missing .env file
  • Syntax errors in code
  • Missing dependencies
  • Port 3500 already in use

Solutions:

Port in use:

sudo lsof -i :3500
sudo kill -9 <PID>
sudo systemctl start arbiter

Missing dependencies:

cd /home/architect/arbiter
npm install
sudo systemctl restart arbiter

Syntax errors: Fix the reported file and line number, then:

sudo systemctl restart arbiter

Database Corruption

Symptom: Application crashes with "database disk image is malformed" error.

Solution:

# Stop application
sudo systemctl stop arbiter

# Check database integrity
sqlite3 linking.db "PRAGMA integrity_check;"

If corrupted:

# Restore from backup (see DEPLOYMENT.md Phase 5)
mv linking.db linking.db.corrupt
cp /home/architect/backups/arbiter/linking_YYYYMMDD_HHMMSS.db linking.db

# Restart application
sudo systemctl start arbiter

All Webhooks Suddenly Failing

Symptom: Every webhook returns 500 error, but application is running.

Check 1: Disk Space

df -h

If / is at 100%, clear space:

# Clean old logs
sudo journalctl --vacuum-time=7d

# Clean old backups
find /home/architect/backups/arbiter -type f -mtime +7 -delete

Check 2: Memory Usage

free -h

If out of memory:

sudo systemctl restart arbiter

Check 3: Discord Bot Disconnected

curl http://localhost:3500/health

If discord: "down":

sudo systemctl restart arbiter

📊 Performance Issues

Slow Response Times

Check 1: Database Size

ls -lh linking.db sessions.db

If >100MB, consider cleanup:

sqlite3 linking.db "DELETE FROM link_tokens WHERE used = 1 AND created_at < datetime('now', '-30 days');"
sqlite3 linking.db "VACUUM;"

Check 2: High CPU Usage

top

If node process is using >80% CPU consistently, check for:

  • Infinite loops in code
  • Too many concurrent webhooks
  • Discord API rate limiting (bot trying to reconnect repeatedly)

Check 3: Rate Limiting Too Strict

If users report frequent "Too many requests" errors:

Edit src/index.js:

const apiLimiter = rateLimit({
    windowMs: 15 * 60 * 1000,
    max: 200, // Increase from 100
    // ...
});

🔐 Security Concerns

Suspicious Database Entries

Check for unusual tokens:

sqlite3 linking.db "SELECT email, tier, created_at FROM link_tokens WHERE used = 0 ORDER BY created_at DESC LIMIT 20;"

Check audit log for unauthorized actions:

sqlite3 linking.db "SELECT * FROM audit_logs ORDER BY timestamp DESC LIMIT 20;"

If compromised:

  1. Change all secrets in .env
  2. Rotate Discord bot token
  3. Regenerate Ghost Admin API key
  4. Clear all unused tokens:
sqlite3 linking.db "DELETE FROM link_tokens WHERE used = 0;"
  1. Force all admin re-authentication:
rm sessions.db
  1. Restart application

📞 Getting Help

Before asking for help, collect:

  1. Service status:
sudo systemctl status arbiter > /tmp/arbiter-status.txt
  1. Recent logs:
sudo journalctl -u arbiter -n 200 > /tmp/arbiter-logs.txt
  1. Configuration (sanitized):
cat .env | sed 's/=.*/=REDACTED/' > /tmp/arbiter-config.txt
  1. Health check output:
curl https://discord-bot.firefrostgaming.com/health > /tmp/arbiter-health.txt
  1. Database stats:
sqlite3 linking.db "SELECT COUNT(*) FROM link_tokens;" > /tmp/arbiter-db-stats.txt
sqlite3 linking.db "SELECT COUNT(*) FROM audit_logs;" >> /tmp/arbiter-db-stats.txt

Share these files (remove any actual secrets first) when requesting support.


🛠️ Tools & Commands Reference

Restart Everything

sudo systemctl restart arbiter
sudo systemctl reload nginx

View All Environment Variables

cat .env

Check Which Process is Using Port 3500

sudo lsof -i :3500

Test Database Connection

sqlite3 linking.db "SELECT 1;"

Force Regenerate Sessions Database

sudo systemctl stop arbiter
rm sessions.db
sudo systemctl start arbiter

Manually Cleanup Old Tokens

sqlite3 linking.db "DELETE FROM link_tokens WHERE created_at < datetime('now', '-1 day');"

Export Audit Logs to CSV

sqlite3 -header -csv linking.db "SELECT * FROM audit_logs ORDER BY timestamp DESC;" > audit_export.csv

🔥❄️ When in doubt, check the logs first. Most issues reveal themselves there. 💙