WHAT WAS DONE: - Migrated Arbiter (discord-oauth-arbiter) code to services/arbiter/ - Migrated Modpack Version Checker code to services/modpack-version-checker/ - Created .env.example for Arbiter with all required environment variables - Moved systemd service file to services/arbiter/deploy/ - Organized directory structure per Gemini monorepo recommendations WHY: - Consolidate all service code in one repository - Prepare for Gemini code review (Panel v1.12 compatibility check) - Enable service-prefixed Git tagging (arbiter-v2.1.0, modpack-v1.0.0) - Support npm workspaces for shared dependencies SERVICES MIGRATED: 1. Arbiter (Discord OAuth bot) - Originally written by Gemini + Claude - Full source code from ops-manual docs/implementation/ - Created comprehensive .env.example - Ready for Panel v1.12 compatibility verification 2. Modpack Version Checker (Python CLI tool) - Full source code from ops-manual docs/tasks/ - Written for Panel v1.11, needs Gemini review for v1.12 - Never had code review before STILL TODO: - Whitelist Manager - Pull from Billing VPS (38.68.14.188) - Currently deployed and running - Needs Panel v1.12 API compatibility fix (Task #86) - Requires SSH access to pull code NEXT STEPS: - Gemini code review for Panel v1.12 API compatibility - Create package.json for each service - Test npm workspaces integration - Deploy after verification FILES: - services/arbiter/ (25 new files, full application) - services/modpack-version-checker/ (21 new files, full application) Signed-off-by: The Golden Chronicler <claude@firefrostgaming.com>
667 lines
14 KiB
Markdown
667 lines
14 KiB
Markdown
# Firefrost Arbiter - Troubleshooting Guide
|
|
|
|
**Last Updated:** March 30, 2026
|
|
**Prepared by:** Claude (Chronicler #49) + Gemini AI
|
|
|
|
---
|
|
|
|
## 🔍 Quick Diagnostics
|
|
|
|
### Check Service Status
|
|
```bash
|
|
sudo systemctl status arbiter
|
|
```
|
|
|
|
### View Recent Logs
|
|
```bash
|
|
sudo journalctl -u arbiter -n 50
|
|
```
|
|
|
|
### Follow Live Logs
|
|
```bash
|
|
sudo journalctl -u arbiter -f
|
|
```
|
|
|
|
### Check Health Endpoint
|
|
```bash
|
|
curl https://discord-bot.firefrostgaming.com/health
|
|
```
|
|
|
|
---
|
|
|
|
## 🚨 Common Issues & Solutions
|
|
|
|
### 1. "Invalid redirect URI" in Discord OAuth
|
|
|
|
**Symptom:** When clicking linking URL or admin login, Discord shows "Invalid Redirect URI" error.
|
|
|
|
**Cause:** The redirect URI in your `.env` file doesn't exactly match what's registered in the Discord Developer Portal.
|
|
|
|
**Solution:**
|
|
|
|
1. Check `.env` file:
|
|
```bash
|
|
cat .env | grep APP_URL
|
|
```
|
|
|
|
Should show: `APP_URL=https://discord-bot.firefrostgaming.com` (no trailing slash)
|
|
|
|
2. Go to Discord Developer Portal → OAuth2 → General
|
|
3. Verify exact URIs are registered:
|
|
- `https://discord-bot.firefrostgaming.com/auth/callback`
|
|
- `https://discord-bot.firefrostgaming.com/admin/callback`
|
|
|
|
4. **Important:** Check for:
|
|
- Trailing slashes (don't include them)
|
|
- `http` vs `https` mismatch
|
|
- `www` vs non-www
|
|
- Typos in domain
|
|
|
|
5. If you changed the URI, wait 5-10 minutes for Discord to propagate
|
|
|
|
6. Restart the application:
|
|
```bash
|
|
sudo systemctl restart arbiter
|
|
```
|
|
|
|
---
|
|
|
|
### 2. "Bot missing permissions" when assigning roles
|
|
|
|
**Symptom:** Logs show "Failed to assign role" or "Missing Permissions" error when trying to assign Discord roles.
|
|
|
|
**Cause:** Either the bot wasn't invited with the correct permissions, or the bot's role is positioned below the roles it's trying to assign.
|
|
|
|
**Solution:**
|
|
|
|
**Check 1: Bot Has "Manage Roles" Permission**
|
|
1. Go to Discord Server → Settings → Roles
|
|
2. Find the bot's role (usually named after the bot)
|
|
3. Verify "Manage Roles" permission is enabled
|
|
4. If not, enable it
|
|
|
|
**Check 2: Role Hierarchy (Most Common Issue)**
|
|
1. Go to Discord Server → Settings → Roles
|
|
2. Find the bot's role in the list
|
|
3. **Drag it ABOVE all subscription tier roles**
|
|
4. The bot can only assign roles that are below its own role
|
|
|
|
Example correct hierarchy:
|
|
```
|
|
1. Owner (you)
|
|
2. Admin
|
|
3. [Bot Role] ← MUST BE HERE
|
|
4. Sovereign
|
|
5. Fire Legend
|
|
6. Frost Legend
|
|
... (all other subscriber roles)
|
|
```
|
|
|
|
**Check 3: Re-invite Bot with Correct Permissions**
|
|
|
|
If role hierarchy is correct but still failing:
|
|
|
|
1. Go to Discord Developer Portal → OAuth2 → URL Generator
|
|
2. Select scopes: `bot`
|
|
3. Select permissions: `Manage Roles` (minimum)
|
|
4. Copy generated URL
|
|
5. Visit URL and re-authorize bot (this updates permissions)
|
|
|
|
**Test:**
|
|
```bash
|
|
# Check if bot can see roles
|
|
sudo journalctl -u arbiter -n 100 | grep "Role ID"
|
|
```
|
|
|
|
---
|
|
|
|
### 3. "Session not persisting" across requests
|
|
|
|
**Symptom:** Admin panel logs you out immediately after login, or every page reload requires re-authentication.
|
|
|
|
**Cause:** Session cookies not being saved properly, usually due to reverse proxy configuration.
|
|
|
|
**Solution:**
|
|
|
|
**Check 1: Express Trust Proxy Setting**
|
|
|
|
Verify in `src/index.js`:
|
|
```javascript
|
|
app.set('trust proxy', 1);
|
|
```
|
|
|
|
This line MUST be present before session middleware.
|
|
|
|
**Check 2: Nginx Proxy Headers**
|
|
|
|
Edit Nginx config:
|
|
```bash
|
|
sudo nano /etc/nginx/sites-available/arbiter
|
|
```
|
|
|
|
Verify these headers exist in the `location /` block:
|
|
```nginx
|
|
proxy_set_header X-Real-IP $remote_addr;
|
|
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
|
proxy_set_header X-Forwarded-Proto $scheme;
|
|
```
|
|
|
|
**Check 3: Cookie Settings for Development**
|
|
|
|
If testing on `http://localhost`, update `src/index.js`:
|
|
```javascript
|
|
cookie: {
|
|
secure: process.env.NODE_ENV === 'production', // false for localhost
|
|
httpOnly: true,
|
|
maxAge: 1000 * 60 * 60 * 24 * 7
|
|
}
|
|
```
|
|
|
|
**Check 4: SESSION_SECRET is Set**
|
|
```bash
|
|
grep SESSION_SECRET .env
|
|
```
|
|
|
|
Should show a 64-character hex string.
|
|
|
|
**Restart after changes:**
|
|
```bash
|
|
sudo systemctl restart arbiter
|
|
sudo systemctl reload nginx
|
|
```
|
|
|
|
---
|
|
|
|
### 4. "Ghost API 401 error"
|
|
|
|
**Symptom:** Logs show "Ghost API 401 Unauthorized" when trying to search users or update members.
|
|
|
|
**Cause:** Invalid or incorrectly formatted Admin API key.
|
|
|
|
**Solution:**
|
|
|
|
**Check 1: API Key Format**
|
|
```bash
|
|
cat .env | grep CMS_ADMIN_KEY
|
|
```
|
|
|
|
Should be in format: `key_id:secret` (with the colon)
|
|
|
|
Example:
|
|
```
|
|
CMS_ADMIN_KEY=65f8a1b2c3d4e5f6:a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6
|
|
```
|
|
|
|
**Check 2: Integration Still Exists**
|
|
|
|
1. Go to Ghost Admin → Settings → Integrations
|
|
2. Find "Firefrost Arbiter" integration
|
|
3. Verify it's not deleted or disabled
|
|
4. If missing, create new integration and update `.env`
|
|
|
|
**Check 3: Ghost URL is Correct**
|
|
```bash
|
|
cat .env | grep CMS_URL
|
|
```
|
|
|
|
Should match your Ghost installation URL exactly (no trailing slash).
|
|
|
|
**Check 4: Test API Key Manually**
|
|
|
|
```bash
|
|
curl -H "Authorization: Ghost <your_admin_key>" \
|
|
"https://firefrostgaming.com/ghost/api/admin/members/"
|
|
```
|
|
|
|
Should return JSON with member list. If 401, key is invalid.
|
|
|
|
**After fixing:**
|
|
```bash
|
|
sudo systemctl restart arbiter
|
|
```
|
|
|
|
---
|
|
|
|
### 5. "Database locked" errors
|
|
|
|
**Symptom:** Logs show "SQLITE_BUSY: database is locked" when multiple webhooks arrive simultaneously.
|
|
|
|
**Cause:** SQLite locks the database during writes. If multiple webhooks arrive at exactly the same time, one may fail.
|
|
|
|
**Solution:**
|
|
|
|
**Option 1: Increase Timeout (Recommended)**
|
|
|
|
Edit `src/database.js`:
|
|
```javascript
|
|
const Database = require('better-sqlite3');
|
|
const db = new Database('linking.db', { timeout: 5000 });
|
|
```
|
|
|
|
This gives SQLite 5 seconds to wait for locks to clear.
|
|
|
|
**Option 2: Add WAL Mode (Write-Ahead Logging)**
|
|
|
|
Edit `src/database.js`, add after database creation:
|
|
```javascript
|
|
db.pragma('journal_mode = WAL');
|
|
```
|
|
|
|
WAL mode allows concurrent reads and writes.
|
|
|
|
**Option 3: Retry Logic (For Critical Operations)**
|
|
|
|
In `src/routes/webhook.js`, wrap database operations:
|
|
```javascript
|
|
let retries = 3;
|
|
while (retries > 0) {
|
|
try {
|
|
stmt.run(token, customer_email, tier, subscription_id);
|
|
break;
|
|
} catch (error) {
|
|
if (error.code === 'SQLITE_BUSY' && retries > 1) {
|
|
retries--;
|
|
await new Promise(resolve => setTimeout(resolve, 100));
|
|
} else {
|
|
throw error;
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**After changes:**
|
|
```bash
|
|
sudo systemctl restart arbiter
|
|
```
|
|
|
|
---
|
|
|
|
### 6. "Email not sending"
|
|
|
|
**Symptom:** Webhook processes successfully but subscriber never receives linking email.
|
|
|
|
**Cause:** SMTP connection issue, firewall blocking port 587, or incorrect credentials.
|
|
|
|
**Solution:**
|
|
|
|
**Check 1: SMTP Credentials**
|
|
```bash
|
|
cat .env | grep SMTP
|
|
```
|
|
|
|
Verify:
|
|
- `SMTP_HOST=38.68.14.188`
|
|
- `SMTP_USER=noreply@firefrostgaming.com`
|
|
- `SMTP_PASS=<correct password>`
|
|
|
|
**Check 2: Port 587 is Open**
|
|
|
|
From Command Center:
|
|
```bash
|
|
telnet 38.68.14.188 587
|
|
```
|
|
|
|
Should connect. If "Connection refused":
|
|
```bash
|
|
sudo ufw allow 587
|
|
```
|
|
|
|
**Check 3: Test SMTP Manually**
|
|
|
|
```bash
|
|
node -e "
|
|
const nodemailer = require('nodemailer');
|
|
const t = nodemailer.createTransport({
|
|
host: '38.68.14.188',
|
|
port: 587,
|
|
secure: false,
|
|
auth: { user: 'noreply@firefrostgaming.com', pass: 'YOUR_PASSWORD' }
|
|
});
|
|
t.sendMail({
|
|
from: 'noreply@firefrostgaming.com',
|
|
to: 'your_email@example.com',
|
|
subject: 'Test',
|
|
text: 'Testing SMTP'
|
|
}).then(() => console.log('Sent!')).catch(console.error);
|
|
"
|
|
```
|
|
|
|
**Check 4: Mailcow Logs**
|
|
|
|
SSH to Billing VPS:
|
|
```bash
|
|
ssh root@38.68.14.188
|
|
docker logs -f mailcowdockerized_postfix-mailcow_1 | grep noreply
|
|
```
|
|
|
|
Look for errors or rejections.
|
|
|
|
**Check 5: Spam Folder**
|
|
|
|
Check if email landed in spam/junk folder.
|
|
|
|
**Check 6: DKIM/SPF Records**
|
|
|
|
Verify DNS records are set up correctly (should be done already, but worth checking if delivery is failing).
|
|
|
|
---
|
|
|
|
### 7. "Webhook signature verification failed"
|
|
|
|
**Symptom:** Paymenter sends webhook but application logs "Invalid webhook signature" and returns 401.
|
|
|
|
**Cause:** `WEBHOOK_SECRET` in `.env` doesn't match the secret configured in Paymenter.
|
|
|
|
**Solution:**
|
|
|
|
**Check 1: Secrets Match**
|
|
```bash
|
|
cat .env | grep WEBHOOK_SECRET
|
|
```
|
|
|
|
Compare to Paymenter webhook configuration:
|
|
1. Paymenter Admin → System → Webhooks
|
|
2. Find Arbiter webhook
|
|
3. Check secret field
|
|
|
|
They must match exactly.
|
|
|
|
**Check 2: Header Name**
|
|
|
|
Verify Paymenter sends signature in `x-signature` header.
|
|
|
|
Edit `src/middleware/verifyWebhook.js` if needed:
|
|
```javascript
|
|
const signature = req.headers['x-signature']; // or 'x-paymenter-signature' or whatever Paymenter uses
|
|
```
|
|
|
|
**Check 3: Signature Algorithm**
|
|
|
|
Verify Paymenter uses HMAC SHA256. If different, update `src/middleware/verifyWebhook.js`:
|
|
```javascript
|
|
const expectedSignature = crypto
|
|
.createHmac('sha256', secret) // or 'sha1', 'md5', etc.
|
|
.update(payload)
|
|
.digest('hex');
|
|
```
|
|
|
|
**Check 4: Payload Format**
|
|
|
|
Paymenter might stringify the JSON differently. Add debug logging:
|
|
```javascript
|
|
console.log('Received signature:', signature);
|
|
console.log('Payload:', payload);
|
|
console.log('Expected signature:', expectedSignature);
|
|
```
|
|
|
|
**Temporary Bypass (Testing Only):**
|
|
|
|
To test without signature verification (NOT for production):
|
|
```javascript
|
|
// In src/routes/webhook.js, temporarily comment out:
|
|
// router.post('/billing', verifyBillingWebhook, validateBillingPayload, async (req, res) => {
|
|
router.post('/billing', validateBillingPayload, async (req, res) => {
|
|
```
|
|
|
|
**After fixing:**
|
|
```bash
|
|
sudo systemctl restart arbiter
|
|
```
|
|
|
|
---
|
|
|
|
## 🔥 Emergency Procedures
|
|
|
|
### Application Won't Start
|
|
|
|
**Symptom:** `systemctl status arbiter` shows "failed" status.
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
sudo journalctl -u arbiter -n 100
|
|
```
|
|
|
|
Look for:
|
|
- Missing `.env` file
|
|
- Syntax errors in code
|
|
- Missing dependencies
|
|
- Port 3500 already in use
|
|
|
|
**Solutions:**
|
|
|
|
**Port in use:**
|
|
```bash
|
|
sudo lsof -i :3500
|
|
sudo kill -9 <PID>
|
|
sudo systemctl start arbiter
|
|
```
|
|
|
|
**Missing dependencies:**
|
|
```bash
|
|
cd /home/architect/arbiter
|
|
npm install
|
|
sudo systemctl restart arbiter
|
|
```
|
|
|
|
**Syntax errors:**
|
|
Fix the reported file and line number, then:
|
|
```bash
|
|
sudo systemctl restart arbiter
|
|
```
|
|
|
|
---
|
|
|
|
### Database Corruption
|
|
|
|
**Symptom:** Application crashes with "database disk image is malformed" error.
|
|
|
|
**Solution:**
|
|
|
|
```bash
|
|
# Stop application
|
|
sudo systemctl stop arbiter
|
|
|
|
# Check database integrity
|
|
sqlite3 linking.db "PRAGMA integrity_check;"
|
|
```
|
|
|
|
**If corrupted:**
|
|
```bash
|
|
# Restore from backup (see DEPLOYMENT.md Phase 5)
|
|
mv linking.db linking.db.corrupt
|
|
cp /home/architect/backups/arbiter/linking_YYYYMMDD_HHMMSS.db linking.db
|
|
|
|
# Restart application
|
|
sudo systemctl start arbiter
|
|
```
|
|
|
|
---
|
|
|
|
### All Webhooks Suddenly Failing
|
|
|
|
**Symptom:** Every webhook returns 500 error, but application is running.
|
|
|
|
**Check 1: Disk Space**
|
|
```bash
|
|
df -h
|
|
```
|
|
|
|
If `/` is at 100%, clear space:
|
|
```bash
|
|
# Clean old logs
|
|
sudo journalctl --vacuum-time=7d
|
|
|
|
# Clean old backups
|
|
find /home/architect/backups/arbiter -type f -mtime +7 -delete
|
|
```
|
|
|
|
**Check 2: Memory Usage**
|
|
```bash
|
|
free -h
|
|
```
|
|
|
|
If out of memory:
|
|
```bash
|
|
sudo systemctl restart arbiter
|
|
```
|
|
|
|
**Check 3: Discord Bot Disconnected**
|
|
```bash
|
|
curl http://localhost:3500/health
|
|
```
|
|
|
|
If `discord: "down"`:
|
|
```bash
|
|
sudo systemctl restart arbiter
|
|
```
|
|
|
|
---
|
|
|
|
## 📊 Performance Issues
|
|
|
|
### Slow Response Times
|
|
|
|
**Check 1: Database Size**
|
|
```bash
|
|
ls -lh linking.db sessions.db
|
|
```
|
|
|
|
If >100MB, consider cleanup:
|
|
```bash
|
|
sqlite3 linking.db "DELETE FROM link_tokens WHERE used = 1 AND created_at < datetime('now', '-30 days');"
|
|
sqlite3 linking.db "VACUUM;"
|
|
```
|
|
|
|
**Check 2: High CPU Usage**
|
|
```bash
|
|
top
|
|
```
|
|
|
|
If `node` process is using >80% CPU consistently, check for:
|
|
- Infinite loops in code
|
|
- Too many concurrent webhooks
|
|
- Discord API rate limiting (bot trying to reconnect repeatedly)
|
|
|
|
**Check 3: Rate Limiting Too Strict**
|
|
|
|
If users report frequent "Too many requests" errors:
|
|
|
|
Edit `src/index.js`:
|
|
```javascript
|
|
const apiLimiter = rateLimit({
|
|
windowMs: 15 * 60 * 1000,
|
|
max: 200, // Increase from 100
|
|
// ...
|
|
});
|
|
```
|
|
|
|
---
|
|
|
|
## 🔐 Security Concerns
|
|
|
|
### Suspicious Database Entries
|
|
|
|
**Check for unusual tokens:**
|
|
```bash
|
|
sqlite3 linking.db "SELECT email, tier, created_at FROM link_tokens WHERE used = 0 ORDER BY created_at DESC LIMIT 20;"
|
|
```
|
|
|
|
**Check audit log for unauthorized actions:**
|
|
```bash
|
|
sqlite3 linking.db "SELECT * FROM audit_logs ORDER BY timestamp DESC LIMIT 20;"
|
|
```
|
|
|
|
**If compromised:**
|
|
1. Change all secrets in `.env`
|
|
2. Rotate Discord bot token
|
|
3. Regenerate Ghost Admin API key
|
|
4. Clear all unused tokens:
|
|
```bash
|
|
sqlite3 linking.db "DELETE FROM link_tokens WHERE used = 0;"
|
|
```
|
|
5. Force all admin re-authentication:
|
|
```bash
|
|
rm sessions.db
|
|
```
|
|
6. Restart application
|
|
|
|
---
|
|
|
|
## 📞 Getting Help
|
|
|
|
**Before asking for help, collect:**
|
|
|
|
1. Service status:
|
|
```bash
|
|
sudo systemctl status arbiter > /tmp/arbiter-status.txt
|
|
```
|
|
|
|
2. Recent logs:
|
|
```bash
|
|
sudo journalctl -u arbiter -n 200 > /tmp/arbiter-logs.txt
|
|
```
|
|
|
|
3. Configuration (sanitized):
|
|
```bash
|
|
cat .env | sed 's/=.*/=REDACTED/' > /tmp/arbiter-config.txt
|
|
```
|
|
|
|
4. Health check output:
|
|
```bash
|
|
curl https://discord-bot.firefrostgaming.com/health > /tmp/arbiter-health.txt
|
|
```
|
|
|
|
5. Database stats:
|
|
```bash
|
|
sqlite3 linking.db "SELECT COUNT(*) FROM link_tokens;" > /tmp/arbiter-db-stats.txt
|
|
sqlite3 linking.db "SELECT COUNT(*) FROM audit_logs;" >> /tmp/arbiter-db-stats.txt
|
|
```
|
|
|
|
**Share these files (remove any actual secrets first) when requesting support.**
|
|
|
|
---
|
|
|
|
## 🛠️ Tools & Commands Reference
|
|
|
|
### Restart Everything
|
|
```bash
|
|
sudo systemctl restart arbiter
|
|
sudo systemctl reload nginx
|
|
```
|
|
|
|
### View All Environment Variables
|
|
```bash
|
|
cat .env
|
|
```
|
|
|
|
### Check Which Process is Using Port 3500
|
|
```bash
|
|
sudo lsof -i :3500
|
|
```
|
|
|
|
### Test Database Connection
|
|
```bash
|
|
sqlite3 linking.db "SELECT 1;"
|
|
```
|
|
|
|
### Force Regenerate Sessions Database
|
|
```bash
|
|
sudo systemctl stop arbiter
|
|
rm sessions.db
|
|
sudo systemctl start arbiter
|
|
```
|
|
|
|
### Manually Cleanup Old Tokens
|
|
```bash
|
|
sqlite3 linking.db "DELETE FROM link_tokens WHERE created_at < datetime('now', '-1 day');"
|
|
```
|
|
|
|
### Export Audit Logs to CSV
|
|
```bash
|
|
sqlite3 -header -csv linking.db "SELECT * FROM audit_logs ORDER BY timestamp DESC;" > audit_export.csv
|
|
```
|
|
|
|
---
|
|
|
|
**🔥❄️ When in doubt, check the logs first. Most issues reveal themselves there. 💙**
|