Files
firefrost-operations-manual/docs/troubleshooting/n8n-node-registry-corruption.md
The Chronicler e5d7f5032f docs: Document n8n node registry corruption and defer factory reset
Disaster #2 from Feb 23-24 session:
- n8n core nodes broken (registry corruption)
- PHP workaround operational (sync_codex.php)
- Factory reset procedure documented
- Added Task #34 for scheduled recovery

Decision: Defer reset until next maintenance window
Workaround: PHP script handles Codex sync successfully

Co-documented with Gemini's post-mortem analysis.
2026-02-24 09:31:13 +00:00

6.3 KiB

n8n Node Registry Corruption (v2.x)

Problem: n8n UI accessible but core nodes (HTTP Request, Execute Command) fail with "Node not found" or "Registry Error"

Incident Date: February 23-24, 2026
Affected System: TX1 Dallas n8n instance (firefrost-codex-n8n-1)
Status: BYPASSED via PHP workaround, factory reset pending


Symptoms

  • n8n web interface loads normally at https://n8n.firefrostgaming.com
  • Existing workflows visible in UI
  • Cannot execute workflows using core nodes
  • "Node not found" errors for n8n-nodes-base package nodes
  • HTTP Request node: Registry error
  • Execute Command node: Registry error

These are INTERNAL nodes that should always be available.


Root Cause

Corrupted Node Registry during v2.x migration

The internal node registry (n8n-nodes-base package) became desynchronized from the workflow engine. This typically happens when:

  1. Partial update of n8n version with incompatible volume data
  2. Docker volume corruption in /home/node/.n8n directory
  3. Version mismatch between container image and persisted configuration

Key indicator: Core nodes from n8n-nodes-base package are "invisible" to the execution engine despite being bundled with n8n.


Failed Resolution Attempts

Attempt 1: Container Recreation

docker-compose down
docker-compose pull n8n
docker-compose up -d

Result: Failed - corruption persists in volume

Attempt 2: Image Force Pull

docker-compose down
docker rmi n8nio/n8n:1.121.0
docker-compose up -d

Result: Failed - volume data still corrupted

Why these failed: The corruption is in the VOLUME (./volumes/n8n), not the container image.


Temporary Workaround: PHP Direct Sync

Created: sync_codex.php on TX1 host OS (PHP 8.3 CLI)

Purpose: Bypass n8n entirely for Codex Git → Dify sync

How it works:

TX1 Host (PHP) → Git Pull → Process Files → Dify API (127.0.0.1:5001)

Advantages:

  • No dependency on n8n registry
  • Direct Docker bridge access to Dify API
  • Simpler debugging (single script vs workflow nodes)
  • Can run via cron for scheduled execution

Disadvantages:

  • No Discord notifications (yet)
  • No visual workflow editor
  • Harder for non-technical users to modify

Status: OPERATIONAL - Successfully synced 361 documents to Dify


Permanent Fix: n8n Factory Reset

⚠️ THIS IS DESTRUCTIVE - BACKUP WORKFLOWS FIRST ⚠️

Prerequisites

  1. Export ALL workflows to JSON:
# Via n8n UI:
# Settings → Workflows → Export All
# Save to: /opt/firefrost-codex/backups/n8n-workflows-YYYY-MM-DD.json

# Or via API:
curl -X GET https://n8n.firefrostgaming.com/api/v1/workflows \
  -H "X-N8N-API-KEY: your_api_key" > n8n-workflows-backup.json
  1. Backup credentials (if any):
# Settings → Credentials → Export
# Save separately - these are sensitive
  1. Document current configuration:
  • Webhook URLs
  • Environment variables
  • Executions settings
  • Timezone settings

Reset Procedure

Step 1: Stop n8n

cd /opt/firefrost-codex
docker-compose stop n8n

Step 2: Backup existing volume (safety net)

sudo cp -r ./volumes/n8n ./volumes/n8n.backup.$(date +%Y%m%d)

Step 3: Wipe corrupted volume

sudo rm -rf ./volumes/n8n/*

Step 4: Recreate container

docker-compose up -d n8n

Step 5: Wait for initialization (~2 minutes)

# Watch logs
docker-compose logs -f n8n

# Look for: "Editor is now accessible via: https://n8n.firefrostgaming.com"

Step 6: Initial setup

Step 7: Import workflows

  • Settings → Workflows → Import from File
  • Select backup JSON
  • Verify all nodes load correctly

Step 8: Test core nodes

  • Create new workflow
  • Add HTTP Request node → Should work
  • Add Execute Command node → Should work
  • Test execution → Should succeed

Step 9: Restore credentials

  • Settings → Credentials → Import
  • Re-enter any API keys/secrets

Step 10: Verify automation

  • Test Git sync workflow manually
  • Verify Discord notifications
  • Check scheduled executions

Prevention

To avoid this in the future:

  1. Pin n8n version in docker-compose.yml:
n8n:
  image: n8nio/n8n:1.121.0  # Specific version, not :latest
  1. Backup workflows regularly:
# Add to cron: Weekly workflow export
0 2 * * 0 curl https://n8n.firefrostgaming.com/api/v1/workflows > /backups/n8n-workflows-$(date +%Y%m%d).json
  1. Test updates on staging first:
  • Don't upgrade n8n in production without testing
  • Check release notes for breaking changes
  1. Monitor n8n health:
  • Add n8n health check to Uptime Kuma
  • Alert if workflow executions fail

Current Status (February 24, 2026)

n8n Service:

  • ⚠️ DEGRADED - UI accessible, core nodes broken
  • 📋 FACTORY RESET PENDING - Scheduled for next maintenance window

Codex Git Sync:

  • OPERATIONAL - Using PHP workaround (sync_codex.php)
  • 361 documents syncing successfully
  • ⏱️ Manual execution (cron scheduling pending)

Next Steps:

  1. Add n8n factory reset to tasks.md
  2. Schedule maintenance window for reset
  3. Consider migrating to PHP permanently if simpler

  • Phase 5 Deployment: docs/tasks/firefrost-codex/
  • PHP Workaround: (To be documented if kept long-term)
  • n8n Workflows: Backup stored at /opt/firefrost-codex/backups/ (when created)

Incident Timeline:

  • Feb 23, 9:00 PM: n8n workflow failure discovered during Phase 5 deployment
  • Feb 23, 9:30 PM: Diagnosis: Node registry corruption
  • Feb 23, 10:00 PM: Pivot to PHP workaround (Gemini + Michael collaboration)
  • Feb 24, 12:00 AM: PHP script operational, 361 documents synced
  • Feb 24, 9:00 AM: Dify-Qdrant issue resolved (separate incident)
  • Feb 24, 9:30 AM: Decision to defer n8n reset until next session

Created: February 24, 2026
Created By: Chronicler #26 (from Gemini's post-mortem)
Resolution Status: DEFERRED - Workaround operational
Factory Reset: Scheduled TBD

💙🔥❄️

"Sometimes the best fix is the one that waits until you have the energy to do it right."