Files
antigravity-skills-reference/docs/maintainers/smart-auto-categorization.md
sck_0 45844de534 refactor: reorganize repo docs and tooling layout
Consolidate the repository into clearer apps, tools, and layered docs areas so contributors can navigate and maintain it more reliably. Align validation, metadata sync, and CI around the same canonical workflow to reduce drift across local checks and GitHub Actions.
2026-03-06 15:01:38 +01:00

5.9 KiB

Smart Auto-Categorization Guide

Overview

The skill collection now uses intelligent auto-categorization to eliminate "uncategorized" and organize skills into meaningful categories based on their content.

Current Status

Current repository indexed through the generated catalog

  • Most skills are in meaningful categories
  • A smaller tail still needs manual review or better keyword coverage
  • 11 primary categories
  • Categories sorted by skill count (most first)

Category Distribution

Category Count Examples
Backend 164 Node.js, Django, Express, FastAPI
Web Development 107 React, Vue, Tailwind, CSS
Automation 103 Workflow, Scripting, RPA
DevOps 83 Docker, Kubernetes, CI/CD, Git
AI/ML 79 TensorFlow, PyTorch, NLP, LLM
Content 47 Documentation, SEO, Writing
Database 44 SQL, MongoDB, PostgreSQL
Testing 38 Jest, Cypress, Unit Testing
Security 36 Encryption, Authentication
Cloud 33 AWS, Azure, GCP
Mobile 21 React Native, Flutter, iOS
Game Dev 15 Unity, WebGL, 3D
Data Science 14 Pandas, NumPy, Analytics

How It Works

1. Keyword-Based Analysis

The system analyzes skill names and descriptions for keywords:

  • Backend: nodejs, express, fastapi, django, server, api, database
  • Web Dev: react, vue, angular, frontend, css, html, tailwind
  • AI/ML: ai, machine learning, tensorflow, nlp, gpt
  • DevOps: docker, kubernetes, ci/cd, deploy
  • And more...

2. Priority System

Frontmatter category > Detected Keywords > Fallback (uncategorized)

If a skill already has a category in frontmatter, that's preserved.

3. Scope-Based Matching

  • Exact phrase matches weighted 2x higher than partial matches
  • Uses word boundaries to avoid false positives

Using the Auto-Categorization

Run on Uncategorized Skills

python tools/scripts/auto_categorize_skills.py

Preview Changes First (Dry Run)

python tools/scripts/auto_categorize_skills.py --dry-run

Output

======================================================================
AUTO-CATEGORIZATION REPORT
======================================================================

Summary:
   ✅ Categorized: 776
   ⏭️  Already categorized: 46
   ❌ Failed to categorize: 124
   📈 Total processed: full repository

Sample changes:
   • 3d-web-experience
     uncategorized → web-development
   • ab-test-setup
     uncategorized → testing
   • agent-framework-azure-ai-py
     uncategorized → backend

Web App Improvements

Category Filter

Before:

  • Unordered list including "uncategorized"
  • No indication of category size

After:

  • Categories sorted by skill count (most first, "uncategorized" last)
  • Shows count: "Backend (164)" "Web Development (107)"
  • Much easier to browse

Example Dropdowns

Sorted Order:

  1. All Categories
  2. Backend (164)
  3. Web Development (107)
  4. Automation (103)
  5. DevOps (83)
  6. AI/ML (79)
  7. ... more categories ...
  8. Uncategorized (126) ← at the end

For Skill Creators

When Adding a New Skill

Include category in frontmatter:

---
name: my-skill
description: "..."
category: web-development
date_added: "2026-03-06"
---

If You're Not Sure

The system will automatically categorize on next index regeneration:

python tools/scripts/generate_index.py

Keyword Reference

Available auto-categorization keywords by category:

Backend: nodejs, node.js, express, fastapi, django, flask, spring, java, python, golang, rust, server, api, rest, graphql, database, sql, mongodb

Web Development: react, vue, angular, html, css, javascript, typescript, frontend, tailwind, bootstrap, webpack, vite, pwa, responsive, seo

Database: database, sql, postgres, mysql, mongodb, firestore, redis, orm, schema

AI/ML: ai, machine learning, ml, tensorflow, pytorch, nlp, llm, gpt, transformer, embedding, training

DevOps: docker, kubernetes, ci/cd, git, jenkins, terraform, ansible, deploy, container, monitoring

Cloud: aws, azure, gcp, serverless, lambda, storage, cdn

Security: encryption, cryptography, jwt, oauth, authentication, authorization, vulnerability

Testing: test, jest, mocha, pytest, cypress, selenium, unit test, e2e

Mobile: mobile, react native, flutter, ios, android, swift, kotlin

Automation: automation, workflow, scripting, robot, trigger, integration

Game Development: game, unity, unreal, godot, threejs, 2d, 3d, physics

Data Science: data, analytics, pandas, numpy, statistics, visualization

Customization

Add Custom Keywords

Edit tools/scripts/auto_categorize_skills.py:

CATEGORY_KEYWORDS = {
    'your-category': [
        'keyword1', 'keyword2', 'exact phrase', 'another-keyword'
    ],
    # ... other categories
}

Then re-run:

python tools/scripts/auto_categorize_skills.py
python tools/scripts/generate_index.py

Troubleshooting

"Failed to categorize" Skills

Some skills may be too generic or unique. You can:

  1. Manually set category in the skill's frontmatter:
category: your-chosen-category
  1. Add keywords to CATEGORY_KEYWORDS config

  2. Move to folder if it fits a broader category:

skills/backend/my-new-skill/SKILL.md

Regenerating Index

After making changes to SKILL.md files:

python tools/scripts/generate_index.py

This will:

  • Parse frontmatter categories
  • Fallback to folder structure
  • Generate new skills_index.json
  • Copy to apps/web-app/public/skills.json

Next Steps

  1. Test in web app: Try the improved category filter
  2. Add missing keywords: If certain skills are still uncategorized
  3. Organize remaining uncategorized skills: Either auto-assign or manually review
  4. Monitor growth: Use reports to track new vs categorized skills

Result: Much cleaner category filter with smart, meaningful organization! 🎉