Files
antigravity-skills-reference/docs/SMART_AUTO_CATEGORIZATION.md
Zied 8de886a2ff feat: Implement intelligent auto-categorization for skills
- Added `scripts/auto_categorize_skills.py` to analyze skill names and descriptions, auto-assigning categories based on keyword matching.
- Updated category distribution to show counts and sort categories by skill count in the Home page dropdown.
- Created documentation in `docs/CATEGORIZATION_IMPLEMENTATION.md` and `docs/SMART_AUTO_CATEGORIZATION.md` detailing the new categorization process and usage.
- Introduced `scripts/fix_year_2025_to_2026.py` to update all skill dates from 2025 to 2026.
- Enhanced user experience by moving "uncategorized" to the bottom of the category list and displaying skill counts in the dropdown.
2026-02-26 12:52:03 +01:00

220 lines
5.8 KiB
Markdown

# Smart Auto-Categorization Guide
## Overview
The skill collection now uses intelligent auto-categorization to eliminate "uncategorized" and organize skills into meaningful categories based on their content.
## Current Status
**946 total skills**
- 820 skills in meaningful categories (87%)
- 126 skills still uncategorized (13%)
- 11 primary categories
- Categories sorted by skill count (most first)
## Category Distribution
| Category | Count | Examples |
|----------|-------|----------|
| Backend | 164 | Node.js, Django, Express, FastAPI |
| Web Development | 107 | React, Vue, Tailwind, CSS |
| Automation | 103 | Workflow, Scripting, RPA |
| DevOps | 83 | Docker, Kubernetes, CI/CD, Git |
| AI/ML | 79 | TensorFlow, PyTorch, NLP, LLM |
| Content | 47 | Documentation, SEO, Writing |
| Database | 44 | SQL, MongoDB, PostgreSQL |
| Testing | 38 | Jest, Cypress, Unit Testing |
| Security | 36 | Encryption, Authentication |
| Cloud | 33 | AWS, Azure, GCP |
| Mobile | 21 | React Native, Flutter, iOS |
| Game Dev | 15 | Unity, WebGL, 3D |
| Data Science | 14 | Pandas, NumPy, Analytics |
## How It Works
### 1. **Keyword-Based Analysis**
The system analyzes skill names and descriptions for keywords:
- **Backend**: nodejs, express, fastapi, django, server, api, database
- **Web Dev**: react, vue, angular, frontend, css, html, tailwind
- **AI/ML**: ai, machine learning, tensorflow, nlp, gpt
- **DevOps**: docker, kubernetes, ci/cd, deploy
- And more...
### 2. **Priority System**
Frontmatter category > Detected Keywords > Fallback (uncategorized)
If a skill already has a category in frontmatter, that's preserved.
### 3. **Scope-Based Matching**
- Exact phrase matches weighted 2x higher than partial matches
- Uses word boundaries to avoid false positives
## Using the Auto-Categorization
### Run on Uncategorized Skills
```bash
python scripts/auto_categorize_skills.py
```
### Preview Changes First (Dry Run)
```bash
python scripts/auto_categorize_skills.py --dry-run
```
### Output
```
======================================================================
AUTO-CATEGORIZATION REPORT
======================================================================
Summary:
✅ Categorized: 776
⏭️ Already categorized: 46
❌ Failed to categorize: 124
📈 Total processed: 946
Sample changes:
• 3d-web-experience
uncategorized → web-development
• ab-test-setup
uncategorized → testing
• agent-framework-azure-ai-py
uncategorized → backend
```
## Web App Improvements
### Category Filter
**Before:**
- Unordered list including "uncategorized"
- No indication of category size
**After:**
- Categories sorted by skill count (most first, "uncategorized" last)
- Shows count: "Backend (164)" "Web Development (107)"
- Much easier to browse
### Example Dropdowns
**Sorted Order:**
1. All Categories
2. Backend (164)
3. Web Development (107)
4. Automation (103)
5. DevOps (83)
6. AI/ML (79)
7. ... more categories ...
8. Uncategorized (126) ← at the end
## For Skill Creators
### When Adding a New Skill
Include category in frontmatter:
```yaml
---
name: my-skill
description: "..."
category: web-development
date_added: "2025-02-26"
---
```
### If You're Not Sure
The system will automatically categorize on next index regeneration:
```bash
python scripts/generate_index.py
```
## Keyword Reference
Available auto-categorization keywords by category:
**Backend**: nodejs, node.js, express, fastapi, django, flask, spring, java, python, golang, rust, server, api, rest, graphql, database, sql, mongodb
**Web Development**: react, vue, angular, html, css, javascript, typescript, frontend, tailwind, bootstrap, webpack, vite, pwa, responsive, seo
**Database**: database, sql, postgres, mysql, mongodb, firestore, redis, orm, schema
**AI/ML**: ai, machine learning, ml, tensorflow, pytorch, nlp, llm, gpt, transformer, embedding, training
**DevOps**: docker, kubernetes, ci/cd, git, jenkins, terraform, ansible, deploy, container, monitoring
**Cloud**: aws, azure, gcp, serverless, lambda, storage, cdn
**Security**: encryption, cryptography, jwt, oauth, authentication, authorization, vulnerability
**Testing**: test, jest, mocha, pytest, cypress, selenium, unit test, e2e
**Mobile**: mobile, react native, flutter, ios, android, swift, kotlin
**Automation**: automation, workflow, scripting, robot, trigger, integration
**Game Development**: game, unity, unreal, godot, threejs, 2d, 3d, physics
**Data Science**: data, analytics, pandas, numpy, statistics, visualization
## Customization
### Add Custom Keywords
Edit [scripts/auto_categorize_skills.py](scripts/auto_categorize_skills.py):
```python
CATEGORY_KEYWORDS = {
'your-category': [
'keyword1', 'keyword2', 'exact phrase', 'another-keyword'
],
# ... other categories
}
```
Then re-run:
```bash
python scripts/auto_categorize_skills.py
python scripts/generate_index.py
```
## Troubleshooting
### "Failed to categorize" Skills
Some skills may be too generic or unique. You can:
1. **Manually set category** in the skill's frontmatter:
```yaml
category: your-chosen-category
```
2. **Add keywords** to CATEGORY_KEYWORDS config
3. **Move to folder** if it fits a broader category:
```
skills/backend/my-new-skill/SKILL.md
```
### Regenerating Index
After making changes to SKILL.md files:
```bash
python scripts/generate_index.py
```
This will:
- Parse frontmatter categories
- Fallback to folder structure
- Generate new skills_index.json
- Copy to web-app/public/skills.json
## Next Steps
1. **Test in web app**: Try the improved category filter
2. **Add missing keywords**: If certain skills are still uncategorized
3. **Organize remaining 126**: Either auto-assign or manually review
4. **Monitor growth**: Use reports to track new vs categorized skills
---
**Result**: Much cleaner category filter with smart, meaningful organization! 🎉