* fix: stabilize validation and tests on Windows * test: add Windows smoke coverage for skill activation * refactor: make setup_web script CommonJS * fix: repair aegisops-ai frontmatter * docs: add when-to-use guidance to core skills * docs: add when-to-use guidance to Apify skills * docs: add when-to-use guidance to Google and Expo skills * docs: add when-to-use guidance to Makepad skills * docs: add when-to-use guidance to git workflow skills * docs: add when-to-use guidance to fp-ts skills * docs: add when-to-use guidance to Three.js skills * docs: add when-to-use guidance to n8n skills * docs: add when-to-use guidance to health analysis skills * docs: add when-to-use guidance to writing and review skills * meta: sync generated catalog metadata * docs: add when-to-use guidance to Robius skills * docs: add when-to-use guidance to review and workflow skills * docs: add when-to-use guidance to science and data skills * docs: add when-to-use guidance to tooling and automation skills * docs: add when-to-use guidance to remaining skills * fix: gate bundle helper execution in Windows activation * chore: drop generated artifacts from contributor PR * docs(maintenance): Record PR 457 sweep Document the open issue triage, PR supersedence decision, local verification, and source-only cleanup that prepared PR #457 for re-running CI. --------- Co-authored-by: sickn33 <sickn33@users.noreply.github.com>
405 lines
12 KiB
Markdown
405 lines
12 KiB
Markdown
---
|
|
name: django-perf-review
|
|
description: Django performance code review. Use when asked to "review Django performance", "find N+1 queries", "optimize Django", "check queryset performance", "database performance", "Django ORM issues", or audit Django code for performance problems.
|
|
allowed-tools: Read, Grep, Glob, Bash, Task
|
|
license: LICENSE
|
|
risk: unknown
|
|
source: community
|
|
---
|
|
|
|
# Django Performance Review
|
|
|
|
Review Django code for **validated** performance issues. Research the codebase to confirm issues before reporting. Report only what you can prove.
|
|
|
|
## When to Use
|
|
|
|
- You need a Django performance review focused on verified ORM and query issues.
|
|
- The code likely has N+1 queries, unbounded querysets, missing indexes, or other database-driven bottlenecks.
|
|
- You want only provable performance findings, not speculative optimization advice.
|
|
|
|
## Review Approach
|
|
|
|
1. **Research first** - Trace data flow, check for existing optimizations, verify data volume
|
|
2. **Validate before reporting** - Pattern matching is not validation
|
|
3. **Zero findings is acceptable** - Don't manufacture issues to appear thorough
|
|
4. **Severity must match impact** - If you catch yourself writing "minor" in a CRITICAL finding, it's not critical. Downgrade or skip it.
|
|
|
|
## Impact Categories
|
|
|
|
Issues are organized by impact. Focus on CRITICAL and HIGH - these cause real problems at scale.
|
|
|
|
| Priority | Category | Impact |
|
|
|----------|----------|--------|
|
|
| 1 | N+1 Queries | **CRITICAL** - Multiplies with data, causes timeouts |
|
|
| 2 | Unbounded Querysets | **CRITICAL** - Memory exhaustion, OOM kills |
|
|
| 3 | Missing Indexes | **HIGH** - Full table scans on large tables |
|
|
| 4 | Write Loops | **HIGH** - Lock contention, slow requests |
|
|
| 5 | Inefficient Patterns | **LOW** - Rarely worth reporting |
|
|
|
|
---
|
|
|
|
## Priority 1: N+1 Queries (CRITICAL)
|
|
|
|
**Impact:** Each N+1 adds `O(n)` database round trips. 100 rows = 100 extra queries. 10,000 rows = timeout.
|
|
|
|
### Rule: Prefetch related data accessed in loops
|
|
|
|
Validate by tracing: View → Queryset → Template/Serializer → Loop access
|
|
|
|
```python
|
|
# PROBLEM: N+1 - each iteration queries profile
|
|
def user_list(request):
|
|
users = User.objects.all()
|
|
return render(request, 'users.html', {'users': users})
|
|
|
|
# Template:
|
|
# {% for user in users %}
|
|
# {{ user.profile.bio }} ← triggers query per user
|
|
# {% endfor %}
|
|
|
|
# SOLUTION: Prefetch in view
|
|
def user_list(request):
|
|
users = User.objects.select_related('profile')
|
|
return render(request, 'users.html', {'users': users})
|
|
```
|
|
|
|
### Rule: Prefetch in serializers, not just views
|
|
|
|
DRF serializers accessing related fields cause N+1 if queryset isn't optimized.
|
|
|
|
```python
|
|
# PROBLEM: SerializerMethodField queries per object
|
|
class UserSerializer(serializers.ModelSerializer):
|
|
order_count = serializers.SerializerMethodField()
|
|
|
|
def get_order_count(self, obj):
|
|
return obj.orders.count() # ← query per user
|
|
|
|
# SOLUTION: Annotate in viewset, access in serializer
|
|
class UserViewSet(viewsets.ModelViewSet):
|
|
def get_queryset(self):
|
|
return User.objects.annotate(order_count=Count('orders'))
|
|
|
|
class UserSerializer(serializers.ModelSerializer):
|
|
order_count = serializers.IntegerField(read_only=True)
|
|
```
|
|
|
|
### Rule: Model properties that query are dangerous in loops
|
|
|
|
```python
|
|
# PROBLEM: Property triggers query when accessed
|
|
class User(models.Model):
|
|
@property
|
|
def recent_orders(self):
|
|
return self.orders.filter(created__gte=last_week)[:5]
|
|
|
|
# Used in template loop = N+1
|
|
|
|
# SOLUTION: Use Prefetch with custom queryset, or annotate
|
|
```
|
|
|
|
### Validation Checklist for N+1
|
|
- [ ] Traced data flow from view to template/serializer
|
|
- [ ] Confirmed related field is accessed inside a loop
|
|
- [ ] Searched codebase for existing select_related/prefetch_related
|
|
- [ ] Verified table has significant row count (1000+)
|
|
- [ ] Confirmed this is a hot path (not admin, not rare action)
|
|
|
|
---
|
|
|
|
## Priority 2: Unbounded Querysets (CRITICAL)
|
|
|
|
**Impact:** Loading entire tables exhausts memory. Large tables cause OOM kills and worker restarts.
|
|
|
|
### Rule: Always paginate list endpoints
|
|
|
|
```python
|
|
# PROBLEM: No pagination - loads all rows
|
|
class UserListView(ListView):
|
|
model = User
|
|
template_name = 'users.html'
|
|
|
|
# SOLUTION: Add pagination
|
|
class UserListView(ListView):
|
|
model = User
|
|
template_name = 'users.html'
|
|
paginate_by = 25
|
|
```
|
|
|
|
### Rule: Use iterator() for large batch processing
|
|
|
|
```python
|
|
# PROBLEM: Loads all objects into memory at once
|
|
for user in User.objects.all():
|
|
process(user)
|
|
|
|
# SOLUTION: Stream with iterator()
|
|
for user in User.objects.iterator(chunk_size=1000):
|
|
process(user)
|
|
```
|
|
|
|
### Rule: Never call list() on unbounded querysets
|
|
|
|
```python
|
|
# PROBLEM: Forces full evaluation into memory
|
|
all_users = list(User.objects.all())
|
|
|
|
# SOLUTION: Keep as queryset, slice if needed
|
|
users = User.objects.all()[:100]
|
|
```
|
|
|
|
### Validation Checklist for Unbounded Querysets
|
|
- [ ] Table is large (10k+ rows) or will grow unbounded
|
|
- [ ] No pagination class, paginate_by, or slicing
|
|
- [ ] This runs on user-facing request (not background job with chunking)
|
|
|
|
---
|
|
|
|
## Priority 3: Missing Indexes (HIGH)
|
|
|
|
**Impact:** Full table scans. Negligible on small tables, catastrophic on large ones.
|
|
|
|
### Rule: Index fields used in WHERE clauses on large tables
|
|
|
|
```python
|
|
# PROBLEM: Filtering on unindexed field
|
|
# User.objects.filter(email=email) # full scan if no index
|
|
|
|
class User(models.Model):
|
|
email = models.EmailField() # ← no db_index
|
|
|
|
# SOLUTION: Add index
|
|
class User(models.Model):
|
|
email = models.EmailField(db_index=True)
|
|
```
|
|
|
|
### Rule: Index fields used in ORDER BY on large tables
|
|
|
|
```python
|
|
# PROBLEM: Sorting requires full scan without index
|
|
Order.objects.order_by('-created')
|
|
|
|
# SOLUTION: Index the sort field
|
|
class Order(models.Model):
|
|
created = models.DateTimeField(db_index=True)
|
|
```
|
|
|
|
### Rule: Use composite indexes for common query patterns
|
|
|
|
```python
|
|
class Order(models.Model):
|
|
user = models.ForeignKey(User)
|
|
status = models.CharField(max_length=20)
|
|
created = models.DateTimeField()
|
|
|
|
class Meta:
|
|
indexes = [
|
|
models.Index(fields=['user', 'status']), # for filter(user=x, status=y)
|
|
models.Index(fields=['status', '-created']), # for filter(status=x).order_by('-created')
|
|
]
|
|
```
|
|
|
|
### Validation Checklist for Missing Indexes
|
|
- [ ] Table has 10k+ rows
|
|
- [ ] Field is used in filter() or order_by() on hot path
|
|
- [ ] Checked model - no db_index=True or Meta.indexes entry
|
|
- [ ] Not a foreign key (already indexed automatically)
|
|
|
|
---
|
|
|
|
## Priority 4: Write Loops (HIGH)
|
|
|
|
**Impact:** N database writes instead of 1. Lock contention. Slow requests.
|
|
|
|
### Rule: Use bulk_create instead of create() in loops
|
|
|
|
```python
|
|
# PROBLEM: N inserts, N round trips
|
|
for item in items:
|
|
Model.objects.create(name=item['name'])
|
|
|
|
# SOLUTION: Single bulk insert
|
|
Model.objects.bulk_create([
|
|
Model(name=item['name']) for item in items
|
|
])
|
|
```
|
|
|
|
### Rule: Use update() or bulk_update instead of save() in loops
|
|
|
|
```python
|
|
# PROBLEM: N updates
|
|
for obj in queryset:
|
|
obj.status = 'done'
|
|
obj.save()
|
|
|
|
# SOLUTION A: Single UPDATE statement (same value for all)
|
|
queryset.update(status='done')
|
|
|
|
# SOLUTION B: bulk_update (different values)
|
|
for obj in objects:
|
|
obj.status = compute_status(obj)
|
|
Model.objects.bulk_update(objects, ['status'], batch_size=500)
|
|
```
|
|
|
|
### Rule: Use delete() on queryset, not in loops
|
|
|
|
```python
|
|
# PROBLEM: N deletes
|
|
for obj in queryset:
|
|
obj.delete()
|
|
|
|
# SOLUTION: Single DELETE
|
|
queryset.delete()
|
|
```
|
|
|
|
### Validation Checklist for Write Loops
|
|
- [ ] Loop iterates over 100+ items (or unbounded)
|
|
- [ ] Each iteration calls create(), save(), or delete()
|
|
- [ ] This runs on user-facing request (not one-time migration script)
|
|
|
|
---
|
|
|
|
## Priority 5: Inefficient Patterns (LOW)
|
|
|
|
**Rarely worth reporting.** Include only as minor notes if you're already reporting real issues.
|
|
|
|
### Pattern: count() vs exists()
|
|
|
|
```python
|
|
# Slightly suboptimal
|
|
if queryset.count() > 0:
|
|
do_thing()
|
|
|
|
# Marginally better
|
|
if queryset.exists():
|
|
do_thing()
|
|
```
|
|
|
|
**Usually skip** - difference is <1ms in most cases.
|
|
|
|
### Pattern: len(queryset) vs count()
|
|
|
|
```python
|
|
# Fetches all rows to count
|
|
if len(queryset) > 0: # bad if queryset not yet evaluated
|
|
|
|
# Single COUNT query
|
|
if queryset.count() > 0:
|
|
```
|
|
|
|
**Only flag** if queryset is large and not already evaluated.
|
|
|
|
### Pattern: get() in small loops
|
|
|
|
```python
|
|
# N queries, but if N is small (< 20), often fine
|
|
for id in ids:
|
|
obj = Model.objects.get(id=id)
|
|
```
|
|
|
|
**Only flag** if loop is large or this is in a very hot path.
|
|
|
|
---
|
|
|
|
## Validation Requirements
|
|
|
|
Before reporting ANY issue:
|
|
|
|
1. **Trace the data flow** - Follow queryset from creation to consumption
|
|
2. **Search for existing optimizations** - Grep for select_related, prefetch_related, pagination
|
|
3. **Verify data volume** - Check if table is actually large
|
|
4. **Confirm hot path** - Trace call sites, verify this runs frequently
|
|
5. **Rule out mitigations** - Check for caching, rate limiting
|
|
|
|
**If you cannot validate all steps, do not report.**
|
|
|
|
---
|
|
|
|
## Output Format
|
|
|
|
```markdown
|
|
## Django Performance Review: [File/Component Name]
|
|
|
|
### Summary
|
|
Validated issues: X (Y Critical, Z High)
|
|
|
|
### Findings
|
|
|
|
#### [PERF-001] N+1 Query in UserListView (CRITICAL)
|
|
**Location:** `views.py:45`
|
|
|
|
**Issue:** Related field `profile` accessed in template loop without prefetch.
|
|
|
|
**Validation:**
|
|
- Traced: UserListView → users queryset → user_list.html → `{{ user.profile.bio }}` in loop
|
|
- Searched codebase: no select_related('profile') found
|
|
- User table: 50k+ rows (verified in admin)
|
|
- Hot path: linked from homepage navigation
|
|
|
|
**Evidence:**
|
|
```python
|
|
def get_queryset(self):
|
|
return User.objects.filter(active=True) # no select_related
|
|
```
|
|
|
|
**Fix:**
|
|
```python
|
|
def get_queryset(self):
|
|
return User.objects.filter(active=True).select_related('profile')
|
|
```
|
|
```
|
|
|
|
If no issues found: "No performance issues identified after reviewing [files] and validating [what you checked]."
|
|
|
|
**Before submitting, sanity check each finding:**
|
|
- Does the severity match the actual impact? ("Minor inefficiency" ≠ CRITICAL)
|
|
- Is this a real performance issue or just a style preference?
|
|
- Would fixing this measurably improve performance?
|
|
|
|
If the answer to any is "no" - remove the finding.
|
|
|
|
---
|
|
|
|
## What NOT to Report
|
|
|
|
- Test files
|
|
- Admin-only views
|
|
- Management commands
|
|
- Migration files
|
|
- One-time scripts
|
|
- Code behind disabled feature flags
|
|
- Tables with <1000 rows that won't grow
|
|
- Patterns in cold paths (rarely executed code)
|
|
- Micro-optimizations (exists vs count, only/defer without evidence)
|
|
|
|
### False Positives to Avoid
|
|
|
|
**Queryset variable assignment is not an issue:**
|
|
```python
|
|
# This is FINE - no performance difference
|
|
projects_qs = Project.objects.filter(org=org)
|
|
projects = list(projects_qs)
|
|
|
|
# vs this - identical performance
|
|
projects = list(Project.objects.filter(org=org))
|
|
```
|
|
Querysets are lazy. Assigning to a variable doesn't execute anything.
|
|
|
|
**Single query patterns are not N+1:**
|
|
```python
|
|
# This is ONE query, not N+1
|
|
projects = list(Project.objects.filter(org=org))
|
|
```
|
|
N+1 requires a loop that triggers additional queries. A single `list()` call is fine.
|
|
|
|
**Missing select_related on single object fetch is not N+1:**
|
|
```python
|
|
# This is 2 queries, not N+1 - report as LOW at most
|
|
state = AutofixState.objects.filter(pr_id=pr_id).first()
|
|
project_id = state.request.project_id # second query
|
|
```
|
|
N+1 requires a loop. A single object doing 2 queries instead of 1 can be reported as LOW if relevant, but never as CRITICAL/HIGH.
|
|
|
|
**Style preferences are not performance issues:**
|
|
If your only suggestion is "combine these two lines" or "rename this variable" - that's style, not performance. Don't report it.
|