feat(engineering,ra-qm): add secrets-vault-manager, sql-database-assistant, gcp-cloud-architect, soc2-compliance
secrets-vault-manager (403-line SKILL.md, 3 scripts, 3 references): - HashiCorp Vault, AWS SM, Azure KV, GCP SM integration - Secret rotation, dynamic secrets, audit logging, emergency procedures sql-database-assistant (457-line SKILL.md, 3 scripts, 3 references): - Query optimization, migration generation, schema exploration - Multi-DB support (PostgreSQL, MySQL, SQLite, SQL Server) - ORM patterns (Prisma, Drizzle, TypeORM, SQLAlchemy) gcp-cloud-architect (418-line SKILL.md, 3 scripts, 3 references): - 6-step workflow mirroring aws-solution-architect for GCP - Cloud Run, GKE, BigQuery, Cloud Functions, cost optimization - Completes cloud trifecta (AWS + Azure + GCP) soc2-compliance (417-line SKILL.md, 3 scripts, 3 references): - SOC 2 Type I & II preparation, Trust Service Criteria mapping - Control matrix generation, evidence tracking, gap analysis - First SOC 2 skill in ra-qm-team (joins GDPR, ISO 27001, ISO 13485) All 12 scripts pass --help. Docs generated, mkdocs.yml nav updated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,330 @@
|
||||
# Query Optimization Guide
|
||||
|
||||
How to read EXPLAIN plans, choose the right index types, understand query plan operators, and configure connection pooling.
|
||||
|
||||
---
|
||||
|
||||
## Reading EXPLAIN Plans
|
||||
|
||||
### PostgreSQL — EXPLAIN ANALYZE
|
||||
|
||||
```sql
|
||||
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT * FROM orders WHERE status = 'paid' ORDER BY created_at DESC LIMIT 20;
|
||||
```
|
||||
|
||||
**Sample output:**
|
||||
```
|
||||
Limit (cost=0.43..12.87 rows=20 width=128) (actual time=0.052..0.089 rows=20 loops=1)
|
||||
-> Index Scan Backward using idx_orders_status_created on orders (cost=0.43..4521.33 rows=7284 width=128) (actual time=0.051..0.085 rows=20 loops=1)
|
||||
Index Cond: (status = 'paid')
|
||||
Buffers: shared hit=4
|
||||
Planning Time: 0.156 ms
|
||||
Execution Time: 0.112 ms
|
||||
```
|
||||
|
||||
**Key fields to check:**
|
||||
|
||||
| Field | What it tells you |
|
||||
|-------|-------------------|
|
||||
| `cost` | Estimated startup..total cost (arbitrary units) |
|
||||
| `rows` | Estimated row count at that node |
|
||||
| `actual time` | Real wall-clock time in milliseconds |
|
||||
| `actual rows` | Real row count — compare against estimate |
|
||||
| `Buffers: shared hit` | Pages read from cache (good) |
|
||||
| `Buffers: shared read` | Pages read from disk (slow) |
|
||||
| `loops` | How many times the node executed |
|
||||
|
||||
**Red flags:**
|
||||
- `Seq Scan` on a large table with a WHERE clause — missing index
|
||||
- `actual rows` >> `rows` (estimated) — stale statistics, run `ANALYZE`
|
||||
- `Nested Loop` with high loop count — consider hash join or add index
|
||||
- `Sort` with `external merge` — not enough `work_mem`, spilling to disk
|
||||
- `Buffers: shared read` much higher than `shared hit` — cold cache or table too large for memory
|
||||
|
||||
### MySQL — EXPLAIN FORMAT=JSON
|
||||
|
||||
```sql
|
||||
EXPLAIN FORMAT=JSON SELECT * FROM orders WHERE status = 'paid' ORDER BY created_at DESC LIMIT 20;
|
||||
```
|
||||
|
||||
**Key fields:**
|
||||
- `query_block.select_id` — identifies subqueries
|
||||
- `table.access_type` — `ALL` (full scan), `ref` (index lookup), `range`, `index`, `const`
|
||||
- `table.rows_examined_per_scan` — how many rows the engine reads
|
||||
- `table.using_index` — covering index (no table lookup needed)
|
||||
- `table.attached_condition` — the WHERE filter applied
|
||||
|
||||
**Access types ranked (best to worst):**
|
||||
`system` > `const` > `eq_ref` > `ref` > `range` > `index` > `ALL`
|
||||
|
||||
---
|
||||
|
||||
## Index Types
|
||||
|
||||
### B-tree (default)
|
||||
|
||||
The workhorse index. Supports equality, range, prefix, and ORDER BY operations.
|
||||
|
||||
**Best for:** `=`, `<`, `>`, `<=`, `>=`, `BETWEEN`, `LIKE 'prefix%'`, `ORDER BY`, `MIN()`, `MAX()`
|
||||
|
||||
```sql
|
||||
CREATE INDEX idx_orders_created ON orders (created_at);
|
||||
```
|
||||
|
||||
**Composite B-tree:** Column order matters. The index is useful for queries that filter on a leftmost prefix of the indexed columns.
|
||||
|
||||
```sql
|
||||
-- This index serves: WHERE status = ... AND created_at > ...
|
||||
-- Also serves: WHERE status = ...
|
||||
-- Does NOT serve: WHERE created_at > ... (without status)
|
||||
CREATE INDEX idx_orders_status_created ON orders (status, created_at);
|
||||
```
|
||||
|
||||
### Hash
|
||||
|
||||
Equality-only lookups. Faster than B-tree for exact matches but no range support.
|
||||
|
||||
**Best for:** `=` lookups on high-cardinality columns
|
||||
|
||||
```sql
|
||||
-- PostgreSQL
|
||||
CREATE INDEX idx_sessions_token ON sessions USING hash (token);
|
||||
```
|
||||
|
||||
**Limitations:** No range queries, no ORDER BY, not WAL-logged before PostgreSQL 10.
|
||||
|
||||
### GIN (Generalized Inverted Index)
|
||||
|
||||
For multi-valued data: arrays, JSONB, full-text search vectors.
|
||||
|
||||
```sql
|
||||
-- JSONB containment
|
||||
CREATE INDEX idx_products_tags ON products USING gin (tags);
|
||||
-- Query: SELECT * FROM products WHERE tags @> '["sale"]';
|
||||
|
||||
-- Full-text search
|
||||
CREATE INDEX idx_articles_search ON articles USING gin (to_tsvector('english', title || ' ' || body));
|
||||
```
|
||||
|
||||
### GiST (Generalized Search Tree)
|
||||
|
||||
For geometric, range, and proximity data.
|
||||
|
||||
```sql
|
||||
-- Range type (e.g., date ranges)
|
||||
CREATE INDEX idx_bookings_period ON bookings USING gist (during);
|
||||
-- Query: SELECT * FROM bookings WHERE during && '[2025-01-01, 2025-01-31]';
|
||||
|
||||
-- PostGIS geometry
|
||||
CREATE INDEX idx_locations_geom ON locations USING gist (geom);
|
||||
```
|
||||
|
||||
### BRIN (Block Range INdex)
|
||||
|
||||
Tiny index for naturally ordered data (e.g., time-series append-only tables).
|
||||
|
||||
```sql
|
||||
CREATE INDEX idx_events_created ON events USING brin (created_at);
|
||||
```
|
||||
|
||||
**Best for:** Large tables where the indexed column correlates with physical row order. Much smaller than B-tree but less precise.
|
||||
|
||||
### Partial Index
|
||||
|
||||
Index only rows matching a condition. Smaller and faster for targeted queries.
|
||||
|
||||
```sql
|
||||
-- Only index active users (skip millions of inactive)
|
||||
CREATE INDEX idx_users_active_email ON users (email) WHERE status = 'active';
|
||||
```
|
||||
|
||||
### Covering Index (INCLUDE)
|
||||
|
||||
Store extra columns in the index to avoid table lookups (index-only scans).
|
||||
|
||||
```sql
|
||||
-- PostgreSQL 11+
|
||||
CREATE INDEX idx_orders_status ON orders (status) INCLUDE (total, created_at);
|
||||
-- Query can be answered entirely from the index:
|
||||
-- SELECT total, created_at FROM orders WHERE status = 'paid';
|
||||
```
|
||||
|
||||
### Expression Index
|
||||
|
||||
Index the result of a function or expression.
|
||||
|
||||
```sql
|
||||
CREATE INDEX idx_users_lower_email ON users (LOWER(email));
|
||||
-- Query: SELECT * FROM users WHERE LOWER(email) = 'user@example.com';
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Query Plan Operators
|
||||
|
||||
### Scan operators
|
||||
|
||||
| Operator | Description | Performance |
|
||||
|----------|-------------|-------------|
|
||||
| **Seq Scan** | Full table scan, reads every row | Slow on large tables |
|
||||
| **Index Scan** | B-tree lookup + table fetch | Fast for selective queries |
|
||||
| **Index Only Scan** | Reads only the index (covering) | Fastest for covered queries |
|
||||
| **Bitmap Index Scan** | Builds a bitmap of matching pages | Good for medium selectivity |
|
||||
| **Bitmap Heap Scan** | Fetches pages identified by bitmap | Pairs with bitmap index scan |
|
||||
|
||||
### Join operators
|
||||
|
||||
| Operator | Description | Best when |
|
||||
|----------|-------------|-----------|
|
||||
| **Nested Loop** | For each outer row, scan inner | Small outer set, indexed inner |
|
||||
| **Hash Join** | Build hash table on inner, probe with outer | Medium-large sets, no index |
|
||||
| **Merge Join** | Merge two sorted inputs | Both inputs already sorted |
|
||||
|
||||
### Other operators
|
||||
|
||||
| Operator | Description |
|
||||
|----------|-------------|
|
||||
| **Sort** | Sorts rows (may spill to disk if work_mem exceeded) |
|
||||
| **Hash Aggregate** | GROUP BY using hash table |
|
||||
| **Group Aggregate** | GROUP BY on pre-sorted input |
|
||||
| **Limit** | Stops after N rows |
|
||||
| **Materialize** | Caches subquery results in memory |
|
||||
| **Gather / Gather Merge** | Collects results from parallel workers |
|
||||
|
||||
---
|
||||
|
||||
## Connection Pooling
|
||||
|
||||
### Why pool connections?
|
||||
|
||||
Each database connection consumes memory (5-10 MB in PostgreSQL). Without pooling:
|
||||
- Application creates a new connection per request (slow: TCP + TLS + auth)
|
||||
- Under load, connection count spikes past `max_connections`
|
||||
- Database OOM or connection refused errors
|
||||
|
||||
### PgBouncer (PostgreSQL)
|
||||
|
||||
The standard external connection pooler for PostgreSQL.
|
||||
|
||||
**Modes:**
|
||||
- **Session** — connection assigned for entire client session (safest, least efficient)
|
||||
- **Transaction** — connection returned to pool after each transaction (recommended)
|
||||
- **Statement** — connection returned after each statement (cannot use transactions)
|
||||
|
||||
```ini
|
||||
# pgbouncer.ini
|
||||
[databases]
|
||||
mydb = host=127.0.0.1 port=5432 dbname=mydb
|
||||
|
||||
[pgbouncer]
|
||||
pool_mode = transaction
|
||||
max_client_conn = 200
|
||||
default_pool_size = 20
|
||||
min_pool_size = 5
|
||||
reserve_pool_size = 5
|
||||
reserve_pool_timeout = 3
|
||||
server_idle_timeout = 300
|
||||
```
|
||||
|
||||
**Sizing formula:**
|
||||
```
|
||||
default_pool_size = num_cpu_cores * 2 + effective_spindle_count
|
||||
```
|
||||
For SSDs, start with `num_cpu_cores * 2` (typically 4-16 connections is optimal).
|
||||
|
||||
### ProxySQL (MySQL)
|
||||
|
||||
```ini
|
||||
mysql_servers = ({ address="127.0.0.1", port=3306, hostgroup=0, max_connections=100 })
|
||||
mysql_query_rules = ({ rule_id=1, match_pattern="^SELECT.*FOR UPDATE", destination_hostgroup=0 })
|
||||
```
|
||||
|
||||
### Application-Level Pooling
|
||||
|
||||
Most ORMs and drivers include built-in pooling:
|
||||
|
||||
| Platform | Pool Configuration |
|
||||
|----------|--------------------|
|
||||
| **node-postgres** | `new Pool({ max: 20, idleTimeoutMillis: 30000 })` |
|
||||
| **SQLAlchemy** | `create_engine(url, pool_size=20, max_overflow=5)` |
|
||||
| **HikariCP (Java)** | `maximumPoolSize=20, minimumIdle=5, idleTimeout=300000` |
|
||||
| **Prisma** | `connection_limit=20` in connection string |
|
||||
|
||||
### Pool Sizing Guidelines
|
||||
|
||||
| Metric | Guideline |
|
||||
|--------|-----------|
|
||||
| **Minimum** | Number of always-active background workers |
|
||||
| **Maximum** | 2-4x CPU cores for OLTP; lower for OLAP |
|
||||
| **Idle timeout** | 30-300 seconds (reclaim unused connections) |
|
||||
| **Connection timeout** | 3-10 seconds (fail fast under pressure) |
|
||||
| **Queue size** | 2-5x pool max (buffer bursts before rejecting) |
|
||||
|
||||
**Warning:** More connections does not mean better performance. Beyond the optimal point (usually 20-50), contention on locks, CPU, and I/O causes throughput to decrease.
|
||||
|
||||
---
|
||||
|
||||
## Statistics and Maintenance
|
||||
|
||||
### PostgreSQL
|
||||
```sql
|
||||
-- Update statistics for the query planner
|
||||
ANALYZE orders;
|
||||
ANALYZE; -- All tables
|
||||
|
||||
-- Check table bloat and dead tuples
|
||||
SELECT relname, n_dead_tup, last_autovacuum, last_autoanalyze
|
||||
FROM pg_stat_user_tables ORDER BY n_dead_tup DESC;
|
||||
|
||||
-- Identify unused indexes
|
||||
SELECT indexrelname, idx_scan, pg_size_pretty(pg_relation_size(indexrelid)) AS size
|
||||
FROM pg_stat_user_indexes
|
||||
WHERE idx_scan = 0 AND indexrelname NOT LIKE '%pkey%'
|
||||
ORDER BY pg_relation_size(indexrelid) DESC;
|
||||
```
|
||||
|
||||
### MySQL
|
||||
```sql
|
||||
-- Update statistics
|
||||
ANALYZE TABLE orders;
|
||||
|
||||
-- Check index usage
|
||||
SELECT * FROM sys.schema_unused_indexes;
|
||||
SELECT * FROM sys.schema_redundant_indexes;
|
||||
|
||||
-- Identify long-running queries
|
||||
SELECT * FROM information_schema.processlist WHERE time > 10;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Performance Checklist
|
||||
|
||||
Before deploying any query to production:
|
||||
|
||||
1. Run `EXPLAIN ANALYZE` and verify no unexpected sequential scans
|
||||
2. Check that estimated rows are within 10x of actual rows
|
||||
3. Verify index usage on all WHERE, JOIN, and ORDER BY columns
|
||||
4. Ensure LIMIT is present for user-facing list queries
|
||||
5. Confirm parameterized queries (no string concatenation)
|
||||
6. Test with production-like data volume (not just 10 rows)
|
||||
7. Monitor query time in application metrics after deployment
|
||||
8. Set up slow query log alerting (> 100ms for OLTP, > 5s for reports)
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference: When to Use Which Index
|
||||
|
||||
| Query Pattern | Index Type |
|
||||
|--------------|-----------|
|
||||
| `WHERE col = value` | B-tree or Hash |
|
||||
| `WHERE col > value` | B-tree |
|
||||
| `WHERE col LIKE 'prefix%'` | B-tree |
|
||||
| `WHERE col LIKE '%substring%'` | GIN (full-text) or trigram |
|
||||
| `WHERE jsonb_col @> '{...}'` | GIN |
|
||||
| `WHERE array_col && ARRAY[...]` | GIN |
|
||||
| `WHERE range_col && '[a,b]'` | GiST |
|
||||
| `WHERE ST_DWithin(geom, ...)` | GiST |
|
||||
| `WHERE col = value` (append-only) | BRIN |
|
||||
| `WHERE col = value AND status = 'active'` | Partial B-tree |
|
||||
| `SELECT a, b WHERE c = value` | Covering (INCLUDE) |
|
||||
451
engineering/sql-database-assistant/references/orm_patterns.md
Normal file
451
engineering/sql-database-assistant/references/orm_patterns.md
Normal file
@@ -0,0 +1,451 @@
|
||||
# ORM Patterns Reference
|
||||
|
||||
Side-by-side comparison of Prisma, Drizzle, TypeORM, and SQLAlchemy patterns for common database operations.
|
||||
|
||||
---
|
||||
|
||||
## Schema Definition
|
||||
|
||||
### Prisma (schema.prisma)
|
||||
```prisma
|
||||
model User {
|
||||
id Int @id @default(autoincrement())
|
||||
email String @unique
|
||||
name String?
|
||||
role Role @default(USER)
|
||||
posts Post[]
|
||||
profile Profile?
|
||||
createdAt DateTime @default(now())
|
||||
updatedAt DateTime @updatedAt
|
||||
|
||||
@@index([email])
|
||||
@@map("users")
|
||||
}
|
||||
|
||||
model Post {
|
||||
id Int @id @default(autoincrement())
|
||||
title String
|
||||
body String?
|
||||
published Boolean @default(false)
|
||||
author User @relation(fields: [authorId], references: [id], onDelete: Cascade)
|
||||
authorId Int
|
||||
tags Tag[]
|
||||
createdAt DateTime @default(now())
|
||||
|
||||
@@index([authorId])
|
||||
@@index([published, createdAt])
|
||||
@@map("posts")
|
||||
}
|
||||
|
||||
enum Role {
|
||||
USER
|
||||
ADMIN
|
||||
MODERATOR
|
||||
}
|
||||
```
|
||||
|
||||
### Drizzle (schema.ts)
|
||||
```typescript
|
||||
import { pgTable, serial, varchar, text, boolean, timestamp, integer, pgEnum } from 'drizzle-orm/pg-core';
|
||||
|
||||
export const roleEnum = pgEnum('role', ['USER', 'ADMIN', 'MODERATOR']);
|
||||
|
||||
export const users = pgTable('users', {
|
||||
id: serial('id').primaryKey(),
|
||||
email: varchar('email', { length: 255 }).notNull().unique(),
|
||||
name: varchar('name', { length: 255 }),
|
||||
role: roleEnum('role').default('USER').notNull(),
|
||||
createdAt: timestamp('created_at').defaultNow().notNull(),
|
||||
updatedAt: timestamp('updated_at').defaultNow().notNull(),
|
||||
});
|
||||
|
||||
export const posts = pgTable('posts', {
|
||||
id: serial('id').primaryKey(),
|
||||
title: varchar('title', { length: 255 }).notNull(),
|
||||
body: text('body'),
|
||||
published: boolean('published').default(false).notNull(),
|
||||
authorId: integer('author_id').notNull().references(() => users.id, { onDelete: 'cascade' }),
|
||||
createdAt: timestamp('created_at').defaultNow().notNull(),
|
||||
}, (table) => ({
|
||||
authorIdx: index('idx_posts_author').on(table.authorId),
|
||||
publishedIdx: index('idx_posts_published').on(table.published, table.createdAt),
|
||||
}));
|
||||
```
|
||||
|
||||
### TypeORM (entities)
|
||||
```typescript
|
||||
import { Entity, PrimaryGeneratedColumn, Column, ManyToOne, OneToMany, CreateDateColumn, UpdateDateColumn, Index } from 'typeorm';
|
||||
|
||||
export enum Role { USER = 'USER', ADMIN = 'ADMIN', MODERATOR = 'MODERATOR' }
|
||||
|
||||
@Entity('users')
|
||||
export class User {
|
||||
@PrimaryGeneratedColumn()
|
||||
id: number;
|
||||
|
||||
@Column({ unique: true })
|
||||
@Index()
|
||||
email: string;
|
||||
|
||||
@Column({ nullable: true })
|
||||
name: string;
|
||||
|
||||
@Column({ type: 'enum', enum: Role, default: Role.USER })
|
||||
role: Role;
|
||||
|
||||
@OneToMany(() => Post, post => post.author)
|
||||
posts: Post[];
|
||||
|
||||
@CreateDateColumn()
|
||||
createdAt: Date;
|
||||
|
||||
@UpdateDateColumn()
|
||||
updatedAt: Date;
|
||||
}
|
||||
|
||||
@Entity('posts')
|
||||
@Index(['published', 'createdAt'])
|
||||
export class Post {
|
||||
@PrimaryGeneratedColumn()
|
||||
id: number;
|
||||
|
||||
@Column()
|
||||
title: string;
|
||||
|
||||
@Column({ nullable: true, type: 'text' })
|
||||
body: string;
|
||||
|
||||
@Column({ default: false })
|
||||
published: boolean;
|
||||
|
||||
@ManyToOne(() => User, user => user.posts, { onDelete: 'CASCADE' })
|
||||
author: User;
|
||||
|
||||
@Column()
|
||||
authorId: number;
|
||||
|
||||
@CreateDateColumn()
|
||||
createdAt: Date;
|
||||
}
|
||||
```
|
||||
|
||||
### SQLAlchemy (models.py)
|
||||
```python
|
||||
import enum
|
||||
from datetime import datetime
|
||||
from sqlalchemy import Column, Integer, String, Text, Boolean, DateTime, Enum, ForeignKey, Index
|
||||
from sqlalchemy.orm import relationship, DeclarativeBase
|
||||
|
||||
class Base(DeclarativeBase):
|
||||
pass
|
||||
|
||||
class Role(enum.Enum):
|
||||
USER = "USER"
|
||||
ADMIN = "ADMIN"
|
||||
MODERATOR = "MODERATOR"
|
||||
|
||||
class User(Base):
|
||||
__tablename__ = 'users'
|
||||
|
||||
id = Column(Integer, primary_key=True, autoincrement=True)
|
||||
email = Column(String(255), unique=True, nullable=False, index=True)
|
||||
name = Column(String(255), nullable=True)
|
||||
role = Column(Enum(Role), default=Role.USER, nullable=False)
|
||||
posts = relationship('Post', back_populates='author', cascade='all, delete-orphan')
|
||||
created_at = Column(DateTime, default=datetime.utcnow, nullable=False)
|
||||
updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow, nullable=False)
|
||||
|
||||
class Post(Base):
|
||||
__tablename__ = 'posts'
|
||||
__table_args__ = (
|
||||
Index('idx_posts_published', 'published', 'created_at'),
|
||||
)
|
||||
|
||||
id = Column(Integer, primary_key=True, autoincrement=True)
|
||||
title = Column(String(255), nullable=False)
|
||||
body = Column(Text, nullable=True)
|
||||
published = Column(Boolean, default=False, nullable=False)
|
||||
author_id = Column(Integer, ForeignKey('users.id', ondelete='CASCADE'), nullable=False, index=True)
|
||||
author = relationship('User', back_populates='posts')
|
||||
created_at = Column(DateTime, default=datetime.utcnow, nullable=False)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## CRUD Operations
|
||||
|
||||
### Create
|
||||
|
||||
| ORM | Pattern |
|
||||
|-----|---------|
|
||||
| **Prisma** | `await prisma.user.create({ data: { email, name } })` |
|
||||
| **Drizzle** | `await db.insert(users).values({ email, name }).returning()` |
|
||||
| **TypeORM** | `await userRepo.save(userRepo.create({ email, name }))` |
|
||||
| **SQLAlchemy** | `session.add(User(email=email, name=name)); session.commit()` |
|
||||
|
||||
### Read (with filter)
|
||||
|
||||
| ORM | Pattern |
|
||||
|-----|---------|
|
||||
| **Prisma** | `await prisma.user.findMany({ where: { role: 'ADMIN' }, orderBy: { createdAt: 'desc' } })` |
|
||||
| **Drizzle** | `await db.select().from(users).where(eq(users.role, 'ADMIN')).orderBy(desc(users.createdAt))` |
|
||||
| **TypeORM** | `await userRepo.find({ where: { role: Role.ADMIN }, order: { createdAt: 'DESC' } })` |
|
||||
| **SQLAlchemy** | `session.query(User).filter(User.role == Role.ADMIN).order_by(User.created_at.desc()).all()` |
|
||||
|
||||
### Update
|
||||
|
||||
| ORM | Pattern |
|
||||
|-----|---------|
|
||||
| **Prisma** | `await prisma.user.update({ where: { id }, data: { name } })` |
|
||||
| **Drizzle** | `await db.update(users).set({ name }).where(eq(users.id, id))` |
|
||||
| **TypeORM** | `await userRepo.update(id, { name })` |
|
||||
| **SQLAlchemy** | `session.query(User).filter(User.id == id).update({User.name: name}); session.commit()` |
|
||||
|
||||
### Delete
|
||||
|
||||
| ORM | Pattern |
|
||||
|-----|---------|
|
||||
| **Prisma** | `await prisma.user.delete({ where: { id } })` |
|
||||
| **Drizzle** | `await db.delete(users).where(eq(users.id, id))` |
|
||||
| **TypeORM** | `await userRepo.delete(id)` |
|
||||
| **SQLAlchemy** | `session.query(User).filter(User.id == id).delete(); session.commit()` |
|
||||
|
||||
---
|
||||
|
||||
## Relations and Eager Loading
|
||||
|
||||
### Prisma — include / select
|
||||
```typescript
|
||||
// Eager load posts with user
|
||||
const user = await prisma.user.findUnique({
|
||||
where: { id: 1 },
|
||||
include: { posts: { where: { published: true }, orderBy: { createdAt: 'desc' } } },
|
||||
});
|
||||
|
||||
// Nested create
|
||||
await prisma.user.create({
|
||||
data: {
|
||||
email: 'new@example.com',
|
||||
posts: { create: [{ title: 'First post' }] },
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
### Drizzle — relational queries
|
||||
```typescript
|
||||
const result = await db.query.users.findFirst({
|
||||
where: eq(users.id, 1),
|
||||
with: { posts: { where: eq(posts.published, true), orderBy: [desc(posts.createdAt)] } },
|
||||
});
|
||||
```
|
||||
|
||||
### TypeORM — relations / query builder
|
||||
```typescript
|
||||
// FindOptions
|
||||
const user = await userRepo.findOne({ where: { id: 1 }, relations: ['posts'] });
|
||||
|
||||
// QueryBuilder for complex joins
|
||||
const result = await userRepo.createQueryBuilder('u')
|
||||
.leftJoinAndSelect('u.posts', 'p', 'p.published = :pub', { pub: true })
|
||||
.where('u.id = :id', { id: 1 })
|
||||
.getOne();
|
||||
```
|
||||
|
||||
### SQLAlchemy — joinedload / selectinload
|
||||
```python
|
||||
from sqlalchemy.orm import joinedload, selectinload
|
||||
|
||||
# Eager load in one JOIN query
|
||||
user = session.query(User).options(joinedload(User.posts)).filter(User.id == 1).first()
|
||||
|
||||
# Eager load in a separate IN query (better for collections)
|
||||
users = session.query(User).options(selectinload(User.posts)).all()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Raw SQL Escape Hatches
|
||||
|
||||
Every ORM should provide a way to execute raw SQL for complex queries:
|
||||
|
||||
| ORM | Pattern |
|
||||
|-----|---------|
|
||||
| **Prisma** | `` prisma.$queryRaw`SELECT * FROM users WHERE id = ${id}` `` |
|
||||
| **Drizzle** | `db.execute(sql`SELECT * FROM users WHERE id = ${id}`)` |
|
||||
| **TypeORM** | `dataSource.query('SELECT * FROM users WHERE id = $1', [id])` |
|
||||
| **SQLAlchemy** | `session.execute(text('SELECT * FROM users WHERE id = :id'), {'id': id})` |
|
||||
|
||||
Always use parameterized queries in raw SQL to prevent injection.
|
||||
|
||||
---
|
||||
|
||||
## Transaction Patterns
|
||||
|
||||
### Prisma
|
||||
```typescript
|
||||
await prisma.$transaction(async (tx) => {
|
||||
const user = await tx.user.create({ data: { email } });
|
||||
await tx.post.create({ data: { title: 'Welcome', authorId: user.id } });
|
||||
});
|
||||
```
|
||||
|
||||
### Drizzle
|
||||
```typescript
|
||||
await db.transaction(async (tx) => {
|
||||
const [user] = await tx.insert(users).values({ email }).returning();
|
||||
await tx.insert(posts).values({ title: 'Welcome', authorId: user.id });
|
||||
});
|
||||
```
|
||||
|
||||
### TypeORM
|
||||
```typescript
|
||||
await dataSource.transaction(async (manager) => {
|
||||
const user = await manager.save(User, { email });
|
||||
await manager.save(Post, { title: 'Welcome', authorId: user.id });
|
||||
});
|
||||
```
|
||||
|
||||
### SQLAlchemy
|
||||
```python
|
||||
with Session() as session:
|
||||
try:
|
||||
user = User(email=email)
|
||||
session.add(user)
|
||||
session.flush() # Get user.id without committing
|
||||
session.add(Post(title='Welcome', author_id=user.id))
|
||||
session.commit()
|
||||
except Exception:
|
||||
session.rollback()
|
||||
raise
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Migration Workflows
|
||||
|
||||
### Prisma
|
||||
```bash
|
||||
# Generate migration from schema changes
|
||||
npx prisma migrate dev --name add_posts_table
|
||||
|
||||
# Apply in production
|
||||
npx prisma migrate deploy
|
||||
|
||||
# Reset database (dev only)
|
||||
npx prisma migrate reset
|
||||
|
||||
# Generate client after schema change
|
||||
npx prisma generate
|
||||
```
|
||||
|
||||
**Files:** `prisma/migrations/<timestamp>_<name>/migration.sql`
|
||||
|
||||
### Drizzle
|
||||
```bash
|
||||
# Generate migration SQL from schema diff
|
||||
npx drizzle-kit generate:pg
|
||||
|
||||
# Push schema directly (dev only, no migration files)
|
||||
npx drizzle-kit push:pg
|
||||
|
||||
# Apply migrations
|
||||
npx drizzle-kit migrate
|
||||
```
|
||||
|
||||
**Files:** `drizzle/<timestamp>_<name>.sql`
|
||||
|
||||
### TypeORM
|
||||
```bash
|
||||
# Auto-generate migration from entity changes
|
||||
npx typeorm migration:generate -d data-source.ts -n AddPostsTable
|
||||
|
||||
# Create empty migration
|
||||
npx typeorm migration:create -n CustomMigration
|
||||
|
||||
# Run pending migrations
|
||||
npx typeorm migration:run -d data-source.ts
|
||||
|
||||
# Revert last migration
|
||||
npx typeorm migration:revert -d data-source.ts
|
||||
```
|
||||
|
||||
**Files:** `src/migrations/<timestamp>-<Name>.ts`
|
||||
|
||||
### SQLAlchemy (Alembic)
|
||||
```bash
|
||||
# Initialize Alembic
|
||||
alembic init alembic
|
||||
|
||||
# Auto-generate migration from model changes
|
||||
alembic revision --autogenerate -m "add posts table"
|
||||
|
||||
# Apply all pending
|
||||
alembic upgrade head
|
||||
|
||||
# Revert one step
|
||||
alembic downgrade -1
|
||||
|
||||
# Show current state
|
||||
alembic current
|
||||
```
|
||||
|
||||
**Files:** `alembic/versions/<hash>_<slug>.py`
|
||||
|
||||
---
|
||||
|
||||
## N+1 Prevention Cheat Sheet
|
||||
|
||||
| ORM | Lazy (N+1 risk) | Eager (fixed) |
|
||||
|-----|-----------------|---------------|
|
||||
| **Prisma** | Not accessing `include` | `include: { posts: true }` |
|
||||
| **Drizzle** | Separate queries | `with: { posts: true }` |
|
||||
| **TypeORM** | `@ManyToOne(() => ..., { lazy: true })` | `relations: ['posts']` or `leftJoinAndSelect` |
|
||||
| **SQLAlchemy** | Default `lazy='select'` | `joinedload()` or `selectinload()` |
|
||||
|
||||
**Rule of thumb:** If you access a relation inside a loop, you have an N+1 problem. Always load relations before the loop.
|
||||
|
||||
---
|
||||
|
||||
## Connection Pooling
|
||||
|
||||
### Prisma
|
||||
```
|
||||
# In .env or connection string
|
||||
DATABASE_URL="postgresql://user:pass@host/db?connection_limit=20&pool_timeout=10"
|
||||
```
|
||||
|
||||
### Drizzle (with node-postgres)
|
||||
```typescript
|
||||
import { Pool } from 'pg';
|
||||
const pool = new Pool({ max: 20, idleTimeoutMillis: 30000, connectionTimeoutMillis: 5000 });
|
||||
const db = drizzle(pool);
|
||||
```
|
||||
|
||||
### TypeORM
|
||||
```typescript
|
||||
const dataSource = new DataSource({
|
||||
type: 'postgres',
|
||||
extra: { max: 20, idleTimeoutMillis: 30000 },
|
||||
});
|
||||
```
|
||||
|
||||
### SQLAlchemy
|
||||
```python
|
||||
from sqlalchemy import create_engine
|
||||
engine = create_engine('postgresql://user:pass@host/db', pool_size=20, max_overflow=5, pool_timeout=30)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Best Practices Summary
|
||||
|
||||
1. **Always use migrations** — never modify production schemas by hand
|
||||
2. **Eager load relations** — prevent N+1 in every list/collection query
|
||||
3. **Use transactions** — group related writes to maintain consistency
|
||||
4. **Parameterize raw SQL** — never concatenate user input into queries
|
||||
5. **Connection pooling** — configure pool size matching your workload
|
||||
6. **Index foreign keys** — ORMs often skip this; add manually if needed
|
||||
7. **Review generated SQL** — enable query logging in development to catch inefficiencies
|
||||
8. **Type-safe queries** — leverage TypeScript/Python typing for compile-time checks
|
||||
9. **Separate read/write models** — use views or read replicas for heavy reporting queries
|
||||
10. **Test migrations both ways** — always verify that down migrations actually reverse up migrations
|
||||
406
engineering/sql-database-assistant/references/query_patterns.md
Normal file
406
engineering/sql-database-assistant/references/query_patterns.md
Normal file
@@ -0,0 +1,406 @@
|
||||
# SQL Query Patterns Reference
|
||||
|
||||
Common query patterns for everyday database operations. All examples use PostgreSQL syntax with dialect notes where they differ.
|
||||
|
||||
---
|
||||
|
||||
## JOIN Patterns
|
||||
|
||||
### INNER JOIN — matching rows in both tables
|
||||
```sql
|
||||
SELECT u.name, o.id AS order_id, o.total
|
||||
FROM users u
|
||||
INNER JOIN orders o ON o.user_id = u.id
|
||||
WHERE o.status = 'paid';
|
||||
```
|
||||
|
||||
### LEFT JOIN — all rows from left, matching from right
|
||||
```sql
|
||||
SELECT u.name, COUNT(o.id) AS order_count
|
||||
FROM users u
|
||||
LEFT JOIN orders o ON o.user_id = u.id
|
||||
GROUP BY u.id, u.name;
|
||||
```
|
||||
Returns users even if they have zero orders.
|
||||
|
||||
### Self JOIN — comparing rows within the same table
|
||||
```sql
|
||||
-- Find employees who earn more than their manager
|
||||
SELECT e.name AS employee, m.name AS manager, e.salary, m.salary AS manager_salary
|
||||
FROM employees e
|
||||
JOIN employees m ON e.manager_id = m.id
|
||||
WHERE e.salary > m.salary;
|
||||
```
|
||||
|
||||
### CROSS JOIN — every combination (cartesian product)
|
||||
```sql
|
||||
-- Generate a calendar grid
|
||||
SELECT d.date, s.shift_name
|
||||
FROM dates d
|
||||
CROSS JOIN shifts s;
|
||||
```
|
||||
Use intentionally. Accidental cartesian joins are a performance killer.
|
||||
|
||||
### LATERAL JOIN (PostgreSQL) — correlated subquery as a table
|
||||
```sql
|
||||
-- Top 3 orders per user
|
||||
SELECT u.name, top_orders.*
|
||||
FROM users u
|
||||
CROSS JOIN LATERAL (
|
||||
SELECT id, total FROM orders
|
||||
WHERE user_id = u.id
|
||||
ORDER BY total DESC LIMIT 3
|
||||
) top_orders;
|
||||
```
|
||||
MySQL equivalent: use a subquery with `ROW_NUMBER()`.
|
||||
|
||||
---
|
||||
|
||||
## Common Table Expressions (CTEs)
|
||||
|
||||
### Basic CTE — readable subquery
|
||||
```sql
|
||||
WITH active_users AS (
|
||||
SELECT id, name, email
|
||||
FROM users
|
||||
WHERE last_login > CURRENT_DATE - INTERVAL '30 days'
|
||||
)
|
||||
SELECT au.name, COUNT(o.id) AS recent_orders
|
||||
FROM active_users au
|
||||
JOIN orders o ON o.user_id = au.id
|
||||
GROUP BY au.name;
|
||||
```
|
||||
|
||||
### Multiple CTEs — chaining transformations
|
||||
```sql
|
||||
WITH monthly_revenue AS (
|
||||
SELECT DATE_TRUNC('month', created_at) AS month, SUM(total) AS revenue
|
||||
FROM orders WHERE status = 'paid'
|
||||
GROUP BY 1
|
||||
),
|
||||
growth AS (
|
||||
SELECT month, revenue,
|
||||
LAG(revenue) OVER (ORDER BY month) AS prev_revenue,
|
||||
ROUND((revenue - LAG(revenue) OVER (ORDER BY month)) / LAG(revenue) OVER (ORDER BY month) * 100, 1) AS growth_pct
|
||||
FROM monthly_revenue
|
||||
)
|
||||
SELECT * FROM growth ORDER BY month;
|
||||
```
|
||||
|
||||
### Recursive CTE — hierarchical data
|
||||
```sql
|
||||
-- Organization tree
|
||||
WITH RECURSIVE org_tree AS (
|
||||
-- Base case: top-level managers
|
||||
SELECT id, name, manager_id, 0 AS depth
|
||||
FROM employees WHERE manager_id IS NULL
|
||||
|
||||
UNION ALL
|
||||
|
||||
-- Recursive case: subordinates
|
||||
SELECT e.id, e.name, e.manager_id, ot.depth + 1
|
||||
FROM employees e
|
||||
JOIN org_tree ot ON e.manager_id = ot.id
|
||||
)
|
||||
SELECT * FROM org_tree ORDER BY depth, name;
|
||||
```
|
||||
|
||||
### Recursive CTE — path traversal
|
||||
```sql
|
||||
-- Category breadcrumb
|
||||
WITH RECURSIVE breadcrumb AS (
|
||||
SELECT id, name, parent_id, name::TEXT AS path
|
||||
FROM categories WHERE id = 42
|
||||
|
||||
UNION ALL
|
||||
|
||||
SELECT c.id, c.name, c.parent_id, c.name || ' > ' || b.path
|
||||
FROM categories c
|
||||
JOIN breadcrumb b ON c.id = b.parent_id
|
||||
)
|
||||
SELECT path FROM breadcrumb WHERE parent_id IS NULL;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Window Functions
|
||||
|
||||
### ROW_NUMBER — assign unique rank per partition
|
||||
```sql
|
||||
SELECT *, ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank
|
||||
FROM employees;
|
||||
```
|
||||
|
||||
### RANK and DENSE_RANK — handle ties
|
||||
```sql
|
||||
-- RANK: 1, 2, 2, 4 (skips after tie)
|
||||
-- DENSE_RANK: 1, 2, 2, 3 (no skip)
|
||||
SELECT name, salary,
|
||||
RANK() OVER (ORDER BY salary DESC) AS rank,
|
||||
DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
|
||||
FROM employees;
|
||||
```
|
||||
|
||||
### Running total and moving average
|
||||
```sql
|
||||
SELECT date, amount,
|
||||
SUM(amount) OVER (ORDER BY date) AS running_total,
|
||||
AVG(amount) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS moving_avg_7d
|
||||
FROM daily_revenue;
|
||||
```
|
||||
|
||||
### LAG / LEAD — access adjacent rows
|
||||
```sql
|
||||
SELECT date, revenue,
|
||||
LAG(revenue, 1) OVER (ORDER BY date) AS prev_day,
|
||||
revenue - LAG(revenue, 1) OVER (ORDER BY date) AS day_over_day_change
|
||||
FROM daily_revenue;
|
||||
```
|
||||
|
||||
### NTILE — divide into buckets
|
||||
```sql
|
||||
-- Split customers into quartiles by total spend
|
||||
SELECT customer_id, total_spend,
|
||||
NTILE(4) OVER (ORDER BY total_spend DESC) AS spend_quartile
|
||||
FROM customer_summary;
|
||||
```
|
||||
|
||||
### FIRST_VALUE / LAST_VALUE
|
||||
```sql
|
||||
SELECT department_id, name, salary,
|
||||
FIRST_VALUE(name) OVER (PARTITION BY department_id ORDER BY salary DESC) AS highest_paid
|
||||
FROM employees;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Subquery Patterns
|
||||
|
||||
### EXISTS — correlated existence check
|
||||
```sql
|
||||
-- Users who have placed at least one order
|
||||
SELECT u.* FROM users u
|
||||
WHERE EXISTS (SELECT 1 FROM orders o WHERE o.user_id = u.id);
|
||||
```
|
||||
|
||||
### NOT EXISTS — safer than NOT IN for NULLs
|
||||
```sql
|
||||
-- Users who have never ordered
|
||||
SELECT u.* FROM users u
|
||||
WHERE NOT EXISTS (SELECT 1 FROM orders o WHERE o.user_id = u.id);
|
||||
```
|
||||
|
||||
### Scalar subquery — single value
|
||||
```sql
|
||||
SELECT name, salary,
|
||||
salary - (SELECT AVG(salary) FROM employees) AS diff_from_avg
|
||||
FROM employees;
|
||||
```
|
||||
|
||||
### Derived table — subquery in FROM
|
||||
```sql
|
||||
SELECT dept, avg_salary
|
||||
FROM (
|
||||
SELECT department_id AS dept, AVG(salary) AS avg_salary
|
||||
FROM employees GROUP BY department_id
|
||||
) dept_avg
|
||||
WHERE avg_salary > 100000;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Aggregation Patterns
|
||||
|
||||
### GROUP BY with HAVING
|
||||
```sql
|
||||
-- Departments with more than 10 employees
|
||||
SELECT department_id, COUNT(*) AS headcount, AVG(salary) AS avg_salary
|
||||
FROM employees
|
||||
GROUP BY department_id
|
||||
HAVING COUNT(*) > 10;
|
||||
```
|
||||
|
||||
### GROUPING SETS — multiple grouping levels
|
||||
```sql
|
||||
SELECT region, product_category, SUM(revenue)
|
||||
FROM sales
|
||||
GROUP BY GROUPING SETS (
|
||||
(region, product_category),
|
||||
(region),
|
||||
(product_category),
|
||||
()
|
||||
);
|
||||
```
|
||||
|
||||
### ROLLUP — hierarchical subtotals
|
||||
```sql
|
||||
SELECT region, city, SUM(revenue)
|
||||
FROM sales
|
||||
GROUP BY ROLLUP (region, city);
|
||||
-- Produces: (region, city), (region), ()
|
||||
```
|
||||
|
||||
### CUBE — all combinations
|
||||
```sql
|
||||
SELECT region, product, SUM(revenue)
|
||||
FROM sales
|
||||
GROUP BY CUBE (region, product);
|
||||
```
|
||||
|
||||
### FILTER clause (PostgreSQL) — conditional aggregation
|
||||
```sql
|
||||
SELECT
|
||||
COUNT(*) AS total,
|
||||
COUNT(*) FILTER (WHERE status = 'paid') AS paid,
|
||||
COUNT(*) FILTER (WHERE status = 'cancelled') AS cancelled,
|
||||
SUM(total) FILTER (WHERE status = 'paid') AS paid_revenue
|
||||
FROM orders;
|
||||
```
|
||||
MySQL/SQL Server equivalent: `SUM(CASE WHEN status = 'paid' THEN 1 ELSE 0 END)`.
|
||||
|
||||
---
|
||||
|
||||
## UPSERT Patterns
|
||||
|
||||
### PostgreSQL — ON CONFLICT
|
||||
```sql
|
||||
INSERT INTO user_settings (user_id, key, value, updated_at)
|
||||
VALUES (1, 'theme', 'dark', NOW())
|
||||
ON CONFLICT (user_id, key)
|
||||
DO UPDATE SET value = EXCLUDED.value, updated_at = EXCLUDED.updated_at;
|
||||
```
|
||||
|
||||
### MySQL — ON DUPLICATE KEY
|
||||
```sql
|
||||
INSERT INTO user_settings (user_id, key_name, value, updated_at)
|
||||
VALUES (1, 'theme', 'dark', NOW())
|
||||
ON DUPLICATE KEY UPDATE value = VALUES(value), updated_at = VALUES(updated_at);
|
||||
```
|
||||
|
||||
### SQL Server — MERGE
|
||||
```sql
|
||||
MERGE INTO user_settings AS target
|
||||
USING (VALUES (1, 'theme', 'dark')) AS source (user_id, key_name, value)
|
||||
ON target.user_id = source.user_id AND target.key_name = source.key_name
|
||||
WHEN MATCHED THEN UPDATE SET value = source.value, updated_at = GETDATE()
|
||||
WHEN NOT MATCHED THEN INSERT (user_id, key_name, value, updated_at)
|
||||
VALUES (source.user_id, source.key_name, source.value, GETDATE());
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## JSON Operations
|
||||
|
||||
### PostgreSQL JSONB
|
||||
```sql
|
||||
-- Extract field
|
||||
SELECT data->>'name' AS name FROM products WHERE data->>'category' = 'electronics';
|
||||
|
||||
-- Array contains
|
||||
SELECT * FROM products WHERE data->'tags' ? 'sale';
|
||||
|
||||
-- Update nested field
|
||||
UPDATE products SET data = jsonb_set(data, '{price}', '29.99') WHERE id = 1;
|
||||
|
||||
-- Aggregate into JSON array
|
||||
SELECT jsonb_agg(jsonb_build_object('id', id, 'name', name)) FROM users;
|
||||
```
|
||||
|
||||
### MySQL JSON
|
||||
```sql
|
||||
-- Extract field
|
||||
SELECT JSON_EXTRACT(data, '$.name') AS name FROM products;
|
||||
-- Shorthand: SELECT data->>"$.name"
|
||||
|
||||
-- Search in array
|
||||
SELECT * FROM products WHERE JSON_CONTAINS(data->"$.tags", '"sale"');
|
||||
|
||||
-- Update
|
||||
UPDATE products SET data = JSON_SET(data, '$.price', 29.99) WHERE id = 1;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pagination Patterns
|
||||
|
||||
### Offset pagination (simple but slow for deep pages)
|
||||
```sql
|
||||
SELECT * FROM products ORDER BY id LIMIT 20 OFFSET 40;
|
||||
```
|
||||
|
||||
### Keyset pagination (fast, requires ordered unique column)
|
||||
```sql
|
||||
-- Page after the last seen id
|
||||
SELECT * FROM products WHERE id > :last_seen_id ORDER BY id LIMIT 20;
|
||||
```
|
||||
|
||||
### Keyset with composite sort
|
||||
```sql
|
||||
SELECT * FROM products
|
||||
WHERE (created_at, id) < (:last_created_at, :last_id)
|
||||
ORDER BY created_at DESC, id DESC
|
||||
LIMIT 20;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Bulk Operations
|
||||
|
||||
### Batch INSERT
|
||||
```sql
|
||||
INSERT INTO events (type, payload, created_at) VALUES
|
||||
('click', '{"page": "/home"}', NOW()),
|
||||
('view', '{"page": "/pricing"}', NOW()),
|
||||
('click', '{"page": "/signup"}', NOW());
|
||||
```
|
||||
|
||||
### Batch UPDATE with VALUES
|
||||
```sql
|
||||
UPDATE products AS p SET price = v.price
|
||||
FROM (VALUES (1, 29.99), (2, 49.99), (3, 9.99)) AS v(id, price)
|
||||
WHERE p.id = v.id;
|
||||
```
|
||||
|
||||
### DELETE with subquery
|
||||
```sql
|
||||
DELETE FROM sessions
|
||||
WHERE user_id IN (SELECT id FROM users WHERE deleted_at IS NOT NULL);
|
||||
```
|
||||
|
||||
### COPY (PostgreSQL bulk load)
|
||||
```sql
|
||||
COPY products (name, price, category) FROM '/path/to/data.csv' WITH (FORMAT csv, HEADER true);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Utility Patterns
|
||||
|
||||
### Generate series (PostgreSQL)
|
||||
```sql
|
||||
-- Fill date gaps
|
||||
SELECT d::date FROM generate_series('2025-01-01'::date, '2025-12-31', '1 day') d;
|
||||
```
|
||||
|
||||
### Deduplicate rows
|
||||
```sql
|
||||
DELETE FROM events a USING events b
|
||||
WHERE a.id > b.id AND a.user_id = b.user_id AND a.event_type = b.event_type
|
||||
AND a.created_at = b.created_at;
|
||||
```
|
||||
|
||||
### Pivot (manual)
|
||||
```sql
|
||||
SELECT user_id,
|
||||
SUM(CASE WHEN month = 1 THEN revenue END) AS jan,
|
||||
SUM(CASE WHEN month = 2 THEN revenue END) AS feb,
|
||||
SUM(CASE WHEN month = 3 THEN revenue END) AS mar
|
||||
FROM monthly_revenue
|
||||
GROUP BY user_id;
|
||||
```
|
||||
|
||||
### Conditional INSERT (skip if exists)
|
||||
```sql
|
||||
INSERT INTO tags (name) SELECT 'new-tag'
|
||||
WHERE NOT EXISTS (SELECT 1 FROM tags WHERE name = 'new-tag');
|
||||
```
|
||||
Reference in New Issue
Block a user