* docs: restructure README.md — 2,539 → 209 lines (#247) - Cut from 2,539 lines / 73 sections to 209 lines / 18 sections - Consolidated 4 install methods into one unified section - Moved all skill details to domain-level READMEs (linked from table) - Front-loaded value prop and keywords for SEO - Added POWERFUL tier highlight section - Added skill-security-auditor showcase section - Removed stale Q4 2025 roadmap, outdated ROI claims, duplicate content - Fixed all internal links - Clean heading hierarchy (H2 for main sections only) Closes #233 Co-authored-by: Leo <leo@openclaw.ai> * fix: enhance 5 skills with scripts, references, and Anthropic best practices (#248) * fix(skill): enhance git-worktree-manager with scripts, references, and Anthropic best practices * fix(skill): enhance mcp-server-builder with scripts, references, and Anthropic best practices * fix(skill): enhance changelog-generator with scripts, references, and Anthropic best practices * fix(skill): enhance ci-cd-pipeline-builder with scripts, references, and Anthropic best practices * fix(skill): enhance prompt-engineer-toolkit with scripts, references, and Anthropic best practices * docs: update README, CHANGELOG, and plugin metadata * fix: correct marketing plugin count, expand thin references --------- Co-authored-by: Leo <leo@openclaw.ai> * ci: Add VirusTotal security scan for skills (#252) * Dev (#231) * Improve senior-fullstack skill description and workflow validation - Expand frontmatter description with concrete actions and trigger clauses - Add validation steps to scaffolding workflow (verify scaffold succeeded) - Add re-run verification step to audit workflow (confirm P0 fixes) * chore: sync codex skills symlinks [automated] * fix(skill): normalize senior-fullstack frontmatter to inline format Normalize YAML description from block scalar (>) to inline single-line format matching all other 50+ skills. Align frontmatter trigger phrases with the body's Trigger Phrases section to eliminate duplication. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): add GITHUB_TOKEN to checkout + restore corrupted skill descriptions - Add token: ${{ secrets.GITHUB_TOKEN }} to actions/checkout@v4 in sync-codex-skills.yml so git-auto-commit-action can push back to branch (fixes: fatal: could not read Username, exit 128) - Restore correct description for incident-commander (was: 'Skill from engineering-team') - Restore correct description for senior-fullstack (was: '>') * fix(ci): pass PROJECTS_TOKEN to fix automated commits + remove duplicate checkout Fixes PROJECTS_TOKEN passthrough for git-auto-commit-action and removes duplicate checkout step in pr-issue-auto-close workflow. * fix(ci): remove stray merge conflict marker in sync-codex-skills.yml (#221) Co-authored-by: Leo <leo@leo-agent-server> * fix(ci): fix workflow errors + add OpenClaw support (#222) * feat: add 20 new practical skills for professional Claude Code users New skills across 5 categories: Engineering (12): - git-worktree-manager: Parallel dev with port isolation & env sync - ci-cd-pipeline-builder: Generate GitHub Actions/GitLab CI from stack analysis - mcp-server-builder: Build MCP servers from OpenAPI specs - changelog-generator: Conventional commits to structured changelogs - pr-review-expert: Blast radius analysis & security scan for PRs - api-test-suite-builder: Auto-generate test suites from API routes - env-secrets-manager: .env management, leak detection, rotation workflows - database-schema-designer: Requirements to migrations & types - codebase-onboarding: Auto-generate onboarding docs from codebase - performance-profiler: Node/Python/Go profiling & optimization - runbook-generator: Operational runbooks from codebase analysis - monorepo-navigator: Turborepo/Nx/pnpm workspace management Engineering Team (2): - stripe-integration-expert: Subscriptions, webhooks, billing patterns - email-template-builder: React Email/MJML transactional email systems Product Team (3): - saas-scaffolder: Full SaaS project generation from product brief - landing-page-generator: High-converting landing pages with copy frameworks - competitive-teardown: Structured competitive product analysis Business Growth (1): - contract-and-proposal-writer: Contracts, SOWs, NDAs per jurisdiction Marketing (1): - prompt-engineer-toolkit: Systematic prompt development & A/B testing Designed for daily professional use and commercial distribution. * chore: sync codex skills symlinks [automated] * docs: update README with 20 new skills, counts 65→86, new skills section * docs: add commercial distribution plan (Stan Store + Gumroad) * docs: rewrite CHANGELOG.md with v2.0.0 release (65 skills, 9 domains) (#226) * docs: rewrite CHANGELOG.md with v2.0.0 release (65 skills, 9 domains) - Consolidate 191 commits since v1.0.2 into proper v2.0.0 entry - Document 12 POWERFUL-tier skills, 37 refactored skills - Add new domains: business-growth, finance - Document Codex support and marketplace integration - Update version history summary table - Clean up [Unreleased] to only planned work * docs: add 24 POWERFUL-tier skills to plugin, fix counts to 85 across all docs - Add engineering-advanced-skills plugin (24 POWERFUL-tier skills) to marketplace.json - Add 13 missing skills to CHANGELOG v2.0.0 (agent-workflow-designer, api-test-suite-builder, changelog-generator, ci-cd-pipeline-builder, codebase-onboarding, database-schema-designer, env-secrets-manager, git-worktree-manager, mcp-server-builder, monorepo-navigator, performance-profiler, pr-review-expert, runbook-generator) - Fix skill count: 86→85 (excl sample-skill) across README, CHANGELOG, marketplace.json - Fix stale 53→85 references in README - Add engineering-advanced-skills install command to README - Update marketplace.json version to 2.0.0 --------- Co-authored-by: Leo <leo@openclaw.ai> * feat: add skill-security-auditor POWERFUL-tier skill (#230) Security audit and vulnerability scanner for AI agent skills before installation. Scans for: - Code execution risks (eval, exec, os.system, subprocess shell injection) - Data exfiltration (outbound HTTP, credential harvesting, env var extraction) - Prompt injection in SKILL.md (system override, role hijack, safety bypass) - Dependency supply chain (typosquatting, unpinned versions, runtime installs) - File system abuse (boundary violations, binaries, symlinks, hidden files) - Privilege escalation (sudo, SUID, cron manipulation, shell config writes) - Obfuscation (base64, hex encoding, chr chains, codecs) Produces clear PASS/WARN/FAIL verdict with per-finding remediation guidance. Supports local dirs, git repo URLs, JSON output, strict mode, and CI/CD integration. Includes: - scripts/skill_security_auditor.py (1049 lines, zero dependencies) - references/threat-model.md (complete attack vector documentation) - SKILL.md with usage guide and report format Tested against: rag-architect (PASS), agent-designer (PASS), senior-secops (FAIL - correctly flagged eval/exec patterns). Co-authored-by: Leo <leo@openclaw.ai> * docs: add skill-security-auditor to marketplace, README, and CHANGELOG - Add standalone plugin entry for skill-security-auditor in marketplace.json - Update engineering-advanced-skills plugin description to include it - Update skill counts: 85→86 across README, CHANGELOG, marketplace - Add install command to README Quick Install section - Add to CHANGELOG [Unreleased] section --------- Co-authored-by: Baptiste Fernandez <fernandez.baptiste1@gmail.com> Co-authored-by: alirezarezvani <5697919+alirezarezvani@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Leo <leo@leo-agent-server> Co-authored-by: Leo <leo@openclaw.ai> * Dev (#249) * docs: restructure README.md — 2,539 → 209 lines (#247) - Cut from 2,539 lines / 73 sections to 209 lines / 18 sections - Consolidated 4 install methods into one unified section - Moved all skill details to domain-level READMEs (linked from table) - Front-loaded value prop and keywords for SEO - Added POWERFUL tier highlight section - Added skill-security-auditor showcase section - Removed stale Q4 2025 roadmap, outdated ROI claims, duplicate content - Fixed all internal links - Clean heading hierarchy (H2 for main sections only) Closes #233 Co-authored-by: Leo <leo@openclaw.ai> * fix: enhance 5 skills with scripts, references, and Anthropic best practices (#248) * fix(skill): enhance git-worktree-manager with scripts, references, and Anthropic best practices * fix(skill): enhance mcp-server-builder with scripts, references, and Anthropic best practices * fix(skill): enhance changelog-generator with scripts, references, and Anthropic best practices * fix(skill): enhance ci-cd-pipeline-builder with scripts, references, and Anthropic best practices * fix(skill): enhance prompt-engineer-toolkit with scripts, references, and Anthropic best practices * docs: update README, CHANGELOG, and plugin metadata * fix: correct marketing plugin count, expand thin references --------- Co-authored-by: Leo <leo@openclaw.ai> --------- Co-authored-by: Leo <leo@openclaw.ai> * Dev (#250) * docs: restructure README.md — 2,539 → 209 lines (#247) - Cut from 2,539 lines / 73 sections to 209 lines / 18 sections - Consolidated 4 install methods into one unified section - Moved all skill details to domain-level READMEs (linked from table) - Front-loaded value prop and keywords for SEO - Added POWERFUL tier highlight section - Added skill-security-auditor showcase section - Removed stale Q4 2025 roadmap, outdated ROI claims, duplicate content - Fixed all internal links - Clean heading hierarchy (H2 for main sections only) Closes #233 Co-authored-by: Leo <leo@openclaw.ai> * fix: enhance 5 skills with scripts, references, and Anthropic best practices (#248) * fix(skill): enhance git-worktree-manager with scripts, references, and Anthropic best practices * fix(skill): enhance mcp-server-builder with scripts, references, and Anthropic best practices * fix(skill): enhance changelog-generator with scripts, references, and Anthropic best practices * fix(skill): enhance ci-cd-pipeline-builder with scripts, references, and Anthropic best practices * fix(skill): enhance prompt-engineer-toolkit with scripts, references, and Anthropic best practices * docs: update README, CHANGELOG, and plugin metadata * fix: correct marketing plugin count, expand thin references --------- Co-authored-by: Leo <leo@openclaw.ai> --------- Co-authored-by: Leo <leo@openclaw.ai> * ci: add VirusTotal security scan for skills - Scans changed skill directories on PRs to dev/main - Scans all skills on release publish - Posts scan results as PR comment with analysis links - Rate-limited to 4 req/min (free tier compatible) - Appends VirusTotal links to release body on publish * fix: resolve YAML lint errors in virustotal workflow - Add document start marker (---) - Quote 'on' key for truthy lint rule - Remove trailing spaces - Break long lines under 160 char limit --------- Co-authored-by: Baptiste Fernandez <fernandez.baptiste1@gmail.com> Co-authored-by: alirezarezvani <5697919+alirezarezvani@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Leo <leo@leo-agent-server> Co-authored-by: Leo <leo@openclaw.ai> * feat: add playwright-pro plugin — production-grade Playwright testing toolkit (#254) Complete Claude Code plugin with: - 9 skills (/pw:init, generate, review, fix, migrate, coverage, testrail, browserstack, report) - 3 specialized agents (test-architect, test-debugger, migration-planner) - 55 test case templates across 11 categories (auth, CRUD, checkout, search, forms, dashboard, settings, onboarding, notifications, API, accessibility) - TestRail MCP server (TypeScript) — 8 tools for bidirectional sync - BrowserStack MCP server (TypeScript) — 7 tools for cross-browser testing - Smart hooks (auto-validate tests, auto-detect Playwright projects) - 6 curated reference docs (golden rules, locators, assertions, fixtures, pitfalls, flaky tests) - Leverages Claude Code built-ins (/batch, /debug, Explore subagent) - Zero-config for core features; TestRail/BrowserStack via env vars - Both TypeScript and JavaScript support throughout Co-authored-by: Leo <leo@openclaw.ai> * feat: add playwright-pro to marketplace registry (#256) - New plugin: playwright-pro (9 skills, 3 agents, 55 templates, 2 MCP servers) - Install: /plugin install playwright-pro@claude-code-skills - Total marketplace plugins: 17 Co-authored-by: Leo <leo@openclaw.ai> * fix: integrate playwright-pro across all platforms (#258) - Add root SKILL.md for OpenClaw and ClawHub compatibility - Add to README: Skills Overview table, install section, badge count - Regenerate .codex/skills-index.json with playwright-pro entry - Add .codex/skills/playwright-pro symlink for Codex CLI - Fix YAML frontmatter (single-line description for index parsing) Platforms verified: - Claude Code: marketplace.json ✅ (merged in PR #256) - Codex CLI: symlink + skills-index.json ✅ - OpenClaw: SKILL.md auto-discovered by install script ✅ - ClawHub: published as playwright-pro@1.1.0 ✅ Co-authored-by: Leo <leo@openclaw.ai> * docs: update CLAUDE.md — reflect 87 skills across 9 domains Sync CLAUDE.md with actual repository state: add Engineering POWERFUL tier (25 skills), update all skill counts, add plugin registry references, and replace stale sprint section with v2.0.0 version info. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: mention Claude Code in project description Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add self-improving-agent plugin — auto-memory curation for Claude Code (#260) New plugin: engineering-team/self-improving-agent/ - 5 skills: /si:review, /si:promote, /si:extract, /si:status, /si:remember - 2 agents: memory-analyst, skill-extractor - 1 hook: PostToolUse error capture (zero overhead on success) - 3 reference docs: memory architecture, promotion rules, rules directory patterns - 2 templates: rule template, skill template - 20 files, 1,829 lines Integrates natively with Claude Code's auto-memory (v2.1.32+). Reads from ~/.claude/projects/<path>/memory/ — no duplicate storage. Promotes proven patterns from MEMORY.md to CLAUDE.md or .claude/rules/. Also: - Added to marketplace.json (18 plugins total) - Added to README (Skills Overview + install section) - Updated badge count to 88+ - Regenerated .codex/skills-index.json + symlink Co-authored-by: Leo <leo@openclaw.ai> * feat: C-Suite expansion — 8 new executive advisory roles (2→10) (#264) * feat: C-Suite expansion — 8 new executive advisory roles Add COO, CPO, CMO, CFO, CRO, CISO, CHRO advisors and Executive Mentor. Expands C-level advisory from 2 to 10 roles with 74 total files. Each role includes: - SKILL.md (lean, <5KB, ~1200 tokens for context efficiency) - Reference docs (loaded on demand, not at startup) - Python analysis scripts (stdlib only, runnable CLI) Executive Mentor features /em: slash commands (challenge, board-prep, hard-call, stress-test, postmortem) with devil's advocate agent. 21 Python tools, 24 reference frameworks, 28,379 total lines. All SKILL.md files combined: ~17K tokens (8.5% of 200K context window). Badge: 88 → 116 skills * feat: C-Suite orchestration layer + 18 complementary skills ORCHESTRATION (new): - cs-onboard: Founder interview → company-context.md - chief-of-staff: Routing, synthesis, inter-agent orchestration - board-meeting: 6-phase multi-agent deliberation protocol - decision-logger: Two-layer memory (raw transcripts + approved decisions) - agent-protocol: Inter-agent invocation with loop prevention - context-engine: Company context loading + anonymization CROSS-CUTTING CAPABILITIES (new): - board-deck-builder: Board/investor update assembly - scenario-war-room: Cascading multi-variable what-if modeling - competitive-intel: Systematic competitor tracking + battlecards - org-health-diagnostic: Cross-functional health scoring (8 dimensions) - ma-playbook: M&A strategy (acquiring + being acquired) - intl-expansion: International market entry frameworks CULTURE & COLLABORATION (new): - culture-architect: Values → behaviors, culture code, health assessment - company-os: EOS/Scaling Up operating system selection + implementation - founder-coach: Founder development, delegation, blind spots - strategic-alignment: Strategy cascade, silo detection, alignment scoring - change-management: ADKAR-based change rollout framework - internal-narrative: One story across employees/investors/customers UPGRADES TO EXISTING ROLES: - All 10 roles get reasoning technique directives - All 10 roles get company-context.md integration - All 10 roles get board meeting isolation rules - CEO gets stage-adaptive temporal horizons (seed→C) Key design decisions: - Two-layer memory prevents hallucinated consensus from rejected ideas - Phase 2 isolation: agents think independently before cross-examination - Executive Mentor (The Critic) sees all perspectives, others don't - 25 Python tools total (stdlib only, no dependencies) 52 new files, 10 modified, 10,862 new lines. Total C-suite ecosystem: 134 files, 39,131 lines. * fix: connect all dots — Chief of Staff routes to all 28 skills - Added complementary skills registry to routing-matrix.md - Chief of Staff SKILL.md now lists all 28 skills in ecosystem - Added integration tables to scenario-war-room and competitive-intel - Badge: 116 → 134 skills - README: C-Level Advisory count 10 → 28 Quality audit passed: ✅ All 10 roles: company-context, reasoning, isolation, invocation ✅ All 6 phases in board meeting ✅ Two-layer memory with DO_NOT_RESURFACE ✅ Loop prevention (no self-invoke, max depth 2, no circular) ✅ All /em: commands present ✅ All complementary skills cross-reference roles ✅ Chief of Staff routes to every skill in ecosystem * refactor: CEO + CTO advisors upgraded to C-suite parity Both roles now match the structural standard of all new roles: - CEO: 11.7KB → 6.8KB SKILL.md (heavy content stays in references) - CTO: 10KB → 7.2KB SKILL.md (heavy content stays in references) Added to both: - Integration table (who they work with and when) - Key diagnostic questions - Structured metrics dashboard table - Consistent section ordering (Keywords → Quick Start → Responsibilities → Questions → Metrics → Red Flags → Integration → Reasoning → Context) CEO additions: - Stage-adaptive temporal horizons (seed=3m/6m/12m → B+=1y/3y/5y) - Cross-references to culture-architect and board-deck-builder CTO additions: - Key Questions section (7 diagnostic questions) - Structured metrics table (DORA + debt + team + architecture + cost) - Cross-references to all peer roles All 10 roles now pass structural parity: ✅ Keywords ✅ QuickStart ✅ Questions ✅ Metrics ✅ RedFlags ✅ Integration * feat: add proactive triggers + output artifacts to all 10 roles Every C-suite role now specifies: - Proactive Triggers: 'surface these without being asked' — context-driven early warnings that make advisors proactive, not reactive - Output Artifacts: concrete deliverables per request type (what you ask → what you get) CEO: runway alerts, board prep triggers, strategy review nudges CTO: deploy frequency monitoring, tech debt thresholds, bus factor flags COO: blocker detection, scaling threshold warnings, cadence gaps CPO: retention curve monitoring, portfolio dog detection, research gaps CMO: CAC trend monitoring, positioning gaps, budget staleness CFO: runway forecasting, burn multiple alerts, scenario planning gaps CRO: NRR monitoring, pipeline coverage, pricing review triggers CISO: audit overdue alerts, compliance gaps, vendor risk CHRO: retention risk, comp band gaps, org scaling thresholds Executive Mentor: board prep triggers, groupthink detection, hard call surfacing This transforms the C-suite from reactive advisors into proactive partners. * feat: User Communication Standard — structured output for all roles Defines 3 output formats in agent-protocol/SKILL.md: 1. Standard Output: Bottom Line → What → Why → How to Act → Risks → Your Decision 2. Proactive Alert: What I Noticed → Why It Matters → Action → Urgency (🔴🟡⚪) 3. Board Meeting: Decision Required → Perspectives → Agree/Disagree → Critic → Action Items 10 non-negotiable rules: - Bottom line first, always - Results and decisions only (no process narration) - What + Why + How for every finding - Actions have owners and deadlines ('we should consider' is banned) - Decisions framed as options with trade-offs - Founder is the highest authority — roles recommend, founder decides - Risks are concrete (if X → Y, costs $Z) - Max 5 bullets per section - No jargon without explanation - Silence over fabricated updates All 10 roles reference this standard. Chief of Staff enforces it as a quality gate. Board meeting Phase 4 uses the Board Meeting Output format. * feat: Internal Quality Loop — verification before delivery No role presents to the founder without passing verification: Step 1: Self-Verification (every role, every time) - Source attribution: where did each data point come from? - Assumption audit: [VERIFIED] vs [ASSUMED] tags on every finding - Confidence scoring: 🟢 high / 🟡 medium / 🔴 low per finding - Contradiction check against company-context + decision log - 'So what?' test: every finding needs a business consequence Step 2: Peer Verification (cross-functional) - Financial claims → CFO validates math - Revenue projections → CRO validates pipeline backing - Technical feasibility → CTO validates - People/hiring impact → CHRO validates - Skip for single-domain, low-stakes questions Step 3: Critic Pre-Screen (high-stakes only) - Irreversible decisions, >20% runway impact, strategy changes - Executive Mentor finds weakest point before founder sees it - Suspicious consensus triggers mandatory pre-screen Step 4: Course Correction (after founder feedback) - Approve → log + assign actions - Modify → re-verify changed parts - Reject → DO_NOT_RESURFACE + learn why - 30/60/90 day post-decision review Board meeting contributions now require self-verified format with confidence tags and source attribution on every finding. * fix: resolve PR review issues 1, 4, and minor observation Issue 1: c-level-advisor/CLAUDE.md — completely rewritten - Was: 2 skills (CEO, CTO only), dated Nov 2025 - Now: full 28-skill ecosystem map with architecture diagram, all roles/orchestration/cross-cutting/culture skills listed, design decisions, integration with other domains Issue 4: Root CLAUDE.md — updated all stale counts - 87 → 134 skills across all 3 references - C-Level: 2 → 33 (10 roles + 5 mentor commands + 18 complementary) - Tool count: 160+ → 185+ - Reference count: 200+ → 250+ Minor observation: Documented plugin.json convention - Explained in c-level-advisor/CLAUDE.md that only executive-mentor has plugin.json because only it has slash commands (/em: namespace) - Other skills are invoked by name through Chief of Staff or directly Also fixed: README.md 88+ → 134 in two places (first line + skills section) * fix: update all plugin/index registrations for 28-skill C-suite 1. c-level-advisor/.claude-plugin/plugin.json — v2.0.0 - Was: 2 skills, generic description - Now: all 28 skills listed with descriptions, all 25 scripts, namespace 'cs', full ecosystem description 2. .codex/skills-index.json — added 18 complementary skills - Was: 10 roles only - Now: 28 total c-level entries (10 roles + 6 orchestration + 6 cross-cutting + 6 culture) - Each with full description for skill discovery 3. .claude-plugin/marketplace.json — updated c-level-skills entry - Was: generic 2-skill description - Now: v2.0.0, full 28-skill ecosystem description, skills_count: 28, scripts_count: 25 * feat: add root SKILL.md for c-level-advisor ClawHub package --------- Co-authored-by: Leo <leo@openclaw.ai> * chore: sync codex skills symlinks [automated] --------- Co-authored-by: Leo <leo@openclaw.ai> Co-authored-by: Baptiste Fernandez <fernandez.baptiste1@gmail.com> Co-authored-by: alirezarezvani <5697919+alirezarezvani@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Leo <leo@leo-agent-server>
743 lines
29 KiB
Python
743 lines
29 KiB
Python
#!/usr/bin/env python3
|
|
"""
|
|
Churn & Retention Analyzer
|
|
===========================
|
|
Customer-level churn and Net Revenue Retention (NRR) analysis for B2B SaaS.
|
|
|
|
Calculates:
|
|
- Gross Revenue Retention (GRR) and Net Revenue Retention (NRR)
|
|
- Monthly and annual churn rates (logo + revenue)
|
|
- Cohort-based retention curves
|
|
- At-risk account identification
|
|
- Expansion revenue segmentation
|
|
- ARR waterfall (new / expansion / contraction / churn)
|
|
|
|
Usage:
|
|
python churn_analyzer.py
|
|
python churn_analyzer.py --csv customers.csv
|
|
python churn_analyzer.py --period 2026-Q1 --output summary
|
|
|
|
Input format (CSV):
|
|
customer_id, name, segment, arr, start_date, [churn_date], [expansion_arr], [contraction_arr]
|
|
|
|
Stdlib only. No dependencies.
|
|
"""
|
|
|
|
import csv
|
|
import sys
|
|
import json
|
|
import argparse
|
|
import statistics
|
|
from datetime import date, datetime, timedelta
|
|
from collections import defaultdict
|
|
from io import StringIO
|
|
from itertools import groupby
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Data model
|
|
# ---------------------------------------------------------------------------
|
|
|
|
class Customer:
|
|
def __init__(self, customer_id, name, segment, arr, start_date,
|
|
churn_date=None, expansion_arr=0.0, contraction_arr=0.0,
|
|
health_score=None):
|
|
self.customer_id = customer_id
|
|
self.name = name
|
|
self.segment = segment
|
|
self.arr = float(arr)
|
|
self.start_date = self._parse_date(start_date)
|
|
self.churn_date = self._parse_date(churn_date) if churn_date else None
|
|
self.expansion_arr = float(expansion_arr or 0)
|
|
self.contraction_arr = float(contraction_arr or 0)
|
|
self.health_score = float(health_score) if health_score else None
|
|
|
|
@staticmethod
|
|
def _parse_date(value):
|
|
if not value or str(value).strip() in ("", "None", "null"):
|
|
return None
|
|
for fmt in ("%Y-%m-%d", "%m/%d/%Y", "%d/%m/%Y", "%Y/%m/%d"):
|
|
try:
|
|
return datetime.strptime(str(value).strip(), fmt).date()
|
|
except ValueError:
|
|
continue
|
|
raise ValueError(f"Cannot parse date: {value!r}")
|
|
|
|
def is_churned(self):
|
|
return self.churn_date is not None
|
|
|
|
def is_active(self, as_of=None):
|
|
as_of = as_of or date.today()
|
|
if self.churn_date and self.churn_date <= as_of:
|
|
return False
|
|
return self.start_date <= as_of
|
|
|
|
def tenure_days(self, as_of=None):
|
|
as_of = as_of or date.today()
|
|
end = self.churn_date if self.churn_date else as_of
|
|
return (end - self.start_date).days
|
|
|
|
def tenure_months(self, as_of=None):
|
|
return self.tenure_days(as_of) / 30.44
|
|
|
|
def cohort_month(self):
|
|
"""Acquisition cohort: YYYY-MM of start_date."""
|
|
return self.start_date.strftime("%Y-%m")
|
|
|
|
def cohort_quarter(self):
|
|
q = (self.start_date.month - 1) // 3 + 1
|
|
return f"Q{q} {self.start_date.year}"
|
|
|
|
def net_arr(self):
|
|
"""Current ARR + expansion - contraction."""
|
|
return self.arr + self.expansion_arr - self.contraction_arr
|
|
|
|
def days_since_acquisition(self, as_of=None):
|
|
as_of = as_of or date.today()
|
|
return (as_of - self.start_date).days
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Core metrics
|
|
# ---------------------------------------------------------------------------
|
|
|
|
class RetentionAnalyzer:
|
|
def __init__(self, customers, as_of=None):
|
|
self.customers = customers
|
|
self.as_of = as_of or date.today()
|
|
|
|
def active_customers(self, as_of=None):
|
|
as_of = as_of or self.as_of
|
|
return [c for c in self.customers if c.is_active(as_of)]
|
|
|
|
def churned_customers(self, start=None, end=None):
|
|
"""Customers who churned in [start, end]."""
|
|
result = []
|
|
for c in self.customers:
|
|
if not c.churn_date:
|
|
continue
|
|
if start and c.churn_date < start:
|
|
continue
|
|
if end and c.churn_date > end:
|
|
continue
|
|
result.append(c)
|
|
return result
|
|
|
|
def arr_waterfall(self, period_start, period_end):
|
|
"""
|
|
Calculate ARR waterfall for a given period.
|
|
Returns dict with opening_arr, new_arr, expansion_arr, contraction_arr,
|
|
churned_arr, closing_arr, nrr, grr.
|
|
"""
|
|
# Opening: active at period start
|
|
opening_customers = [c for c in self.customers if c.is_active(period_start)]
|
|
opening_arr = sum(c.arr for c in opening_customers)
|
|
opening_ids = {c.customer_id for c in opening_customers}
|
|
|
|
# New: started during the period
|
|
new_customers = [
|
|
c for c in self.customers
|
|
if period_start < c.start_date <= period_end
|
|
]
|
|
new_arr = sum(c.arr for c in new_customers)
|
|
|
|
# Churned: were active at start, churn_date within period
|
|
churned = [
|
|
c for c in opening_customers
|
|
if c.churn_date and period_start < c.churn_date <= period_end
|
|
]
|
|
churned_arr = sum(c.arr for c in churned)
|
|
|
|
# Expansion and contraction: from customers active at opening
|
|
expansion = sum(
|
|
c.expansion_arr for c in opening_customers
|
|
if not c.is_churned() or (c.churn_date and c.churn_date > period_end)
|
|
)
|
|
contraction = sum(
|
|
c.contraction_arr for c in opening_customers
|
|
if not c.is_churned() or (c.churn_date and c.churn_date > period_end)
|
|
)
|
|
|
|
closing_arr = opening_arr + new_arr + expansion - contraction - churned_arr
|
|
|
|
grr = (opening_arr - contraction - churned_arr) / opening_arr if opening_arr else 0
|
|
nrr = (opening_arr + expansion - contraction - churned_arr) / opening_arr if opening_arr else 0
|
|
|
|
return {
|
|
"period_start": period_start.isoformat(),
|
|
"period_end": period_end.isoformat(),
|
|
"opening_arr": opening_arr,
|
|
"new_arr": new_arr,
|
|
"expansion_arr": expansion,
|
|
"contraction_arr": contraction,
|
|
"churned_arr": churned_arr,
|
|
"closing_arr": closing_arr,
|
|
"net_new_arr": new_arr + expansion - contraction - churned_arr,
|
|
"grr": max(0.0, grr),
|
|
"nrr": max(0.0, nrr),
|
|
}
|
|
|
|
def logo_churn_rate(self, period_start, period_end):
|
|
"""Logo churn rate for a period."""
|
|
opening = [c for c in self.customers if c.is_active(period_start)]
|
|
churned = [
|
|
c for c in opening
|
|
if c.churn_date and period_start < c.churn_date <= period_end
|
|
]
|
|
return len(churned) / len(opening) if opening else 0.0
|
|
|
|
def revenue_churn_rate(self, period_start, period_end):
|
|
"""Gross revenue churn rate for a period."""
|
|
opening = [c for c in self.customers if c.is_active(period_start)]
|
|
opening_arr = sum(c.arr for c in opening)
|
|
churned_arr = sum(
|
|
c.arr for c in opening
|
|
if c.churn_date and period_start < c.churn_date <= period_end
|
|
)
|
|
contraction = sum(c.contraction_arr for c in opening)
|
|
return (churned_arr + contraction) / opening_arr if opening_arr else 0.0
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Cohort analysis
|
|
# ---------------------------------------------------------------------------
|
|
|
|
class CohortAnalyzer:
|
|
def __init__(self, customers):
|
|
self.customers = customers
|
|
|
|
def build_cohorts(self):
|
|
"""Group customers by acquisition cohort (month)."""
|
|
cohorts = defaultdict(list)
|
|
for c in self.customers:
|
|
cohorts[c.cohort_month()].append(c)
|
|
return dict(sorted(cohorts.items()))
|
|
|
|
def retention_at_month(self, cohort_customers, months_after):
|
|
"""
|
|
What fraction of cohort ARR remains `months_after` months after acquisition?
|
|
"""
|
|
if not cohort_customers:
|
|
return None
|
|
|
|
opening_arr = sum(c.arr for c in cohort_customers)
|
|
if opening_arr == 0:
|
|
return None
|
|
|
|
earliest_start = min(c.start_date for c in cohort_customers)
|
|
check_date = earliest_start + timedelta(days=int(months_after * 30.44))
|
|
|
|
if check_date > date.today():
|
|
return None # Future — no data
|
|
|
|
retained_arr = sum(
|
|
c.arr for c in cohort_customers
|
|
if c.is_active(check_date)
|
|
)
|
|
return retained_arr / opening_arr
|
|
|
|
def retention_curve(self, cohort_customers, max_months=24):
|
|
"""Return retention at months 0, 3, 6, 9, 12, 18, 24."""
|
|
checkpoints = [0, 3, 6, 9, 12, 18, 24]
|
|
checkpoints = [m for m in checkpoints if m <= max_months]
|
|
curve = {}
|
|
for m in checkpoints:
|
|
rate = self.retention_at_month(cohort_customers, m)
|
|
if rate is not None:
|
|
curve[m] = rate
|
|
return curve
|
|
|
|
def cohort_report(self):
|
|
"""Returns dict: cohort → {size, opening_arr, retention_curve}."""
|
|
cohorts = self.build_cohorts()
|
|
report = {}
|
|
for cohort_month, customers in cohorts.items():
|
|
curve = self.retention_curve(customers)
|
|
report[cohort_month] = {
|
|
"customer_count": len(customers),
|
|
"opening_arr": sum(c.arr for c in customers),
|
|
"churned_count": sum(1 for c in customers if c.is_churned()),
|
|
"current_retention": curve.get(12, curve.get(max(curve.keys()) if curve else 0)),
|
|
"retention_curve": curve,
|
|
}
|
|
return report
|
|
|
|
def identify_at_risk(self, tenure_months_max=6, health_threshold=60):
|
|
"""
|
|
Identify at-risk customers based on:
|
|
- Low health score (if available)
|
|
- Short tenure (haven't proved long-term value)
|
|
- High contraction signals
|
|
"""
|
|
at_risk = []
|
|
for c in self.customers:
|
|
if c.is_churned():
|
|
continue
|
|
reasons = []
|
|
score = 0
|
|
|
|
# Health score signal
|
|
if c.health_score is not None and c.health_score < health_threshold:
|
|
reasons.append(f"Health score {c.health_score:.0f} < {health_threshold}")
|
|
score += 40
|
|
|
|
# Early tenure risk
|
|
tenure = c.tenure_months()
|
|
if tenure < tenure_months_max:
|
|
reasons.append(f"Tenure {tenure:.1f} months (< {tenure_months_max})")
|
|
score += 20
|
|
|
|
# Contraction signal
|
|
if c.contraction_arr > 0:
|
|
contraction_pct = c.contraction_arr / c.arr
|
|
reasons.append(f"Contraction {contraction_pct:.0%} of ARR")
|
|
score += 30
|
|
|
|
# No expansion in mature account
|
|
if tenure > 12 and c.expansion_arr == 0:
|
|
reasons.append("No expansion after 12+ months (stagnant)")
|
|
score += 10
|
|
|
|
if score > 0:
|
|
at_risk.append({
|
|
"customer_id": c.customer_id,
|
|
"name": c.name,
|
|
"segment": c.segment,
|
|
"arr": c.arr,
|
|
"tenure_months": round(tenure, 1),
|
|
"health_score": c.health_score,
|
|
"risk_score": score,
|
|
"risk_reasons": reasons,
|
|
})
|
|
|
|
return sorted(at_risk, key=lambda x: -x["risk_score"])
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Expansion analysis
|
|
# ---------------------------------------------------------------------------
|
|
|
|
class ExpansionAnalyzer:
|
|
def __init__(self, customers):
|
|
self.customers = customers
|
|
|
|
def expansion_summary(self):
|
|
active = [c for c in self.customers if not c.is_churned()]
|
|
expanding = [c for c in active if c.expansion_arr > 0]
|
|
contracting = [c for c in active if c.contraction_arr > 0]
|
|
|
|
total_arr = sum(c.arr for c in active)
|
|
total_expansion = sum(c.expansion_arr for c in active)
|
|
total_contraction = sum(c.contraction_arr for c in active)
|
|
|
|
return {
|
|
"active_customers": len(active),
|
|
"total_arr": total_arr,
|
|
"expanding_count": len(expanding),
|
|
"contracting_count": len(contracting),
|
|
"expansion_arr": total_expansion,
|
|
"contraction_arr": total_contraction,
|
|
"expansion_rate": total_expansion / total_arr if total_arr else 0,
|
|
"contraction_rate": total_contraction / total_arr if total_arr else 0,
|
|
"net_expansion_rate": (total_expansion - total_contraction) / total_arr if total_arr else 0,
|
|
}
|
|
|
|
def expansion_by_segment(self):
|
|
active = [c for c in self.customers if not c.is_churned()]
|
|
by_segment = defaultdict(lambda: {"arr": 0.0, "expansion": 0.0,
|
|
"contraction": 0.0, "count": 0})
|
|
for c in active:
|
|
seg = c.segment or "Unspecified"
|
|
by_segment[seg]["arr"] += c.arr
|
|
by_segment[seg]["expansion"] += c.expansion_arr
|
|
by_segment[seg]["contraction"] += c.contraction_arr
|
|
by_segment[seg]["count"] += 1
|
|
|
|
result = {}
|
|
for seg, data in by_segment.items():
|
|
arr = data["arr"]
|
|
result[seg] = {
|
|
"customer_count": data["count"],
|
|
"arr": arr,
|
|
"expansion_arr": data["expansion"],
|
|
"contraction_arr": data["contraction"],
|
|
"expansion_rate": data["expansion"] / arr if arr else 0,
|
|
"net_nrr_contribution": (arr + data["expansion"] - data["contraction"]) / arr if arr else 0,
|
|
}
|
|
return result
|
|
|
|
def top_expansion_candidates(self, min_tenure_months=6, min_arr=5000):
|
|
"""
|
|
Customers who are active, healthy tenure, but have zero expansion.
|
|
These are upsell/expansion targets.
|
|
"""
|
|
active = [c for c in self.customers if not c.is_churned()]
|
|
candidates = []
|
|
for c in active:
|
|
tenure = c.tenure_months()
|
|
if (tenure >= min_tenure_months
|
|
and c.arr >= min_arr
|
|
and c.expansion_arr == 0
|
|
and (c.health_score is None or c.health_score >= 60)):
|
|
candidates.append({
|
|
"customer_id": c.customer_id,
|
|
"name": c.name,
|
|
"segment": c.segment,
|
|
"arr": c.arr,
|
|
"tenure_months": round(tenure, 1),
|
|
"health_score": c.health_score,
|
|
})
|
|
return sorted(candidates, key=lambda x: -x["arr"])
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Reporting
|
|
# ---------------------------------------------------------------------------
|
|
|
|
def fmt_currency(value):
|
|
if value >= 1_000_000:
|
|
return f"${value / 1_000_000:.2f}M"
|
|
if value >= 1_000:
|
|
return f"${value / 1_000:.1f}K"
|
|
return f"${value:.0f}"
|
|
|
|
|
|
def fmt_pct(value):
|
|
return f"{value * 100:.1f}%"
|
|
|
|
|
|
def nrr_status(nrr):
|
|
if nrr >= 1.20:
|
|
return "✅ World-class"
|
|
if nrr >= 1.10:
|
|
return "✅ Healthy"
|
|
if nrr >= 1.00:
|
|
return "⚠️ Acceptable"
|
|
if nrr >= 0.90:
|
|
return "🔴 Concerning"
|
|
return "🔴 Crisis"
|
|
|
|
|
|
def grr_status(grr):
|
|
if grr >= 0.90:
|
|
return "✅ Strong"
|
|
if grr >= 0.85:
|
|
return "⚠️ Acceptable"
|
|
return "🔴 Below threshold"
|
|
|
|
|
|
def print_header(title):
|
|
width = 70
|
|
print()
|
|
print("=" * width)
|
|
print(f" {title}")
|
|
print("=" * width)
|
|
|
|
|
|
def print_section(title):
|
|
print(f"\n--- {title} ---")
|
|
|
|
|
|
def print_full_report(customers, period_start, period_end):
|
|
analyzer = RetentionAnalyzer(customers, as_of=period_end)
|
|
cohort_analyzer = CohortAnalyzer(customers)
|
|
expansion_analyzer = ExpansionAnalyzer(customers)
|
|
|
|
print_header("CHURN & RETENTION ANALYZER")
|
|
print(f" Analysis period: {period_start.isoformat()} → {period_end.isoformat()}")
|
|
print(f" Total customers in dataset: {len(customers)}")
|
|
active = analyzer.active_customers(period_end)
|
|
churned_in_period = analyzer.churned_customers(period_start, period_end)
|
|
print(f" Active at period end: {len(active)}")
|
|
print(f" Churned in period: {len(churned_in_period)}")
|
|
|
|
# ── ARR Waterfall
|
|
print_section("ARR WATERFALL")
|
|
wf = analyzer.arr_waterfall(period_start, period_end)
|
|
print(f" Opening ARR: {fmt_currency(wf['opening_arr'])}")
|
|
print(f" + New Logo ARR: +{fmt_currency(wf['new_arr'])}")
|
|
print(f" + Expansion ARR: +{fmt_currency(wf['expansion_arr'])}")
|
|
print(f" - Contraction ARR: -{fmt_currency(wf['contraction_arr'])}")
|
|
print(f" - Churned ARR: -{fmt_currency(wf['churned_arr'])}")
|
|
print(f" {'─'*42}")
|
|
print(f" Closing ARR: {fmt_currency(wf['closing_arr'])}")
|
|
print(f" Net New ARR: {'+' if wf['net_new_arr'] >= 0 else ''}{fmt_currency(wf['net_new_arr'])}")
|
|
|
|
# ── NRR / GRR
|
|
print_section("RETENTION METRICS")
|
|
nrr = wf["nrr"]
|
|
grr = wf["grr"]
|
|
logo_churn = analyzer.logo_churn_rate(period_start, period_end)
|
|
rev_churn = analyzer.revenue_churn_rate(period_start, period_end)
|
|
|
|
print(f" NRR (Net Revenue Retention): {fmt_pct(nrr)} {nrr_status(nrr)}")
|
|
print(f" GRR (Gross Revenue Retention): {fmt_pct(grr)} {grr_status(grr)}")
|
|
print(f" Logo Churn Rate (period): {fmt_pct(logo_churn)}")
|
|
print(f" Revenue Churn Rate (period): {fmt_pct(rev_churn)}")
|
|
if wf["opening_arr"] > 0:
|
|
expansion_rate = wf["expansion_arr"] / wf["opening_arr"]
|
|
print(f" Expansion Rate (period): {fmt_pct(expansion_rate)}")
|
|
print()
|
|
print(f" NRR Benchmark: >120% world-class | 100-120% healthy | <100% fix immediately")
|
|
|
|
# ── Expansion summary
|
|
print_section("EXPANSION REVENUE")
|
|
exp = expansion_analyzer.expansion_summary()
|
|
print(f" Expanding customers: {exp['expanding_count']} / {exp['active_customers']} ({fmt_pct(exp['expanding_count']/exp['active_customers']) if exp['active_customers'] else '—'})")
|
|
print(f" Contracting: {exp['contracting_count']} / {exp['active_customers']}")
|
|
print(f" Expansion ARR: {fmt_currency(exp['expansion_arr'])} ({fmt_pct(exp['expansion_rate'])} of base)")
|
|
print(f" Contraction ARR: {fmt_currency(exp['contraction_arr'])}")
|
|
print(f" Net Expansion Rate: {fmt_pct(exp['net_expansion_rate'])}")
|
|
|
|
# ── Segment breakdown
|
|
print_section("SEGMENT BREAKDOWN (NRR Components)")
|
|
seg_data = expansion_analyzer.expansion_by_segment()
|
|
col_w = [18, 8, 12, 10, 10, 10]
|
|
h = (f" {'Segment':<{col_w[0]}} {'Custs':>{col_w[1]}} {'ARR':>{col_w[2]}} "
|
|
f"{'Expansion':>{col_w[3]}} {'Contraction':>{col_w[4]}} {'NRR':>{col_w[5]}}")
|
|
print(h)
|
|
print(" " + "-" * (sum(col_w) + 5))
|
|
for seg, data in sorted(seg_data.items(), key=lambda x: -x[1]["arr"]):
|
|
print(f" {seg:<{col_w[0]}} {data['customer_count']:>{col_w[1]}} "
|
|
f"{fmt_currency(data['arr']):>{col_w[2]}} "
|
|
f"{fmt_currency(data['expansion_arr']):>{col_w[3]}} "
|
|
f"{fmt_currency(data['contraction_arr']):>{col_w[4]}} "
|
|
f"{fmt_pct(data['net_nrr_contribution']):>{col_w[5]}}")
|
|
|
|
# ── Cohort retention
|
|
print_section("COHORT RETENTION CURVES")
|
|
cohort_report = cohort_analyzer.cohort_report()
|
|
print(f" {'Cohort':<10} {'Custs':>6} {'Opening ARR':>13} {'Mo.3':>8} {'Mo.6':>8} {'Mo.12':>8}")
|
|
print(" " + "-" * 57)
|
|
for cohort, data in cohort_report.items():
|
|
curve = data["retention_curve"]
|
|
m3 = fmt_pct(curve[3]) if 3 in curve else " —"
|
|
m6 = fmt_pct(curve[6]) if 6 in curve else " —"
|
|
m12 = fmt_pct(curve[12]) if 12 in curve else " —"
|
|
print(f" {cohort:<10} {data['customer_count']:>6} "
|
|
f"{fmt_currency(data['opening_arr']):>13} "
|
|
f"{m3:>8} {m6:>8} {m12:>8}")
|
|
|
|
# ── At-risk accounts
|
|
print_section("AT-RISK ACCOUNTS")
|
|
at_risk = cohort_analyzer.identify_at_risk()
|
|
if at_risk:
|
|
print(f" {'Customer':<22} {'Segment':<14} {'ARR':>10} {'Tenure':>8} {'Risk':>6} Reason")
|
|
print(" " + "-" * 80)
|
|
for acct in at_risk[:10]: # Top 10
|
|
reason_short = acct["risk_reasons"][0] if acct["risk_reasons"] else ""
|
|
tenure_str = f"{acct['tenure_months']}mo"
|
|
print(f" {acct['name']:<22} {acct['segment']:<14} "
|
|
f"{fmt_currency(acct['arr']):>10} {tenure_str:>8} "
|
|
f"{acct['risk_score']:>5} {reason_short}")
|
|
if len(at_risk) > 10:
|
|
print(f" ... and {len(at_risk) - 10} more at-risk accounts")
|
|
else:
|
|
print(" ✅ No at-risk accounts identified")
|
|
|
|
# ── Expansion candidates
|
|
print_section("EXPANSION CANDIDATES (no expansion yet, healthy tenure)")
|
|
candidates = expansion_analyzer.top_expansion_candidates()
|
|
if candidates:
|
|
print(f" {'Customer':<22} {'Segment':<14} {'ARR':>10} {'Tenure':>8} Action")
|
|
print(" " + "-" * 70)
|
|
for c in candidates[:8]:
|
|
action = "Upsell review" if c["arr"] > 20000 else "Seat expansion call"
|
|
tenure_str = f"{c['tenure_months']}mo"
|
|
print(f" {c['name']:<22} {c['segment']:<14} "
|
|
f"{fmt_currency(c['arr']):>10} {tenure_str:>8} {action}")
|
|
else:
|
|
print(" ✅ All eligible accounts have expansion in motion")
|
|
|
|
# ── Red flags
|
|
print_section("HEALTH FLAGS")
|
|
flags = []
|
|
if nrr < 1.0:
|
|
flags.append("🔴 NRR below 100% — revenue base is shrinking. Fix before scaling sales.")
|
|
if grr < 0.85:
|
|
flags.append(f"🔴 GRR {fmt_pct(grr)} — gross retention below 85% threshold. Churn is a product/CS problem.")
|
|
if logo_churn > 0.05:
|
|
flags.append(f"⚠️ Logo churn {fmt_pct(logo_churn)} this period — run cohort analysis to find the pattern.")
|
|
if exp["expansion_rate"] < 0.10 and exp["active_customers"] > 10:
|
|
flags.append("⚠️ Expansion rate below 10% — upsell motion is weak or non-existent.")
|
|
churned_arr_pct = wf["churned_arr"] / wf["opening_arr"] if wf["opening_arr"] else 0
|
|
if churned_arr_pct > 0.10:
|
|
flags.append(f"🔴 Revenue churn at {fmt_pct(churned_arr_pct)} of opening ARR this period — high urgency.")
|
|
if len(at_risk) > len(active) * 0.20:
|
|
flags.append(f"⚠️ {len(at_risk)} of {len(active)} active accounts flagged at-risk ({fmt_pct(len(at_risk)/len(active) if active else 0)})")
|
|
|
|
if flags:
|
|
for f in flags:
|
|
print(f" {f}")
|
|
else:
|
|
print(" ✅ No critical health flags")
|
|
|
|
print()
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Sample data
|
|
# ---------------------------------------------------------------------------
|
|
|
|
SAMPLE_CSV = """customer_id,name,segment,arr,start_date,churn_date,expansion_arr,contraction_arr,health_score
|
|
C001,Acme Manufacturing,Enterprise,120000,2023-01-15,,45000,0,82
|
|
C002,TechStart Inc,Mid-Market,28000,2023-02-01,,8000,0,74
|
|
C003,Global Retail Co,Enterprise,250000,2023-01-05,,0,25000,45
|
|
C004,MedTech Solutions,Mid-Market,45000,2023-03-10,,15000,0,88
|
|
C005,FinServ Holdings,Enterprise,185000,2023-01-20,2023-09-15,0,0,
|
|
C006,StartupHub Network,SMB,12000,2023-04-01,,0,3000,55
|
|
C007,EduPlatform Inc,Mid-Market,32000,2023-02-15,,10000,0,91
|
|
C008,BioLab Analytics,Enterprise,95000,2023-01-10,,20000,0,78
|
|
C009,RegionalBank Corp,Enterprise,310000,2023-03-01,,75000,0,85
|
|
C010,CloudOps Systems,Mid-Market,38000,2023-05-01,2024-01-10,0,0,
|
|
C011,InsurTech Platform,Mid-Market,55000,2023-06-15,,0,0,62
|
|
C012,LegalAI Corp,SMB,18000,2023-07-01,,5000,0,79
|
|
C013,RetailChain Ltd,Enterprise,140000,2023-04-20,,0,20000,41
|
|
C014,DataPipeline Co,Mid-Market,42000,2023-08-01,,12000,0,83
|
|
C015,NanoTech Startup,SMB,9500,2023-09-15,2024-02-28,0,0,
|
|
C016,MedDevice Corp,Enterprise,220000,2023-02-28,,60000,0,92
|
|
C017,ConsultingFirm XYZ,SMB,15000,2023-10-01,,0,5000,38
|
|
C018,GovTech Solutions,Enterprise,175000,2023-11-15,,0,0,71
|
|
C019,AgriData Systems,Mid-Market,31000,2024-01-10,,8000,0,77
|
|
C020,HealthcarePlus,Mid-Market,62000,2024-02-01,,0,0,65
|
|
"""
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# CLI
|
|
# ---------------------------------------------------------------------------
|
|
|
|
def load_customers_from_csv(csv_text):
|
|
reader = csv.DictReader(StringIO(csv_text))
|
|
customers = []
|
|
errors = []
|
|
for i, row in enumerate(reader, start=2):
|
|
try:
|
|
c = Customer(
|
|
customer_id=row.get("customer_id", f"row_{i}"),
|
|
name=row.get("name", f"Customer {i}"),
|
|
segment=row.get("segment", ""),
|
|
arr=row.get("arr", 0),
|
|
start_date=row.get("start_date", ""),
|
|
churn_date=row.get("churn_date", None) or None,
|
|
expansion_arr=row.get("expansion_arr", 0) or 0,
|
|
contraction_arr=row.get("contraction_arr", 0) or 0,
|
|
health_score=row.get("health_score", None) or None,
|
|
)
|
|
customers.append(c)
|
|
except (ValueError, KeyError) as e:
|
|
errors.append(f" Row {i}: {e}")
|
|
if errors:
|
|
print("⚠️ Skipped rows with errors:")
|
|
for err in errors:
|
|
print(err)
|
|
return customers
|
|
|
|
|
|
def parse_period(period_str):
|
|
"""Parse 'YYYY-QN' or 'YYYY-MM' into (start_date, end_date)."""
|
|
if not period_str:
|
|
today = date.today()
|
|
q = (today.month - 1) // 3
|
|
start = date(today.year, q * 3 + 1, 1)
|
|
# End of current quarter
|
|
end_month = start.month + 2
|
|
end_year = start.year + (end_month - 1) // 12
|
|
end_month = ((end_month - 1) % 12) + 1
|
|
import calendar
|
|
end_day = calendar.monthrange(end_year, end_month)[1]
|
|
return start, date(end_year, end_month, end_day)
|
|
|
|
import calendar
|
|
if "-Q" in period_str:
|
|
year, qpart = period_str.split("-Q")
|
|
year = int(year)
|
|
q = int(qpart)
|
|
start_month = (q - 1) * 3 + 1
|
|
end_month = start_month + 2
|
|
start = date(year, start_month, 1)
|
|
end = date(year, end_month, calendar.monthrange(year, end_month)[1])
|
|
return start, end
|
|
|
|
# YYYY-MM
|
|
year, month = period_str.split("-")
|
|
year, month = int(year), int(month)
|
|
start = date(year, month, 1)
|
|
end = date(year, month, calendar.monthrange(year, month)[1])
|
|
return start, end
|
|
|
|
|
|
def main():
|
|
parser = argparse.ArgumentParser(
|
|
description="Churn & Retention Analyzer — NRR, cohort analysis, at-risk detection"
|
|
)
|
|
parser.add_argument(
|
|
"--csv", metavar="FILE",
|
|
help="CSV file with customer data (uses sample data if not provided)"
|
|
)
|
|
parser.add_argument(
|
|
"--period", metavar="PERIOD",
|
|
help='Analysis period: "2026-Q1" or "2026-03" (defaults to current quarter)'
|
|
)
|
|
parser.add_argument(
|
|
"--output", choices=["summary", "full", "json"],
|
|
default="full",
|
|
help="Output format (default: full)"
|
|
)
|
|
args = parser.parse_args()
|
|
|
|
# Load data
|
|
if args.csv:
|
|
try:
|
|
with open(args.csv, "r", encoding="utf-8") as f:
|
|
csv_text = f.read()
|
|
except FileNotFoundError:
|
|
print(f"Error: File not found: {args.csv}", file=sys.stderr)
|
|
sys.exit(1)
|
|
else:
|
|
print("No --csv provided. Using sample customer data.\n")
|
|
csv_text = SAMPLE_CSV
|
|
|
|
customers = load_customers_from_csv(csv_text)
|
|
if not customers:
|
|
print("No customers loaded. Exiting.", file=sys.stderr)
|
|
sys.exit(1)
|
|
|
|
period_start, period_end = parse_period(args.period)
|
|
|
|
if args.output == "json":
|
|
analyzer = RetentionAnalyzer(customers, as_of=period_end)
|
|
cohort_analyzer = CohortAnalyzer(customers)
|
|
expansion_analyzer = ExpansionAnalyzer(customers)
|
|
wf = analyzer.arr_waterfall(period_start, period_end)
|
|
output = {
|
|
"period": {"start": period_start.isoformat(), "end": period_end.isoformat()},
|
|
"arr_waterfall": wf,
|
|
"logo_churn_rate": analyzer.logo_churn_rate(period_start, period_end),
|
|
"revenue_churn_rate": analyzer.revenue_churn_rate(period_start, period_end),
|
|
"cohort_report": {k: {**v, "retention_curve": {str(m): r for m, r in v["retention_curve"].items()}}
|
|
for k, v in cohort_analyzer.cohort_report().items()},
|
|
"at_risk_accounts": cohort_analyzer.identify_at_risk(),
|
|
"expansion_summary": expansion_analyzer.expansion_summary(),
|
|
"expansion_by_segment": expansion_analyzer.expansion_by_segment(),
|
|
"expansion_candidates": expansion_analyzer.top_expansion_candidates(),
|
|
}
|
|
print(json.dumps(output, indent=2))
|
|
elif args.output == "summary":
|
|
analyzer = RetentionAnalyzer(customers, as_of=period_end)
|
|
wf = analyzer.arr_waterfall(period_start, period_end)
|
|
print_header("NRR SUMMARY")
|
|
print(f" Period: {period_start.isoformat()} → {period_end.isoformat()}")
|
|
print(f" NRR: {fmt_pct(wf['nrr'])} {nrr_status(wf['nrr'])}")
|
|
print(f" GRR: {fmt_pct(wf['grr'])} {grr_status(wf['grr'])}")
|
|
print(f" Opening: {fmt_currency(wf['opening_arr'])}")
|
|
print(f" Closing: {fmt_currency(wf['closing_arr'])}")
|
|
print(f" Net New: {fmt_currency(wf['net_new_arr'])}")
|
|
print()
|
|
else:
|
|
print_full_report(customers, period_start, period_end)
|
|
|
|
|
|
if __name__ == "__main__":
|
|
main()
|