- scrum-master: add velocity_analyzer, sprint_health_scorer, retrospective_analyzer - scrum-master: add references, assets, templates, rewrite SKILL.md - senior-pm: add risk_matrix_analyzer, resource_capacity_planner, project_health_dashboard - senior-pm: add references, assets, templates, rewrite SKILL.md - All scripts: zero deps, dual output, type hints, tested against sample data
386 lines
13 KiB
Markdown
386 lines
13 KiB
Markdown
# Velocity Forecasting Guide: Monte Carlo Methods & Probabilistic Estimation
|
||
|
||
## Table of Contents
|
||
- [Overview](#overview)
|
||
- [Monte Carlo Simulation Fundamentals](#monte-carlo-simulation-fundamentals)
|
||
- [Velocity-Based Forecasting](#velocity-based-forecasting)
|
||
- [Implementation Approaches](#implementation-approaches)
|
||
- [Confidence Intervals & Risk Assessment](#confidence-intervals--risk-assessment)
|
||
- [Practical Applications](#practical-applications)
|
||
- [Advanced Techniques](#advanced-techniques)
|
||
- [Common Pitfalls](#common-pitfalls)
|
||
- [Case Studies](#case-studies)
|
||
|
||
---
|
||
|
||
## Overview
|
||
|
||
Velocity forecasting using Monte Carlo simulation provides probabilistic estimates for sprint and project completion, moving beyond single-point estimates to give stakeholders a range of likely outcomes with associated confidence levels.
|
||
|
||
### Why Probabilistic Forecasting?
|
||
- **Uncertainty Acknowledgment**: Software development is inherently uncertain
|
||
- **Risk Quantification**: Provides probability distributions rather than false precision
|
||
- **Stakeholder Communication**: Better expectation management through confidence intervals
|
||
- **Decision Support**: Enables data-driven planning and resource allocation
|
||
|
||
### Core Principles
|
||
1. **Historical Velocity Patterns**: Use actual team performance data
|
||
2. **Statistical Modeling**: Apply appropriate probability distributions
|
||
3. **Confidence Intervals**: Provide ranges, not single points
|
||
4. **Continuous Calibration**: Update forecasts with new data
|
||
|
||
---
|
||
|
||
## Monte Carlo Simulation Fundamentals
|
||
|
||
### What is Monte Carlo Simulation?
|
||
Monte Carlo simulation uses random sampling to model the probability of different outcomes in systems that cannot be easily predicted due to random variables.
|
||
|
||
### Application to Velocity Forecasting
|
||
```
|
||
For each simulation iteration:
|
||
1. Sample a velocity value from historical distribution
|
||
2. Calculate projected completion time
|
||
3. Repeat thousands of times
|
||
4. Analyze the distribution of results
|
||
```
|
||
|
||
### Key Statistical Concepts
|
||
|
||
#### Normal Distribution
|
||
Most teams' velocity follows a roughly normal distribution after stabilization:
|
||
- **Mean (μ)**: Average historical velocity
|
||
- **Standard Deviation (σ)**: Velocity variability measure
|
||
- **68-95-99.7 Rule**: Probability ranges for forecasting
|
||
|
||
#### Distribution Characteristics
|
||
- **Symmetry**: Balanced around the mean (normal teams)
|
||
- **Skewness**: Teams with frequent disruptions may show positive skew
|
||
- **Kurtosis**: Measure of "tail heaviness" - extreme outcomes frequency
|
||
|
||
---
|
||
|
||
## Velocity-Based Forecasting
|
||
|
||
### Basic Velocity Forecasting Formula
|
||
|
||
**Single Sprint Forecast:**
|
||
```
|
||
Confidence Interval = μ ± (Z-score × σ)
|
||
|
||
Where:
|
||
- μ = historical mean velocity
|
||
- σ = standard deviation of velocity
|
||
- Z-score = confidence level multiplier
|
||
```
|
||
|
||
**Multi-Sprint Forecast:**
|
||
```
|
||
Total Points = Σ(sampled_velocity_i) for i = 1 to n sprints
|
||
Where each velocity_i is randomly sampled from historical distribution
|
||
```
|
||
|
||
### Confidence Level Z-Scores
|
||
| Confidence Level | Z-Score | Interpretation |
|
||
|------------------|---------|----------------|
|
||
| 50% | 0.67 | Median outcome |
|
||
| 70% | 1.04 | Moderate confidence |
|
||
| 85% | 1.44 | High confidence |
|
||
| 95% | 1.96 | Very high confidence |
|
||
| 99% | 2.58 | Extremely high confidence |
|
||
|
||
---
|
||
|
||
## Implementation Approaches
|
||
|
||
### 1. Simple Historical Distribution Method
|
||
```python
|
||
def simple_monte_carlo_forecast(velocities, sprints_ahead, iterations=10000):
|
||
results = []
|
||
for _ in range(iterations):
|
||
total_points = sum(random.choice(velocities) for _ in range(sprints_ahead))
|
||
results.append(total_points)
|
||
return analyze_results(results)
|
||
```
|
||
|
||
**Pros:** Simple, uses actual data points
|
||
**Cons:** Ignores trends, assumes stationary distribution
|
||
|
||
### 2. Normal Distribution Method
|
||
```python
|
||
def normal_distribution_forecast(velocities, sprints_ahead, iterations=10000):
|
||
mean_velocity = statistics.mean(velocities)
|
||
std_velocity = statistics.stdev(velocities)
|
||
|
||
results = []
|
||
for _ in range(iterations):
|
||
total_points = sum(
|
||
max(0, random.normalvariate(mean_velocity, std_velocity))
|
||
for _ in range(sprints_ahead)
|
||
)
|
||
results.append(total_points)
|
||
return analyze_results(results)
|
||
```
|
||
|
||
**Pros:** Mathematically clean, handles interpolation
|
||
**Cons:** Assumes normal distribution, may generate impossible values
|
||
|
||
### 3. Bootstrap Sampling Method
|
||
```python
|
||
def bootstrap_forecast(velocities, sprints_ahead, iterations=10000):
|
||
n = len(velocities)
|
||
results = []
|
||
for _ in range(iterations):
|
||
# Sample with replacement
|
||
bootstrap_sample = [random.choice(velocities) for _ in range(n)]
|
||
# Calculate statistics from bootstrap sample
|
||
mean_vel = statistics.mean(bootstrap_sample)
|
||
std_vel = statistics.stdev(bootstrap_sample)
|
||
|
||
total_points = sum(
|
||
max(0, random.normalvariate(mean_vel, std_vel))
|
||
for _ in range(sprints_ahead)
|
||
)
|
||
results.append(total_points)
|
||
return analyze_results(results)
|
||
```
|
||
|
||
**Pros:** Robust to distribution assumptions, accounts for sampling uncertainty
|
||
**Cons:** More complex, requires sufficient historical data
|
||
|
||
---
|
||
|
||
## Confidence Intervals & Risk Assessment
|
||
|
||
### Interpreting Forecast Results
|
||
|
||
#### Percentile-Based Confidence Intervals
|
||
```python
|
||
def calculate_confidence_intervals(results, confidence_levels=[0.5, 0.7, 0.85, 0.95]):
|
||
sorted_results = sorted(results)
|
||
intervals = {}
|
||
|
||
for confidence in confidence_levels:
|
||
percentile_index = int(confidence * len(sorted_results))
|
||
intervals[f"{int(confidence*100)}%"] = sorted_results[percentile_index]
|
||
|
||
return intervals
|
||
```
|
||
|
||
#### Example Interpretation
|
||
For a 6-sprint forecast with results:
|
||
- **50%:** 120 points (median outcome)
|
||
- **70%:** 135 points (likely case)
|
||
- **85%:** 150 points (conservative case)
|
||
- **95%:** 170 points (very conservative case)
|
||
|
||
### Risk Assessment Framework
|
||
|
||
#### Delivery Probability
|
||
```
|
||
P(Completion ≤ Target) = (# simulations ≤ target) / total_simulations
|
||
```
|
||
|
||
#### Risk Categories
|
||
| Probability Range | Risk Level | Recommendation |
|
||
|-------------------|------------|----------------|
|
||
| > 85% | Low Risk | Proceed with confidence |
|
||
| 70-85% | Moderate Risk | Add buffer, monitor closely |
|
||
| 50-70% | High Risk | Reduce scope or extend timeline |
|
||
| < 50% | Very High Risk | Significant replanning required |
|
||
|
||
---
|
||
|
||
## Practical Applications
|
||
|
||
### Sprint Planning
|
||
Use velocity forecasting to:
|
||
- Set realistic sprint goals
|
||
- Communicate uncertainty to Product Owner
|
||
- Plan capacity buffers for unknowns
|
||
- Identify when to adjust scope
|
||
|
||
### Release Planning
|
||
Apply Monte Carlo methods to:
|
||
- Estimate feature completion dates
|
||
- Plan release milestones
|
||
- Assess project schedule risk
|
||
- Make go/no-go decisions
|
||
|
||
### Stakeholder Communication
|
||
Present forecasts as:
|
||
- Range estimates, not single points
|
||
- Probability statements ("70% confident we'll deliver X by date Y")
|
||
- Risk scenarios with mitigation options
|
||
- Visual distributions showing uncertainty
|
||
|
||
---
|
||
|
||
## Advanced Techniques
|
||
|
||
### 1. Trend-Adjusted Forecasting
|
||
Account for improving or declining velocity trends:
|
||
```python
|
||
def trend_adjusted_forecast(velocities, sprints_ahead):
|
||
# Calculate linear trend
|
||
x = range(len(velocities))
|
||
slope, intercept = calculate_linear_regression(x, velocities)
|
||
|
||
# Adjust future velocities for trend
|
||
adjusted_velocities = []
|
||
for i in range(sprints_ahead):
|
||
future_sprint = len(velocities) + i
|
||
predicted_velocity = slope * future_sprint + intercept
|
||
adjusted_velocities.append(predicted_velocity)
|
||
|
||
return monte_carlo_with_adjusted_velocities(adjusted_velocities)
|
||
```
|
||
|
||
### 2. Seasonality Adjustments
|
||
For teams with seasonal patterns (holidays, budget cycles):
|
||
```python
|
||
def seasonal_adjustment(velocities, sprint_dates, forecast_dates):
|
||
# Identify seasonal patterns
|
||
seasonal_factors = calculate_seasonal_factors(velocities, sprint_dates)
|
||
|
||
# Apply factors to forecast
|
||
adjusted_forecast = apply_seasonal_factors(forecast_dates, seasonal_factors)
|
||
return adjusted_forecast
|
||
```
|
||
|
||
### 3. Capacity-Based Modeling
|
||
Incorporate team capacity changes:
|
||
```python
|
||
def capacity_adjusted_forecast(velocities, historical_capacity, future_capacity):
|
||
# Calculate velocity per capacity unit
|
||
velocity_per_capacity = [v/c for v, c in zip(velocities, historical_capacity)]
|
||
baseline_efficiency = statistics.mean(velocity_per_capacity)
|
||
|
||
# Forecast based on future capacity
|
||
future_velocities = [capacity * baseline_efficiency for capacity in future_capacity]
|
||
return monte_carlo_forecast(future_velocities)
|
||
```
|
||
|
||
### 4. Multi-Team Forecasting
|
||
For dependencies across teams:
|
||
```python
|
||
def multi_team_forecast(team_forecasts, dependencies):
|
||
# Account for critical path and dependencies
|
||
# Use min/max operations for dependent deliveries
|
||
# Model coordination overhead
|
||
pass
|
||
```
|
||
|
||
---
|
||
|
||
## Common Pitfalls
|
||
|
||
### 1. Insufficient Historical Data
|
||
**Problem:** Using too few sprint data points
|
||
**Solution:** Minimum 6-8 sprints for reliable forecasting
|
||
**Mitigation:** Use industry benchmarks or similar team data
|
||
|
||
### 2. Non-Stationary Data
|
||
**Problem:** Including data from different team compositions or processes
|
||
**Solution:** Use only recent, relevant historical data
|
||
**Identification:** Look for structural breaks in velocity time series
|
||
|
||
### 3. False Precision
|
||
**Problem:** Reporting over-precise estimates (e.g., "23.7 points")
|
||
**Solution:** Round to reasonable precision, emphasize ranges
|
||
**Communication:** Use language like "approximately" and "around"
|
||
|
||
### 4. Ignoring External Factors
|
||
**Problem:** Not accounting for holidays, team changes, external dependencies
|
||
**Solution:** Adjust historical data or forecasts for known factors
|
||
**Documentation:** Maintain context for each sprint's circumstances
|
||
|
||
### 5. Overconfidence in Models
|
||
**Problem:** Treating forecasts as guarantees
|
||
**Solution:** Regular calibration against actual outcomes
|
||
**Improvement:** Update models based on forecast accuracy
|
||
|
||
---
|
||
|
||
## Case Studies
|
||
|
||
### Case Study 1: Stabilizing Team
|
||
**Situation:** New team, first 10 sprints, velocity ranging 15-25 points
|
||
**Approach:**
|
||
- Used bootstrap sampling due to small sample size
|
||
- Applied 30% buffer for team learning curve
|
||
- Updated forecast every 2 sprints
|
||
|
||
**Results:**
|
||
- Initial forecast: 20 ± 8 points per sprint
|
||
- Final 3 sprints: 22 ± 3 points per sprint
|
||
- Accuracy improved from 60% to 85% confidence bands
|
||
|
||
### Case Study 2: Seasonal Product Team
|
||
**Situation:** E-commerce team with holiday impacts
|
||
**Data:** 24 sprints showing clear seasonal patterns
|
||
**Approach:**
|
||
- Identified seasonal multipliers (0.7x during holidays)
|
||
- Used 2-year historical data for seasonal adjustment
|
||
- Applied capacity-based modeling for temporary staff
|
||
|
||
**Results:**
|
||
- Standard model: 40% forecast accuracy during Q4
|
||
- Seasonal-adjusted model: 80% forecast accuracy
|
||
- Better resource planning and stakeholder communication
|
||
|
||
### Case Study 3: Platform Team with Dependencies
|
||
**Situation:** Infrastructure team supporting multiple product teams
|
||
**Challenge:** High variability due to urgent requests and dependencies
|
||
**Approach:**
|
||
- Separated planned vs. unplanned work velocity
|
||
- Used wider confidence intervals (90% vs 70%)
|
||
- Implemented buffer management strategy
|
||
|
||
**Results:**
|
||
- Planned work predictability: 85%
|
||
- Total work predictability: 65% (acceptable for context)
|
||
- Improved capacity allocation decisions
|
||
|
||
---
|
||
|
||
## Tools and Implementation
|
||
|
||
### Recommended Tools
|
||
1. **Python/R:** For custom implementation and complex models
|
||
2. **Excel/Google Sheets:** For simple implementations and visualization
|
||
3. **Jira/Azure DevOps:** For automated data collection
|
||
4. **Specialized Tools:** ActionableAgile, Monte Carlo simulation software
|
||
|
||
### Key Metrics to Track
|
||
- **Forecast Accuracy:** How often do actual results fall within predicted ranges?
|
||
- **Calibration:** Do 70% confidence intervals contain 70% of actual results?
|
||
- **Bias:** Are forecasts consistently optimistic or pessimistic?
|
||
- **Resolution:** How precise are the forecasts for decision-making?
|
||
|
||
### Implementation Checklist
|
||
- [ ] Historical velocity data collection (minimum 6 sprints)
|
||
- [ ] Data quality validation (outliers, context)
|
||
- [ ] Distribution analysis (normal, skewed, multi-modal)
|
||
- [ ] Model selection and parameter estimation
|
||
- [ ] Validation against held-out data
|
||
- [ ] Visualization and communication materials
|
||
- [ ] Regular calibration and model updates
|
||
|
||
---
|
||
|
||
## Conclusion
|
||
|
||
Monte Carlo velocity forecasting transforms uncertain estimates into probabilistic statements that enable better decision-making. Success requires:
|
||
|
||
1. **Quality Data:** Clean, relevant historical velocity data
|
||
2. **Appropriate Models:** Choose methods suited to your team's patterns
|
||
3. **Clear Communication:** Present uncertainty honestly to stakeholders
|
||
4. **Continuous Improvement:** Calibrate and refine models over time
|
||
5. **Contextual Awareness:** Account for team changes, external factors, and business context
|
||
|
||
The goal is not perfect prediction, but better understanding of uncertainty to make more informed planning decisions.
|
||
|
||
---
|
||
|
||
*This guide provides a comprehensive foundation for implementing probabilistic velocity forecasting. Adapt the techniques to your team's specific context and constraints.* |