New skill that collects real financial data for any US publicly traded company via yfinance. Outputs structured JSON with market data, historical financials, WACC inputs, and analyst estimates. Includes 9-check validation script and reference docs for yfinance pitfalls (NaN years, field aliases, FCF mismatch). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3.1 KiB
3.1 KiB
yfinance Pitfalls & Field Mapping
NaN Year Patterns
yfinance frequently returns NaN for older fiscal years. Observed patterns:
| Ticker | NaN Years | Notes |
|---|---|---|
| META | 2020, 2021 | All fields NaN; must supplement from 10-K |
| General | Varies | Older years (>3 years back) are less reliable |
Workaround: Check every field with pd.notna(). Report NaN years to user. Never fill with estimates.
Field Name Variants
yfinance row index names are not fully stable across versions. Use fallback chains:
FIELD_ALIASES = {
"revenue": ["Total Revenue", "Revenue", "Operating Revenue"],
"ebit": ["Operating Income", "EBIT"],
"ebitda": ["EBITDA", "Normalized EBITDA"],
"tax": ["Tax Provision", "Income Tax Expense", "Tax Effect Of Unusual Items"],
"net_income": ["Net Income", "Net Income Common Stockholders"],
"capex": ["Capital Expenditure", "Capital Expenditures"],
"ocf": ["Operating Cash Flow", "Cash Flow From Continuing Operating Activities"],
"da": ["Depreciation And Amortization", "Depreciation Amortization Depletion"],
"fcf": ["Free Cash Flow"],
"nwc": ["Change In Working Capital", "Changes In Working Capital"],
"total_debt": ["Total Debt"],
"cash": ["Cash And Cash Equivalents"],
"short_investments": ["Other Short Term Investments", "Short Term Investments"],
"sbc": ["Stock Based Compensation"],
}
def safe_get(df, aliases, col):
for alias in aliases:
if alias in df.index:
val = df.loc[alias, col]
return float(val) if pd.notna(val) else None
return None
Datetime Column Index
yfinance returns DataFrame columns as pandas.Timestamp, not integer years:
# ❌ WRONG
financials[2024] # KeyError
# ✅ RIGHT
year_col = [c for c in financials.columns if c.year == 2024][0]
financials.loc["Total Revenue", year_col]
Shares Outstanding Variants
# Preferred: diluted
shares = info.get("sharesOutstanding") # Basic shares
# Alternative
shares = info.get("impliedSharesOutstanding") # May be more accurate
Risk-Free Rate via ^TNX
tnx = yf.Ticker("^TNX")
hist = tnx.history(period="1d")
risk_free_rate = hist["Close"].iloc[-1] / 100 # Convert from percentage
Pitfall: ^TNX returns yield as percentage (e.g., 4.3), not decimal (0.043). Divide by 100.
Analyst Estimates
ticker = yf.Ticker("META")
# Revenue estimates
rev_est = ticker.revenue_estimate # DataFrame with columns: avg, low, high, ...
# Rows: "0q" (current quarter), "+1q", "0y" (current year), "+1y"
# EPS estimates
eps_est = ticker.eps_trend # Similar structure
Pitfall: These APIs change between yfinance versions. Always wrap in try/except.
FCF Definition Mismatch
| Source | FCF Definition | META 2024 |
|---|---|---|
| yfinance | Operating CF + CapEx | ~$54.1B |
| Morgan Stanley DCF | EBITDA - Taxes - CapEx - NWC - SBC | ~$37.9B |
| Difference | SBC (~$22B) + other adjustments | ~30% gap |
Always flag this in output metadata. Downstream DCF skills need to decide whether to use yfinance FCF or reconstruct from components.