/token-cost
Know what you'll spend before you spend it.
Usage
/token-cost src/lib/agents/researcher.ts # analyzes a code path
/token-cost --prompt @prompt.txt --runs 1000 # one prompt × N runs
/token-cost --conversation @chat.jsonl # multi-turn estimate
What it estimates
Per call
- Input tokens (system + history + user message)
- Output tokens (estimated based on
max_tokensand historical avg) - Cache hit rate (if
cache_controlis used in the prompt) - Total cost in USD
For a workflow
- Cost per user invocation
- Cost at 1k / 10k / 100k users/month
- Cost breakdown by model (Opus vs Sonnet vs Haiku)
- "What if we switched to Haiku?" comparison
Pricing source
- Anthropic: live from anthropic.com/pricing (or hardcoded fallback)
- OpenAI: openai.com/pricing
- Includes prompt caching discounts (10% read / 25% write surcharge per Anthropic)
Output
Workflow: stack planner (one user request)
Model: claude-haiku-4-5-20251001
Input: ~3,200 tokens
Output: ~1,800 tokens
Cost: $0.0034 per request
At scale:
1,000 requests/day: $3.40/day · $102/month
10,000 requests/day: $34/day · $1,020/month
Switching to Sonnet 4.6: ~5x cost, marginal quality gain
Switching to Opus 4.7: ~30x cost, recommended for complex reasoning only
Rules
- Use real tokenizers when possible (
tiktokenfor OpenAI, Anthropic's tokenizer SDK) - If estimating, conservatively round UP (it's worse to underbudget)
- Surface caching opportunities — long system prompts that aren't cached