Prompt Cache ROI Calculator
Anthropic caching — calculate real cost savings
Cache write: 1.25× input price (25% premium) · Cache read: 0.1× input price (90% discount)
📚 Learn more — how it works, FAQ & guide Click to expand
Learn more — how it works, FAQ & guide
Click to expand
Anthropic prompt cache ROI calculator
Calculate exact savings from Claude prompt caching. Break-even, monthly + annual savings.
How to use this tool
- 1
Pick model + cache TTL
Claude Opus/Sonnet/Haiku, and 5-min or 1-hour cache.
- 2
Enter cached vs dynamic tokens
System prompt / examples (cached) vs changing user input.
- 3
Enter requests per hour
Higher traffic → better cache amortization.
- 4
See monthly savings
Break-even, annual savings, optimal cache strategy.
Frequently Asked Questions
How does Anthropic prompt caching work?
Anthropic caches portions of your prompt (system, examples, long context) with an ephemeral breakpoint. Cache writes cost 1.25× input price (25% premium). Cache reads cost 0.1× input price (90% discount). Default TTL is 5 minutes, extended 1h TTL costs 2× write (2.5× base).
When does caching save money?
Break-even: ~2 reads for 5-min cache (1.25 write + 0.1×N ≤ 1.0×(N+1) → N ≥ 2). Real savings start at 10+ reads per write. At 100 reads/write, you pay 1.25 + 10 = 11.25 units vs 100 without cache = ~89% savings.
What should I cache?
Large static content: system prompts, instruction templates, few-shot examples, long documents, tool definitions. Don't cache: short prompts (<1024 tokens on Sonnet/Opus, <2048 on Haiku), rapidly changing content, PII that varies per user.
Minimum cacheable tokens?
Claude Sonnet 4 / Opus: 1024 tokens minimum. Claude Haiku: 2048 tokens minimum. Below this, caching is not applied.
You might also like
🔒
100% Privacy. This tool runs entirely in your browser. Your data is never uploaded to any server.