Question 1

Does prompt caching work with all Claude models?

Accepted Answer

Prompt caching is available for Claude Sonnet 4.6, Claude Opus 4.6, and Claude Haiku 3.5 via the API. The pricing discount is consistent across models (90% off cached reads, 25% premium on cache writes). Caching is an API feature only - it is not applicable to subscription plans which do not use token-based billing.

Question 2

What can be cached in Claude prompt caching?

Accepted Answer

You can cache system prompts, large context documents, few-shot examples, and any other static or slowly-changing content that repeats across requests. The content must be in the 'cache_control' parameter of your API request. Images and tool definitions can also be cached. Dynamic content (the user's actual message) is not cached - only the static context around it.

Question 3

Does the cache persist across different user sessions?

Accepted Answer

No. The cache is per-API-key and not shared across different users or sessions. Each cache entry has a 5-minute TTL that resets on every request that uses that cache. The cache does not persist if you go 5 minutes without any request hitting it. This means caching is most valuable for applications with continuous active usage rather than infrequent requests.

Question 4

How do I know if caching is working?

Accepted Answer

Anthropic's API response includes usage metadata that shows how many tokens were served from cache (cache_read_input_tokens) versus charged at standard rates. Monitor this field in your API responses to verify caching is active and measure your actual savings rate.

Question 5

Is there a minimum size for cacheable content?

Accepted Answer

Yes. There is a minimum block size of 1,024 tokens for content to be eligible for caching. This means very short system prompts (under about 750-800 words) may not benefit from caching. For most production applications with meaningful system prompts or shared context documents, this threshold is easily exceeded.

Model	Standard Input	Cache Write	Cache Read
Claude Opus 4.6	$5.00	$6.25	$0.50
Claude Sonnet 4.6	$3.00	$3.75	$0.30
Claude Haiku 3.5	$0.80	$1.00	$0.08

Claude Prompt Caching: How It Works and How Much It Saves (2026)

What Is Prompt Caching?

Prompt Caching Pricing (Sonnet 4.6)

Worked Savings Examples

Customer support app with 10,000-token system prompt

Code review app with 50,000-token codebase as context

Cache TTL: The 5-Minute Rule

Prompt Caching vs Batch API: Which to Use?

Use prompt caching when...

Use Batch API when...

Frequently Asked Questions

Related Pages