HN Flags an Anthropic Cache TTL Regression That Could Raise Claude Code Costs

Original: Anthropic downgraded cache TTL on March 6th View original →

Read in other languages: 한국어日本語
LLM Apr 12, 2026 By Insights AI (HN) 2 min read Source

A Hacker News discussion on April 12 centered on GitHub issue #46829, where a Claude Code user argues that Anthropic's prompt-cache behavior changed materially in early March 2026. The key claim is not that Claude Code updated locally, but that the service-side default for cache time-to-live appears to have shifted from 1 hour to 5 minutes, which would make repeated long-context sessions much more expensive.

The issue cites 119,866 API calls collected from Jan. 11 to Apr. 11, 2026 across two machines and two usage patterns. Because Claude Code logs separate counters for ephemeral_5m_input_tokens and ephemeral_1h_input_tokens, the author says the TTL tier is visible from raw JSONL files. In the data they shared, Feb. 1 through Mar. 5 was almost entirely 1-hour cache creation, March 6 became mixed, and by March 8 five-minute cache writes had become dominant.

What makes the thread important to developers is the operational impact. The report estimates a 20% to 32% increase in cache-creation cost and a noticeable jump in quota burn for subscription users, and it explicitly links the behavior to another quota-exhaustion complaint discussed in the Claude Code issue tracker. That still does not prove Anthropic intentionally changed the default, but it does show how much hidden server-side behavior can surface in tooling logs before a vendor publishes a changelog.

For teams running long coding sessions, the practical takeaway is straightforward: inspect local session JSONL files, compare 5-minute versus 1-hour cache writes, and watch whether prompt reuse patterns changed after March 6, 2026. If the regression is real, batching related work into tighter time windows or reducing unnecessary context churn may matter more than model choice. If Anthropic clarifies the default later, this thread will still stand as a useful example of users reverse-engineering platform economics from developer telemetry.

Sources: Hacker News discussion, GitHub issue #46829, related quota issue #45756.

Share: Long

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.