HN spotlights Caveman, a Claude Code plugin that trims tokens with “caveman” responses
Original: A Claude Code skill that makes Claude talk like a caveman, cutting token use View original →
A fast-rising Hacker News thread is centered on the GitHub project Caveman. The branding is playful, but the underlying problem is practical. As Claude Code, Codex, and similar coding agents become part of everyday engineering workflows, overly polite and wordy answers are not just a style annoyance. They also consume money, latency, and precious context window budget. At the time of writing, the HN discussion had reached 399 points and 238 comments.
Caveman is positioned as a lightweight skill/plugin layer rather than a model change. Its README says it can reduce output tokens by roughly 75% by removing filler language and hedging while keeping code blocks and technical terms intact. The before-and-after examples are easy to understand: a long explanation of React re-renders becomes a short note about object references and useMemo, without changing the actual fix.
That is why the HN interest matters. Developers are increasingly treating response formatting itself as an optimization surface. If an agent can preserve the same diagnosis, command, or review guidance while emitting fewer tokens, the upside compounds over long sessions. Less output means lower cost, faster completions, and more room for follow-up context.
The obvious caveat is that compression only helps if accuracy survives. Caveman’s README makes that claim, but the real test is broader use across debugging, design review, and implementation workflows. Even so, the project captures a real shift in AI tooling. People are no longer only asking which model is smartest. They are also asking which interaction style is efficient enough to live inside daily engineering loops.
Related Articles
This was not just another “local models are bad” rant. The thread blew up because it mixed a blunt reality check with a serious counterargument: some of the pain comes from small models, but a lot of it may come from the harness wrapped around them.
HN jumped on the trust problem before the string oddity. A case-sensitive <code>HERMES.md</code> in commit history sent Claude Code requests to extra-usage billing, and the thread zeroed in on how invisible routing rules can burn real money.
HN did not treat this as abstract legal trivia. Once the Claude Code leak became the hook, the thread turned into a practical question for every team shipping AI-assisted software: if the model wrote the bulk of it, what is actually yours?
Comments (0)
No comments yet. Be the first to comment!