HN asked whether AI bug hunting is really just more tokens

AI cybersecurity is not proof of work hit HN because it challenged a simple intuition: if more compute helps models search code, maybe the side with more GPUs wins security. The post, submitted on 2026-04-16 10:48:00 UTC, drew more than 230 score and 80 comments because the community wanted to know whether that analogy actually holds.

Antirez’s argument is that hash-style proof of work and bug discovery are not the same kind of search. In proof of work, enough attempts will eventually find an input that satisfies the target. In code analysis, repeated LLM runs may explore different branches, but meaningful paths can saturate. At that point the limiting factor is not just sample count M, but the model’s intelligence level I.

The essay uses the OpenBSD SACK bug as an example. A weaker model might point at validation or overflow-looking code and appear to get close, but the claim is that it does not necessarily understand how the missing validation, integer overflow, and supposedly non-null branch compose into an exploitable issue. In that view, some “found bugs” are pattern matches that accidentally land near reality, not proof of exploit-level reasoning.

HN did what HN usually does with a strong thesis: it complicated it. Commenters brought up attacker-defender asymmetry, noting that attackers need one exploitable issue while defenders must find, patch, and deploy fixes broadly. Others argued that more tokens and better models both help, depending on the search surface, much like breadth-first versus depth-first search has no universal winner.

The practical takeaway is that AI security claims need sharper vocabulary. Finding a suspicious code region, explaining a vulnerability chain, producing a reliable exploit, and helping deploy a fix are different achievements. Benchmarks and demos often blur those stages. The community energy around this post came from that blur. As LLM-assisted bug hunting improves, the important question will be less “did the model say bug?” and more “what reasoning and verification chain turned that output into security work?”

HN asked whether AI bug hunting is really just more tokens

Related Articles

A $1,500 LLM hacking test exposes the gap between capability, guardrails, and harnesses

Anthropic’s 832-account map shows attacks moving past phishing into operations

Hacker News debates whether small open models can already reproduce parts of Mythos-style AI security work

Related Articles

A $1,500 LLM hacking test exposes the gap between capability, guardrails, and harnesses

Anthropic’s 832-account map shows attacks moving past phishing into operations
AI X/Twitter Jun 4, 2026 1 min read

Hacker News debates whether small open models can already reproduce parts of Mythos-style AI security work
AI Hacker News Apr 13, 2026 2 min read