Hacker News Highlights HJB as the Shared Math Behind Continuous RL and Diffusion Models

One equation, several modern AI ideas

A March 2026 Hacker News submission on Daniel López Montero’s HJB explainer reached 120 points and 33 comments at crawl time. The article is not a product launch or benchmark thread. It is a mathematical reframing of several modern AI topics around one object: the Hamilton-Jacobi-Bellman equation, or HJB.

The argument starts from Richard Bellman’s 1950s work on dynamic programming. In discrete time, the Bellman equation expresses the value of an action as immediate reward plus continuation value. When the time step shrinks toward zero, the optimization problem turns into a partial differential equation. That PDE is the HJB equation, which Bellman later recognized as structurally identical to the older Hamilton-Jacobi equation from classical mechanics.

Why the control view matters

The post uses that bridge to connect topics that are often taught separately:

continuous-time reinforcement learning as optimal control
stochastic control formulations with noise and finite-horizon objectives
diffusion models interpreted as control problems rather than only sampling recipes
related links to optimal transport and Schrödinger-bridge style thinking

That matters because it gives practitioners a cleaner conceptual map. Instead of treating RL, diffusion, and certain transport problems as unrelated subfields with different jargon, the article shows that they share a common optimization backbone. For technical readers, that can change how they think about objectives, state dynamics, and what a model is really optimizing over time.

From theory to implementation

The explainer is also practical enough to matter beyond pure math. It discusses how continuous-time control leads into neural policy iteration and how the value-function viewpoint gives intuition for modern generative modeling. That is useful because many AI engineers interact with diffusion systems and sequential decision problems at the implementation layer without seeing the common mathematics underneath.

The broader signal from the Hacker News response is that readers still want rigorous connective tissue, not only new model announcements. As AI systems get more agentic and more sequential, control-theory language is becoming harder to ignore. The HJB lens does not replace empirical work, but it does offer a more coherent framework for understanding why certain classes of training and inference procedures behave the way they do.

Primary source: Daniel López Montero’s article. Community discussion: Hacker News.

Hacker News Highlights HJB as the Shared Math Behind Continuous RL and Diffusion Models

One equation, several modern AI ideas

Why the control view matters

From theory to implementation

Related Articles

Google DeepMind turns AlphaGo’s 10-year mark into a case for AI-driven discovery

Google upgrades Gemini 3 Deep Think for science, research, and engineering

Anthropic launches a Science Blog to cover AI-driven research workflows and results

Comments (0)

Leave a Comment

Related Articles

Google DeepMind turns AlphaGo’s 10-year mark into a case for AI-driven discovery

Google upgrades Gemini 3 Deep Think for science, research, and engineering
Sciences Mar 23, 2026 2 min read

Anthropic launches a Science Blog to cover AI-driven research workflows and results