Hacker News Highlights a Continuous-Time Route from RL to Diffusion Models
Original: Hamilton-Jacobi-Bellman Equation: Reinforcement Learning and Diffusion Models View original →
A Hacker News discussion on March 30, 2026 boosted visibility for Daniel López Montero’s March 28 essay on the Hamilton-Jacobi-Bellman equation, a mathematical object that sits behind optimal control and, by extension, a large part of reinforcement learning. The post argues that continuous-time control is not just historical background; it provides a useful lens for understanding how modern AI systems are trained and optimized. That framing stood out in a feed that often concentrates on products and launches rather than on mathematical structure.
The essay starts from Bellman’s discrete-time dynamic programming and then shows what changes when the time step shrinks toward zero. In that limit, the Bellman equation becomes the HJB partial differential equation. From there, the author moves into controlled diffusions, Itô processes, and the infinitesimal generator that governs state evolution under noise. For readers who mostly encounter reinforcement learning through Markov decision processes and policy gradients, the piece offers a more structural explanation of why these methods exist in the first place.
The most interesting bridge is the one to diffusion models. Rather than treating generative diffusion as a separate toolkit, the article frames it as another problem in stochastic optimal control. That perspective connects sampling, denoising, and control-theoretic objectives, and it helps explain why tools from PDEs, policy iteration, and Monte Carlo evaluation continue to reappear in generative modeling research. The post also includes concrete examples such as stochastic LQR and the Merton portfolio problem, which ground the theory in recognizable control settings.
Why did this resonate on Hacker News? Because it pushes back against the idea that current AI progress is only about bigger models and more compute. The essay makes a case that old mathematics still structures new systems, and that understanding those foundations can improve how researchers reason about both reinforcement learning and generative models. For engineers, it is a useful reminder that the gap between theory and practice is often smaller than the tooling stack makes it look.
- Original source: Daniel López Montero’s March 28, 2026 essay
- Core theme: HJB links optimal control, continuous-time RL, and diffusion models
- Main takeaway: classical mathematics still explains much of modern AI behavior
Related Articles
A March 2026 Hacker News thread with 120 points and 33 comments pushed a deep technical explainer on the Hamilton-Jacobi-Bellman equation. The post argues that continuous-time reinforcement learning and diffusion models can be understood through the same control-theory structure rather than as separate ML tricks.
A heavily discussed HN post focused on Epoch AI’s confirmation that GPT-5.4 Pro helped solve one FrontierMath Open Problems combinatorics challenge, shifting attention from benchmark scores toward expert-verified research workflows.
Google DeepMind said on February 11, 2026 that Gemini Deep Think is being used on professional research problems across mathematics, physics, and computer science. The company highlighted its Aletheia math agent, up to 90% on IMO-ProofBench Advanced, and collaborations on 18 research problems as evidence that AI is moving from benchmark performance toward real scientific workflow support.
Comments (0)
No comments yet. Be the first to comment!