Show HN: Timber Compiles Classical ML Models into Tiny C Binaries for Microsecond Inference

What Timber is proposing

A March 2026 Show HN post introduced Timber, an open-source compiler focused on classical machine learning models rather than LLM inference. The project claims it can take trained models from XGBoost, LightGBM, scikit-learn, CatBoost, and ONNX tree operators, then emit a self-contained C99 inference artifact with no runtime dependency on Python. The default serving path includes an Ollama-compatible HTTP API so teams can expose models via familiar endpoints without building a custom serving layer.

The README positions Timber for low-latency, deterministic environments such as fraud detection, risk scoring, and edge deployments. The maintainers emphasize portability and small artifacts, including an example compiled binary around 48 KB for a sample model.

Compiler pipeline and API surface

Timber’s documented pipeline is parse → IR construction → optimization → C99 emission → native compilation. Listed optimization passes include dead-leaf elimination, threshold quantization, constant-feature folding, and branch sorting. The serving interface then wraps the compiled model behind endpoints such as /api/predict, /api/models, and /api/health.

That architecture is interesting for teams that already maintain tree-based models and want to remove Python from hot inference paths, especially where cold-start behavior and deployment size matter.

Performance claims and interpretation

The project reports around 2 microseconds single-sample latency and roughly 336x speedup versus a Python XGBoost baseline in its benchmark setup (Apple M2 Pro, 50-tree classifier scenario). It also includes comparison numbers against ONNX Runtime and Treelite. These are project-authored benchmarks, so production teams should treat them as directional and reproduce results under their own feature engineering and transport stack.

HN discussion themes

The Hacker News thread reached 199 points with 33 comments at crawl time. Discussion centered on practical fit: some readers welcomed renewed focus on “classical ML” infrastructure, while others questioned whether inference speedups remain the dominant bottleneck when data transformation pipelines are still Python-heavy. That debate is the core adoption question for Timber: if your bottleneck is feature prep, compiler-level inference gains may be secondary; if your bottleneck is repeated scoring in latency-critical paths, the approach can be compelling.

Timber repository · Hacker News discussion

Show HN: Timber Compiles Classical ML Models into Tiny C Binaries for Microsecond Inference

What Timber is proposing

Compiler pipeline and API surface

Performance claims and interpretation

HN discussion themes

Related Articles

Mistral ties a 10MW inference site to its industrial physics AI push

NeurIPS desk-rejection dispute turns AI detectors into the real review issue

A $1,500 LLM hacking test exposes the gap between capability, guardrails, and harnesses

Related Articles

Mistral ties a 10MW inference site to its industrial physics AI push
AI May 30, 2026 2 min read

NeurIPS desk-rejection dispute turns AI detectors into the real review issue
AI Reddit Jun 4, 2026 1 min read

A $1,500 LLM hacking test exposes the gap between capability, guardrails, and harnesses
AI Hacker News Jun 4, 2026 1 min read