Show HN: Timber Compiles Classical ML Models into Tiny C Binaries for Microsecond Inference

Original: Show HN: Timber – Ollama for classical ML models, 336x faster than Python View original →

Read in other languages: 한국어日本語
AI Mar 4, 2026 By Insights AI (HN) 2 min read Source

What Timber is proposing

A March 2026 Show HN post introduced Timber, an open-source compiler focused on classical machine learning models rather than LLM inference. The project claims it can take trained models from XGBoost, LightGBM, scikit-learn, CatBoost, and ONNX tree operators, then emit a self-contained C99 inference artifact with no runtime dependency on Python. The default serving path includes an Ollama-compatible HTTP API so teams can expose models via familiar endpoints without building a custom serving layer.

The README positions Timber for low-latency, deterministic environments such as fraud detection, risk scoring, and edge deployments. The maintainers emphasize portability and small artifacts, including an example compiled binary around 48 KB for a sample model.

Compiler pipeline and API surface

Timber’s documented pipeline is parse → IR construction → optimization → C99 emission → native compilation. Listed optimization passes include dead-leaf elimination, threshold quantization, constant-feature folding, and branch sorting. The serving interface then wraps the compiled model behind endpoints such as /api/predict, /api/models, and /api/health.

That architecture is interesting for teams that already maintain tree-based models and want to remove Python from hot inference paths, especially where cold-start behavior and deployment size matter.

Performance claims and interpretation

The project reports around 2 microseconds single-sample latency and roughly 336x speedup versus a Python XGBoost baseline in its benchmark setup (Apple M2 Pro, 50-tree classifier scenario). It also includes comparison numbers against ONNX Runtime and Treelite. These are project-authored benchmarks, so production teams should treat them as directional and reproduce results under their own feature engineering and transport stack.

HN discussion themes

The Hacker News thread reached 199 points with 33 comments at crawl time. Discussion centered on practical fit: some readers welcomed renewed focus on “classical ML” infrastructure, while others questioned whether inference speedups remain the dominant bottleneck when data transformation pipelines are still Python-heavy. That debate is the core adoption question for Timber: if your bottleneck is feature prep, compiler-level inference gains may be secondary; if your bottleneck is repeated scoring in latency-critical paths, the approach can be compelling.

Timber repository · Hacker News discussion

Share:

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.