Hacker News Tracks tinybox as Offline AI Hardware Moves Into 120B-Class Territory

The March 21, 2026 Hacker News submission titled "Tinybox – Offline AI device 120B parameters" had 279 points and 163 comments when checked on March 22, 2026. The post linked to tinygrad's tinybox page, which pitches a compact system for deep learning training and inference instead of another generic workstation or cloud instance comparison. That distinction matters because the community is increasingly looking for practical on-prem boxes that can host larger LLM workloads without committing everything to remote infrastructure.

tinygrad currently highlights two shipping configurations. Tinybox Red V2 uses 4x 9070 XT GPUs, advertises 778 TFLOPS of FP16 throughput, and is listed at $12,000. Tinybox Green V2 moves to 4x RTX PRO 6000 Blackwell GPUs, 3,086 TFLOPS FP16 throughput, and a $65,000 price tag. The company also says the broader tinybox line was benchmarked in MLPerf Training 4.0 against systems costing roughly 10x more, framing the machine as a performance-per-dollar play rather than a boutique showcase.

Red V2: 4x 9070 XT, FP16 778 TFLOPS, $12,000
Green V2: 4x RTX PRO 6000 Blackwell, FP16 3,086 TFLOPS, $65,000
tinygrad's framing: a machine designed first for deep learning, then reused for inference

The community interest is easy to read. Teams building local copilots, retrieval systems, and agent workflows want more control over privacy, bandwidth costs, and predictable capacity. A ready-to-buy appliance lowers the barrier between DIY multi-GPU rigs and hyperscaler contracts, especially for smaller companies that want enough VRAM and bandwidth to experiment with 70B- to 120B-class models on site.

The open question is whether the real user experience matches the product page. Thermals, software tooling, serviceability, and sustained inference behavior matter as much as raw FLOPS. Even so, the HN thread captured a real shift: local AI hardware is no longer a fringe hobby build category. It is becoming a defined product segment with serious buyers and clearer expectations.

Hacker News Tracks tinybox as Offline AI Hardware Moves Into 120B-Class Territory

Related Articles

Taalas Claims to Bake Entire LLMs Into Silicon for 17K Tokens/Second

Hacker News spots CanIRun.ai, a browser-side local AI compatibility checker

LocalLLaMA Patch Claims Faster Qwen3.5-397B Inference on Blackwell Workstations With a K=64 Kernel Fix

Comments (0)

Leave a Comment

Related Articles

Taalas Claims to Bake Entire LLMs Into Silicon for 17K Tokens/Second
LLM Reddit Feb 23, 2026 1 min read

Hacker News spots CanIRun.ai, a browser-side local AI compatibility checker
LLM Hacker News Mar 13, 2026 2 min read

LocalLLaMA Patch Claims Faster Qwen3.5-397B Inference on Blackwell Workstations With a K=64 Kernel Fix