Reverse Engineered Apple Neural Engine to Train Microgpt
Original: Reverse engineered Apple Neural Engine(ANE) to train Microgpt View original →
Why the Apple Neural Engine?
Apple's M4 chip Neural Engine (ANE) offers 38 TFLOPS of claimed INT8 compute — though since it's a FP16 processor, actual compute is roughly half that. Despite this capability, Apple provides no public API for direct ANE access. CoreML is the official recommended path, but it abstracts away from direct hardware utilization.
This developer, wanting to maximize the compute in their Mac Mini M4, used Claude to reverse engineer the ANE's private APIs, bypassing CoreML to access the hardware directly. The post earned 457 upvotes on r/LocalLLaMA.
The Reverse Engineering Process
Using Claude as an engineering partner, the developer analyzed Apple's private ANE APIs, ran benchmarks by bypassing CoreML, and built a bespoke training pipeline. The result: a successfully trained 110M parameter Microgpt model running entirely on the ANE.
Results and Limitations
- Success: Completed training a 110M Microgpt model on a single M4 ANE
- Limitation: A single chip is not practical for training larger models
- Future potential: A cluster of ANE-equipped Apple Silicon devices could theoretically train larger models; even a single device should handle LoRA fine-tuning for 3B/7B models
Why NPU Training Matters
NPUs offer dramatically better power efficiency than GPUs for matrix multiplication workloads. Apple Silicon ANEs process vastly more operations per watt than discrete GPUs. This project demonstrates a potential path toward democratizing AI training — using the NPU in MacBooks and Mac Minis rather than expensive NVIDIA hardware. It also highlights Claude's utility as a reverse engineering assistant for systems-level work.
Related Articles
Anthropic says Xcode 26.3 now includes native integration with the Claude Agent SDK, bringing Claude Code capabilities directly into Apple’s IDE. The update expands from turn-by-turn assistance to longer-running autonomous coding workflows.
OpenAI announced an Operator upgrade adding Google Drive slides creation/editing and Jupyter-mode code execution in Browser. It also said Operator availability expanded to 20 additional regions in recent weeks, with new country additions including Korea and several European markets.
OpenAI says GPT-5.4 Thinking is shipping in ChatGPT, with GPT-5.4 also live in the API and Codex and GPT-5.4 Pro available for harder tasks. The launch packages reasoning, coding, and native computer use into a single professional-work model with up to 1M tokens of context.
Comments (0)
No comments yet. Be the first to comment!