Show HN: Off Grid Bundles Text, Vision, Image, and Voice AI Fully Offline on Mobile

Original: Show HN: Off Grid – Run AI text, image gen, vision offline on your phone View original →

Read in other languages: 한국어日本語
LLM Feb 16, 2026 By Insights AI (HN) 1 min read 6 views Source

What launched on Hacker News

A Show HN post highlighted Off Grid, a mobile AI app designed to run end-to-end on local hardware. At crawl time, the thread reached 119 points and 64 comments, indicating strong community interest in practical offline AI. The project is published on GitHub under an MIT license and positions itself as a full on-device AI suite rather than a single chatbot utility.

Feature scope beyond text chat

According to the repository README, Off Grid combines multiple workloads in one app: text generation, image generation, vision analysis, speech-to-text, and document-aware chat. The text stack lists support for models like Qwen 3, Llama 3.2, Gemma 3, and Phi-4, with additional support for user-supplied .gguf models. On the imaging side, the project claims on-device Stable Diffusion with model options including DreamShaper and Anything V5.

Performance and hardware claims

The maintainer reports headline performance targets of 15-30 tok/s on flagship devices for text generation and 5-15 tok/s on mid-range devices. For image generation, the README describes NPU acceleration on Snapdragon hardware with roughly 5-10 seconds per image on supported phones, and slower CPU fallback paths. Vision inference is reported around 7 seconds on flagship devices. The repository says testing covered Snapdragon 8 Gen 2/3 and Apple A17 Pro class hardware, with results dependent on model size and quantization.

Why this matters

The key architectural point is privacy and offline resilience: the project explicitly states that prompts, audio, and documents remain on-device. If the claims hold across real-world usage, this approach could be useful for teams and users who need lower cloud dependency, predictable latency without network round trips, or stricter data-locality constraints. The tradeoff is that mobile-class compute still imposes model-size and throughput limits, so workload selection and quantization strategy remain central to user experience.

Share:

Related Articles

LLM Hacker News 5d ago 2 min read

A well-received HN post highlighted Sarvam AI’s decision to open-source Sarvam 30B and 105B, two reasoning-focused MoE models trained in India under the IndiaAI mission. The announcement matters because it pairs open weights with concrete product deployment, inference optimization, and unusually strong Indian-language benchmarks.

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment

© 2026 Insights. All rights reserved.