LocalLLaMA Showcases PokeClaw, a Fully On-Device Gemma 4 Agent for Android
Original: [PokeClaw] First working app that uses Gemma 4 to autonomously control an Android phone. Fully on-device, no cloud. View original →
A LocalLLaMA post highlighted PokeClaw, an Android prototype built to answer a simple question: can Gemma 4 actually control a phone locally instead of sending every action to the cloud? The project answer is yes, at least in prototype form. The Reddit post and README describe a closed on-device loop where the model interprets the screen, chooses a tool, observes the result, and keeps going until the task is finished.
The interesting part is the tool surface. PokeClaw gives the model actions such as tap, swipe, long press, input text, open an app, send a message, take a screenshot, inspect screen information, and finish a task. It also includes an auto-reply flow for messages. Under the hood, the app runs through LiteRT-LM with native tool calling, so the control loop stays on the device rather than bouncing through a remote browser or hosted agent runtime.
The README does not oversell the current state. It repeatedly calls the app a two-day open-source prototype and warns that it is rough around the edges. Hardware still matters: Android 9+ and arm64 are required, 8 GB RAM is the minimum, 12 GB+ is recommended, and the first model download is about 2.6 GB. On a low-end phone doing CPU-only inference, warmup can take around 45 seconds; stronger Tensor, Snapdragon, or Dimensity devices cut that down significantly.
That combination of limitations and ambition is why the LocalLLaMA thread landed. PokeClaw is not claiming agent perfection. It is showing that a 2.3B-class on-device model can already navigate apps, fill inputs, and automate messaging flows on commodity phones with no API key and no recurring cloud bill. For the local AI community, that is a meaningful shift from demo chatbots toward embodied mobile automation.
The original Reddit discussion is at r/LocalLLaMA, and the implementation details are in the PokeClaw GitHub repo. Even in its unfinished state, the project is a strong signal that Gemma 4's tool-calling stack is already pushing on-device agents beyond the desktop.
Related Articles
A LocalLLaMA thread highlighted Gemma 4 31B's unexpectedly strong FoodTruck Bench showing, and the discussion quickly turned to long-horizon planning quality and benchmark reliability.
A Show HN thread highlighted Gemma Gem, a Chrome extension that runs Gemma 4 locally via WebGPU and exposes page-reading, clicking, typing, scrolling, screenshot, and JavaScript tools without API keys or server-side inference.
A LocalLLaMA explainer argues that Gemma 4 E2B/E4B gain their efficiency from Per-Layer Embeddings. The key point is that many of those parameters behave more like large token lookup tables than always-active compute-heavy layers, which changes the inference trade-off.
Comments (0)
No comments yet. Be the first to comment!