Google, Gemma 4로 on-device agentic workflow 확장

Google AI Edge Team은 April 2, 2026에 Gemma 4를 on-device agent stack의 실전형 기반으로 밀어 올리겠다고 밝혔다. Apache 2.0 license로 제공되는 Gemma 4 family는 multi-step planning, autonomous action, offline code generation, audio-visual processing, 140개가 넘는 languages 지원을 내세운다. Google의 메시지는 분명하다. developers가 agentic experiences를 phone, desktop, browser, IoT hardware, robotics 위에서 locally 실행할 수 있어야 한다는 것이다.

이번 발표는 model release만이 아니라 tooling release이기도 하다. Google AI Edge Gallery에는 Agent Skills가 추가됐고, Google은 이를 fully on-device로 multi-step autonomous workflow를 실행하는 초기 application 가운데 하나라고 설명했다. 이 skills는 외부 지식 조회, 요약·flashcard·visualization 생성, text-to-speech나 image generation 같은 다른 model과의 연동까지 수행할 수 있다. 즉 open model을 공개하는 데서 끝나지 않고 tool use와 end-to-end agent behavior의 working pattern을 함께 제시한 셈이다.

배포 레이어로는 LiteRT-LM이 전면에 섰다. Google은 structured output을 위한 constrained decoding, Gemma 4의 128K context window까지 활용하는 dynamic context handling, 일부 device에서 Gemma 4 E2B를 1.5GB 미만 메모리로 구동하는 optimization을 강조했다. 또 2개의 skills에 걸친 4,000 input tokens를 under 3 seconds에 처리할 수 있다고 했고, Android와 iOS뿐 아니라 Raspberry Pi 5, Qualcomm Dragonwing IQ8까지 deployment path를 제시했다. 새로운 litert-lm CLI와 Python bindings도 함께 공개됐다.

왜 Gemma 4가 중요한가

더 큰 신호는 Google이 agent stack을 cloud 바깥, device 쪽으로 이동시키고 있다는 점이다. 이는 privacy, latency, cost, offline availability의 tradeoff를 다시 쓰게 만든다. tool calling과 structured output을 갖춘 open model이 consumer hardware에서 충분히 돌아간다면 developers는 cloud-centric orchestration 외의 대안을 얻게 된다. Gemma 4는 단순한 open model family가 아니라 on-device agentic AI를 mainstream development target으로 만들려는 Google의 시도다.

Google이 AICore, Google AI Edge Gallery, LiteRT-LM을 동시에 강조한 점도 눈여겨볼 만하다. model, runtime, demo application, CLI를 한 번에 제공하면 developer는 proof of concept에서 production candidate까지 가는 경로를 더 짧게 만들 수 있다. 이는 open model 경쟁이 단순 benchmark보다 toolchain completeness로 이동하고 있음을 보여주는 신호다.

즉 model release와 developer distribution이 하나의 패키지로 합쳐지고 있다.

on-device 전략의 존재감이 더 커진다.

Google, Gemma 4로 on-device agentic workflow 확장

왜 Gemma 4가 중요한가

Related Articles

Google, Gemma 4 공개… Apache 2.0 license와 최대 256K context 제공

Gemini 3.5 Flash GA, Google Search까지 agent 표면으로 확장

Gemma 4의 on-device Agent Skills, Reddit가 주목

Comments (0)

Leave a Comment

Related Articles

Google, Gemma 4 공개… Apache 2.0 license와 최대 256K context 제공
LLM X/Twitter Apr 2, 2026 2 min read

Gemini 3.5 Flash GA, Google Search까지 agent 표면으로 확장

Gemma 4의 on-device Agent Skills, Reddit가 주목
LLM Reddit Apr 5, 2026 1 min read