#voice-ai

AI sources.twitter Apr 18, 2026 1 min read

Grok STT API, 25+개 언어와 시간당 $0.10 가격으로 음성 API 시장 겨냥

왜 중요한가: xAI가 Grok Voice stack을 standalone STT/TTS API로 내며 batch $0.10/hour, streaming $0.20/hour 가격을 제시했다. 25+ languages, diarization, word-level timestamps는 call center와 meeting transcription 시장을 직접 겨냥한다.

#xai #grok #speech-to-text

LLM sources.twitter Mar 30, 2026 2 min read

Google, Gemini 3.1 Flash Live 출시 확대… Gemini Live·Search Live·AI Studio 동시 전개

Google DeepMind는 2026년 3월 26일 Gemini 3.1 Flash Live가 Gemini Live와 Google Search Live에 순차 적용되고, 개발자는 Google AI Studio에서 바로 사용할 수 있다고 밝혔다. Google은 이 모델을 자사 최고 품질의 audio model로 규정하며, 더 낮은 latency와 향상된 tonal understanding, 그리고 ComplexFuncBench Audio 90.8% 성능을 강조했다.

#google #gemini #voice-ai

LLM Mar 27, 2026 1 min read

Google, Gemini 3.1 Flash Live 공개... 저지연 voice agent와 Search Live 글로벌 확대

Google은 Mar 26, 2026에 Gemini 3.1 Flash Live를 발표하고 실시간 음성 상호작용 성능을 전면 강화했다. Gemini Live API, Gemini Enterprise for Customer Experience, Search Live, Gemini Live까지 같은 audio stack을 확장한 점이 핵심이다.

#google #gemini #voice-ai

LLM Hacker News Mar 11, 2026 1 min read

Hacker News가 Apple Silicon용 온디바이스 음성 AI 스택을 밀어 올리다

Launch HN 스레드는 RunAnywhere의 MetalRT와 RCLI를 끌어올리며, Apple Silicon에서 STT·LLM·TTS를 클라우드 없이 엮는 저지연 음성 AI 파이프라인에 관심을 모았다.

#apple-silicon #on-device-ai #voice-ai

LLM Hacker News Mar 11, 2026 1 min read

Hacker News가 조명한 Apple Silicon용 RunAnywhere 로컬 Voice AI 스택

Launch HN 스레드로 RunAnywhere의 RCLI가 부각됐다. 이 프로젝트는 Apple Silicon에서 STT, LLM, TTS, 로컬 RAG, 38개 macOS action을 모두 로컬로 묶어 macOS용 Voice AI를 구축하려는 시도다.

#apple-silicon #local-ai #voice-ai

AI Mar 7, 2026 1 min read

IBM·Deepgram, watsonx Orchestrate에 speech-to-text·text-to-speech 통합

IBM과 Deepgram은 Feb 24, 2026, Deepgram의 speech-to-text와 text-to-speech를 watsonx Orchestrate에 통합한다고 발표했다. Deepgram은 IBM의 첫 voice partner가 되며, voice AI를 enterprise agent workflow 안으로 더 깊게 넣는 움직임이다.

#ibm #deepgram #voice-ai