Cohere, 2B·Apache 2.0 기반 speech recognition 모델 Transcribe 공개

Cohere가 발표한 내용

2026년 3월 26일 Cohere는 X에서 Transcribe를 새로운 state-of-the-art open-source speech recognition model로 소개했다. 공식 release page는 이 주장을 더 구체화한다. Cohere에 따르면 Transcribe는 2B parameter 규모의 Conformer 기반 encoder-decoder model이며, 연구 데모가 아니라 production-grade automatic speech recognition을 목표로 처음부터 훈련되었다.

release page가 말하는 핵심

Cohere는 Transcribe가 14개 언어를 지원하고 Apache 2.0 license로 배포된다고 설명한다. 또한 이 모델이 현재 Hugging Face Open ASR Leaderboard에서 평균 word error rate 5.42로 1위를 기록하고 있으며, 공개된 다른 open·closed speech system보다 앞선다고 주장한다. Cohere는 단순한 benchmark 성적보다 실제 배포에서 중요한 low word error rate와 high throughput의 균형도 강조한다.

release page는 Transcribe를 meeting transcription, speech analytics, audio search, real-time customer support agent를 위한 실용적 building block으로 위치시킨다. Cohere는 이 모델을 Hugging Face의 open weights, 실험용 API, 그리고 managed private deployment를 위한 Model Vault라는 세 경로로 제공한다. 이는 로컬 infrastructure control을 원하는 개발자와 self-managing 없이 운영하고 싶은 enterprise를 동시에 겨냥한 배치다.

왜 중요한가

speech는 AI stack 안에서도 여전히 파편화된 영역이었고, 강력한 모델일수록 상용 API나 더 좁은 license에 묶이는 경우가 많았다. Cohere는 Apache license, leaderboard 선두 성능, 비교적 관리 가능한 serving footprint를 결합해 speech recognition을 mainstream enterprise toolchain 쪽으로 끌어오려 한다. launch benchmark 밖에서도 latency와 accuracy 주장이 유지된다면, Transcribe는 품질을 포기하지 않으면서 open speech infrastructure를 원하는 조직의 유력한 기본 선택지가 될 수 있다.

출처: Cohere X 게시물 · Cohere release page

Cohere, 2B·Apache 2.0 기반 speech recognition 모델 Transcribe 공개

Cohere가 발표한 내용

release page가 말하는 핵심

왜 중요한가

Related Articles

Cohere, 14개 언어 지원 오픈소스 ASR Transcribe 공개

Moonshine 오픈 웨이트 STT, HN에서 주목: Whisper Large v3 비교 지표 공개

Cohere, open 2B ASR model Transcribe와 WebGPU 브라우저 demo 전면 배치