Rust·llama.cpp manga translator, LocalLLaMA가 본 local OCR pipeline의 손맛

r/LocalLLaMA에서 올라온 Rust 기반 manga translator는 단순한 demo보다 workflow 완성도 때문에 반응을 얻었다. 작성자는 이 프로젝트가 manga뿐 아니라 일반 image translation에도 쓸 수 있고, object detection, visual LLM-based OCR, layout analysis, fine-tuned inpainting model을 조합한다고 설명했다. LLM 부분은 llama.cpp를 통합해 Gemma 4 family와 Qwen3.5 family를 지원하고, OpenAPI-compatible API를 통해 LM Studio나 OpenRouter도 붙일 수 있다고 밝혔다.

흥미로운 점은 “모델 하나가 번역한다”가 아니라, 여러 vision 단계가 editor UX로 엮였다는 점이다. 작성자는 button을 누르면 pipeline이 돌고, 사용자가 결과를 proofread하고 font, size, color를 바꿀 수 있는 mini Photoshop 같은 편집 흐름을 강조했다. GitHub repo는 https://github.com/mayocream/koharu로 공개돼 있다.

local OCR과 layout analysis가 speech bubble, text area, redraw 문제를 나눠 처리한다.
llama.cpp 통합은 cloud-only 번역이 아니라 local model 선택지를 만든다.
OpenAPI-compatible path는 local과 remote provider를 같은 UI 안에서 바꿀 수 있게 한다.

community discussion noted that 사용자들이 실제로 원하는 것은 “완전 자동”만이 아니다. 댓글에서는 browser extension, manual textbox control, font customization, video demo 보강 같은 사용성 요구가 이어졌다. manga translation은 OCR 오류 하나가 말풍선 배치와 그림 복원까지 흔드는 문제라, editable pipeline이 중요하다. 이 thread가 LocalLLaMA에서 먹힌 이유도 그 지점이다. local LLM이 novelty를 넘어, 작은 creative tool의 부품으로 들어가는 모습이 보였기 때문이다.

원문 thread는 r/LocalLLaMA에 있다.

Rust·llama.cpp manga translator, LocalLLaMA가 본 local OCR pipeline의 손맛

Related Articles

단종 Intel Optane으로 1조 파라미터 모델을 초당 4토큰에 구동

RTX 4070 12GB에서 35B 모델 110 tok/s — ik_llama.cpp 최적화 효과

Gemma 4 GGUF를 다시 받아야 하나, Reddit이 짚은 llama.cpp 수정들

Related Articles

단종 Intel Optane으로 1조 파라미터 모델을 초당 4토큰에 구동
LLM Reddit May 12, 2026 1 min read

RTX 4070 12GB에서 35B 모델 110 tok/s — ik_llama.cpp 최적화 효과
LLM Reddit May 22, 2026 1 min read

Gemma 4 GGUF를 다시 받아야 하나, Reddit이 짚은 llama.cpp 수정들
LLM Reddit Apr 9, 2026 1 min read