Kreuzberg v4.5 adds faster Rust-native document layout extraction
Original: Kreuzberg v4.5.0: We loved Docling's model so much that we gave it a faster engine View original →
The r/LocalLLaMA thread had 50 points and 21 comments at crawl time. In the post, the author describes Kreuzberg as an MIT-licensed, open-source document intelligence framework written in Rust with bindings for 12 programming languages, including Python, TypeScript/Node.js, PHP, Ruby, Java, C#, Go, Elixir, R, C, and WASM. Its base promise is broad: extract text, structure, and metadata from 88+ formats, run OCR, generate embeddings, and fit into AI-oriented document pipelines.
The headline change in v4.5 is that Kreuzberg now treats documents as structured layout objects instead of only as text sources. The post says the project integrates Docling's RT-DETR v2 layout model, also called Docling Heron, inside a Rust-native pipeline. When a page contains tables, Kreuzberg crops the table regions, runs TATR (Table Transformer), and then aligns the predicted cells with native PDF text positions in order to reconstruct markdown tables.
The benchmark section is detailed enough to matter. Across 171 PDF documents covering academic papers, government and legal files, invoices, OCR scans, and edge cases, the author reports Structure F1 of 42.1% for Kreuzberg versus 41.7% for Docling, Text F1 of 88.9% versus 86.7%, and average processing time of 1,032 ms per document versus 2,894 ms. Taken at face value, that means roughly Docling-class quality with an average throughput advantage of about 2.8x.
The implementation notes show where the speedup is supposed to come from. Kreuzberg uses pdfium for character-level extraction and font metadata when a native text layer exists, falls back to Tesseract OCR when it does not, relies on ONNX Runtime for inference, and spreads work across pages with Rayon. The post also mentions a repair pass for broken font CMap tables that cuts garbled lines from 406 to 0 on affected test documents, plus a multi-backend OCR pipeline, PaddleOCR v2 with a unified 18,000+ character multilingual model, and extraction result caching.
- Bindings: 12 languages
- Input coverage: 88+ formats
- Benchmark: 171 PDFs, 1,032 ms/doc, Structure F1 42.1%, Text F1 88.9%
This release matters because document AI has become a systems problem, not just an OCR problem. Teams increasingly need layout understanding, table recovery, multilingual fallback, and operational efficiency together. Kreuzberg's pitch is that you can get that stack in a Rust-native package rather than through a Python-heavy deployment path. For anyone already using Docling, v4.5 looks like a release worth benchmarking rather than dismissing as a wrapper.
Related Articles
A high-engagement r/MachineLearning discussion introduced IronClaw, a Rust-based AI agent runtime designed around sandboxed tool execution, encrypted credential handling, and database-backed policy controls. The post landed because it treats agent security as a systems problem instead of a prompt-only problem.
A March 14, 2026 Show HN post introduced Han, a Rust-built programming language with Korean keywords, an interpreter, LLVM IR codegen, a REPL, and an LSP server.
A March 9, 2026 LocalLLaMA discussion highlighted Fish Audio’s S2 release, which combines fine-grained inline speech control, multilingual coverage, and an SGLang-based streaming stack.
Comments (0)
No comments yet. Be the first to comment!