Skip to content
Aging

Semble: Open-Source Code Search for AI Agents That Uses 98% Fewer Tokens

Original: Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep View original →

Read in other languages: 한국어日本語
LLM May 18, 2026 By Insights AI (HN) 1 min read 3 views Source

The Problem

When AI agents like Claude Code navigate large codebases, they typically rely on grep and broad file reads — consuming enormous amounts of context tokens. grep+read needs a full 100k context window just to hit 85% recall. Semble, released by MinishLab, tackles this directly.

How It Works

Semble uses a two-stage retrieval pipeline. The first stage applies tree-sitter for code-aware chunking, then scores candidates using both Model2Vec semantic embeddings and BM25 lexical matching. The second stage reranks results with code-specific signals: definition boosts, identifier stem matching, file coherence, and noise penalties for test and legacy code.

Everything runs on CPU. No external APIs, no GPU, no authentication required. A typical repository indexes in ~200ms; queries return in ~1.5ms.

Benchmarks

  • Token efficiency: 94% recall at just 2k tokens — vs. 100k context for 85% recall with grep+read
  • NDCG@10: 0.854 — 99% of the 137M-parameter CodeRankEmbed transformer model
  • Indexing speed: ~200x faster than code-specialized transformers (~200ms)
  • Query speed: ~10x faster (~1.5ms per query)

Integration

Add Semble to Claude Code as an MCP server with a single command:

claude mcp add semble -s user -- uvx --from "semble[mcp]" semble

Cursor, Codex, and OpenCode support the same uvx command structure. For shell-based workflows, document semble search and semble find-related in AGENTS.md.

Why It Matters

Token efficiency directly impacts cost, speed, and context window limits for AI agent workflows. Semble hits a practical sweet spot: near-transformer search quality with zero external dependencies, running fully offline. As AI coding agents become standard development tools, efficient codebase navigation becomes a first-class engineering concern.

Share: Long

Related Articles

Comments (0)

No comments yet. Be the first to comment!

Leave a Comment