Semble
Semble is a code search library built for agents. It returns the exact code snippets they need instantly, using ~98% fewer tokens than grep+read. Indexing and searching a full codebase end-to-end takes under a second, with ~200x faster indexing and ~10x faster queries than a code-specialized transformer, at 99% of its retrieval quality (see benchmarks). Everything runs on CPU with no API keys, GPU, or external services. Use it as an MCP server, a CLI tool via AGENTS.md, or a dedicated sub-agent, and any coding agent (Claude Code, Cursor, Codex, OpenCode, etc.) gets instant access to any repo.
Quickstart
Your agent queries Semble in natural language (e.g. "How is authentication handled?") and gets back only the relevant code snippets, without grepping or reading full files.
The fastest way to get started is the interactive installer. Install uv, then run:
uv tool install semblesemble installsemble install detects your installed coding agents (Claude Code, Cursor, Codex, Gemini, OpenCode, and more) and lets you choose which integrations to enable: MCP server, CLI instructions in AGENTS.md, and a dedicated sub-agent. To undo, run semble uninstall.
For manual setup (per-agent MCP config, AGENTS.md snippet, sub-agent files), see Installation.
Main Features
- Fast: indexes an average repo in ~250 ms and answers queries in ~1.5 ms, all on CPU.
- Accurate: NDCG@10 of 0.854 on the benchmarks, on par with code-specialized transformer models at a fraction of the size and cost.
- Token-efficient: returns only the relevant chunks, using ~98% fewer tokens than grep+read.
- Zero setup: runs on CPU with no API keys, GPU, or external services required.
- MCP server: works with Claude Code, Cursor, Codex, OpenCode, VS Code, and any other MCP-compatible agent.
- Local and remote: pass a local path or a git URL.
MCP tools
Once connected, the agent has access to two tools:
| Tool | Description |
|---|---|
search | Search a codebase with a natural-language or code query. Pass repo as a local path or an https:// git URL. |
find_related | Given a file_path and line number, return chunks semantically similar to the code at that location. |
How it works
Semble splits each file into code-aware chunks using tree-sitter, then scores every query against the chunks with two complementary retrievers: static Model2Vec embeddings using the code-specialized potion-code-16M model for semantic similarity, and BM25 for lexical matches on identifiers and API names. The two score lists are fused with Reciprocal Rank Fusion (RRF).
After fusing, results are reranked with a set of code-aware signals:
- Adaptive weighting. Symbol-like queries (
Foo::bar,_private,getUserById) get more lexical weight, while natural-language queries stay balanced between semantic and lexical retrievers. - Definition boosts. A chunk that defines the queried symbol (a
class,def,func, etc.) is ranked above chunks that merely reference it. - Identifier stems. Query tokens are stemmed and matched against identifier stems in a chunk, giving an additional weight to chunks that contain them. For example, querying
parse configboosts chunks containingparseConfig,ConfigParser, orconfig_parser. - File coherence. When multiple chunks from the same file match the query, the file is boosted so the top result reflects broad file-level relevance rather than a single out-of-context chunk.
- Noise penalties. Test files,
compat//legacy/shims, example code, and.d.tsdeclaration stubs are down-ranked so canonical implementations surface first.
Because the embedding model is static with no transformer forward pass at query time, all of this runs in milliseconds on CPU.
Citing
If you use Semble in your research, please cite the following:
@software{minishlab2026semble, author = {{van Dongen}, Thomas and Stephan Tulkens}, title = {Semble: Fast and Accurate Code Search for Agents}, year = {2026}, publisher = {Zenodo}, doi = {10.5281/zenodo.19785932}, url = {https://github.com/MinishLab/semble}, license = {MIT}}