Together Research says LLMs can repair bad database query plans
Original: Together Research says LLMs can repair bad database query plans View original →
On April 3, 2026, Together AI’s X account promoted new research claiming that LLMs can repair query plans when a database optimizer misses semantic correlations. The post highlights DBPlanBench, a system that hands the LLM a database’s physical operator graph and asks it to patch the plan directly instead of rewriting the full execution strategy from scratch.
What the research claims
The team says DBPlanBench works on Apache DataFusion plans and uses localized edits plus an evolutionary search loop to refine candidates. In the X post, Together reports up to 4.78x speedups on TPC-H and TPC-DS, says 60.8% of tested queries improved by more than 5%, and cites a build-memory reduction from 3.3 GB to 411 MB in one of its examples. The related arXiv paper describes the motivation in conventional database terms: cost estimators can miss semantic correlations in data, which in turn leads to bad join order, bad access paths, and cascading planning errors.
Why it matters
This is a useful example of LLMs being applied below the application layer, inside systems infrastructure that normally depends on handwritten heuristics. The notable design choice is not to have the model generate a brand-new plan, but to let it inspect an already-optimized physical plan and suggest bounded changes that can be executed and evaluated. If the approach holds up beyond benchmark settings, it could open a path to narrower, higher-confidence uses of LLMs in database engines and other optimization stacks.
Source materials include Together AI’s X post and the paper “Making Databases Faster with LLM Evolutionary Sampling”.
Related Articles
A Reddit thread in r/LocalLLaMA drew 142 upvotes and 29 comments around CoPaw-9B. The discussion focused on its Qwen3.5-based 9B agent positioning, 262,144-token context window, and whether local users would get GGUF or other quantized builds quickly.
Together Research said on March 31, 2026 that Aurora is an open-source framework for adaptive speculative decoding that learns from live inference traces and updates the speculator asynchronously without interrupting serving. Together’s blog and paper say Aurora reframes the problem as asynchronous RL and can deliver 1.25x additional speedup over a strong static speculator as traffic shifts.
Lemonade packages local AI inference behind an OpenAI-compatible server that targets GPUs and NPUs, aiming to make open models easier to deploy on everyday PCs.
Comments (0)
No comments yet. Be the first to comment!