Skip to content

Microsoft turns LLM brain predictions into fMRI-tested explanations

Original: Understanding the brain with AI-driven explanations and experiments View original →

Read in other languages: 한국어日本語
Sciences Jun 26, 2026 By Insights AI 2 min read 1 views Source

AI models can predict brain responses to language, but prediction alone does not explain what a brain region is doing. Microsoft Research’s new work tries to close that gap by turning black-box language-brain models into short verbal hypotheses and testing them in an fMRI scanner.

The June 25, 2026 Microsoft Research post describes generative causal testing, or GCT, developed with researchers from UC Berkeley, UCSF, and Columbia University. The starting problem is familiar in neuroscience and AI: LLM-based models can predict how patches of cortex respond as a person listens to a story, but their learned parameters are not a theory a scientist can read.

GCT distills those models into compact explanations. A cortical patch might be described as responding to “food preparation,” “location names,” or another human-readable concept. The method then tests the explanation rather than leaving it as a label. An LLM writes new stories designed to activate a targeted brain area, people listen to those stories in the scanner, and the explanation gains support only if the region responds as predicted.

Microsoft says the experiments confirmed known selectivity, separated neighboring place-processing regions that had long been treated as similar, and revealed small prefrontal micro-regions tuned to concepts such as dialogue, clock times, and measurements. That makes the work more than a visualization layer over a model; it creates a loop from prediction to hypothesis to intervention.

The scientific stake is interpretability. A model that forecasts brain activity is useful, but a model that proposes testable explanations can change how researchers design experiments. The same loop also shows a productive role for LLMs in science: not as final authorities, but as generators of precise stimuli and candidate theories that can be checked against measurements.

The caution is that GCT explanations remain hypotheses. They do not prove a complete biological mechanism. Still, the work pushes AI-assisted neuroscience toward a stricter standard: explain what a model has found, generate a targeted test, and let the brain data decide whether the explanation survives.

Share: Long

Related Articles