Microsoft turns LLM brain predictions into fMRI-tested explanations
Original: Understanding the brain with AI-driven explanations and experiments View original →
AI models can predict brain responses to language, but prediction alone does not explain what a brain region is doing. Microsoft Research’s new work tries to close that gap by turning black-box language-brain models into short verbal hypotheses and testing them in an fMRI scanner.
The June 25, 2026 Microsoft Research post describes generative causal testing, or GCT, developed with researchers from UC Berkeley, UCSF, and Columbia University. The starting problem is familiar in neuroscience and AI: LLM-based models can predict how patches of cortex respond as a person listens to a story, but their learned parameters are not a theory a scientist can read.
GCT distills those models into compact explanations. A cortical patch might be described as responding to “food preparation,” “location names,” or another human-readable concept. The method then tests the explanation rather than leaving it as a label. An LLM writes new stories designed to activate a targeted brain area, people listen to those stories in the scanner, and the explanation gains support only if the region responds as predicted.
Microsoft says the experiments confirmed known selectivity, separated neighboring place-processing regions that had long been treated as similar, and revealed small prefrontal micro-regions tuned to concepts such as dialogue, clock times, and measurements. That makes the work more than a visualization layer over a model; it creates a loop from prediction to hypothesis to intervention.
The scientific stake is interpretability. A model that forecasts brain activity is useful, but a model that proposes testable explanations can change how researchers design experiments. The same loop also shows a productive role for LLMs in science: not as final authorities, but as generators of precise stimuli and candidate theories that can be checked against measurements.
The caution is that GCT explanations remain hypotheses. They do not prove a complete biological mechanism. Still, the work pushes AI-assisted neuroscience toward a stricter standard: explain what a model has found, generate a targeted test, and let the brain data decide whether the explanation survives.
Related Articles
Meta said on March 26, 2026 that TRIBE v2 is a foundation model for predicting human brain responses to sight, sound, and language. The supporting paper and demo highlight zero-shot generalization, prediction across 70,000 voxels, and public releases of the paper, code, and model weights.
Meta said on March 26, 2026 that TRIBE v2 can predict high-resolution fMRI brain activity with zero-shot generalization across new subjects, languages, and tasks. The company is also releasing the model, code, paper, and demo for researchers.
OpenAI says GPT-5.5 Instant has pushed health responses close to its frontier Thinking models while reaching free ChatGPT users. The bigger signal is production data: flagged factuality issues in health answers fell 71% over two months.