Meta introduces TRIBE v2, a tri-modal foundation model for in-silico neuroscience

On March 26, 2026, AI at Meta introduced TRIBE v2 on X as a foundation model for predicting how the human brain responds to sight, sound, and language. The linked paper page and interactive demo make clear that Meta is framing the project not as a narrow benchmark result, but as a reusable computational layer for in-silico neuroscience.

The scale claims are notable. In the X post, Meta said TRIBE v2 draws on 500+ hours of fMRI recordings from 700+ people. The paper abstract describes the training base as over 1,000 hours of fMRI across 720 subjects. The phrasing differs across the materials, but the direction is consistent: TRIBE v2 is meant to replace many specialized, task-specific brain-response models with a broader tri-modal foundation model that generalizes across stimuli, tasks, and individuals.

The demo describes a three-stage pipeline. First, pretrained audio, video, and text embeddings represent the incoming stimulus. Second, a transformer learns universal representations shared across modalities, tasks, and subjects. Third, a subject layer maps those shared representations onto individual fMRI voxels. Meta says the result scales to whole-brain prediction across 70,000 voxels, far beyond the 1,000 cortical predictions of TRIBE v1, and achieves a 2-3x improvement over standard methods for zero-shot prediction on new subjects and stimuli.

The larger significance is what Meta says the model can do once trained. The paper argues that TRIBE v2 can reproduce established visual and neuro-linguistic findings in silico, while the demo says it can turn months of lab preparation into seconds of computation. In other words, Meta is presenting TRIBE v2 as a tool for planning experiments, testing hypotheses, and exploring multisensory brain organization before or alongside expensive real-world studies.

Meta is also releasing the surrounding research assets rather than only the headline. The thread points researchers to the model weights, code, paper, and demo. That open release strategy matters because it turns TRIBE v2 from a one-off research announcement into something outside groups can inspect, reproduce, and build on. As a result, this is one of the clearer recent examples of foundation-model ideas moving beyond text and image generation into scientific measurement itself.

Meta introduces TRIBE v2, a tri-modal foundation model for in-silico neuroscience

Related Articles

Meta unveils TRIBE v2 for zero-shot prediction of high-resolution fMRI activity

Google DeepMind turns AlphaGo’s 10-year mark into a case for AI-driven discovery

Anthropic launches a Science Blog to cover AI-driven research workflows and results

Comments (0)

Leave a Comment

Related Articles

Meta unveils TRIBE v2 for zero-shot prediction of high-resolution fMRI activity

Google DeepMind turns AlphaGo’s 10-year mark into a case for AI-driven discovery
Sciences sources.twitter Mar 24, 2026 1 min read

Anthropic launches a Science Blog to cover AI-driven research workflows and results