Anthropic Donates Petri AI Alignment Testing Tool to Independent Nonprofit Meridian Labs
Original: Anthropic Donates Petri Open-Source AI Alignment Testing Tool to Meridian Labs View original →
What is Petri?
Petri is an open-source AI alignment evaluation framework developed by Anthropic. It uses separate auditor and judge models to assess whether AI systems exhibit concerning behaviors such as deception, sycophancy, and cooperation with harmful requests.
What is New in Petri 3.0
Released alongside the donation, Petri 3.0 introduces three major improvements. First, separated components enable adaptability: the framework can now be customized for different evaluation purposes. Second, a new Dish add-on uses real system prompts and deployment scaffolding to prevent models from detecting they are being tested, increasing realism. Third, integration with Bloom enables more thorough behavioral assessments, adding significant evaluation depth.
Why Donate to Meridian Labs?
Anthropic transferred Petri to Meridian Labs, an independent nonprofit, for the same reason it donated the Model Context Protocol to the Linux Foundation: neutrality. A tool owned by a single commercial lab raises questions about bias. Under independent governance, Petri can serve labs, researchers, and governments as a credible third-party resource.
Strengthening the Alignment Ecosystem
With AI systems becoming increasingly capable, the ability to reliably test them for misaligned behavior is critical. By open-sourcing Petri and placing it under neutral stewardship, Anthropic is investing in the shared infrastructure needed to evaluate models responsibly across the entire industry.
Related Articles
Anthropic has donated the Model Context Protocol (MCP) to the Agentic AI Foundation under the Linux Foundation. With participation from OpenAI, Microsoft, Google, and AWS, MCP becomes the standard for AI agent integration.
Anthropic published a new theory explaining why AI assistants like Claude express emotions and use anthropomorphic language—proposing that models select from personas inherited from fictional characters during training.
Anthropic said on March 17, 2026 that open source security is becoming more important as AI grows more capable. In its X post, the company said it is donating to the Linux Foundation to help secure the software foundations AI depends on.