AI Reddit 6h ago 2 min read
An r/artificial link post resurfaced BullshitBench v2, a community benchmark built around 100 nonsense prompts and a 3-judge panel. The current public leaderboard places Claude Sonnet 4.6 with high reasoning at a 91% green rate and 3% red rate, but the results still need to be read as a community signal rather than a neutral standard.