LLM 3h ago 1 min read
A new arXiv preprint reports that LLM judges became meaningfully more lenient when prompts framed evaluation consequences, exposing a weak point in automated safety and quality benchmarks.
A new arXiv preprint reports that LLM judges became meaningfully more lenient when prompts framed evaluation consequences, exposing a weak point in automated safety and quality benchmarks.