OpenAI says GPT-5.4 Thinking shows low chain-of-thought controllability in new safety study

On March 5, 2026, OpenAI used X to introduce a new evaluation suite and research paper on Chain-of-Thought (CoT) Controllability. The company said GPT-5.4 Thinking showed low ability to obscure its reasoning, and the linked research summary framed that result positively because it suggests CoT monitoring still provides a useful safety signal.

The card linked from the post described the work as CoT-Control, an evaluation setup for testing whether reasoning models can intentionally change or hide the content of their intermediate reasoning traces. OpenAI's claim is not that reasoning models are fully transparent in every circumstance. Instead, the company is reporting that current frontier reasoning systems still struggle to reliably manipulate those traces on demand. From a safety perspective, that matters because some monitoring approaches depend on the assumption that internal reasoning remains at least partially inspectable.

What OpenAI announced: a new evaluation suite plus a research paper.
Research framing: low controllability of visible reasoning can be beneficial for monitoring.
Model named in the post: GPT-5.4 Thinking.

The announcement stands out because OpenAI treated a limitation as a safety-relevant property. In many model launches, more control is presented as strictly better. Here, the company argued that lower ability to hide or rewrite reasoning traces may reduce one class of audit blind spot. That does not solve broader alignment or reliability questions, but it does strengthen the case for keeping reasoning-monitoring techniques in the evaluation toolbox while models become more capable.

Primary sources are OpenAI's March 5, 2026 X post and the linked research summary. Because the company described the result in terms of low ability to obscure reasoning, the most careful reading is that OpenAI is reporting an empirical finding about current model behavior, not making a universal claim that all future reasoning models will remain similarly monitorable. The linked summary page is titled Reasoning models struggle to control their chains of thought, and that's good.

OpenAI says GPT-5.4 Thinking shows low chain-of-thought controllability in new safety study

Related Articles

OpenAI says GPT-5.4 Thinking still struggles to hide its chain of thought

HN Meets GPT-5.5 API With a Price-and-Behavior Audit, Not a Victory Lap

GPT-5.5 jumps 3 points clear on Artificial Analysis, but cost rises 20%

Comments (0)

Leave a Comment

Related Articles

OpenAI says GPT-5.4 Thinking still struggles to hide its chain of thought
LLM sources.twitter Mar 15, 2026 2 min read

HN Meets GPT-5.5 API With a Price-and-Behavior Audit, Not a Victory Lap

GPT-5.5 jumps 3 points clear on Artificial Analysis, but cost rises 20%