How we monitor internal coding agents for misalignment
How OpenAI uses chain-of-thought monitoring to study misalignment in internal coding agents—analyzing real-world deployments to detect risks and strengthen AI safety safeguards.
Quick summary
OpenAI employs chain‑of‑thought monitoring to examine its internal coding agents, analyzing real‑world deployments to spot misalignment risks and reinforce AI safety safeguards.
Related tags
Companies and people
Story threads
Continue with this story
Follow the same topic through connected articles, entity pages, and active story threads.
Ad slot
Article monetization slot
Reserved for contextual monetization inside article pages.
Related articles
More stories that share tags, source, or category context.
More from OpenAI News
Fresh reporting and follow-up coverage from the same newsroom.
Inside our approach to the Model Spec
Learn how OpenAI’s Model Spec serves as a public framework for model behavior, balancing safety, user freedom, and accountability as AI systems advance.
Introducing the OpenAI Safety Bug Bounty program
OpenAI launches a Safety Bug Bounty program to identify AI abuse and safety risks, including agentic vulnerabilities, prompt injection, and data exfiltration.
Helping developers build safer AI experiences for teens
OpenAI releases prompt-based teen safety policies for developers using gpt-oss-safeguard, helping moderate age-specific risks in AI systems.
Powering product discovery in ChatGPT
ChatGPT introduces richer, visually immersive shopping powered by the Agentic Commerce Protocol, enabling product discovery, side-by-side comparisons, and merchant integration.