News Grower

Independent coverage of AI, startups, and technology.

OpenAI News Feb 23, 2026 at 11:00 AI

Why we no longer evaluate SWE-bench Verified

SWE-bench Verified is increasingly contaminated and mismeasures frontier coding progress. Our analysis shows flawed tests and training leakage. We recommend SWE-bench Pro.

Quick summary

SWE-bench Verified is increasingly contaminated and mismeasures frontier coding progress. Our analysis shows flawed tests and training leakage.

Related tags

Companies and people

Continue with this story

Follow the same topic through connected articles, entity pages, and active story threads.

Ad slot

Article inline monetization block

A reserved partner slot for relevant tools, services, and contextual editorial integrations.

Partner slot

Related articles

More stories that share tags, source, or category context.

More from OpenAI News

Fresh reporting and follow-up coverage from the same newsroom.

Open source page