releases.shpreview
OpenAI/Product Releases/Why we no longer evaluate SWE-bench Verified

Why we no longer evaluate SWE-bench Verified

$npx -y @buildinternet/releases show rel_FvHW_S9Xu5XhuxnBRdesx

SWE-bench Verified is increasingly contaminated and mismeasures frontier coding progress. Our analysis shows flawed tests and training leakage. We recommend SWE-bench Pro.

Fetched April 7, 2026