Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
SlopCodeBench: Benchmarking How Coding Agents Degrade over Long-Horizon Tasks (scbench.ai)
2 points by matt_d 22 days ago | hide | past | favorite


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: