How We Broke Top AI Agent Benchmarks: And What Comes Next
via rdi.berkeley.edu
Short excerpt below. Read at the original source.
Article URL: https://rdi.berkeley.edu/blog/trustworthy-benchmarks-cont/ Comments URL: https://news.ycombinator.com/item?id=47733217 Points: 24 # Comments: 3