MCP-Universe benchmark shows GPT-5 fails more than half of real-world orchestration tasks

via venturebeat.com

Short excerpt below. Read at the original source.

A new benchmark from Salesforce research evaluates model and agentic performance on real-life enterprise tasks.Read More

Read at Source