Information
AI agents are an exciting new research direction, and benchmarks are crucial for driving progress. However, current agent benchmarks and evaluation practices reveals several shortcomings that hinder their usefulness in real-world applications. ... We present five key findings from our analysis of AI agent benchmarks and evaluations. 1. Cost ...