lexmount/browseruse-agent-bench
Real-world browser-agent benchmark: 210 tasks across 107 websites, multi-agent/multi-browser evaluation, reproducible leaderboard and result submissions.
GitHub repository with 11 stars and 1 forks.
Language: Python
Topics: agent, benchmark, browseruse, agent-evaluation, ai-agents, browser-agent, browser-automation, computer-use, leaderboard, llm-evaluation