exploitbench/exploitbench
ExploitBench measures how far AI agents climb, from reaching vulnerable code, to triggering the bug, to building exploit primitives, to arbitrary code execution.
GitHub repository with 222 stars and 16 forks.
Language: Python