rollinsio/beyond-test-coverage
Benchmark for the quality of LLM-generated test suites — anti-fragility, rigor, mocking discipline, reuse — scored against human baselines, not coverage. Python, JS/TS, Go.
GitHub repository with 18 stars and 1 forks.
Language: Python
Topics: benchmark, claude, code-quality, llm, mocha, pytest, test-generation, testing, vitest