devmirza-bot/frameworks-benchmark
Simple benchmarking tool written in HTML, CSS and Javascript
GitHub repository with 7 stars and 1 forks.
Language: JavaScript
Topics: benchmark, nextjs, nextjs14, react, reactjs
Simple benchmarking tool written in HTML, CSS and Javascript
GitHub repository with 7 stars and 1 forks.
Language: JavaScript
Topics: benchmark, nextjs, nextjs14, react, reactjs
2026-06-04: 7 stars and 1 forks.
Run unit tests with several test runners or benchmark inside real browsers with playwright and other Javascript runtimes.
GitHub repository with 102 stars and 14 forks.
Trending score: 0.05; stars gained: +0; forks gained: +0.
Language: JavaScript
Topics: mocha, mochajs, tape, testing, testing-tools, playwright
Benchmarks for bundlers and build tools, including Rspack, Rsbuild, webpack, Vite, Rolldown, esbuild, Parcel and Farm.
GitHub repository with 105 stars and 6 forks.
Trending score: 0.04; stars gained: +0; forks gained: +0.
Language: JavaScript
Topics: build-tools, rsbuild, rspack, benchmark, bundler, farm
Unlimited FREE AI coding. Connect Claude Code, Codex, Cursor, Cline, Copilot, Antigravity to FREE Claude/GPT/Gemini via 40+ providers. Auto-fallback, RTK -40% tokens, never hit limits.
GitHub repository with 16,300 stars and 2,451 forks.
Trending score: 5.08; stars gained: +501; forks gained: +73.
Language: JavaScript
Topics: claude-code, cursor, ai-agents, ai-gateway, anthropic, chatgpt
Marketing skills for Claude Code and AI agents. CRO, copywriting, SEO, analytics, and growth engineering.
GitHub repository with 31,919 stars and 5,251 forks.
Trending score: 4.46; stars gained: +432; forks gained: +55.
Language: JavaScript
Git. Ship. Done - Core
GitHub repository with 2,672 stars and 161 forks.
Trending score: 3.96; stars gained: +189; forks gained: +13.
Language: JavaScript
Topics: claude-code, context-engineering, meta-prompting, spec-driven-development
Codex++ tweak system for the Codex desktop app
GitHub repository with 2,831 stars and 124 forks.
Trending score: 3.64; stars gained: +137; forks gained: +5.
Language: JavaScript
GitHub repository with 3,414 stars and 874 forks.
Trending score: 3.34; stars gained: +58; forks gained: +10.
Language: JavaScript
The design language that makes your AI harness better at design.
GitHub repository with 34,407 stars and 1,866 forks.
Trending score: 2.89; stars gained: +838; forks gained: +35.
Language: JavaScript
MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research · 浏览器里运行的安卓模拟器 · Browser-hosted Android Simulator · Verifiable Evaluation · Scalable Online RL Training
GitHub repository with 498 stars and 79 forks.
Trending score: 3.43; stars gained: +84; forks gained: +10.
Language: TypeScript
Topics: agent, agents, ai, android, automation, benchmark
🔍 The hardest search benchmark in the wild — vague, multi-turn, proactive. 200 long-horizon tasks with persona-driven progressive disclosure, scored by verifiable schema-free knowledge-graph evaluation. No vibes, just triplet F1.
GitHub repository with 774 stars and 2 forks.
Trending score: 1.88; stars gained: +100; forks gained: +0.
Language: Python
Topics: agentic-ai, benchmark, llm, proactive-agent, search, search-agent
Minecraft-style voxel benchmark for comparing AI models (Arena + Sandbox)
GitHub repository with 244 stars and 17 forks.
Trending score: 1.23; stars gained: +19; forks gained: +1.
Language: TypeScript
Topics: ai, benchmark, llm, nlp, voxel, comparison-benchmarks
BEHAVIOR-1K: a platform for accelerating Embodied AI research. Join our Discord for support: https://discord.gg/bccR5vGFEx
GitHub repository with 1,502 stars and 205 forks.
Trending score: 1.17; stars gained: +14; forks gained: +0.
Language: Python
Topics: benchmark, embodied-ai, robotics, simulation
Benchmark for the quality of LLM-generated test suites — anti-fragility, rigor, mocking discipline, reuse — scored against human baselines, not coverage. Python, JS/TS, Go.
GitHub repository with 18 stars and 1 forks.
Trending score: 1.04; stars gained: +9; forks gained: +0.
Language: Python
Topics: benchmark, claude, code-quality, llm, mocha, pytest
τ-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
GitHub repository with 1,273 stars and 328 forks.
Trending score: 0.92; stars gained: +7; forks gained: +1.
Language: Python
Topics: benchmark, llm, ai, language-model-agent, conversational-agents