bencherdev/bencher

🐰 Bencher - Continuous Benchmarking

GitHub repository with 850 stars and 41 forks.

Language: Rust

Topics: benchmark, ci, performance, continuous-benchmarking, cd, ci-cd, code-quality, benchmarking

Open provider repository

Latest metric snapshot

2026-06-05: 850 stars and 41 forks.

Similar repositories

  1. 1. pawurb/hotpath-rs

    Quickly find bottlenecks in Rust - one profiler for CPU, time, memory, and async code.

    GitHub repository with 1,529 stars and 45 forks.

    Trending score: 0.60; stars gained: +3; forks gained: +0.

    Language: Rust

    Topics: allocations, benchmark, performance, rust, debugging, mpsc

  2. 2. gungraun/gungraun

    High-precision, one-shot and consistent benchmarking framework/harness for Rust. All Valgrind tools at your fingertips.

    GitHub repository with 274 stars and 23 forks.

    Trending score: 0.05.

    Language: Rust

    Topics: benchmark, cargo, rust, bindings, callgrind, client-request

Trending in Rust

  1. 1. BigPizzaV3/CodexPlusPlus

    An enhanced tool for CodexApp, striving to make Codex better to use and more comfortable 一个CodexApp的增强工具,努力让Codex变得更好用更舒服

    GitHub repository with 13,760 stars and 852 forks.

    Trending score: 5.16; stars gained: +916; forks gained: +44.

    Language: Rust

  2. 2. Hmbown/CodeWhale

    DeepSeek + MiMo coding agent in terminal

    GitHub repository with 37,132 stars and 3,195 forks.

    Trending score: 4.80; stars gained: +393; forks gained: +32.

    Language: Rust

    Topics: cli, deepseek, llm, rust, terminal, tui

  3. 3. openai/codex

    Lightweight coding agent that runs in your terminal

    GitHub repository with 88,832 stars and 13,052 forks.

    Trending score: 4.58; stars gained: +326; forks gained: +48.

    Language: Rust

  4. 4. tinyhumansai/openhuman

    Your Personal AI super intelligence. Private, Simple and extremely powerful.

    GitHub repository with 30,826 stars and 2,977 forks.

    Trending score: 4.37; stars gained: +332; forks gained: +50.

    Language: Rust

  5. 5. fallow-rs/fallow

    Codebase intelligence for TypeScript and JavaScript. Free static layer: unused code, duplication, circular deps, complexity hotspots, architecture boundaries. Optional paid runtime layer: hot-path review and cold-path deletion evidence from real production traffic. Rust-native, sub-second, zero-config framework support.

    GitHub repository with 3,058 stars and 94 forks.

    Trending score: 4.05; stars gained: +346; forks gained: +16.

    Language: Rust

    Topics: cli, code-duplication, code-quality, codebase-intelligence, copy-paste-detection, dead-code

  6. 6. aaif-goose/goose

    an open source, extensible AI agent that goes beyond code suggestions - install, execute, edit, and test with any LLM

    GitHub repository with 46,568 stars and 4,863 forks.

    Trending score: 3.80; stars gained: +152; forks gained: +28.

    Language: Rust

    Topics: acp, ai, ai-agents, mcp

Trending topic: benchmark

  1. 1. Purewhiter/mobilegym

    MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research · 浏览器里运行的安卓模拟器 · Browser-hosted Android Simulator · Verifiable Evaluation · Scalable Online RL Training

    GitHub repository with 517 stars and 82 forks.

    Trending score: 3.00; stars gained: +33; forks gained: +4.

    Language: TypeScript

    Topics: agent, agents, ai, android, automation, benchmark

  2. 2. VibeBench/VibeSearchBench

    🔍 The hardest search benchmark in the wild — vague, multi-turn, proactive. 200 long-horizon tasks with persona-driven progressive disclosure, scored by verifiable schema-free knowledge-graph evaluation. No vibes, just triplet F1.

    GitHub repository with 780 stars and 9 forks.

    Trending score: 1.88; stars gained: +102; forks gained: +0.

    Language: Python

    Topics: agentic-ai, benchmark, llm, proactive-agent, search, search-agent

  3. 3. Ammaar-Alam/minebench

    Minecraft-style voxel benchmark for comparing AI models (Arena + Sandbox)

    GitHub repository with 244 stars and 17 forks.

    Trending score: 1.14; stars gained: +13; forks gained: +0.

    Language: TypeScript

    Topics: ai, benchmark, llm, nlp, voxel, comparison-benchmarks

  4. 4. hogeheer499-commits/strix-halo-guide

    AMD Strix Halo local LLM guide: direct 100.0 t/s 30B Qwen MoE on Ryzen AI MAX+ 395 / Radeon 8060S. Setup, benchmarks, raw evidence.

    GitHub repository with 91 stars and 4 forks.

    Trending score: 0.98; stars gained: +7; forks gained: +0.

    Language: Python

    Topics: amd, benchmark, gfx1151, inference, llama-cpp, llm

  5. 5. sierra-research/tau2-bench

    τ-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains

    GitHub repository with 1,273 stars and 328 forks.

    Trending score: 0.92; stars gained: +7; forks gained: +1.

    Language: Python

    Topics: benchmark, llm, ai, language-model-agent, conversational-agents

  6. 6. open-compass/opencompass

    OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

    GitHub repository with 7,061 stars and 784 forks.

    Trending score: 0.91; stars gained: +4; forks gained: +1.

    Language: Python

    Topics: benchmark, chatgpt, evaluation, large-language-model, llama2, llama3