harbor-framework/harbor

Harbor is a framework for running agent evaluations and creating and using RL environments.

GitHub repository with 2,304 stars and 1,108 forks.

Language: Python

Topics: evals, rl-environments, terminal-bench

Open provider repository

24h trend summary

Trending score 1.37, activity score 0.05, stars gained +26, forks gained +4.

Latest metric snapshot

2026-06-05: 2,304 stars and 1,108 forks.

Similar repositories

  1. 1. Arize-ai/phoenix

    AI Observability & Evaluation

    GitHub repository with 9,988 stars and 909 forks.

    Trending score: 3.03; stars gained: +17; forks gained: +2.

    Language: Python

    Topics: agents, ai-monitoring, ai-observability, aiengineering, anthropic, datasets

  2. 2. future-agi/future-agi

    Open-source, end-to-end platform for evaluating, observing, and improving LLM and AI agent applications. Tracing · Evals · Simulations · Datasets · Gateway · Guardrails. Self-hostable. Apache 2.0.

    GitHub repository with 1,103 stars and 228 forks.

    Trending score: 2.95; stars gained: +27; forks gained: +5.

    Language: Python

    Topics: ai, ai-gateway, evals, llm, observability, simulation

  3. 3. harbor-framework/harbor

    Harbor is a framework for running agent evaluations and creating and using RL environments.

    GitHub repository with 2,304 stars and 1,108 forks.

    Trending score: 1.37; stars gained: +26; forks gained: +4.

    Language: Python

    Topics: evals, rl-environments, terminal-bench

  4. 4. SumanD18/sentinel

    Open-source observability and trust layer for AI agents: trace every step, score every output, catch hallucinations and runaway loops in real time. Self-hostable.

    GitHub repository with 6 stars and 0 forks.

    Trending score: 0.03; stars gained: +0; forks gained: +0.

    Language: Python

    Topics: ai-agents, evals, fastapi, guardrails, hallucination-detection, llm

  5. 5. Swival/calibra

    A benchmarking harness for coding agents.

    GitHub repository with 14 stars and 3 forks.

    Trending score: 0.05.

    Language: Python

    Topics: agent, ai, benchmark, benchmarking, eval, evals

Trending in Python

  1. 1. NousResearch/hermes-agent

    The agent that grows with you

    GitHub repository with 181,581 stars and 31,155 forks.

    Trending score: 5.95; stars gained: +1,867; forks gained: +361.

    Language: Python

    Topics: ai, ai-agent, ai-agents, anthropic, chatgpt, claude

  2. 2. chopratejas/headroom

    Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

    GitHub repository with 13,361 stars and 853 forks.

    Trending score: 5.69; stars gained: +2,829; forks gained: +175.

    Language: Python

    Topics: agent, ai, anthropic, compression, context-engineering, context-window

  3. 3. Imbad0202/academic-research-skills

    Academic Research Skills for Claude Code: research → write → review → revise → finalize

    GitHub repository with 27,422 stars and 2,253 forks.

    Trending score: 5.52; stars gained: +1,079; forks gained: +89.

    Language: Python

    Topics: academic-pipeline, academic-writing, ai-research, claude, claude-code, literature-review

  4. 4. anthropics/financial-services

    GitHub repository with 30,002 stars and 4,224 forks.

    Trending score: 4.88; stars gained: +688; forks gained: +114.

    Language: Python

  5. 5. virgiliojr94/book-to-skill

    Turn any technical book PDF into a Claude Code skill — ready to study, reference, and use while you work.

    GitHub repository with 4,250 stars and 534 forks.

    Trending score: 4.88; stars gained: +476; forks gained: +68.

    Language: Python

  6. 6. vinta/awesome-python

    An opinionated list of Python frameworks, libraries, tools, and resources

    GitHub repository with 301,341 stars and 28,044 forks.

    Trending score: 4.60; stars gained: +518; forks gained: +24.

    Language: Python

    Topics: awesome, python, collections, python-frameworks, python-libraries, python-tools

Trending topic: evals

  1. 1. mastra-ai/mastra

    From the team behind Gatsby, Mastra is a framework for building AI-powered applications and agents with a modern TypeScript stack.

    GitHub repository with 24,786 stars and 2,198 forks.

    Trending score: 3.40; stars gained: +98; forks gained: +12.

    Language: TypeScript

    Topics: agents, ai, chatbots, evals, javascript, llm

  2. 2. Arize-ai/phoenix

    AI Observability & Evaluation

    GitHub repository with 9,988 stars and 909 forks.

    Trending score: 3.03; stars gained: +17; forks gained: +2.

    Language: Python

    Topics: agents, ai-monitoring, ai-observability, aiengineering, anthropic, datasets

  3. 3. future-agi/future-agi

    Open-source, end-to-end platform for evaluating, observing, and improving LLM and AI agent applications. Tracing · Evals · Simulations · Datasets · Gateway · Guardrails. Self-hostable. Apache 2.0.

    GitHub repository with 1,103 stars and 228 forks.

    Trending score: 2.95; stars gained: +27; forks gained: +5.

    Language: Python

    Topics: ai, ai-gateway, evals, llm, observability, simulation

  4. 4. harbor-framework/harbor

    Harbor is a framework for running agent evaluations and creating and using RL environments.

    GitHub repository with 2,304 stars and 1,108 forks.

    Trending score: 1.37; stars gained: +26; forks gained: +4.

    Language: Python

    Topics: evals, rl-environments, terminal-bench

  5. 5. MCPJam/inspector

    Testing and evaluation platform to chat, inspect, and debug MCP servers, MCP apps, and ChatGPT apps.

    GitHub repository with 1,990 stars and 235 forks.

    Trending score: 0.76; stars gained: +5; forks gained: +1.

    Language: TypeScript

    Topics: anthropic, chatgpt, cicd, debugger, evals, evaluation

  6. 6. spences10/my-pi

    Composable pi coding agent with MCP, LSP, agent chains, prompt presets, and local eval telemetry

    GitHub repository with 41 stars and 10 forks.

    Trending score: 0.61; stars gained: +3; forks gained: +1.

    Language: TypeScript

    Topics: cli, coding-agent, llm, mcp, pi, typescript