VectorInstitute/inspect-mlflow

MLflow integration for Inspect AI evals: log runs, metrics, artifacts, and traces.

GitHub repository with 9 stars and 0 forks.

Language: Python

Topics: evals, mlflow

Open provider repository

Latest metric snapshot

2026-06-05: 9 stars and 0 forks.

Similar repositories

1. future-agi/future-agi

Open-source, end-to-end platform for evaluating, observing, and improving LLM and AI agent applications. Tracing · Evals · Simulations · Datasets · Gateway · Guardrails. Self-hostable. Apache 2.0.

GitHub repository with 1,101 stars and 228 forks.

Trending score: 2.95; stars gained: +27; forks gained: +5.

Language: Python

Topics: ai, ai-gateway, evals, llm, observability, simulation
2. harbor-framework/harbor

Harbor is a framework for running agent evaluations and creating and using RL environments.

GitHub repository with 2,298 stars and 1,108 forks.

Trending score: 1.37; stars gained: +26; forks gained: +4.

Language: Python

Topics: evals, rl-environments, terminal-bench
3. Kiln-AI/Kiln

Build, Evaluate, and Optimize AI Systems. Includes evals, RAG, agents, fine-tuning, synthetic data generation, dataset management, MCP, and more.

GitHub repository with 4,863 stars and 368 forks.

Trending score: 0.83; stars gained: +4; forks gained: +0.

Language: Python

Topics: ai, chain-of-thought, collaboration, dataset-generation, fine-tuning, machine-learning
4. METR/hawk

Run Inspect AI evals in the cloud

GitHub repository with 22 stars and 19 forks.

Trending score: 0.51; stars gained: +0; forks gained: +1.

Language: Python

Topics: aws, evals, inspect-ai, llm
5. SumanD18/sentinel

Open-source observability and trust layer for AI agents: trace every step, score every output, catch hallucinations and runaway loops in real time. Self-hostable.

GitHub repository with 6 stars and 0 forks.

Trending score: 0.03; stars gained: +0; forks gained: +0.

Language: Python

Topics: ai-agents, evals, fastapi, guardrails, hallucination-detection, llm

Trending in Python

1. NousResearch/hermes-agent

The agent that grows with you

GitHub repository with 181,138 stars and 31,078 forks.

Trending score: 5.95; stars gained: +1,867; forks gained: +361.

Language: Python

Topics: ai, ai-agent, ai-agents, anthropic, chatgpt, claude
2. chopratejas/headroom

Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

GitHub repository with 12,594 stars and 817 forks.

Trending score: 5.69; stars gained: +2,829; forks gained: +175.

Language: Python

Topics: agent, ai, anthropic, claude-code, compression, context-engineering
3. Imbad0202/academic-research-skills

Academic Research Skills for Claude Code: research → write → review → revise → finalize

GitHub repository with 27,254 stars and 2,241 forks.

Trending score: 5.52; stars gained: +1,079; forks gained: +89.

Language: Python

Topics: academic-pipeline, academic-writing, ai-research, claude, claude-code, literature-review
4. open-webui/open-webui

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

GitHub repository with 140,059 stars and 20,110 forks.

Trending score: 5.04; stars gained: +317; forks gained: +58.

Language: Python

Topics: ollama, ollama-webui, llm, webui, self-hosted, llm-ui
5. ZhuLinsen/daily_stock_analysis

LLM驱动的 A/H/美股智能分析：多数据源行情 + 实时新闻 + LLM决策仪表盘 + 多渠道推送，零成本定时运行，纯白嫖. LLM-powered stock analysis system for A/H/US markets.

GitHub repository with 40,774 stars and 38,952 forks.

Trending score: 4.88; stars gained: +836; forks gained: +443.

Language: Python

Topics: a-stock, ai-agent, aigc, llm, quant, quantitative-finance
6. anthropics/financial-services

GitHub repository with 29,960 stars and 4,217 forks.

Trending score: 4.88; stars gained: +688; forks gained: +114.

Language: Python

VectorInstitute/inspect-mlflow

Latest metric snapshot

Similar repositories

1. future-agi/future-agi

2. harbor-framework/harbor

3. Kiln-AI/Kiln

4. METR/hawk

5. SumanD18/sentinel

Trending in Python

1. NousResearch/hermes-agent

2. chopratejas/headroom

3. Imbad0202/academic-research-skills

4. open-webui/open-webui

5. ZhuLinsen/daily_stock_analysis

6. anthropics/financial-services

Trending topic: evals

1. mastra-ai/mastra

2. future-agi/future-agi

3. harbor-framework/harbor

4. Kiln-AI/Kiln

5. MCPJam/inspector

6. spences10/my-pi