samarailly51-pixel/claimpilot-harness
Crash-test insurance claim AI agents before production.
GitHub repository with 79 stars and 2 forks.
Language: Python
Topics: agent-evaluation, ai-agents, insurance, llm-evals, prompt-injection, python, testing
Crash-test insurance claim AI agents before production.
GitHub repository with 79 stars and 2 forks.
Language: Python
Topics: agent-evaluation, ai-agents, insurance, llm-evals, prompt-injection, python, testing
Trending score 1.52, freshness score 0.96, stars gained +21, forks gained +1.
2026-06-15: 79 stars and 2 forks.
Crash-test insurance claim AI agents before production.
GitHub repository with 79 stars and 2 forks.
Trending score: 1.52; stars gained: +21; forks gained: +1.
Language: Python
Topics: agent-evaluation, ai-agents, insurance, llm-evals, prompt-injection, python
OpenSkillEval: Automatically Auditing the Open Skill Ecosystem for LLM Agents
GitHub repository with 12 stars and 0 forks.
Trending score: 1.02; stars gained: +3; forks gained: +0.
Language: Python
Topics: agent-evaluation, ai-agents, benchmark, llm-eval, skill-evaluation
A single interface to use and evaluate different agent frameworks
GitHub repository with 1,176 stars and 94 forks.
Trending score: 0.92; stars gained: +2; forks gained: +0.
Language: Python
Topics: agent-evaluation, agents, ai, a2a, mcp
Catch your AI's mistakes and blind spots before your customers or regulators do. iFixAi runs 45 inspections, 32 graded core plus 13 extended for frontier risks like sabotage, sandbagging, and oversight evasion. It returns a letter grade in under 5 minutes. Industry and model agnostic.
GitHub repository with 479 stars and 92 forks.
Trending score: 0.76; stars gained: +0; forks gained: +0.
Language: Python
Topics: ai, diagnostic-tool, agent-evaluation, ai-alignment, ai-evaluation, ai-governance
Open skill for capturing AI agent work as structured traces.
GitHub repository with 89 stars and 10 forks.
Trending score: 0.71; stars gained: -1; forks gained: +0.
Language: Python
Topics: agent-evaluation, agent-traces, agent-workflows, ai-agents, llm-agents, post-training
Real-world browser-agent benchmark: 210 tasks across 107 websites, multi-agent/multi-browser evaluation, reproducible leaderboard and result submissions.
GitHub repository with 12 stars and 2 forks.
Trending score: 0.23; stars gained: +0; forks gained: +0.
Language: Python
Topics: agent, agent-evaluation, ai-agents, benchmark, browser-agent, browser-automation
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
GitHub repository with 88,172 stars and 12,648 forks.
Trending score: 6.02; stars gained: +1,097; forks gained: +218.
Language: Python
Topics: shortvideo, automation, chatgpt, moviepy, python, tiktok
Self-hosted AI workspace.
GitHub repository with 71,540 stars and 9,127 forks.
Trending score: 5.98; stars gained: +834; forks gained: +140.
Language: Python
The agent that grows with you
GitHub repository with 194,238 stars and 34,023 forks.
Trending score: 5.92; stars gained: +753; forks gained: +209.
Language: Python
Topics: ai, ai-agent, ai-agents, anthropic, chatgpt, claude
Security scanner for AI agent skills. Detect vulnerabilities, malicious patterns, and security risks.
GitHub repository with 5,962 stars and 441 forks.
Trending score: 5.61; stars gained: +874; forks gained: +76.
Language: Python
Learn it. Build it. Ship it for others.
GitHub repository with 32,676 stars and 5,366 forks.
Trending score: 5.59; stars gained: +762; forks gained: +135.
Language: Python
Topics: agents, ai, ai-agents, ai-engineering, computer-vision, course
Generate draw.io diagrams from natural language — 6 presets, vision self-check + up to 5-round refinement, codebase-to-diagram, 10,000+ official shapes & 321 AI/LLM brand logos. Exports PNG/SVG/PDF/JPG.
GitHub repository with 3,445 stars and 240 forks.
Trending score: 5.51; stars gained: +1,369; forks gained: +113.
Language: Python
Topics: agent-skill, agent-skills, architecture-diagram, claude-code, claude-code-skill, claude-skills
Next-generation AI Agent Optimization Platform: Cozeloop addresses challenges in AI agent development by providing full-lifecycle management capabilities from development, debugging, and evaluation to monitoring.
GitHub repository with 5,522 stars and 764 forks.
Trending score: 1.76; stars gained: +8; forks gained: +0.
Language: Go
Topics: agent, agent-evaluation, agent-observability, agentops, ai, coze
Crash-test insurance claim AI agents before production.
GitHub repository with 79 stars and 2 forks.
Trending score: 1.52; stars gained: +21; forks gained: +1.
Language: Python
Topics: agent-evaluation, ai-agents, insurance, llm-evals, prompt-injection, python
OpenSkillEval: Automatically Auditing the Open Skill Ecosystem for LLM Agents
GitHub repository with 12 stars and 0 forks.
Trending score: 1.02; stars gained: +3; forks gained: +0.
Language: Python
Topics: agent-evaluation, ai-agents, benchmark, llm-eval, skill-evaluation
A single interface to use and evaluate different agent frameworks
GitHub repository with 1,176 stars and 94 forks.
Trending score: 0.92; stars gained: +2; forks gained: +0.
Language: Python
Topics: agent-evaluation, agents, ai, a2a, mcp
Catch your AI's mistakes and blind spots before your customers or regulators do. iFixAi runs 45 inspections, 32 graded core plus 13 extended for frontier risks like sabotage, sandbagging, and oversight evasion. It returns a letter grade in under 5 minutes. Industry and model agnostic.
GitHub repository with 479 stars and 92 forks.
Trending score: 0.76; stars gained: +0; forks gained: +0.
Language: Python
Topics: ai, diagnostic-tool, agent-evaluation, ai-alignment, ai-evaluation, ai-governance
Open skill for capturing AI agent work as structured traces.
GitHub repository with 89 stars and 10 forks.
Trending score: 0.71; stars gained: -1; forks gained: +0.
Language: Python
Topics: agent-evaluation, agent-traces, agent-workflows, ai-agents, llm-agents, post-training