Dylsimple60/RLHF_learn

🤖 Enhance reinforcement learning stability and efficiency with advanced algorithms like TRPO, PPO, DPO, GRPO, DAPO, and GSPO for optimized policy training.

GitHub repository with 6 stars and 0 forks.

Language: Python

Topics: ai-safety, attention-mechanisms, datasets, deep-learning, deep-reinforcement-learning, gpt, human-feedback, large-language-models, openai-o1, python

Open provider repository

Latest metric snapshot

2026-06-05: 6 stars and 0 forks.

Similar repositories

  1. 1. microsoft/agent-governance-toolkit

    AI Agent Governance Toolkit — Policy enforcement, zero-trust identity, execution sandboxing, and reliability engineering for autonomous AI agents. Covers 10/10 OWASP Agentic Top 10.

    GitHub repository with 3,992 stars and 547 forks.

    Trending score: 4.25; stars gained: +167; forks gained: +12.

    Language: Python

    Topics: agent-framework, ai-agents, ai-safety, compliance, governance, microsoft

  2. 2. ifixai-ai/iFixAi

    The open-source diagnostic for AI misalignment. 32 tests across fabrication, manipulation, deception, unpredictability, and opacity. Provider-agnostic. Runs against OpenAI, Anthropic, Bedrock, Azure, Gemini, and more. Letter grade in under 5 minutes, content-addressed manifest for bit-identical replay. Built by iMe.

    GitHub repository with 466 stars and 90 forks.

    Trending score: 1.78; stars gained: +6; forks gained: +3.

    Language: Python

    Topics: ai, diagnostic-tool, misalignment, agent-evaluation, ai-alignment, ai-evaluation

  3. 3. emmanuelgjr/genai_incidents

    Single source of truth for GenAI and agentic AI security incidents, mapped to OWASP LLM Top 10, OWASP Agentic Top 10 (ASI), NIST AI RMF, and MITRE ATLAS.

    GitHub repository with 12 stars and 3 forks.

    Trending score: 0.87; stars gained: +6; forks gained: +1.

    Language: Python

    Topics: agentic-incidents, ai-incidents, ai-safety, cybersecurity, dataset, genai-incidents

  4. 4. OraclesTech/guardian-sdk

    Ethicore Engine™ is an AI safety, ethics, and compliance platform. This repo consists of the open-source components of Ethicore Engine™ - Guardian SDK; designed to protect your AI applications from prompt injection, jailbreaks, role hijacking, system-prompt extraction, and 100+ additional threat categories through a multi-layer analysis pipeline

    GitHub repository with 88 stars and 11 forks.

    Trending score: 0.77; stars gained: +5; forks gained: +0.

    Language: Python

    Topics: adversarial-machine-learning, agent-safety, agent-security, agentic-loop, ai-agents, ai-safety

  5. 5. swarm-ai-research/swarm

    SWARM: System-Wide Assessment of Risk in Multi-agent systems

    GitHub repository with 33 stars and 5 forks.

    Trending score: 0.53; stars gained: +1; forks gained: +0.

    Language: Python

    Topics: agi-safety, ai, ai-agent, ai-agents, ai-safety, alignment

  6. 6. CyberStrategyInstitute/ai-safe2-framework

    The Universal Governance, Risk, Compliance (GRC) Operating System with Integrated Security for Agentic AI, Non-Human Identities, and Swarm Governance. AI SAFE² + AI Sovereignty Maturity Model (AISM) [Dual License: MIT + CC-BY-SA]

    GitHub repository with 127 stars and 19 forks.

    Trending score: 0.47; stars gained: +2; forks gained: +1.

    Language: Python

    Topics: agentic-ai, ai-governance, ai-security, compliance, devsecops, grc

Trending in Python

  1. 1. NousResearch/hermes-agent

    The agent that grows with you

    GitHub repository with 181,454 stars and 31,141 forks.

    Trending score: 5.95; stars gained: +1,867; forks gained: +361.

    Language: Python

    Topics: ai, ai-agent, ai-agents, anthropic, chatgpt, claude

  2. 2. chopratejas/headroom

    Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

    GitHub repository with 12,942 stars and 833 forks.

    Trending score: 5.69; stars gained: +2,829; forks gained: +175.

    Language: Python

    Topics: agent, ai, anthropic, claude-code, compression, context-engineering

  3. 3. Imbad0202/academic-research-skills

    Academic Research Skills for Claude Code: research → write → review → revise → finalize

    GitHub repository with 27,379 stars and 2,251 forks.

    Trending score: 5.52; stars gained: +1,079; forks gained: +89.

    Language: Python

    Topics: academic-pipeline, academic-writing, ai-research, claude, claude-code, literature-review

  4. 4. anthropics/financial-services

    GitHub repository with 30,002 stars and 4,224 forks.

    Trending score: 4.88; stars gained: +688; forks gained: +114.

    Language: Python

  5. 5. virgiliojr94/book-to-skill

    Turn any technical book PDF into a Claude Code skill — ready to study, reference, and use while you work.

    GitHub repository with 4,221 stars and 528 forks.

    Trending score: 4.88; stars gained: +476; forks gained: +68.

    Language: Python

  6. 6. vinta/awesome-python

    An opinionated list of Python frameworks, libraries, tools, and resources

    GitHub repository with 301,341 stars and 28,044 forks.

    Trending score: 4.60; stars gained: +518; forks gained: +24.

    Language: Python

    Topics: awesome, python, collections, python-frameworks, python-libraries, python-tools

Trending topic: ai-safety

  1. 1. microsoft/agent-governance-toolkit

    AI Agent Governance Toolkit — Policy enforcement, zero-trust identity, execution sandboxing, and reliability engineering for autonomous AI agents. Covers 10/10 OWASP Agentic Top 10.

    GitHub repository with 3,992 stars and 547 forks.

    Trending score: 4.25; stars gained: +167; forks gained: +12.

    Language: Python

    Topics: agent-framework, ai-agents, ai-safety, compliance, governance, microsoft

  2. 2. ifixai-ai/iFixAi

    The open-source diagnostic for AI misalignment. 32 tests across fabrication, manipulation, deception, unpredictability, and opacity. Provider-agnostic. Runs against OpenAI, Anthropic, Bedrock, Azure, Gemini, and more. Letter grade in under 5 minutes, content-addressed manifest for bit-identical replay. Built by iMe.

    GitHub repository with 466 stars and 90 forks.

    Trending score: 1.78; stars gained: +6; forks gained: +3.

    Language: Python

    Topics: ai, diagnostic-tool, misalignment, agent-evaluation, ai-alignment, ai-evaluation

  3. 3. securelayer7/PROMPTPurify

    Prompt-injection guardrail for LLM applications. Compact model that outperforms larger open-source guards. No regex, no signatures. Demo: anton.securelayer7.net

    GitHub repository with 45 stars and 17 forks.

    Trending score: 0.97; stars gained: +9; forks gained: +5.

    Language: TypeScript

    Topics: ai-firewall, ai-safety, ai-security, application-security, ctf, guardrails

  4. 4. cordum-io/cordum

    The open agent control plane. Govern autonomous AI agents with pre-execution policy enforcement, approval gates, and audit trails. Works with LangChain, CrewAI, MCP, and any framework.

    GitHub repository with 485 stars and 29 forks.

    Trending score: 0.90; stars gained: +1; forks gained: +0.

    Language: Go

    Topics: ai-orchestration, ai-safety, autonomous-agents, governance, llm-agents, workflow-engine

  5. 5. emmanuelgjr/genai_incidents

    Single source of truth for GenAI and agentic AI security incidents, mapped to OWASP LLM Top 10, OWASP Agentic Top 10 (ASI), NIST AI RMF, and MITRE ATLAS.

    GitHub repository with 12 stars and 3 forks.

    Trending score: 0.87; stars gained: +6; forks gained: +1.

    Language: Python

    Topics: agentic-incidents, ai-incidents, ai-safety, cybersecurity, dataset, genai-incidents

  6. 6. trustabl/trustabl

    Static analyzer for agent reliability.

    GitHub repository with 17 stars and 3 forks.

    Trending score: 0.84; stars gained: +1; forks gained: +0.

    Language: Go

    Topics: agent-security, agent-security-eval, agent-security-scanner, agent-security-tools, agent-tools, agents