avnlp/biothink
Self-Reflective Question Answering for Biomedical Reasoning. GRPO fine-tuning via QLoRA & Unsloth with rewards for correctness, relevance, groundness, utility & XML structure. Structured think → answer → self-reflection with context grading, relevance assessment & groundness evaluation. DeepEval LLM-as-a-Judge (GEval, Faithfulness, Relevancy).
GitHub repository with 5 stars and 1 forks.
Language: Python
Topics: biomedical-question-answering, deepeval, grpo, rag, self-rag, self-reflection