open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
GitHub repository with 7,061 stars and 784 forks.
Language: Python
Topics: evaluation, benchmark, large-language-model, chatgpt, llm, llama2, openai, llama3