php/real-time-benchmark-data

Results for Real-Time Benchmark for PHP

GitHub repository with 20 stars and 1 forks.

Topics: benchmark, php

Open provider repository

Latest metric snapshot

2026-06-15: 20 stars and 1 forks.

Similar repositories

  1. 1. VibeBench/VibeSearchBench

    🔍 The hardest search benchmark in the wild — vague, multi-turn, proactive. 200 long-horizon tasks with persona-driven progressive disclosure, scored by verifiable schema-free knowledge-graph evaluation. No vibes, just triplet F1.

    GitHub repository with 1,008 stars and 63 forks.

    Trending score: 3.33; stars gained: +50; forks gained: +37.

    Language: Python

    Topics: agentic-ai, benchmark, llm, proactive-agent, search, search-agent

  2. 2. wuyoscar/Internal-Safety-Collapse

    Internal Safety Collapse (ISC): Turning the LLM or an AI Agent into a sensitive data generator.

    GitHub repository with 856 stars and 140 forks.

    Trending score: 2.51; stars gained: +11; forks gained: +3.

    Language: Python

    Topics: agent-safety, ai-safety, benchmark, jailbreak, large-language-models, llm-safety

  3. 3. Purewhiter/mobilegym

    MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research · 浏览器里运行的安卓模拟器 · Browser-hosted Android Simulator · Verifiable Evaluation · Scalable Online RL Training

    GitHub repository with 618 stars and 98 forks.

    Trending score: 2.50; stars gained: +12; forks gained: +1.

    Language: TypeScript

    Topics: benchmark, mobile-agent, reinforcement-learning, vlm, agents, gym

  4. 4. hogeheer499-commits/strix-halo-guide

    Complete guide to running large language models locally on AMD Strix Halo / Ryzen AI MAX+ 395 with Radeon 8060S (gfx1151) and 96GB/128GB unified memory. Covers BIOS config, Ubuntu/kernel setup, Ollama, llama.cpp Vulkan/RADV, ROCm/HIP, vLLM, and 70B/120B GGUF evidence.

    GitHub repository with 142 stars and 6 forks.

    Trending score: 1.97; stars gained: +9; forks gained: +0.

    Language: Python

    Topics: amd, benchmark, gfx1151, llama-cpp, llm, local-llm

  5. 5. SemiAnalysisAI/InferenceX

    Open Source Continuous Inference Benchmark Research Platform Kimi K2.6, DeepSeekv4, GLM5 - GB200 NVL72 vs MI355X vs B200 vs GB300 NVL72 & soon™ TPUv6e/v7/Trainium2/3

    GitHub repository with 1,098 stars and 194 forks.

    Trending score: 1.96; stars gained: +4; forks gained: +1.

    Language: Shell

    Topics: ai, amd, benchmark, cuda, gb200, llm

  6. 6. open-compass/opencompass

    OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

    GitHub repository with 7,084 stars and 788 forks.

    Trending score: 1.55; stars gained: +3; forks gained: +0.

    Language: Python

    Topics: benchmark, chatgpt, evaluation, large-language-model, llama2, llama3

Trending topic: benchmark

  1. 1. VibeBench/VibeSearchBench

    🔍 The hardest search benchmark in the wild — vague, multi-turn, proactive. 200 long-horizon tasks with persona-driven progressive disclosure, scored by verifiable schema-free knowledge-graph evaluation. No vibes, just triplet F1.

    GitHub repository with 1,008 stars and 63 forks.

    Trending score: 3.33; stars gained: +50; forks gained: +37.

    Language: Python

    Topics: agentic-ai, benchmark, llm, proactive-agent, search, search-agent

  2. 2. wuyoscar/Internal-Safety-Collapse

    Internal Safety Collapse (ISC): Turning the LLM or an AI Agent into a sensitive data generator.

    GitHub repository with 856 stars and 140 forks.

    Trending score: 2.51; stars gained: +11; forks gained: +3.

    Language: Python

    Topics: agent-safety, ai-safety, benchmark, jailbreak, large-language-models, llm-safety

  3. 3. Purewhiter/mobilegym

    MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research · 浏览器里运行的安卓模拟器 · Browser-hosted Android Simulator · Verifiable Evaluation · Scalable Online RL Training

    GitHub repository with 618 stars and 98 forks.

    Trending score: 2.50; stars gained: +12; forks gained: +1.

    Language: TypeScript

    Topics: benchmark, mobile-agent, reinforcement-learning, vlm, agents, gym

  4. 4. hogeheer499-commits/strix-halo-guide

    Complete guide to running large language models locally on AMD Strix Halo / Ryzen AI MAX+ 395 with Radeon 8060S (gfx1151) and 96GB/128GB unified memory. Covers BIOS config, Ubuntu/kernel setup, Ollama, llama.cpp Vulkan/RADV, ROCm/HIP, vLLM, and 70B/120B GGUF evidence.

    GitHub repository with 142 stars and 6 forks.

    Trending score: 1.97; stars gained: +9; forks gained: +0.

    Language: Python

    Topics: amd, benchmark, gfx1151, llama-cpp, llm, local-llm

  5. 5. SemiAnalysisAI/InferenceX

    Open Source Continuous Inference Benchmark Research Platform Kimi K2.6, DeepSeekv4, GLM5 - GB200 NVL72 vs MI355X vs B200 vs GB300 NVL72 & soon™ TPUv6e/v7/Trainium2/3

    GitHub repository with 1,098 stars and 194 forks.

    Trending score: 1.96; stars gained: +4; forks gained: +1.

    Language: Shell

    Topics: ai, amd, benchmark, cuda, gb200, llm

  6. 6. open-compass/opencompass

    OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

    GitHub repository with 7,084 stars and 788 forks.

    Trending score: 1.55; stars gained: +3; forks gained: +0.

    Language: Python

    Topics: benchmark, chatgpt, evaluation, large-language-model, llama2, llama3