LiangSu8899/FlashRT
FlashRT is a high-performance realtime inference engine for small-batch, latency-sensitive AI workloads. The flagship integration is production VLA control for Pi0, Pi0.5, GROOT N1.6, and Pi0-FAST. Also support llm e.g, qwen3.6-27B
GitHub repository with 281 stars and 32 forks.
Language: C++
Topics: cuda, cuda-kernels, realtime-inference, realtime-vla, gr00t, gr00t-n1-6-3b, pi, pi05, vla, qwen