MagellaX/StreamAttn
A high-performance attention mechanism that computes softmax normalization in a single streaming pass using running accumulators (online softmax).
GitHub repository with 31 stars and 0 forks.
Language: Python
Topics: cuda, llms, rl, triton