Elijas/token-throttle
Multi-resource rate limiting for LLM APIs. Reserve tokens before you call, refund what you don't use, stay under the limit across workers.
GitHub repository with 18 stars and 2 forks.
Language: Python
Topics: ai, ai-agents, ai-engineering, llm, llms, rate-limit, rate-limit-redis, rate-limiter, rate-limiting, tokens