fkiene/llmtrim
Local proxy that compresses your LLM API requests so you pay less, with no change to the answers. Trims wasted tokens from prompts, history, tool output, and code before they're sent: -31% input / -74% output, measured live. Any provider, no extra model calls. Also an MCP server and embeddable library (Rust, Python, Ruby, Kotlin, Swift).
GitHub repository with 50 stars and 2 forks.
Language: Rust
Topics: agentic-coding, ai, anthropic, claude-code, cost-reduction, developer-tools, llm, llmops, mcp, mitm-proxy