tinybirdco/llm-benchmark
We assessed the ability of popular LLMs to generate accurate and efficient SQL from natural language prompts. Using a 200 million record dataset from the GH Archive uploaded to Tinybird, we asked the LLMs to generate SQL based on 50 prompts.
GitHub repository with 81 stars and 9 forks.
Language: TypeScript