ClevelandMuseumArt/openaccess
GitHub repository with 97 stars and 9 forks.
Topics: dataset
GitHub repository with 97 stars and 9 forks.
Topics: dataset
2026-06-05: 97 stars and 9 forks.
Single source of truth for GenAI and agentic AI security incidents, mapped to OWASP LLM Top 10, OWASP Agentic Top 10 (ASI), NIST AI RMF, and MITRE ATLAS.
GitHub repository with 12 stars and 3 forks.
Trending score: 0.87; stars gained: +6; forks gained: +1.
Language: Python
Topics: agentic-incidents, ai-incidents, ai-safety, cybersecurity, dataset, genai-incidents
Browser compatibility data for Web technologies as displayed on MDN
GitHub repository with 5,681 stars and 2,566 forks.
Trending score: 0.63; stars gained: +2; forks gained: +2.
Language: JSON
Topics: compatibility, compat, data, dataset, json
A hand-curated, topic-organized library of the best ML education — 923 docs (391 arXiv papers, 474 Stanford/MIT/Karpathy/fast.ai lectures, 58 explainer articles), normalized to Markdown with full provenance. Open it in Obsidian or point your agent at it. A clean ML corpus for learning, RAG & fine-tuning.
GitHub repository with 120 stars and 13 forks.
Trending score: 0.58; stars gained: +3; forks gained: +1.
Language: Python
Topics: arxiv, corpus, dataset, deep-learning, education, llm
Crowd-sourced lists of urls to help Common Crawl crawl under-resourced languages. See https://github.com/commoncrawl/web-languages-code/ for the code
GitHub repository with 69 stars and 93 forks.
Trending score: 0.42; stars gained: +1; forks gained: +2.
Topics: crawling, dataset, language-detection
200+ auto-updated space, astronomy & physics datasets on Hugging Face (NASA, NOAA, ESA, JPL, SpaceX, Wikidata). Satellites, asteroids, space probes (Voyager, Cassini, Mars), space weather, exoplanets, pulsars, radio/X-ray surveys, cosmic rays, particle physics, and more. Parquet format, no API keys.
GitHub repository with 7 stars and 1 forks.
Trending score: 0.36; stars gained: +1; forks gained: +1.
Language: Python
Topics: asteroids, huggingface-datasets, machine-learning-dataset, nasa, noaa, open-data
The National Gallery of Art Open Data Program
GitHub repository with 745 stars and 120 forks.
Trending score: 0.32; stars gained: +1; forks gained: +0.
Language: Python
Topics: open, data, art, collection, csv, national-gallery
Single source of truth for GenAI and agentic AI security incidents, mapped to OWASP LLM Top 10, OWASP Agentic Top 10 (ASI), NIST AI RMF, and MITRE ATLAS.
GitHub repository with 12 stars and 3 forks.
Trending score: 0.87; stars gained: +6; forks gained: +1.
Language: Python
Topics: agentic-incidents, ai-incidents, ai-safety, cybersecurity, dataset, genai-incidents
Browser compatibility data for Web technologies as displayed on MDN
GitHub repository with 5,681 stars and 2,566 forks.
Trending score: 0.63; stars gained: +2; forks gained: +2.
Language: JSON
Topics: compatibility, compat, data, dataset, json
A hand-curated, topic-organized library of the best ML education — 923 docs (391 arXiv papers, 474 Stanford/MIT/Karpathy/fast.ai lectures, 58 explainer articles), normalized to Markdown with full provenance. Open it in Obsidian or point your agent at it. A clean ML corpus for learning, RAG & fine-tuning.
GitHub repository with 120 stars and 13 forks.
Trending score: 0.58; stars gained: +3; forks gained: +1.
Language: Python
Topics: arxiv, corpus, dataset, deep-learning, education, llm
Crowd-sourced lists of urls to help Common Crawl crawl under-resourced languages. See https://github.com/commoncrawl/web-languages-code/ for the code
GitHub repository with 69 stars and 93 forks.
Trending score: 0.42; stars gained: +1; forks gained: +2.
Topics: crawling, dataset, language-detection
200+ auto-updated space, astronomy & physics datasets on Hugging Face (NASA, NOAA, ESA, JPL, SpaceX, Wikidata). Satellites, asteroids, space probes (Voyager, Cassini, Mars), space weather, exoplanets, pulsars, radio/X-ray surveys, cosmic rays, particle physics, and more. Parquet format, no API keys.
GitHub repository with 7 stars and 1 forks.
Trending score: 0.36; stars gained: +1; forks gained: +1.
Language: Python
Topics: asteroids, huggingface-datasets, machine-learning-dataset, nasa, noaa, open-data
The National Gallery of Art Open Data Program
GitHub repository with 745 stars and 120 forks.
Trending score: 0.32; stars gained: +1; forks gained: +0.
Language: Python
Topics: open, data, art, collection, csv, national-gallery