academic/awesome-datascience

:memo: An awesome Data Science repository to learn and apply for real world problems.

GitHub repository with 29,341 stars and 6,547 forks.

Topics: data-science, machine-learning, data-visualization, science, data-mining, awesome-list, deep-learning, analytics, data-scientists, hacktoberfest

Open provider repository

24h trend summary

Trending score 1.38, activity score 0.05, stars gained +27, forks gained +5.

Latest metric snapshot

2026-06-05: 29,341 stars and 6,547 forks.

Similar repositories

  1. 1. apache/superset

    Apache Superset is a Data Visualization and Data Exploration Platform

    GitHub repository with 73,180 stars and 17,522 forks.

    Trending score: 2.73; stars gained: +24; forks gained: +19.

    Language: TypeScript

    Topics: analytics, apache, apache-superset, asf, bi, business-analytics

  2. 2. marimo-team/marimo

    A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. Stored as pure Python. All in a modern, AI-native editor.

    GitHub repository with 21,312 stars and 1,126 forks.

    Trending score: 2.54; stars gained: +20; forks gained: +7.

    Language: Python

    Topics: notebooks, python, data-science, machine-learning, artificial-intelligence, data-visualization

  3. 3. streamlit/streamlit

    Streamlit — A faster way to build and share data apps.

    GitHub repository with 44,830 stars and 4,264 forks.

    Trending score: 2.52; stars gained: +25; forks gained: +4.

    Language: Python

    Topics: data-analysis, data-science, data-visualization, deep-learning, developer-tools, machine-learning

  4. 4. ray-project/ray

    Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

    GitHub repository with 42,779 stars and 7,642 forks.

    Trending score: 2.37; stars gained: +15; forks gained: +13.

    Language: Python

    Topics: ray, distributed, parallel, machine-learning, reinforcement-learning, deep-learning

  5. 5. SimplifyJobs/Summer2026-Internships

    Summer 2026 software engineering, data science, AI, quant, product management, and hardware internship postings. Updated daily by Simplify and Pitt CSC.

    GitHub repository with 44,806 stars and 3,182 forks.

    Trending score: 2.09; stars gained: +17; forks gained: +1.

    Language: Python

    Topics: data-science, fall-2026, github, internship, internships, interview-preparation

  6. 6. lance-format/lance

    Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..

    GitHub repository with 6,582 stars and 695 forks.

    Trending score: 1.70; stars gained: +5; forks gained: +3.

    Language: Rust

    Topics: apache-arrow, computer-vision, data-analysis, data-analytics, data-centric, data-format

Trending topic: data-science

  1. 1. apache/superset

    Apache Superset is a Data Visualization and Data Exploration Platform

    GitHub repository with 73,180 stars and 17,522 forks.

    Trending score: 2.73; stars gained: +24; forks gained: +19.

    Language: TypeScript

    Topics: analytics, apache, apache-superset, asf, bi, business-analytics

  2. 2. marimo-team/marimo

    A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. Stored as pure Python. All in a modern, AI-native editor.

    GitHub repository with 21,312 stars and 1,126 forks.

    Trending score: 2.54; stars gained: +20; forks gained: +7.

    Language: Python

    Topics: notebooks, python, data-science, machine-learning, artificial-intelligence, data-visualization

  3. 3. streamlit/streamlit

    Streamlit — A faster way to build and share data apps.

    GitHub repository with 44,830 stars and 4,264 forks.

    Trending score: 2.52; stars gained: +25; forks gained: +4.

    Language: Python

    Topics: data-analysis, data-science, data-visualization, deep-learning, developer-tools, machine-learning

  4. 4. ray-project/ray

    Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

    GitHub repository with 42,779 stars and 7,642 forks.

    Trending score: 2.37; stars gained: +15; forks gained: +13.

    Language: Python

    Topics: ray, distributed, parallel, machine-learning, reinforcement-learning, deep-learning

  5. 5. SimplifyJobs/Summer2026-Internships

    Summer 2026 software engineering, data science, AI, quant, product management, and hardware internship postings. Updated daily by Simplify and Pitt CSC.

    GitHub repository with 44,806 stars and 3,182 forks.

    Trending score: 2.09; stars gained: +17; forks gained: +1.

    Language: Python

    Topics: data-science, fall-2026, github, internship, internships, interview-preparation

  6. 6. lance-format/lance

    Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..

    GitHub repository with 6,582 stars and 695 forks.

    Trending score: 1.70; stars gained: +5; forks gained: +3.

    Language: Rust

    Topics: apache-arrow, computer-vision, data-analysis, data-analytics, data-centric, data-format