corpusmusic/millionsongtestset

Extracted data from the intersection of the Million Song Dataset's 10,000-song subset, the Tagtraum genre "ground truth" dataset, and the musiXmatch lyrics dataset. This combined dataset is designed to help researchers develop genre-classification algorithms and statistical analysis methods for the portion of the Million Song Dataset for which there are reliable lyric information and user-genereated genre tags tied to specific songs.

GitHub repository with 6 stars and 2 forks.

Language: R

Open provider repository

Latest metric snapshot

2026-06-05: 6 stars and 2 forks.

Similar repositories

1. prisma-flowdiagram/PRISMA2020

Produce PRISMA-2020 compliant flow diagrams

GitHub repository with 276 stars and 116 forks.

Trending score: 0.49; stars gained: +2; forks gained: +0.

Language: R
2. SequoiApp/Rsequoia2

GitHub repository with 10 stars and 0 forks.

Trending score: 0.33; stars gained: +1; forks gained: -1.

Language: R
3. Hinna0818/UKBAnalytica_v2

📊 A Scalable Phenotyping and Statistical Pipeline for UK Biobank RAP Data Analysis

GitHub repository with 33 stars and 4 forks.

Trending score: 0.33; stars gained: +1; forks gained: +0.

Language: R
4. riazarbi/sp500-scraper

Constituent history of the S&P 500 from various data sources

GitHub repository with 35 stars and 13 forks.

Trending score: 0.33; stars gained: +1; forks gained: +1.

Language: R

Topics: backtesting, equity-data, equity-research, sp500, sp500-data-analysis
5. openair-project/openair

🧭 Open source tools for air quality data analysis

GitHub repository with 357 stars and 122 forks.

Trending score: 0.32; stars gained: +1; forks gained: +0.

Language: R

Topics: air-quality, air-quality-data, meteorology, openair, package, r
6. yrosseel/lavaan

an R package for structural equation modeling and more

GitHub repository with 504 stars and 120 forks.

Trending score: 0.32; stars gained: +1; forks gained: +0.

Language: R

Topics: factor-analysis, growth-curve-models, latent-variables, missing-data, multilevel-models, multivariate-analysis

Trending in R

1. prisma-flowdiagram/PRISMA2020

Produce PRISMA-2020 compliant flow diagrams

GitHub repository with 276 stars and 116 forks.

Trending score: 0.49; stars gained: +2; forks gained: +0.

Language: R
2. SequoiApp/Rsequoia2

GitHub repository with 10 stars and 0 forks.

Trending score: 0.33; stars gained: +1; forks gained: -1.

Language: R
3. Hinna0818/UKBAnalytica_v2

📊 A Scalable Phenotyping and Statistical Pipeline for UK Biobank RAP Data Analysis

GitHub repository with 33 stars and 4 forks.

Trending score: 0.33; stars gained: +1; forks gained: +0.

Language: R
4. riazarbi/sp500-scraper

Constituent history of the S&P 500 from various data sources

GitHub repository with 35 stars and 13 forks.

Trending score: 0.33; stars gained: +1; forks gained: +1.

Language: R

Topics: backtesting, equity-data, equity-research, sp500, sp500-data-analysis
5. openair-project/openair

🧭 Open source tools for air quality data analysis

GitHub repository with 357 stars and 122 forks.

Trending score: 0.32; stars gained: +1; forks gained: +0.

Language: R

Topics: air-quality, air-quality-data, meteorology, openair, package, r
6. yrosseel/lavaan

an R package for structural equation modeling and more

GitHub repository with 504 stars and 120 forks.

Trending score: 0.32; stars gained: +1; forks gained: +0.

Language: R

Topics: factor-analysis, growth-curve-models, latent-variables, missing-data, multilevel-models, multivariate-analysis