sparkdq-community/sparkdq
A lightweight, declarative PySpark framework for data quality validation — check columns, rows, and entire datasets directly in your Spark pipelines
GitHub repository with 75 stars and 8 forks.
Language: Python
Topics: data-quality, data-validation, pyspark, data-engineering, pyspark-validation, spark-data-quality, apache-spark, databricks, etl