benseverndev-oss/goldenmatch
Zero-config entity resolution. The zero-tuning Fellegi-Sunter path beats hand-rolled Splink head-to-head; scales from a CSV to a verified 100M-row dedupe in 9.2 min on Ray. Fuzzy/exact/probabilistic + PPRL + LLM, identity graph. Python + edge-safe TypeScript (optional WASM), SQL-native in Postgres & DuckDB, MCP/REST + dbt/Airflow.
GitHub repository with 111 stars and 10 forks.
Language: Python
Topics: active-learning, agent, airflow, auto-config, data-engineering, data-quality, deduplication, entity-resolution, fuzzy-matching, human-in-the-loop