Clustering Algorithms — Interactive Comparison
Three classic unsupervised algorithms (K-Means, DBSCAN, HAC) compared on three benchmark 2D datasets where the "right" choice differs.
The point isn't that one algorithm is best — it's that picking the right tool for the geometry of your data matters more than any tuning. Try a non-default algorithm on the spiral dataset and watch K-Means fail.
Scope: these are synthetic 2D toy datasets chosen because they make geometric trade-offs visible at a glance. Real-world clustering rarely looks this clean — the metrics shown (silhouette, Davies-Bouldin, Calinski-Harabasz) are useful relative comparisons here, less so as absolute production benchmarks.
3 entwined spiral-shaped clusters (N=312). DBSCAN dominates here — distance-based methods get fooled by the geometry.
Run K-Means, DBSCAN, and HAC side-by-side with each dataset's optimal parameters. Useful for seeing why no single algorithm 'wins' across all geometries.
Pick a dataset above and press Compare to run all three algorithms with their pre-tuned parameters.