Clustering by Sum of Norms: Stochastic Incremental Algorithm, Convergence and Cluster Recovery
Paper i proceeding, 2017

Standard clustering methods such as K-means, Gaussian mixture models, and hierarchical clustering, arc beset by local minima, which are sometimes drastically suboptimal. Moreover the number of clusters K must be known in advance. The recently introduced sum-of-norms (SON) or Clusterpath convex relaxation of k-means and hierarchical clustering shrinks cluster centroids toward one another and ensure a unique global minimizer. We give a scalable stochastic incremental algorithm based on proximal iterations to solve the SON problem with convergence guarantees. We also show that the algorithm recovers clusters under quite general conditions which have a similar form to the unifying proximity condition introduced in the approximation algorithms community (that covers paradigm cases such as Gaussian mixtures and planted partition models). We give experimental results to confirm that our algorithm scales much better than previous methods while producing clusters of comparable quality.

Författare

Ashkan Panahi

North Carolina State University

Devdatt Dubhashi

Chalmers, Data- och informationsteknik, Datavetenskap

Fredrik Johansson

Massachusetts Institute of Technology (MIT)

Chiranjib Bhattacharyya

CSA Department

Proceedings of Machine Learning Research

Vol. 6 4247-4260
978-151085514-4 (ISBN)

34th International Conference on Machine Learning, ICML 2017
Sydney, Australia,

Ämneskategorier

Reglerteknik

Signalbehandling

Datavetenskap (datalogi)

Mer information

Senast uppdaterat

2022-04-26