PARMA-CC: A Family of Parallel Multiphase Approximate Cluster Combining Algorithms
Artikel i vetenskaplig tidskrift, 2023
We show that PARMA-CC algorithms yield equivalent clustering outcomes despite their different approaches. Furthermore, we show that certain PARMA-CC algorithms can achieve higher efficiency with respect to certain properties of the data to be clustered. Generally speaking, in PARMA-CC algorithms, parallel threads compute summaries associated with clusters of data (sub)sets. As the threads concurrently combine the summaries, they construct a comprehensive summary of the sets of clusters. By approximating a cluster with its respective geometrical summaries, PARMA-CC algorithms scale well with increased data volumes, and, by computing and efficiently combining the summaries in parallel, they enable latency improvements. PARMA-CC algorithms utilize special data structures that enable parallelism through in-place data processing. As we show in our analysis and evaluation, PARMA-CC algorithms can complement and outperform well-established methods, with significantly better scalability, while still providing highly accurate results in a variety of data sets, even with skewed data distributions, which cause the traditional approaches to exhibit their worst-case behaviour.
Parallel Clustering
Synchronization
Data Structures
Approximation
Författare
Amir Keramatian
Nätverk och System
Vincenzo Massimiliano Gulisano
Nätverk och System
Marina Papatriantafilou
Nätverk och System
Philippas Tsigas
Nätverk och System
Journal of Parallel and Distributed Computing
0743-7315 (ISSN) 1096-0848 (eISSN)
Vol. 177 68-88HAREN: Självdistribuerad och anpassningsbar dataströmningsanalys i dimman
Vetenskapsrådet (VR) (2016-03800), 2017-01-01 -- 2020-12-31.
Molnbaserade produkter och produktion (FiC)
Stiftelsen för Strategisk forskning (SSF) (GMT14-0032), 2016-01-01 -- 2020-12-31.
Ämneskategorier
Datorteknik
Mediateknik
Datavetenskap (datalogi)
Datorsystem
Styrkeområden
Informations- och kommunikationsteknik
Produktion
Drivkrafter
Hållbar utveckling
DOI
10.1016/j.jpdc.2023.02.001