Two divide-and-conquer clustering algorithms proven to be consistent for large networks
In this article, we present two division-to-rule algorithms for grouping large graphs. Both algorithms apply a basic algorithm to several small subgraphs, and then use these individual local groupings to obtain a global grouping. We show that our methods help to scale computation-intensive basic clustering algorithms to large networks and improve the algorithmic stability of some well-known algorithms.
In this article, we propose division-to-rule strategies to solve the problem of community detection in networks. We propose two algorithms which perform clustering on several small subgraphs and finally patch the results in a single clustering. The main advantage of these algorithms is that they significantly reduce the computational cost of traditional algorithms, including spectral clustering, semi-defined programs, modularity-based methods, likelihood-based methods, etc., without lose precision, and sometimes even improve precision. These algorithms are also, by nature, parallelizable. Since most traditional algorithms are precise, and the corresponding optimization problems are much simpler in small problems, our divide-and-conquer methods provide an omnibus recipe for scaling traditional algorithms to large networks. We prove the consistency of these algorithms under various subgraph selection procedures and perform in-depth simulations and real-world data analyzes to understand the benefits of the divide-and-conquer approach in various contexts.
- Accepted August 30, 2021.
Author contributions: research designed by SSM, PS and PJB; SSM, PS and PJB carried out research; SSM and PS provided new reagents / analytical tools; SSM and PS analyzed the data; and SSM and PS wrote the paper.
Editors: CEP, Johns Hopkins University; and KR, University of Wisconsin-Madison.
The authors declare no competing interests.
This article contains additional information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2100482118/-/DCSupplemental.