Skip to main navigation Skip to search Skip to main content

As strong as the weakest link: Mining diverse cliques in weighted graphs

  • Petko Bogdanov
  • , Ben Baumer
  • , Prithwish Basu
  • , Amotz Bar-Noy
  • , Ambuj K. Singh
  • Smith College
  • RTX Corporation
  • City University of New York
  • University of California at Santa Barbara

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

Mining for cliques in networks provides an essential tool for the discovery of strong associations among entities. Applications vary, from extracting core subgroups in team performance data arising in sports, entertainment, research and business; to the discovery of functional complexes in high-throughput gene interaction data. A challenge in all of these scenarios is the large size of real-world networks and the computational complexity associated with clique enumeration. Furthermore, when mining for multiple cliques within the same network, the results need to be diversified in order to extract meaningful information that is both comprehensive and representative of the whole dataset. We formalize the problem of weighted diverse clique mining (mDKC) in large networks, incorporating both individual clique strength (measured by its weakest link) and diversity of the cliques in the result set. We show that the problem is NP-hard due to the diversity requirement. However, our formulation is sub-modular, and hence can be approximated within a constant factor from the optimal. We propose algorithms for mDKC that exploit the edge weight distribution in the input network and produce performance gains of more than 3 orders of magnitude compared to an exhaustive solution. One of our algorithms, Diverse Cliques (DiCliQ), guarantees a constant factor approximation while the other, Bottom Up Diverse Cliques (BUDiC), scales to large and dense networks without compromising the solution quality. We evaluate both algorithms on 5 real-world networks of different genres and demonstrate their utility for discovery of gene complexes and effective collaboration subgroups in sports and entertainment.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Proceedings
PublisherSpringer Verlag
Pages525-540
Number of pages16
EditionPART 1
ISBN (Print)9783642409875
DOIs
StatePublished - 2013
Event13th Joint European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2013 - Prague, Czech Republic
Duration: Sep 23 2013Sep 27 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume8188 LNAI

Conference

Conference13th Joint European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2013
Country/TerritoryCzech Republic
CityPrague
Period09/23/1309/27/13

Fingerprint

Dive into the research topics of 'As strong as the weakest link: Mining diverse cliques in weighted graphs'. Together they form a unique fingerprint.

Cite this