Skip to main navigation Skip to search Skip to main content

Data structures for maintaining set partitions

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Efficiently maintaining the partition induced by a set of features is an important problem in building decision-tree classifiers. In order to identify a small set of discriminating features, we need the capability of efficiently adding and removing specific features and determining the effect of these changes on the induced classification or partition. In this paper we introduce a variety of randomized and deterministic data structures to support these operations on both general and geometrically induced set partitions. We give both Monte Carlo and Las Vegas data structures that realize near-optimal time bounds and are practical to implement. We then provide a faster solution to this problem in the geometric setting. Finally, we present a data structure that efficiently estimates the number of partitions separating elements.

Original languageEnglish
Pages (from-to)43-67
Number of pages25
JournalRandom Structures and Algorithms
Volume25
Issue number1
DOIs
StatePublished - Aug 2004

Keywords

  • Approximation algorithms
  • Data structures
  • Decision trees
  • Random walks
  • Randomized algorithms
  • Set partitions

Fingerprint

Dive into the research topics of 'Data structures for maintaining set partitions'. Together they form a unique fingerprint.

Cite this