Abstract
Efficiently maintaining the partition induced by a set of features is an important problem in building decision-tree classifiers. In order to identify a small set of discriminating features, we need the capability of efficiently adding and removing specific features and determining the effect of these changes on the induced classification or partition. In this paper we introduce a variety of randomized and deterministic data structures to support these operations on both general and geometrically induced set partitions. We give both Monte Carlo and Las Vegas data structures that realize near-optimal time bounds and are practical to implement. We then provide a faster solution to this problem in the geometric setting. Finally, we present a data structure that efficiently estimates the number of partitions separating elements.
| Original language | English |
|---|---|
| Pages (from-to) | 43-67 |
| Number of pages | 25 |
| Journal | Random Structures and Algorithms |
| Volume | 25 |
| Issue number | 1 |
| DOIs | |
| State | Published - Aug 2004 |
Keywords
- Approximation algorithms
- Data structures
- Decision trees
- Random walks
- Randomized algorithms
- Set partitions
Fingerprint
Dive into the research topics of 'Data structures for maintaining set partitions'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver