Skip to main navigation Skip to search Skip to main content

Efficient Join Synopsis Maintenance for Data Warehouse

  • University of Utah

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

17 Scopus citations

Abstract

Various sources such as daily business operations and sensors from different IoT applications constantly generate a lot of data. They are often loaded into a data warehouse system to perform complex analysis over. It, however, can be extremely costly if the query involves joins, especially many-to-many joins over multiple large tables. A join synopsis, i.e., a small uniform random sample over the join result, often suffices as a representative alternative to the full join result for many applications such as histogram construction, model training and etc. Towards that end, we propose a novel algorithm SJoin that can maintain a join synopsis over a pre-specified general θ-join query in a dynamic database with continuous inflows of updates. Central to SJoin is maintaining a weighted join graph index, which assists to efficiently replace join results in the synopsis upon update. We conduct extensive experiments using TPC-DS and a simulated road sensor data over several complex join queries and they demonstrate the clear advantage of SJoin over the best available baseline.

Original languageEnglish
Title of host publicationSIGMOD 2020 - Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages2027-2042
Number of pages16
ISBN (Electronic)9781450367356
DOIs
StatePublished - Jun 14 2020
Event2020 ACM SIGMOD International Conference on Management of Data, SIGMOD 2020 - Portland, United States
Duration: Jun 14 2020Jun 19 2020

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data

Conference

Conference2020 ACM SIGMOD International Conference on Management of Data, SIGMOD 2020
Country/TerritoryUnited States
CityPortland
Period06/14/2006/19/20

Keywords

  • join synopsis
  • random sampling

Fingerprint

Dive into the research topics of 'Efficient Join Synopsis Maintenance for Data Warehouse'. Together they form a unique fingerprint.

Cite this