Skip to main navigation Skip to search Skip to main content

Query log compression for workload analytics

Research output: Contribution to journalConference articlepeer-review

5 Scopus citations

Abstract

Analyzing database access logs is a key part of performance tuning, intrusion detection, benchmark development, and many other database administration tasks. Unfortunately, it is common for production databases to deal with millions or more queries each day, so these logs must be summarized before they can be used. Designing an appropriate summary encoding requires trading off between conciseness and information content. For example: simple workload sampling may miss rare, but high impact queries. In this paper, we present LoGR, a lossy log compression scheme suitable for use in many automated log analytics tools, as well as for human inspection. We formalize and analyze the space/fidelity trade-off in the context of a broader family of pattern“ and pattern mixture“ log encodings to which LogR belongs. We show through a series of experiments that LogR compressed encodings can be created efficiently, come with provable information-theoretic bounds on their accuracy, and outperform state-of-art log summarization strategies.

Original languageEnglish
Pages (from-to)183-196
Number of pages14
JournalProceedings of the VLDB Endowment
Volume12
Issue number3
DOIs
StatePublished - 2018
Event45th International Conference on Very Large Data Bases, VLDB 2019 - Los Angeles, United States
Duration: Aug 26 2017Aug 30 2017

Fingerprint

Dive into the research topics of 'Query log compression for workload analytics'. Together they form a unique fingerprint.

Cite this