Skip to main navigation Skip to search Skip to main content

Parsing XML using parallel traversal of streaming trees

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

XML has been widely adopted across a wide spectrum of applications. Its parsing efficiency, however, remains a concern, and can be a bottleneck. With the current trend towards multicore CPUs, parallelization to improve performance is increasingly relevant. In many applications, the XML is streamed from the network, and thus the complete XML document is never in memory at any single moment in time. Parallel parsing of such a stream can be equated to parallel depth-first traversal of a streaming tree. Existing research on parallel tree traversal has assumed the entire tree was available in-memory, and thus cannot be directly applied. In this paper we investigate parallel, SAX-style parsing of XML via a parallel, depth-first traversal of the streaming document. We show good scalability up to about 6 cores on a Linux platform.

Original languageEnglish
Title of host publicationHigh Performance Computing - HiPC 2008 - 15th International Conference, Proceedings
PublisherSpringer Verlag
Pages142-156
Number of pages15
ISBN (Print)354089893X, 9783540898931
DOIs
StatePublished - 2008
Event15th International Conference on High Performance Computing, HiPC 2008 - Bangalore, India
Duration: Dec 17 2008Dec 20 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5374 LNCS

Conference

Conference15th International Conference on High Performance Computing, HiPC 2008
Country/TerritoryIndia
CityBangalore
Period12/17/0812/20/08

Fingerprint

Dive into the research topics of 'Parsing XML using parallel traversal of streaming trees'. Together they form a unique fingerprint.

Cite this