Skip to main navigation Skip to search Skip to main content

Model-driven autoscaling for hadoop clusters

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

In this paper, we present the design and implementation of a model-driven auto scaling solution for Hadoop clusters. We first develop novel performance models for Hadoop workloads that relate job completion times to various workload and system parameters such as input size and resource allocation. We then employ statistical techniques to tune the models for specific workloads, including Terasort and K-means. Finally, we employ the tuned models to determine the resources required to successfully complete the Hadoop jobs as per the user-specified response time SLA. We implement our solution on an Open Stack-based cloud cluster running Hadoop. Our experimental results across different workloads demonstrate the auto scaling capabilities of our solution, and enable significant resource savings without compromising performance.

Original languageEnglish
Title of host publicationProceedings - IEEE International Conference on Autonomic Computing, ICAC 2015
EditorsPhilippe Lalanda, Samuel Kounev, Ada Diaconescu, Lucy Cherkasova
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages155-156
Number of pages2
ISBN (Electronic)9781467369701
DOIs
StatePublished - Sep 14 2015
Event12th IEEE International Conference on Autonomic Computing, ICAC 2015 - Grenoble, France
Duration: Jul 7 2015Jul 10 2015

Publication series

NameProceedings - IEEE International Conference on Autonomic Computing, ICAC 2015

Conference

Conference12th IEEE International Conference on Autonomic Computing, ICAC 2015
Country/TerritoryFrance
CityGrenoble
Period07/7/1507/10/15

Keywords

  • Auto Scaling
  • Hadoop
  • Performance Modeling

Fingerprint

Dive into the research topics of 'Model-driven autoscaling for hadoop clusters'. Together they form a unique fingerprint.

Cite this