Skip to main navigation Skip to search Skip to main content

Reliability in cloud computing: Downtime predictions for virtual servers

Research output: Contribution to conferencePaperpeer-review

Abstract

Reliability in cloud computing, i.e. failure resilience and availability, is a critically important and timely issue. Google, Amazon and other cloud computing service providers have all experienced severe service outages in the recent past. Having poor reliability in the cloud not only affects its existing users and applications, but also deters new users and applications from using cloud services, resulting in millions of dollars of losses in revenue and productivity. In this work, we address a fundamental question faced by cloud computing service providers pertaining to reliability, specifically, how to predict the downtime of provisioned server infrastructure for a given finite duration considering concurrent machine failures. We design a set of algorithms using an analytical and an empirical approach based on the limiting behavior of the birth-death process and sample path analysis to address the question, and provide a roadmap for the empirical analysis to be accomplished in future.

Original languageEnglish
Pages25-30
Number of pages6
StatePublished - 2011
Event21st Workshop on Information Technologies and Systems, WITS 2011 - Shanghai, China
Duration: Dec 3 2011Dec 4 2011

Conference

Conference21st Workshop on Information Technologies and Systems, WITS 2011
Country/TerritoryChina
CityShanghai
Period12/3/1112/4/11

Fingerprint

Dive into the research topics of 'Reliability in cloud computing: Downtime predictions for virtual servers'. Together they form a unique fingerprint.

Cite this