Skip to main navigation Skip to search Skip to main content

Reduction of discounted continuous-time MDPs with unbounded jump and reward rates to discrete-time total-reward MDPs

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

24 Scopus citations

Abstract

This chapter discusses a reduction of discounted continuous-time Markov decision processes (CTMDPs) to discrete-time Markov decision processes (MDPs). This reduction is based on the equivalence of a randomized policy that chooses actions only at jump epochs to a nonrandomized policy that can switch actions between jumps. For discounted CTMDPs with bounded jump rates, this reduction was introduced by the author in 2004 as a reduction to discounted MDPs. Here we show that this reduction also holds for unbounded jump and reward rates, but the corresponding MDP may not be discounted. However, the analysis of the equivalent total-reward MDP leads to the description of optimal policies for the CTMDP and provides methods for their computation.

Original languageEnglish
Title of host publicationSystems and Control
Subtitle of host publicationFoundations and Applications
PublisherBirkhauser
Pages77-97
Number of pages21
Edition9780817683368
DOIs
StatePublished - 2012

Publication series

NameSystems and Control: Foundations and Applications
Number9780817683368

Keywords

  • Jump Rate
  • Optimal Policy
  • Reward Function
  • Reward Rate
  • Total Reward

Fingerprint

Dive into the research topics of 'Reduction of discounted continuous-time MDPs with unbounded jump and reward rates to discrete-time total-reward MDPs'. Together they form a unique fingerprint.

Cite this