Skip to main navigation Skip to search Skip to main content

Sufficiency of deterministic policies for atomless discounted and uniformly absorbing MDPs with multiple criteria

  • University of Liverpool

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

This paper studies Markov decision processes (MDPs) with atomless initial state distributions and atomless transition probabilities. Such MDPs are called atomless. The initial state distribution is considered to be fixed. We show that for discounted MDPs with bounded one-step reward vector-functions, for each policy there exists a deterministic (that is, nonrandomized and stationary) policy with the same performance vector. This fact is proved in the paper for a more general class of uniformly absorbing MDPs with expected total rewards, and then it is extended under certain assumptions to MDPs with unbounded rewards. For problems with multiple criteria and constraints, the results of this paper imply that for atomless MDPs studied in this paper it is sufficient to consider only deterministic policies, while without the atomless assumption it is well-known that randomized policies can outperform deterministic ones. We also provide an example of an MDP demonstrating that if a vector measure is defined on a standard Borel space, then Lyapunov’s convexity theorem is a special case of the described results.

Original languageEnglish
Pages (from-to)163-191
Number of pages29
JournalSIAM Journal on Control and Optimization
Volume57
Issue number1
DOIs
StatePublished - 2019

Keywords

  • Atomless
  • Compact
  • Convex
  • Deterministic policy
  • Discounted
  • Markov decision process

Fingerprint

Dive into the research topics of 'Sufficiency of deterministic policies for atomless discounted and uniformly absorbing MDPs with multiple criteria'. Together they form a unique fingerprint.

Cite this