Skip to main navigation Skip to search Skip to main content

Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

This note shows that the number of arithmetic operations required by any member of a broad class of optimistic policy iteration algorithms to solve a deterministic discounted dynamic programming problem with three states and four actions may grow arbitrarily. Therefore any such algorithm is not strongly polynomial. In particular, the modified policy iteration and λ-policy iteration algorithms are not strongly polynomial.

Original languageEnglish
Pages (from-to)429-431
Number of pages3
JournalOperations Research Letters
Volume42
Issue number6-7
DOIs
StatePublished - Sep 2014

Keywords

  • Algorithm
  • Markov decision process
  • Modified policy iteration
  • Policy
  • Strongly polynomial

Fingerprint

Dive into the research topics of 'Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming'. Together they form a unique fingerprint.

Cite this