Skip to main navigation Skip to search Skip to main content

The value iteration algorithm is not strongly polynomial for discounted dynamic programming

Research output: Contribution to journalArticlepeer-review

25 Scopus citations

Abstract

This note provides a simple example demonstrating that, if exact computations are allowed, the number of iterations required for the value iteration algorithm to find an optimal policy for discounted dynamic programming problems may grow arbitrarily quickly with the size of the problem. In particular, the number of iterations can be exponential in the number of actions. Thus, unlike policy iterations, the value iteration algorithm is not strongly polynomial for discounted dynamic programming.

Original languageEnglish
Pages (from-to)130-131
Number of pages2
JournalOperations Research Letters
Volume42
Issue number2
DOIs
StatePublished - Mar 2014

Keywords

  • Algorithm
  • Markov decision process
  • Policy
  • Strongly polynomial
  • Value iteration

Fingerprint

Dive into the research topics of 'The value iteration algorithm is not strongly polynomial for discounted dynamic programming'. Together they form a unique fingerprint.

Cite this