Skip to main navigation Skip to search Skip to main content

An asymptotically efficient simulation-based algorithm for finite horizon stochastic dynamic programming

  • Hyeong Soo Chang
  • , Michael C. Fu
  • , Jiaqiao Hu
  • , Steven I. Marcus
  • Sogang University
  • University of Maryland, College Park

Research output: Contribution to journalArticlepeer-review

18 Scopus citations

Abstract

We present a simulation-based algorithm called "Simulated Annealing Multiplicative Weights" (SAMW) for solving large finite-horizon stochastic dynamic programming problems. At each iteration of the algorithm, a probability distribution over candidate policies is updated by a simple multiplicative weight rule, and with proper annealing of a control parameter, the generated sequence of distributions converges to a distribution concentrated only on the best policies. The algorithm is "asymptotically efficient," in the sense that for the goal of estimating the value of an optimal policy, a provably convergent finite-time upper bound for the sample mean is obtained.

Original languageEnglish
Pages (from-to)89-94
Number of pages6
JournalIEEE Transactions on Automatic Control
Volume52
Issue number1
DOIs
StatePublished - Jan 2007

Keywords

  • Learning algorithms
  • Markov decision processes
  • Simulated annealing
  • Simulation
  • Stochastic dynamic programming

Fingerprint

Dive into the research topics of 'An asymptotically efficient simulation-based algorithm for finite horizon stochastic dynamic programming'. Together they form a unique fingerprint.

Cite this