Skip to main navigation Skip to search Skip to main content

Optimality of pure strategies in stochastic decision processes

Research output: Contribution to journalConference articlepeer-review

Abstract

Discrete-time, infinite-horizon stochastic decision processes with various reward criteria are addressed. Sufficient conditions are obtained for the value of a class of strategies to be equal to the value of the subclass of nonrandomized strategies from this class. Two different methods for proving that nonrandomized strategies are as good as arbitrary ones are considered. The first method is based on the fact that a strategic measure, i.e., the measure on the set of trajectories generated by a strategy and an initial distribution, for any strategy may be represented as a linear combination (or a linear operator) of strategic measures generated by nonrandomized strategies and the same initial distribution. This method is applicable to various criteria and classes of strategies. The second method is applicable to Markov decision processes with the expected total reward criterion. It is based on linearity properties of optimality equations, on the approximation of dynamic programming models by negative dynamic programming models, and on the replacement of the initial model by another one whose states represent information about the past in the initial model.

Original languageEnglish
Pages (from-to)2149-2154
Number of pages6
JournalProceedings of the IEEE Conference on Decision and Control
Volume4
StatePublished - 1990
EventProceedings of the 29th IEEE Conference on Decision and Control Part 6 (of 6) - Honolulu, HI, USA
Duration: Dec 5 1990Dec 7 1990

Fingerprint

Dive into the research topics of 'Optimality of pure strategies in stochastic decision processes'. Together they form a unique fingerprint.

Cite this