Splitting Randomized Stationary Policies in Markov Decision Processes

Event Details
  • Date/Time:
    • Thursday April 2, 2009
      11:00 am - 12:00 pm
  • Location: Executive classroom
  • Phone:
  • URL:
  • Email:
  • Fee(s):
    $0.00
  • Extras:
Contact
Anita Race
H. Milton Stewart School of Industrial and Systems Engineering
Contact Anita Race
Summaries

Summary Sentence: Splitting Randomized Stationary Policies in Markov Decision Processes

Full Summary: Splitting Randomized Stationary Policies in Markov Decision Processes

TITLE: Splitting Randomized Stationary Policies in Markov Decision Processes

SPEAKER: Eugene A. Feinberg

ABSTRACT:

This talk discusses whether an occupation measure for a given randomized stationary policy for a Markov Decision Process (MDPs) can be represented as a convex combination of occupation measures for simpler policies. In particular, the question is whether an occupation measure for a randomized stationary policy can be represented as a convex integral combination of occupation measures for deterministic policies. If this is possible, we say that the policy can be split.

We present general results on such representations and focus our attention on two particular situations: (i) the given subset of states is a singleton (splitting at a state), and (ii) the given subset is finite and the policy uses a finite number of actions at this set (finite splitting). For splitting at a state, we present new formulas and provide necessary and sufficient conditions when a policy can be split. For finite splitting, we describe the structure of the split and provide an algorithm for its computation. Important properties of the structure of the finite splitting are that each two consecutive policies in the split differ only at one state and the number of summands in the convex combination is relatively small.

We apply finite splitting to constrained discounted MDPs with countable state spaces and prove that an optimal mixed stationary policy can be selected in such a way that each two consecutive stationary policies in the split differ at one state. Though this talk discusses infinite state and action MDPs, the results on finite splitting are new for discounted MDPs with finite state and action sets. We shall also discuss splitting of state-action frequencies in unichain MDPs with average rewards per unit time.

This talk is based in part on a joint paper with Eric V. Denardo and Uriel G. Rothblum.

Additional Information

In Campus Calendar
No
Groups

H. Milton Stewart School of Industrial and Systems Engineering (ISYE)

Invited Audience
No audiences were selected.
Categories
Seminar/Lecture/Colloquium
Keywords
MDPs
Status
  • Created By: Anita Race
  • Workflow Status: Published
  • Created On: Oct 12, 2009 - 4:36pm
  • Last Updated: Oct 7, 2016 - 9:47pm