Infinite horizon discounted dynamic programming subject to total variation ambiguity on conditional distribution
Paper i proceeding, 2016

We analyze the infinite horizon minimax discounted cost Markov Control Model (MCM), for a class of controlled process conditional distributions, which belong to a ball, with respect to total variation distance metric, centered at a known nominal controlled conditional distribution with radius R ϵ [0, 2], in which the minimization is over the control strategies and the maximization is over conditional distributions. Through our analysis (i) we derive a new discounted dynamic programming equation, (ii) we show the associated contraction property, and (iii) we develop a new policy iteration algorithm. Finally, the application of the new dynamic programming and the corresponding policy iteration algorithm are shown via an illustrative example.

Författare

I. Tzortzis

University of Cyprus

C. D. Charalambous

University of Cyprus

Themistoklis Charalambous

Chalmers, Signaler och system, Kommunikationssystem, informationsteori och antenner, Kommunikationssystem

Proceedings of the 55th IEEE Conference on Decision and Control (CDC 2016); Las Vegas; United States; 12-14 December 2016

0743-1546 (ISSN)

Art no 7798559, Pages 2010-2015

Ämneskategorier

Elektroteknik och elektronik

DOI

10.1109/CDC.2016.7798559

ISBN

978-1-5090-1837-6