On the Risk Sensitive Optimality Criteria for Markov Decision Processes

On the Risk Sensitive Optimality Criteria for Markov Decision Processes
Author: Stanford University. Department of Operations Research
Publisher:
Total Pages: 28
Release: 1975
Genre:
ISBN:

Discrete dynamic programming models with an exponential utility function are studied with respect to the asymptotic behavior of the dynamic programming recursion for the expected utility. Preliminary results on maximizing the asymptotic growth of the expected utility in the class of stationary policies are presented. Under the condition that there exists a stationary 'optimal' policy with an irreducible, aperiodic transition probability matrix, some nice limiting properties for the maximum expected utilities are established. Moreover, it is shown how to generate a monotonic sequence of lower and upper bounds on the maximum growth rate of the expected utility. Under certain additional assumptions it is possible to extend the obtained results to Markov decision processes with a denumerable state space.

State-Augmentation Transformations for Risk-Sensitive Markov Decision Processes

State-Augmentation Transformations for Risk-Sensitive Markov Decision Processes
Author: Shuai Ma
Publisher:
Total Pages: 0
Release: 2020
Genre:
ISBN:

Markov decision processes (MDPs) provide a mathematical framework for modeling sequential decision making (SDM) where system evolution and reward are partly under the control of a decision maker and partly random. MDPs have been widely adopted in numerous fields, such as finance, robotics, manufacturing, and control systems. For stochastic control problems, MDPs serve as the underlying models in dynamic programming and reinforcement learning (RL) algorithms. In this thesis, we study risk estimation in MDPs, where the variability of random rewards is taken into account. First, we categorize the reward into four classes: deterministic/stochastic and state-/transition-based. Though numerous of theoretical methods are designed for MDPs or Markov processes with a deterministic (and state-based) reward, many practical problems are naturally modeled by processes with stochastic (and transition-based) reward. When the optimality criterion refers to the risk-neutral expectation of a (discount) total reward, we can use a model (reward) simplification to bridge the gap. However, when the criterion is risk-sensitive, a model simplification will change the risk value. For preserving the risks, we address that most, if not all, the inherent risk measures depend on the reward sequence (Rt). In order to bridge the gap between theoretical methods and practical problems with respect to risk-sensitive criteria, we propose a state-augmentation transformation (SAT). Four cases are thoroughly studied in which different forms of SAT should be implemented for risk preservation. In numerical experiments, we compare the results from the model simplifications and the SAT, and illustrate that, i). the model simplifications change (Rt) as well as return (or total reward) distributions; and ii). the proposed SAT transforms processes with complicated rewards, such as stochastic and transition-based rewards, into ones with deterministic state-based rewards, with intact (Rt). Second, we consider constrained risk-sensitive SDM problems in dynamic environments. Unlike other studies, we simultaneously consider the three factors-constraint, risk, and dynamic environment. We propose a scheme to generate a synthetic dataset for training an approximator. The reasons for not using historical data are two-fold. The first reason refers to information incompleteness. Historical data usually contains no information on criterion parameters (which risk objective and constraint(s) are concerned) and (or) the optimal policy (usually just an action for each item of data), and in many cases, even the information on environmental parameters (such as all involved costs) is incomplete. The second reason is about optimality. The decision makers might prefer an easy-to-use policy than an optimal one, which is hard to determine whether the preferred policy is optimal (such as an EOQ policy), since the practical problems could be different from the theoretical model diversely and subtly. Therefore, we propose to evaluate or estimate risk measures with RL methods and train an approximator, such as neural network, with a synthetic dataset. A numerical experiment validates the proposed scheme. The contributions of this study are three-fold. First, for risk evaluation in different cases, we propose the SAT theorem and corollaries to enable theoretical methods to solve practical problems with a preserved (Rt). Second, we estimate three risk measures with return variance as examples to illustrate the difference between the results from the SAT and the model simplification. Third, we present a scheme for constrained, risk-sensitive SDM problems in a dynamic environment with an inventory control example.

Markov Decision Processes with Applications to Finance

Markov Decision Processes with Applications to Finance
Author: Nicole Bäuerle
Publisher: Springer Science & Business Media
Total Pages: 393
Release: 2011-06-06
Genre: Mathematics
ISBN: 3642183247

The theory of Markov decision processes focuses on controlled Markov chains in discrete time. The authors establish the theory for general state and action spaces and at the same time show its application by means of numerous examples, mostly taken from the fields of finance and operations research. By using a structural approach many technicalities (concerning measure theory) are avoided. They cover problems with finite and infinite horizons, as well as partially observable Markov decision processes, piecewise deterministic Markov decision processes and stopping problems. The book presents Markov decision processes in action and includes various state-of-the-art applications with a particular view towards finance. It is useful for upper-level undergraduates, Master's students and researchers in both applied probability and finance, and provides exercises (without solutions).

Risk Management by Markov Decision Processes

Risk Management by Markov Decision Processes
Author: You Liang
Publisher:
Total Pages: 0
Release: 2015
Genre:
ISBN:

A very important and powerful tool in the study of mathematical finance including risk management is the model of Markov decision processes. My PhD research in the area of risk management is focused on three major problems: (1) risk sensitive partially observable Markov decision processes, (2) dynamic risk measures, (3) dynamic deviation measures. Our first part is to extend the classic model of Markov decision processes simultaneously in two directions: partially observable states and risk sensitivity. Another direction of extending the classic model of Markov decision processes is to incorporate risk measures with the optimality criteria of the model. Our second part of the thesis is to use the model of Markov decision processes to characterize and derive new forms of dynamic risk measures. The third part of the thesis is to apply the model of Markov decision processes to characterize and derive a sequence of dynamic deviation measures.

Continuous-Time Markov Decision Processes

Continuous-Time Markov Decision Processes
Author: Xianping Guo
Publisher: Springer Science & Business Media
Total Pages: 240
Release: 2009-09-18
Genre: Mathematics
ISBN: 3642025471

Continuous-time Markov decision processes (MDPs), also known as controlled Markov chains, are used for modeling decision-making problems that arise in operations research (for instance, inventory, manufacturing, and queueing systems), computer science, communications engineering, control of populations (such as fisheries and epidemics), and management science, among many other fields. This volume provides a unified, systematic, self-contained presentation of recent developments on the theory and applications of continuous-time MDPs. The MDPs in this volume include most of the cases that arise in applications, because they allow unbounded transition and reward/cost rates. Much of the material appears for the first time in book form.

Constrained Markov Decision Processes

Constrained Markov Decision Processes
Author: Eitan Altman
Publisher: Routledge
Total Pages: 256
Release: 2021-12-17
Genre: Mathematics
ISBN: 1351458248

This book provides a unified approach for the study of constrained Markov decision processes with a finite state space and unbounded costs. Unlike the single controller case considered in many other books, the author considers a single controller with several objectives, such as minimizing delays and loss, probabilities, and maximization of throughputs. It is desirable to design a controller that minimizes one cost objective, subject to inequality constraints on other cost objectives. This framework describes dynamic decision problems arising frequently in many engineering fields. A thorough overview of these applications is presented in the introduction. The book is then divided into three sections that build upon each other.

Markov Decision Processes

Markov Decision Processes
Author: Martin L. Puterman
Publisher: John Wiley & Sons
Total Pages: 544
Release: 2014-08-28
Genre: Mathematics
ISBN: 1118625870

The Wiley-Interscience Paperback Series consists of selected books that have been made more accessible to consumers in an effort to increase global appeal and general circulation. With these new unabridged softcover volumes, Wiley hopes to extend the lives of these works by making them available to future generations of statisticians, mathematicians, and scientists. "This text is unique in bringing together so many results hitherto found only in part in other texts and papers. . . . The text is fairly self-contained, inclusive of some basic mathematical results needed, and provides a rich diet of examples, applications, and exercises. The bibliographical material at the end of each chapter is excellent, not only from a historical perspective, but because it is valuable for researchers in acquiring a good perspective of the MDP research potential." —Zentralblatt fur Mathematik ". . . it is of great value to advanced-level students, researchers, and professional practitioners of this field to have now a complete volume (with more than 600 pages) devoted to this topic. . . . Markov Decision Processes: Discrete Stochastic Dynamic Programming represents an up-to-date, unified, and rigorous treatment of theoretical and computational aspects of discrete-time Markov decision processes." —Journal of the American Statistical Association

Discrete-Time Markov Control Processes

Discrete-Time Markov Control Processes
Author: Onesimo Hernandez-Lerma
Publisher: Springer Science & Business Media
Total Pages: 223
Release: 2012-12-06
Genre: Mathematics
ISBN: 1461207290

This book presents the first part of a planned two-volume series devoted to a systematic exposition of some recent developments in the theory of discrete-time Markov control processes (MCPs). Interest is mainly confined to MCPs with Borel state and control (or action) spaces, and possibly unbounded costs and noncompact control constraint sets. MCPs are a class of stochastic control problems, also known as Markov decision processes, controlled Markov processes, or stochastic dynamic pro grams; sometimes, particularly when the state space is a countable set, they are also called Markov decision (or controlled Markov) chains. Regardless of the name used, MCPs appear in many fields, for example, engineering, economics, operations research, statistics, renewable and nonrenewable re source management, (control of) epidemics, etc. However, most of the lit erature (say, at least 90%) is concentrated on MCPs for which (a) the state space is a countable set, and/or (b) the costs-per-stage are bounded, and/or (c) the control constraint sets are compact. But curiously enough, the most widely used control model in engineering and economics--namely the LQ (Linear system/Quadratic cost) model-satisfies none of these conditions. Moreover, when dealing with "partially observable" systems) a standard approach is to transform them into equivalent "completely observable" sys tems in a larger state space (in fact, a space of probability measures), which is uncountable even if the original state process is finite-valued.

Selected Topics on Continuous-time Controlled Markov Chains and Markov Games

Selected Topics on Continuous-time Controlled Markov Chains and Markov Games
Author: Tomás Prieto-Rumeau
Publisher: World Scientific
Total Pages: 292
Release: 2012
Genre: Mathematics
ISBN: 1848168489

This book concerns continuous-time controlled Markov chains, also known as continuous-time Markov decision processes. They form a class of stochastic control problems in which a single decision-maker wishes to optimize a given objective function. This book is also concerned with Markov games, where two decision-makers (or players) try to optimize their own objective function. Both decision-making processes appear in a large number of applications in economics, operations research, engineering, and computer science, among other areas.An extensive, self-contained, up-to-date analysis of basic optimality criteria (such as discounted and average reward), and advanced optimality criteria (e.g., bias, overtaking, sensitive discount, and Blackwell optimality) is presented. A particular emphasis is made on the application of the results herein: algorithmic and computational issues are discussed, and applications to population models and epidemic processes are shown.This book is addressed to students and researchers in the fields of stochastic control and stochastic games. Moreover, it could be of interest also to undergraduate and beginning graduate students because the reader is not supposed to have a high mathematical background: a working knowledge of calculus, linear algebra, probability, and continuous-time Markov chains should suffice to understand the contents of the book.

Markov Decision Processes with Their Applications

Markov Decision Processes with Their Applications
Author: Qiying Hu
Publisher: Springer Science & Business Media
Total Pages: 305
Release: 2007-09-14
Genre: Business & Economics
ISBN: 0387369511

Put together by two top researchers in the Far East, this text examines Markov Decision Processes - also called stochastic dynamic programming - and their applications in the optimal control of discrete event systems, optimal replacement, and optimal allocations in sequential online auctions. This dynamic new book offers fresh applications of MDPs in areas such as the control of discrete event systems and the optimal allocations in sequential online auctions.