Abstract

This paper is intended to model a decision maker as a rational probabilistic decider (RPD) and to investigate its behavior in stationary and symmetric Markov switch environments. RPDs take their decisions based on penalty functions defined by the environment. The quality of decision making depends on a parameter referred to as level of rationality. The dynamic behavior of RPDs is described by an ergodic Markov chain. Two classes of RPDs are considered—local and global. The former take their decisions based on the penalty in the current state while the latter consider all states. It is shown that asymptotically (in time and in the level of rationality) both classes behave quite similarly. However, the second largest eigenvalue of Markov transition matrices for global RPDs is smaller than that for local ones, indicating faster convergence to the optimal state. As an illustration, the behavior of a chief executive officer, modeled as a global RPD, is considered, and it is shown that the company performance may or may not be optimized—depending on the pay structure employed. While the current paper investigates individual RPDs, a companion paper will address collective behavior.