Fixed-Point Techniques and Applications to Real World ProblemsView this Special Issue
On a Unique Solution of the Stochastic Functional Equation Arising in Gambling Theory and Human Learning Process
The term “learning” is often used to refer to a generally stable behavioral change resulting from practice. However, it is a fundamental biological capacity far more developed in humans than in other living beings. In an animal or human being, the learning phase may often be viewed as a series of choices between multiple possible reactions. Here, we analyze a specific type of human learning process related to gambling in which a subject inserts a poker chip to operate a two-armed bandit device and then presses one of the two keys. Through the use of an electromagnet, one or more poker chips are given to the individual in a container located in the apparatus’s center. If a chip is provided, it is declared a winner; otherwise, it is considered a loser. The goal of this paper is to look at the subject’s actions in such situations and provide a mathematical model that is appropriate for it. The existence of a unique solution to the suggested human learning model is examined using relevant fixed point results.
Learning is a fundamental biological capacity that is much more evolved in humans than in any other living being. The central topic in learning philosophy is how multiple forms of learning take place in a human brain and body since this was explicitly formulated in the discipline of learning psychology, but with additional feedback from other psychological disciplines and the adjacent areas of sociology, pedagogy, and biology, including contemporary brain science.
In modern mathematical learning experiments, the researchers concluded that a basic learning experiment was compatible with any stochastic process. Thus, it is not a novel concept (for detail, see ). However, after 1950, two critical features emerged mainly in the research initiated by Bush, Estes, and Mosteller. Firstly, the learning method egalitarian essence was a core feature of the developed model. Secondly, these frameworks were studied and applied in areas that did not conceal their quantitative aspects.
Several studies on human actions in probability-learning scenarios have produced different results (for the detail, see [2–5]).
In 2019, Turab and Sintunavarat [6, 7] proposed a functional equation to examine the experimental work of Bush and Wilson  on a paradise fish. In this experiment, a fish was given two options for swimming. The fish had options to swim on either side (right or left) of the tank’s far end.
In , the authors recently addressed a kind of traumatic avoidance learning experiment for normal dogs suggested by Solomon and Wynne . They examined the psychological responses of 30 dogs enclosed in a small steel grid cage and proposed a mathematical model. The suggested avoidance learning model’s existence and uniqueness of a solution result were investigated using the appropriate fixed point method.
For the research in this area, especially related to the two-choice behavior, we refer to [11–13] and the references therein. It is worth noting that most animal behavior studies in a two-choice situation discussed above have focused only on the animals’ approach toward an inevitable conclusion. Bush and Wilson , on the other hand, divided such responses into four categories depending on the food source and side chosen (right-reward, right nonreward, left reward, and left nonreward).
In this work, by following the work presented by Turab and Sintunavarat [6, 9] and the idea discussed in [8, 14], our aims are to discuss the two-armed bandit experiment proposed by Goodnow and Pettigrew  and propose a convenient mathematical model. We evaluate our findings under the experimenter-subject controlled events to see the feasibility of the suggested model. The existence of a unique solution to the proposed model is examined by using the appropriate fixed point theorem. In the end, we raise some open problems for the interested readers.
2. A Two-Armed Bandit Experiment
In , Goodnow and Pettigrew presented an experiment related to the gambling theory. This gambling activity involves playing a poker game with chips worth one penny each (see Figure 1). The subject (S) is given 200 chips by an experimenter (E). He/She inserts into the machine one of these chips and pushes one of two buttons. A chip drops into the payout box with a clatter of noise when the bet is successful. The payoff box has a glass face, and the heap of chips he/she has won can be seen by S. The subject is not permitted until the end of the experiment to carry the chips out of this box. Whatever the outcome of the bet, between each test, the machine becomes unusable for several seconds, and S wait until two signal lights and a loud buzz appear, indicating that the device is ready to take the next bet. The apparatus is fully programmed such that inserting a chip before the device’s ready is useless for S.
When the subject S implants a chip (upper center light) and clicks a key (left or right lower), the lights on the face of the machine flash on successively (upper outer lights in Figure 1). These lights are parallel to the control machine’s lights controlled in an adjacent space by E. A master switch to turn the device on or off is also included in the control machine, along with a key that allows the machine to eject a chip into the pay-off box when pushed. The one-way mirror enables E from the control room to view S’s activities.
The assignment’s method and directions were given to S and E. The S was instructed that he/she is playing for cash and that he/she would be paid for the discrepancy between the number of wins and losses. There were 120 trials allowed for every S, divided into 12 blocks of 10 trials each. The probability of the above task was 50 : 50, 70 : 30, and 90 : 10. When the experiment is completed, S was asked the following questions: (1)How did you decide which alternative you should choose?(2)How he/she thought about the strategy of always betting on one key?
The results were described in terms of the average proportion of choices of one alternative: pushing the ‘left button’ in the gambling experiment provided the greater likelihood of these alternatives outside the 50 : 50 scenario. In Table 1, the findings are presented.
3. Mathematical Modeling of the Two-Armed Bandit Experiment
In the above experiment, significant interest lies in the behavior of a subject S; press right or left button, ` or `,’ and get the reward in terms of a poker chip. In our view, if a subject chooses the reward side, there would be an occurrence of alternative and if the subject made a move to the other side, then there will be an occurrence of alternative . Thus, according to the mathematical point of view, there would be four possibilities of events, depending on the action of the subject and the reward. These events are listed in Table 2.
Depending on the action of the subject and getting the chance of the reward, we have the following four events (see Table 3).
The probability of the outcomes and are and , respectively, where ]. The experimental pattern asks for the outcomes of the responses (whether the subject get the reward or not), trials’ fixed proportion of ]. Therefore, we get the event probabilities stated below (see Table 4).
We define as the learning rate parameters and their values can be recognized as a measure of the ineffectiveness of the corresponding events in altering the response probability.
If, on some trial, is the possibility of response with outcome and is fulfilled, the next possibility of with outcome will be , and if is achieved with outcome then the new probability would be with the event probability Similarly, if is performed with outcomes and then the new probabilities of are and , with the event probabilities and , respectively. For the four events , we can define the transition operators ] as for all ].
By considering the work presented in [6, 8, 9] and the above transition operators with their corresponding probabilities and events given in Table 4, we introduce the following functional equation, which can discuss all the aspects of the two-armed bandit model.
Fixed point theory, on the other hand, began in the second half of the nineteenth century as a method of using iterative estimations to demonstrate the existence and uniqueness of solutions to ordinary differential and integral equations. It is a wonderful combination of basic and applied analysis, geometry, and topology. A fixed point theoretic viewpoint can be seen in Picard’s work, which is a fundamental notion in the field of metric fixed point theory. Nevertheless, it is credited to the Polish mathematician “Banach,” who abstracted the underlying principles into a framework that can be applied to find the existence of a unique solution to the broad range of applications beyond differential and integral equations. It has been extended and generalized in numerous directions (for the detail, see [16–18]). We suggest the reader to see [19–21] for further information on fixed point theory and its applications in various spaces.
The following stated outcome will be required in the progression.
Theorem 1 (see ). Let be a complete metric space and be a Banach contraction mapping (shortly, BCM), that is, for some and for all Then, has one fixed point. Furthermore, the Picard iteration in that can be defined as for all , where , converges to the unique fixed point of .
4. Existence and Uniqueness Results
We let For the rest of this article, represents the class with consisting of all real-valued continuous functions which satisfy the following relation
Clearly, is a Banach space with for all .
Following that, we can rewrite the functional equation (2) as where is an unknown function, .
Theorem 2. For and with where If there is a such that is -invariant, that is, , where is defined for each as for all then is a BCM.
Proof. Let . For each distinct points , we obtain
By applying the definition of the norm (5), we obtain
where is defined in (7). This gives that
As a result of we can claim that is a BCM with the metric imposed by .
We get the following conclusion from Theorem 2 about the uniqueness of a functional equation (6)’s solution.☐☐
Theorem 3. The stochastic equation (6) has a unique solution with where is defined in (7). Assume that there is a such that is -invariant, that is, , where defined for each as for all Furthermore, the following iteration in defined by converges to the unique solution of (12).
Proof. We reach the conclusion of this theorem by combining the Banach fixed point theorem with Theorem 2.☐
The following corollaries arise from the preceding findings.
Corollary 4. For and with where If there is a such that is -invariant, that is, , where defined for each as for all then is a BCM.
Corollary 5. The stochastic equation (6) has a unique solution with where is defined in (7). Assume that there is a such that is -invariant, that is, , where defined for each as for all Furthermore, the iteration in () defined by converges to the unique solution of (12).
5. A Certain Case with Experimenter-Subject-Controlled Events
It has been highlighted that the examination of any experiment is truly based on suppositions. Therefore, experiments are classified into contingent and noncontingent, based on the occurrences of the results. It has been suggested that the correspondence of contingent experiments is for the events of experimental-subject (contingent) and noncontingent experiments are for the events of experimental control.
In the previous models on imitation problems such as T-maze experiments with fish and dog (see [6, 9]), it was already mentioned that such experiments required a contingent approach; the result of the trials was entirely dependent on the subject’s choice. Thus, such types of models required experimenter-subject-controlled events. The two responses and along with outcomes and are choosing the right or left side or pushing the right or left button, which coincides with rewarding and non-rewarding or correct and incorrect, respectively. Now we define the probabilities and which indicate the conditional probability of outcomes and of the given alternatives and respectively. With such conditions, we have the following Table 5.
We have the following functional equation from the data given above: where is an unknown function, and . We shall begin with the following finding.
Theorem 6. For and with where Assume that, if there is a such that is -invariant, that is, , where defined for each as for all then is a BCM.
Proof. Let . For each distinct points , we obtain By applying the definition of the norm (5), we obtain where is defined in (19). Thus, we have As a result of one can see that is a BCM.☐☐
For the unique solution of (18), we get the subsequent conclusion from Theorem 6.
Theorem 7. The stochastic equation (18) has a unique solution with Assume that, there is a such that is -invariant, that is, , where defined for each asfor all Furthermore, the iteration in defined by converges to the unique solution of (24).
Proof. The conclusion of this theorem can be found by combining Theorem 6 with the Banach fixed point theorem.☐☐
In this work, we have discussed a special type of stochastic process related to the two-armed bandit experiment  which plays a vital role in observing the subject’s behavior in a two-choice situation. We reviewed the operant’s responses under such conditions and provided a mathematical model for it. The Banach fixed point theorem was used to determine the existence of a unique solution to the two-armed bandit learning model. We investigated the proposed model’s adaptability by subjecting it to some controlled events. Moreover, the presented approach is straightforward and easy to verifiable. Thus, the proposed approach can be used to investigate more psychological learning experiments related to animals and humans in the future.
Now, for the interested readers, we propose the following open problems.
Question 1. Assume that if a subject does not press any button on a specific trial , how can we describe such an event by a model?
In the end, we also leave the stability problem (for the detail, see [23–27]) of the stochastic equation given below as an open problem: where and is an unknown function.
No data were used to support this study.
Conflicts of Interest
The authors declare no conflict of interest.
All authors contributed equally to the manuscript and typed, read, and approved the final manuscript.
This work was funded by the University of Jeddah, Saudi Arabia, under grant No. (UJ-21-DR-93). The authors, therefore, acknowledge with thanks the university technical and financial support.
R. R. Bush and F. Mosteller, Stochastic Models for Learning, John Wiley & Sons, Inc., 1955.
D. A. Grant, H. W. Hake, and J. P. Hornseth, “Acquisition and extinction of a verbal conditioned response with differing percentages of reinforcement,” Journal of Experimental Psychology, vol. 42, no. 1, pp. 1–5, 1951.View at: Publisher Site | Google Scholar
W. K. Estes and J. H. Straughan, “Analysis of a verbal conditioning situation in terms of statistical learning theory,” Journal of Experimental Psychology, vol. 47, no. 4, pp. 225–234, 1954.View at: Publisher Site | Google Scholar
L. G. Humphreys, “Acquisition and extinction of verbal expectations in a situation analogous to conditioning,” Journal of Experimental Psychology, vol. 25, no. 3, pp. 294–301, 1939.View at: Publisher Site | Google Scholar
M. E. Jarvik, “Probability learning and a negative recency effect in the serial anticipation of alternative symbols,” Journal of Experimental Psychology, vol. 41, no. 4, pp. 291–297, 1951.View at: Publisher Site | Google Scholar
A. Turab and W. Sintunavarat, “On analytic model for two-choice behavior of the paradise fish based on the fixed point method,” Journal of Fixed Point Theory and Applications, vol. 21, no. 2, p. 56, 2019.View at: Publisher Site | Google Scholar
A. Turab and W. Sintunavarat, “Corrigendum: On analytic model for two-choice behavior of the paradise fish based on the fixed point method, J. Fixed Point Theory Appl. 2019, 21:56,” Journal of Fixed Point Theory and Applications, vol. 22, no. 4, p. 82, 2020.View at: Publisher Site | Google Scholar
R. R. Bush and T. R. Wilson, “Two-choice behavior of paradise fish,” Journal of Experimental Psychology, vol. 51, no. 5, pp. 315–322, 1956.View at: Publisher Site | Google Scholar
A. Turab and W. Sintunavarat, “On the solution of the traumatic avoidance learning model approached by the Banach fixed point theorem,” Journal of Fixed Point Theory and Applications, vol. 22, no. 2, 2020.View at: Publisher Site | Google Scholar
R. L. Solomon and L. C. Wynne, “Traumatic avoidance learning: acquisition in normal dogs,” Psychological Monographs: General and Applied, vol. 67, no. 4, pp. 1–19, 1953.View at: Publisher Site | Google Scholar
W. Sintunavarat and A. Turab, “Some particular aspects of certain type of probabilistic predator-prey model with experimenter-subject-controlled events and the fixed point method,” AIP Conference Proceedings, vol. 2423, no. 60005, 2021.View at: Publisher Site | Google Scholar
V. Berinde and A. R. Khan, “On a functional equation arising in mathematical biology and theory of learning,” Creative Mathematics and Informatics, vol. 24, no. 1, pp. 9–16, 2015.View at: Publisher Site | Google Scholar
V. I. Istrăţescu, “On a functional equation,” Journal of Mathematical Analysis and Applications, vol. 56, no. 1, pp. 133–136, 1976.View at: Publisher Site | Google Scholar
M. H. Detambel, “A test of a model for multiple-choice behavior,” Journal of Experimental Psychology, vol. 49, no. 2, pp. 97–104, 1955.View at: Publisher Site | Google Scholar
J. J. Goodnow and T. F. Pettigrew, “Effect of prior patterns of experience upon strategies and learning sets,” Journal of Experimental Psychology, vol. 49, no. 6, pp. 381–389, 1955.View at: Publisher Site | Google Scholar
H. Aydi, E. Karapinar, and V. Rakocevic, “Nonunique fixed point theorems on b-metric spaces via simulation functions,” Jordan Journal of Mathematics and Statistics, vol. 12, no. 3, pp. 265–288, 2019.View at: Google Scholar
E. Karapinar, “Ciric type nonunique fixed points results: a review, applied and computational mathematics an,” International Journal, vol. 1, pp. 3–21, 2019.View at: Google Scholar
H. H. Alsulami, E. Karapinar, and V. Rakocevic, “Ciric type nonunique fixed point theorems on b-metric spaces,” Filomat, vol. 31, no. 11, pp. 3147–3156, 2017.View at: Google Scholar
H. K. Nashine, R. Pant, and R. George, “Common positive solution of two nonlinear matrix equations using fixed point results,” Mathematics, vol. 9, no. 18, p. 2199, 2021.View at: Publisher Site | Google Scholar
S. Etemad, S. Rezapour, and M. E. Samei, “-contractions and solutions of a -fractional differential inclusion with three-point boundary value conditions via computational results,” Advances in Difference Equations, vol. 2020, no. 1, Article ID 218, 2020.View at: Publisher Site | Google Scholar
I. Iqbal, N. Hussain, and N. Sultana, “Fixed points of multivalued non-linear -contractions with application to solution of matrix equations,” Filomat, vol. 31, no. 11, pp. 3319–3333, 2017.View at: Publisher Site | Google Scholar
S. Banach, “Sur les opérations dans les ensembles abstraits et leur application aux équations intégrales,” Fundamenta Mathematicae, vol. 3, pp. 133–181, 1922.View at: Publisher Site | Google Scholar
D. H. Hyers, “On the stability of the linear functional equation,” Proceedings of the National Academy of Sciences of the United States of America, vol. 27, no. 4, pp. 222–224, 1941.View at: Publisher Site | Google Scholar
P. Gavruta, “A generalization of the Hyers-Ulam-Rassias stability of approximately additive mappings,” Journal of Mathematical Analysis and Applications, vol. 184, no. 3, pp. 431–436, 1994.View at: Publisher Site | Google Scholar
S. M. Ulam, A Collection of the Mathematical Problems, no. 8, Interscience Tracts in Pure and Applied Mathematics, New York, 1960.
T. Aoki, “On the stability of the linear transformation in Banach spaces,” Journal of the Mathematical Society of Japan, vol. 2, no. 1-2, pp. 64–66, 1950.View at: Publisher Site | Google Scholar
D. H. Hyers, G. Isac, and T. M. Rassias, Stability of Functional Equations in Several Variables, vol. 34, Springer Science & Business Media, 2012.