Nonlinear Problems: Mathematical Modeling, Analyzing, and Computing for Finance
View this Special IssueResearch Article  Open Access
Dynamic Nonlinear Pricing Model Based on Adaptive and Sophisticated Learning
Abstract
Existing dynamic pricing models which take consumersâ€™ learning behavior into account generally assume that consumers learn on the basis of reinforcement learning and beliefbased learning. Nevertheless, abundant empirical evidence of behavior game indicates that consumersâ€™ learning is normally described as a process of mixed learning. Particularly, for experience goods, a consumerâ€™s purchase decision is not only based on his previous purchase behavior (adaptive learning), but also affected by that of other consumers (sophisticated learning). With the assumption that consumers are both adaptive and sophisticated learners, we study a dynamic pricing model dealing with repeated decision problems in a duopoly market. Specifically, we build a dynamic game model based on sophisticated experienceweighted attraction learning model (SEWA) and analyze the existence of the equilibrium. Finally, we show the characteristics and differences of the steadystate solutions between models considering adaptive consumers and models considering sophistical consumers by numerical results.
1. Introduction
For experience goods, consumers are only able to learn about their own preferences for a certain product after experiencing it. Studies have shown that indeed, there exists a learning process when consumers buy experience goods and they tend to be bounded rational for goods bought repeatedly [1]. Learning model describes consumersâ€™ behavior by assuming that consumers are capable of using some simple learning rules when they make repeated decisions. Most previous research on dynamic pricing of experience goods describes consumersâ€™ learning process with the adaptive learning model. For example, consumersâ€™ purchase decisions are mainly based on their previous purchase behavior and usage experience. However, a large stream of empirical evidence indicates that in the market, there still exist sophisticated consumers. Different from adaptive ones, sophisticated consumers can use other consumersâ€™ purchase information to maximize their own utility. Hence, their learning process is different with that of adaptive consumers, and the way they purchase repeatedly can affect their purchase behavior [2]. With the development of information technology and ecommerce, information is easy to get and shared worldwide, which increases the number of sophisticated consumers in the market, and it is more realistic to consider consumersâ€™ sophisticated behavior when studying dynamic pricing problems. Therefore, we develop a sophisticated learning model based on adaptive learning and apply it to dynamic pricing problems.
Most previous literature on dynamic pricing of experience goods has assumed fully rational consumers [3â€“5]. But consumersâ€™ bounded rational behavior has a significant impact on market demand and corporate profits [6]. Moreover, we should pay more attention to consumersâ€™ learning behavior when they buy experience goods. There are kinds of learning models describing consumersâ€™ learning process. Brandts and Holt [7] propose Bayesian learning model when studying signal game. Sarin and Vahid [8] develop belief learning models. BĂ¶rgers and Sarin [9] propose reinforcement learning model. Oyarzun and Sarin [10] study the reinforcement learning model in risk decision. All these learning models assume that consumersâ€™ learning process is single. However, a fruitful avenue of empirical evidence suggests that the learning process is often a process of mixed learning. Camerer and Ho [11] develop the experienceweighted attraction learning model (EWA) which is a sort of mixed learning models, combining reinforcement learning model with brief learning model together. Furthermore, they show that EWA learning model fits remarkably better than the existing learning models in several different classes of games. Ho et al. [12] simplify the EWA model further and put forward the selftuning EWA learning model, which has the same function to predict like the EWA learning model. But EWA model is still adaptive learning model, which cannot be used to describe the sophisticated consumersâ€™ learning process. Camerer et al. [2] propose sophisticated experienceweighted attraction model (SEWA) based on EWA learning model describing the sophisticated consumersâ€™ learning process.
It is generally accepted that learning model is used to explain the behavior of game and individual decision making, but a lot of evidence suggests that learning model can also be used to explain the actual choice behavior [13, 14]. So far, many scholars have introduced learning model into their researches on dynamic pricing problems. Some of them use it in solving monopoly pricing problems of experience goods [15]. But generally, the market of experience goods is highly competitive. Chintagunta and Rao [16] consider the reinforcement learning model in a duopoly dynamic pricing model. Based on their study, Hopkins [17] compares and analyzes the steady state of reinforcement learning model and brief learning model in dynamic pricing. All these literatures assume that consumersâ€™ purchase behavior is a process of adaptive learning, but Camerer et al. [2] suggest sophisticated learning of consumers are more realistic in real world. So it is necessary to take sophisticated consumersâ€™ learning process into consideration in dynamic pricing. Amaldoss and Jain [18] found that using the sophisticated experienceweighted attraction learning model (SEWA) could explain the actual customersâ€™ purchase behavior better in their experimental study of the dynamic pricing of luxury goods.
Therefore, we study a dynamic pricing model of experience goods with both adaptive and sophisticated consumers and try to illustrate the effect of sophisticated consumers on the market structure more clearly. However, because of the particularity of experience goods, consumers cannot perceive the utility of products before purchasing; thus, there definitely exist limitations when describing the learning process of adaptive and sophisticated consumers with EWA model and SEWA model, respectively. To address these problems, we simplify the EWA model and SEWA model on the basis of belief learning proposed by Sarin and Vahid [8] and enforcement learning proposed by BĂ¶rgers and Sarin [9]. When studying dynamic oligopoly problems, in addition to determining the learning model of consumersâ€™ behavior, we also need to choose the suitable equilibrium analysis methods. There are several equilibrium conceptsâ€”openloop equilibrium, closedloop equilibrium, and Markov perfect equilibrium. We mainly use openloop equilibrium for it can conduct qualitative analysis on the steadystate solutions and stability and what is more, the analysis is simple [19]. The main contribution of this paper is as follows: we propose a dynamic pricing model based on adaptive learning and sophisticated learning; we also perform numerical comparison of dynamic pricing equilibrium solutions between sophisticated and adaptive learning. It can be concluded that, for nonlinear dynamic pricing problems, both models have groups of equilibrium solutions. Compared with the dynamic pricing model of adaptive consumers, however, when that of sophisticated consumers reaches a symmetric Nash equilibrium, differences between the two companiesâ€™ market share will decrease.
The rest of this paper is organized as follows. The next section introduces the duopoly dynamic pricing model based on the adaptive learning of EWA and analyzes the characteristics of steady states. In Section 3, we study the duopoly dynamic pricing model based on the sophisticated learning model and analyze the qualities of steadystate solutions through numerical solution. In Section 4, we summarize and conclude.
2. Dynamic Pricing Model of Duopoly Based on Adaptive Learning
We consider the infinite duopoly dynamic pricing model of adaptive consumers. In the market of repeated purchasing, assume that there are two firmsâ€”firm 1 and firm 2â€”producing two brands of experience goods (such as daily necessities, etc.). There are lots of consumers in the market and each one chooses only one brand to purchase at every period. When deciding to buy a brand at every period, adaptive consumers update their propensity to a brand according to their previous experience. We use EWA to describe consumersâ€™ learning process. And adaptive learning model based on EWA that is composed of three elements. The first one is consumersâ€™ propensity for the two goods which we denote by , and it is determined by goodsâ€™ prices and consumersâ€™ evaluation on quality or goodwill of goods, . The prices of the two goods are described as , (). Consumersâ€™ evaluation on quality or goodwill of goods is denoted by , (). We assume consumersâ€™ propensity for goods is ; thus, and . The second one is the choice rule, describing the probability to choose goods on the basis of goodsâ€™ propensity. The last one is the updating rule, describing the process which consumers update their propensity at each period. Table 1 lists all of the symbols used in this paper.

2.1. Choice Probabilities
Consumersâ€™ preference for good will affect the probability of choosing it in some way. Furthermore, the more degree a consumer prefers a certain good, the more likely he chooses it. That means should increase along with and decrease along with monotonically (where ). There are several probability functions meeting the above requirements, such as Logit, power, and probit. Previous studies illustrate that the Logit forms fit better than the others [20]. So we use the Logit function, which is commonly used in the studies about making brand choices with risk and uncertainty [21, 22]. Based on the Logit rules, the probability of a consumer choosing good from the first firm is The probability that a consumer chooses good 2 is . is the optimal degree of consumers, and the probability a consumer chooses the optimal product increases with . If , the consumer will choose every good with equal probability, and if , then he will choose the optimal product with the probability of 1. is the relative quality or goodwill of two goods for adaptive consumers; is the relative price of two goods. According to (1), the consumersâ€™ choosing probability () is nonlinear and the degree of its nonlinearity depends on the optimization parameter . For example, if is very small, approximates a linear function.
We consider the customers in the market as a whole, the number of the consumers is and the total demand of the market is . Thus, we can consider the probability that the consumer chooses good as the total demand of the market.
2.2. Updating Rules
Prices for experience goods are usually clearly marked on the shelves. Thus, the learning the consumer has to undertake is neither their prices nor their distribution but about the quality or goodwill of experience goods. And consumers are only able to receive a payoff for product after experiencing it. For this information case asymmetry, Erev and Roth [13] propose a reinforcement learning model, the updating rule is where , is the consumersâ€™ estimate for the quality or goodwill of good and at , respectively. is the utility which is acquired by consuming good at . A consumerâ€™s choice behavior is random due to a given Logit choice rule. In this context we assume that the consumerâ€™s experience is also random. So is a random variable. We assume is the average value of . is the â€śrecencyâ€ť parameter, . means only the last period is remembered. If approaches to zero, it indicates that previous experience has great effect on present belief. In reinforcement learning model, the good is purchased at the previous period and the good is not purchased at the previous period.
Sarin and Vahid [8] propose a â€śbeliefbasedâ€ť learning model and the updating rule is where , , , , and in (3) are the same as in (2).
However, considerable empirical evidence suggests that the learning process is often a process of mixed learning. Therefore, in order to accurately describe the consumerâ€™s learning process, we combine rules (2) and (3) by a similar way with Camerer and Ho [11] and obtain an adaptive learning model based on EWA. The updating rule is where is the â€śrecencyâ€ť parameter of preference for purchased goods and is the â€śrecencyâ€ť parameter of preference for goods not purchased, , . and are the same as in (2).
Assumption 1. The â€śrecencyâ€ť parameter , can be seen as coefficient memory and consumers are more impressed for the purchased good than the nonpurchased good. Therefore, we assume that and . means that the EWA model turns to belief learning model and means that the EWA model turns to reinforcement learning model.
The following Proposition 2 shows the continuous time updating rule of adaptive consumers.
Proposition 2. The continuous time updating rule of adaptive consumers is
Proof. See the Appendix.
2.3. Equilibrium Analysis of Dynamic Pricing Model
We suppose the two firms producing a product at a constant marginal cost in time , and marginal cost for both brands is normalized to zero. Then the optimal function of firmâ€™s longterm profit is subject to
For myopic firm, its shortterm profit is described as , which is the value function. In order to analyse the existence of Nash equilibrium of our model, we further make the following general assumption on the value function.
Assumption 3. The following assumption hold for all , .(1)The strategy sets are nonempty convex.(2)The firm â€™s instantaneous profits are uniformly bounded. That is, .
Lemma 4. The firm â€™s instantaneous profits are quasiconcave in ().
Proof. Taking the first derivative of with respect to yields
Let and consider the following two cases.(1)The function for all . Because is continuous and , it follows that for . Hence, strictly decreases in for and therefore is strongly quasiconcave.(2)There exists such that . The second derivative of with respect to evaluated at is equal to
where the third equality follows from . The preceding shows that is a local maximum of and that there does not exist an interior minimum for . It then follows that is unique because otherwise there must exist an interior minimum for . Consequently, the function increases for and decreases for , and therefore is strongly quasiconcave.
Lemma 5. There exists at least one Nash equilibrium in dynamic pricing model of duopoly based on adaptive learning (see (6)).
Proof. From Theorem 2.1 in Vives [23], we see that if the strategy sets are nonempty convex and compact, and the firm â€™s instantaneous profits is continuous in the prices of all firms and quasiconcave in its own price, the Nash equilibrium will exist. In our model, although each firm can choose a price from , which is not compact, the firm â€™s instantaneous profits are uniformly bounded, allowing us to construct an equivalent model by restricting firm to choose a price from a nonempty convex and compact set. In addition, the firm â€™s instantaneous profits a continuous and quasiconcave in (). So, there exists Nash equilibrium in dynamic pricing model of duopoly based on adaptive learning.
By analyzing (6) and (5), we obtain the following steady states of Nash equilibrium.
Proposition 6. The steady states of openloop Nash equilibrium satisfying the following (10) and (11):
Proof. See the Appendix.
According to (10) and (11), consumerâ€™s choosing probability is nonlinear, so the steadystate solution is also nonlinear, and the steadystate may have multiple solutions. Additional, it can be seen from (10) and (11) that if is close to zero, there is only one steadystate solution.
The following proposition shows the relationship between the myopic optimal price and optimal steadystate price. We set as the firmsâ€™ myopic optimal price, and the solution is . (It has been established by Caplin and Nalebuff [24] that Nash equilibrium in oligopoly with Logit demand functions exists and is unique.)
Proposition 7. If is the solution of (10) and (11), at any path of the optimal price, each firmâ€™s optimal price is (the equality holds . ).
Proof. See the Appendix.
2.4. Numerical Results
The steadystate prices satisfy (10) and (11). Because is nonlinear equation of and , so we turn (10) and (11) into the nonlinear equations of and and get the steadystate solution by solving equations. In fact, (10) and (11) have several different sets of steadystate solution. For instance, we assume that . That is, the two goods are identical. If , , , and , there are three groups of steadystate solutions for , that is , , and . As is shown in Figure 1, the curve of (10) and the curve of (11) have three intersections. Respectively, the corresponding probabilities are , , and . If , , , and ( means the adaptive learning model based on EWA turning to the reinforcement learning model); there are also three groups of steadystate solutions for , that is , , and . As is shown in Figure 2, the curve of (10) and the curve of (11) have three intersections. Respectively, the corresponding probability is , , and . If , , , and ( means the adaptive learning model based on EWA turning to the belief learning model); we can see directly that (10) and (11) have only one steadystate solution , and the corresponding probability is .
Based on Figures 1 and 2, if and is big enough, there will always be three groups of the steadystate solution, and the greater is, the more divergent three sets of steadystate solutions are. If is a small positive number, as long as firm 1 charges a price lower than myopic optimal price, the third equilibrium solution will emerge, and . In other words, by selecting a lower initial price, the firm has the capability of making some naive consumers fascinated and gaining a greater market share. What is more, comparing Figure 1 with Figure 2, it can be seen that the three solutions in Figure 2 are more divergent, which indicates that the two firms have greater differences in their market share under the balanced equilibrium on both sides. In order to illustrate the relationship between the number of steadystate solutions and , we get Figure 3 with , , and . Therefore, we can draw the conclusion that compared to the adaptive learning model based on EWA, reinforcement learning model gives consumers more possibilities to choose their own familiar goods. As a result, consumers gradually are trapped in those products with inferior quality.
What happens if one firm holds a quality advantage? We assume that and , , , and ; then we get the relationship between and which isillustrated in Figure 4. From Figure 4 we can see that there exists a steady state when and , and there are three steady state when . This illustrates that it remains true that dominance by a low quality firm is a possibility if the initial value of goodwill is sufficiently close to the appropriate steady state.
3. Dynamic Pricing Model of Duopoly Based on Sophisticated Learning
With the development of information technology and ecommerce, access to information is getting easier. Some consumers may update their thoughts about the quality or goodwill of experience goods not only by their own experience but also taking advantage of other consumersâ€™ purchase information. We call this part of consumers as sophisticated consumers. The way they purchase repeatedly can affect their purchase behavior, and the consumerâ€™s behavior can affect market structure. Thus, taking sophisticated consumers into account when study the dynamic pricing problems is more in accordance with reality.
Like the previous assumptions, there are two firms 1 and 2 producing experience goods with two brands. At each point in time the consumer seeks to buy one unit of the good, either from firm 1 or firm 2. Here we assume that there are two types of consumers on the market, one is adaptive consumers, their proportion is , and the other one is sophisticated consumers, their proportion is .
Same as adaptive learning model based on EWA, sophisticated learning model is composed of three elements. The first one is consumersâ€™ propensity for the two goods which we denoted as . The prices of the two goods are described as , and consumersâ€™ evaluation on quality or goodwill of goods is denoted by . (). The second one is the choice probabilities. The last one is the updating rule. The significant difference between sophisticated learning and adaptive learning model is the updating rule. We will describe this difference in detail in Section 3.2.
3.1. Choice Probabilities
Same as the adaptive consumers, the sophisticated consumersâ€™ choice probability can be given as and . Where is the sophisticated consumersâ€™ estimate for the quality or goodwill of good and is the relative quality or goodwill of two goods for sophisticated. According to (12), the consumersâ€™ choosing probability ( is nonlinear. The nonlinearity arises from the nonlinearity of the demand function and the degree of its nonlinearity depends on the optimization parameter . For example, if is very small, approximates a linear function.
We assume the customers in the market as a whole () as in Section 2.1. So the total demand of the market is .
3.2. Updating Rules
Because sophisticated consumers are easily influenced by other consumers when they make purchase decisions. We assume that, for a consumer, the more the goods he buys, the better the quality of the goods is. In other words, the less he purchases, the poorer the quality is. Therefore, we assume that the credit evaluation for sophisticated consumers after purchasing good 1 is where represents sophisticated consumersâ€™ strategy for purchasing good 1. means other consumersâ€™ purchase strategy. is the number of customers buying good 1, while is those who buy good 2. So the overall number of customers in the market is . Considering customers on the market as a whole, we get .
According to the SEWA model proposed by Camerer et al. [2], we have where . Similarly, when consumers do not buy good ,
Then we obtain the updating rule of sophisticated consumers as follows.
Proposition 8. The continuous updating rule of sophisticated consumersâ€™ is
The poof of Proposition 8 is similar with the poof of Proposition 2. If , there is only adaptive consumers in the market, namely, the experience inspired model. If , there is only sophisticated consumers in the market, namely, the AQRE model. and it is apparent that sophisticated consumers become adaptive consumers if , and the larger is, the greater effect on sophisticated consumersâ€™ purchase decision from other consumers.
3.3. Equilibrium Analysis of Dynamic Pricing Model
We also suppose the two firms producing a product at a constant marginal cost and marginal cost for both brands is normalized to zero. Based on the assumption in Sections 3.1 and 3.2, the maximization function of firmâ€™s longterm profit is subject to
By analyzing (17) and (18), we obtain the following steady states of Nash equilibrium.
Proposition 9. The steady states of Nash equilibrium for openloop satisfy where
Proof. See the Appendix.
Equations (19) are the requirements which steadystate solutions must meet. According to these equations, consumersâ€™ choosing probability is nonlinear, so the steadystate solution is also nonlinear, and the steadystate may have multiple solutions.
3.4. Numerical Results
In fact, (19) has three different steadystate solutions. For example, we assume . That is, the two goods are completely identical. If , , , , and , there are three groups of steadystate solutions for , we get , , and . Respectively, the corresponding probabilities , are , , and .
Compared with the steady states of dynamic pricing model which only considers adaptive consumers in the last section, the relative market share of two firms is less in the steady states obtained from this section. Whether there are sophisticated consumers in the market, the cognition degree of sophisticated consumers is mainly reflected in the value of in the model. So, we then analyze the relationship between the steadystate solution and the value of . In the steadystate solutions, the relationship among , , and is shown in Figures 5 and 6.
As we can see from Figures 5 and 6, the steady state is gradually convergent with the increasing of . That is to say, in case of different perception for the same quality products, the market share disparity is smaller in the steady state of sophisticated learning situation than that of selfadaptive learning situation. Compared to the market where only exist adaptive consumers, when there are also sophisticated consumers and if , is small but positive; the firm 1 should choose a lower price which will place him on the third stable state. The market share of firm 1 is less. This indicates that when there are sophisticated consumers in the market, the firmsâ€™ income is less than the case that there are only adaptive consumers in the market simply by raising initial evaluation of consumers for the commodity. Therefore, when faced with sophisticated consumers, firms should not only raise the initial evaluation of consumers for the commodity but also improve product quality to increase market share.
4. Conclusion
This paper considers the dynamic pricing problems of experience goods in a duopoly market. First of all, we presented the selfadaptive learning model based on EWA model and applied the model to deal with dynamic pricing problems. With the concept of openloop equilibrium, we gained groups of steadystate solutions when using nonlinear dynamic programming. In the analysis of steadystate solutions, we get that there will be a dominant company in the symmetrical steadystate solutions. Secondly, we put forward a sophisticated learning model on the basis of previous studies and applied it to dynamic pricing as well. The dynamic pricing model of sophisticated learning also has multiple steadystate solutions, and there must be one firm dominating when a symmetric steadystate solution exists, which is similar to the selfadaptive learning dynamic pricing model based on EWA. But in the dynamic pricing model based on sophisticated learning, differences of market share between the two firms decrease when the steadystate solution is symmetric. This shows that, compared with the adaptive consumers, sophisticated consumers less easily become locked into the habit of purchasing inferior goods. Therefore, with the market existing sophisticated consumers, firms should not only improve consumersâ€™ initial evaluation but also devote themselves to improving the quality of products, so as to occupy the dominate status in the market. While there are most adaptive consumers in the market, firms can successfully make parts of naive consumers obsessed only by increasing consumersâ€™ initial evaluation on their products. This paper studies dynamic pricing problems of experience goods in a duopoly market. However, there are lots of firms producing experience goods in the market. Hence, future research can consider the case of many firms in the market. Meanwhile, we assume that consumers are homogeneous and do not consider heterogeneous consumers. Furthermore, in reality, decision makers are always in front of imprecise and vague operational conditions [25]. Uncertainties have been tackled in a lot of ways and fuzzy set theory has a long history for handling imprecise values [26].
Appendix
Proof of Proposition 2. Because the consumersâ€™ actual preference evolution is random, firms need to use stochastic dynamic optimization method to predict consumer preferences. There are lots of stochastic dynamic optimization methods [27], and in this thesis, we assume that firms use the stochastic optimal approximation theory [28]. Firstly, we calculate the expectations of change of :
Stochastic approximation, which has been widely used in the recent literature on learning, shows that if is small, the solution of the original stochastic difference equation to the differential equation (5) will be closely approximated by the solution to the following parallel continuous time system [29].
Proof of Proposition 6. The currentvalue Hamiltonian of (6) is given by
, are costate variables. Let , we have
By derivation of to , separately, and with the equations and , we have
Because the steadystate meets the following condition: , , , , , and , combined with (A.3), we can obtain the steadystate price satisfying (10) and (11).
Proof of Proposition 7. By the proof of Proposition 6, we can have (1)Obviously if , we can get and .(2)Because of , , , and , we can obtain .
Proof of Proposition 9. The currentvalue Hamiltonian of (16) is given by , , , and are costate variables. Let , we have By derivation of to , , , and separately, and with the equations , , , and , we have Because the steadystate meets the following condition: , , , , , , , , , , , and , combined with (A.7), we can obtain the steadystate price satisfying (19).
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgment
This research is supported by the National Natural Science Foundation of China (Grant nos. 71371191, 70971139, 71210003, and 71271219).
References
 P. Nelson, â€śInformation and consumer behavior,â€ť The Journal of Political Economy, vol. 78, pp. 311â€“329, 1970. View at: Google Scholar
 C. F. Camerer, T.H. Ho, and J.K. Chong, â€śSophisticated experienceweighted attraction learning and strategic teaching in repeated games,â€ť Journal of Economic Theory, vol. 104, no. 1, pp. 137â€“188, 2002. View at: Publisher Site  Google Scholar
 K. Bagwell and M. H. Riordan, â€śHigh and declining prices signal product quality,â€ť The American Economic Review, vol. 81, no. 1, pp. 224â€“239, 1991. View at: Google Scholar
 C. Shapiro, â€śOptimal pricing of experience goods,â€ť The Bell Journal of Economics, vol. 14, no. 2, pp. 497â€“507, 1983. View at: Google Scholar
 P. Milgrom and J. Roberts, â€śPrice and advertising signals of product quality,â€ť The Journal of Political Economy, vol. 94, pp. 796â€“821, 1986. View at: Google Scholar
 R. Spiegler, Bounded Rationality and Industrial Organization, Oxford University Press, 2011.
 J. Brandts and C. A. Holt, â€śNaive bayesian learning and adjustment to equilibrium in signaling games,â€ť Journal of Economic Behavior and Organization. In press. View at: Google Scholar
 R. Sarin and F. Vahid, â€śPayoff assessments without probabilities: a simple dynamic model of choice,â€ť Games and Economic Behavior, vol. 28, no. 2, pp. 294â€“309, 1999. View at: Publisher Site  Google Scholar
 T. Börgers and R. Sarin, â€śNaive reinforcement learning with endogenous aspirations,â€ť International Economic Review, vol. 41, no. 4, pp. 921â€“950, 2000. View at: Publisher Site  Google Scholar  MathSciNet
 C. Oyarzun and R. Sarin, â€śLearning and risk aversion,â€ť Journal of Economic Theory, vol. 148, no. 1, pp. 196â€“225, 2013. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 C. Camerer and T.H. Ho, â€śExperienceweighted attraction learning in normal form games,â€ť Econometrica, vol. 67, no. 4, pp. 827â€“874, 1999. View at: Google Scholar
 T. H. Ho, C. F. Camerer, and J.K. Chong, â€śSelftuning experience weighted attraction learning in games,â€ť Journal of Economic Theory, vol. 133, no. 1, pp. 177â€“198, 2007. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 I. Erev and A. E. Roth, â€śPredicting how people play games: reinforcement learning in experimental games with unique, mixed strategy equilibria,â€ť American Economic Review, vol. 88, no. 4, pp. 848â€“881, 1998. View at: Google Scholar
 I. Erev and G. Barron, â€śOn adaptation, maximization, and reinforcement learning among cognitive strategies,â€ť Psychological Review, vol. 112, no. 4, pp. 912â€“931, 2005. View at: Publisher Site  Google Scholar
 D. Bergemann and J. Valimaki, â€śMonopoly pricing of experience goods,â€ť Cowles Foundation Disussion Paper, vol. 1463, 2005. View at: Google Scholar
 P. K. Chintagunta and V. R. Rao, â€śPricing strategies in a dynamic duopoly: a differential game model,â€ť Management Science, vol. 42, no. 11, pp. 1501â€“1514, 1996. View at: Google Scholar
 E. Hopkins, â€śAdaptive learning models of consumer behavior,â€ť Journal of Economic Behavior and Organization, vol. 64, no. 34, pp. 348â€“368, 2007. View at: Publisher Site  Google Scholar
 W. Amaldoss and S. Jain, â€śConspicuous consumption and sophisticated thinking,â€ť Management Science, vol. 51, no. 10, pp. 1449â€“1466, 2005. View at: Publisher Site  Google Scholar
 R. Cellini and L. Lambertini, â€śA dynamic model of differentiated oligopoly with capital accumulation,â€ť Journal of Economic Theory, vol. 83, no. 1, pp. 145â€“155, 1998. View at: Publisher Site  Google Scholar
 C. Camerer and T.H. Ho, â€śExperienceweighted attraction learning in coordination games: probability rules, heterogeneity, and timevariation,â€ť Journal of Mathematical Psychology, vol. 42, no. 23, pp. 305â€“326, 1998. View at: Publisher Site  Google Scholar
 M. BenAakiva and S. R. Lerman, Discrete Choice Analysis: Theory and Application to Travel Demand, 1985.
 S. P. Anderson, A. de Palma, and J.F. Thisse, Discrete Choice Theory of Product Differentiation, MIT Press, 1992. View at: MathSciNet
 X. Vives, Oligopoly Pricing: Old Ideas and New Tools, The MIT press, 2001.
 A. Caplin and B. Nalebuff, â€śAggregation and imperfect competition: on the existence of equilibrium,â€ť Econometrica, vol. 59, no. 1, pp. 25â€“59, 1991. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 L. Wang, Q. L. Fu, C. G. Lee, and Y. R. Zeng, â€śModel and algorithm of fuzzy joint replenishment problem under credibility measure on fuzzy goal,â€ť KnowledgeBased Systems, vol. 39, pp. 57â€“66, 2013. View at: Google Scholar
 L. Wang, H. Qu, Y. Li, and J. He, â€śModeling and optimization of stochastic joint replenishment and delivery scheduling problem with uncertain costs,â€ť Discrete Dynamics in Nature and Society, vol. 2013, Article ID 657465, 12 pages, 2013. View at: Publisher Site  Google Scholar  MathSciNet
 D. Fudenberg and D. K. Levine, The Theory of Learning in Games, vol. 2, MIT Press, 1998. View at: MathSciNet
 M. Benaïm, â€śDynamics of stochastic approximation algorithms,â€ť in Séminaire de Probabilités, XXXIII, vol. 1709, pp. 1â€“68, Springer, 1999. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 M. Benaïm, â€śRecursive algorithms, urn processes and chaining number of chain recurrent sets,â€ť Ergodic Theory and Dynamical Systems, vol. 18, no. 1, pp. 53â€“87, 1998. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
Copyright
Copyright © 2014 Wenjie Bi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.