Journal of Probability and Statistics

Journal of Probability and Statistics / 2012 / Article

Research Article | Open Access

Volume 2012 |Article ID 167431 |

John W. Lau, Ed Cripps, "Bayesian Non-Parametric Mixtures of GARCH(1,1) Models", Journal of Probability and Statistics, vol. 2012, Article ID 167431, 16 pages, 2012.

Bayesian Non-Parametric Mixtures of GARCH(1,1) Models

Academic Editor: Ori Rosen
Received02 Mar 2012
Revised16 May 2012
Accepted18 May 2012
Published16 Jul 2012


Traditional GARCH models describe volatility levels that evolve smoothly over time, generated by a single GARCH regime. However, nonstationary time series data may exhibit abrupt changes in volatility, suggesting changes in the underlying GARCH regimes. Further, the number and times of regime changes are not always obvious. This article outlines a nonparametric mixture of GARCH models that is able to estimate the number and time of volatility regime changes by mixing over the Poisson-Kingman process. The process is a generalisation of the Dirichlet process typically used in nonparametric models for time-dependent data provides a richer clustering structure, and its application to time series data is novel. Inference is Bayesian, and a Markov chain Monte Carlo algorithm to explore the posterior distribution is described. The methodology is illustrated on the Standard and Poor's 500 financial index.

1. Introduction

Generalised autoregressive conditional heteroscedastic (GARCH) models estimate time-varying fluctuations around mean levels of a time series known as the volatility of a time series [1, 2]. The standard GARCH model specifies the volatility persistence at time ๐‘ก as a linear combination of previous volatilities and squared residual terms. The persistence is assumed constant for all ๐‘ก resulting in smooth transitions of volatility levels. However, many nonstationary time series exhibit abrupt changes in volatility suggesting fluctuating levels of volatility persistence. In this case the GARCH parameters undergo regime changes over time. If the maximum number of potential regimes is known Markov-switching GARCH, models are an appealing option [3โ€“8]. However, often the number of volatility regimes is not known and can be difficult to preselect. In this case, Bayesian nonparametric mixture models are attractive because they allow the data to determine the number of volatility regimes or mixture components. For example, recently nonparametric mixing over the Dirichlet process has been described by Lau and So [9] for GARCH(1,1) models, Jensen and Maheu [10] for stochastic volatility models and Griffin [11] for Ornstein-Uhlenbeck Processes.

The Dirichlet process [12] is most widely applied in nonparametric mixture models and within a hierarchical framework is introduced in Lo [13] for independent data. Lau and So [9] extend the work of Lo [13] to time-dependent data where time-varying GARCH parameters are mixed over the Dirichlet process. The additional flexibility of this approach allows a range of GARCH regimes, from all observations generated by a single GARCH model to each observation in the series generated by a unique set of GARCH parameters. Lau and So [9] conclude with a discussion on the possibility of extending their method to alternative random probability measures that provide greater clustering flexibility than the Dirichlet process. We continue this discussion by outlining a novel method for a class of GARCH mixture models mixed over the Poisson-Kingman process [14, 15] derived from the stable subordinator (known henceforth as PKSS). To illustrate the richer clustering mechanisms of the PKSS process, we describe three of its special casesโ€”the Dirichlet process [12], the Poisson-Dirichlet (PD) process [14, 16], and the Normalized Generalized Gamma (NGG) process [17, 18].

Theoretical developments and recent applications of the PKSS process are discussed in Lijoi et al. [19]. However, we note that the PD and the NGG processes have yet to be developed for volatitility estimation, or indeed time series applications in general, and in this sense the work in this paper is novel. Although the Dirichlet process has now been used extensively in applications, the implementation of more general nonparametric mixture models for applied work is not always obvious. We therefore describe three Markov chain Monte Carlo (MCMC) algorithms. First, we develop a weighted Chinese restaurant Gibbs type process for partition sampling to explore the posterior distribution. The basis of this algorithm is developed for time-dependent data in Lau and So [9], and we extend it to allow for the PKSS process. We also note recent developments for the sampling of Bayesian nonparametric models in Walker [20] and Papaspiliopoulos and Roberts [21] and describe how these algorithms can be constructed to estimate our model.

The methodology is illustrated through volatility and predictive density estimation of a GARCH(1,1) model applied to the Standard and Poorโ€™s 500 financial index from 2003 to 2009. Results are compared between a no-mixture model and nonparametric mixtures over the Dirichlet, PD, and NGG processes. Under the criterion of marginal likelihood the NGG process performs the best. Also, the PD and NGG process outperforms the previously studied Dirichlet process which in turn outperforms the no-mixture model. The results suggest that alternatives to the Dirichlet process should be considered for applications of nonparametric mixture models to time-dependent data.

The paper proceeds as follows. Section 2 presents a Bayesian mixture of GARCH models over an unknown mixing distribution, outlines a convenient Bayesian estimator based on quadratic loss, and describes some of the time series properties of our model. Section 3 discusses the class of random probability measures we consider as the mixing distributions and details the clustering mechanisms associated with the three special cases mentioned above via the Pรณlya Urn representation and the consequences for the posterior distribution of the partitions resulting from the PKSS process. Our MCMC algorithm for sampling the posterior distribution is presented in Section 4, and the alternative MCMC algorithms of Walker [20] and Papaspiliopoulos and Roberts [21] are presented in Section 5. Section 6 describes the application, and Section 7 concludes the paper.

2. The Mixture Model

Let ๐‘Œ๐‘ก be the observed variable at time ๐‘ก, ๐˜๐‘กโˆ’1={๐‘Œ1,โ€ฆ,๐‘Œ๐‘กโˆ’1} be the observations from time 1 to time ๐‘กโˆ’1 and ๐‘Œ๐‘ก=0 for ๐‘กโ‰ค0. The GARCH(1,1) model specifies ๐‘Œ๐‘กโˆฃ๐˜๐‘กโˆ’1,(๐œˆ,๐œ’,๐œ“)โˆผ๐‘๎€ท0,๐œŽ2๐‘ก๎€ธ,๐œŽ2๐‘ก=๐œˆ+๐œ’๐‘Œ2๐‘กโˆ’1+๐œ“๐œŽ2๐‘กโˆ’1,(2.1) where ๐œˆ>0, ๐œ’โ‰ฅ0, ๐œ“โ‰ฅ0 and ๐œŽ2๐‘ก=0 for ๐‘กโ‰ค0. In (2.1) the GARCH parameters (๐œˆ,๐œ’,๐œ“) are not time varying implying that volatility persistence is constant over time with smooth transitions of volatility levels. To allow abrupt changes to volatilities, we extend (2.1) by writing ๐‘๐‘ก={๐œˆ๐‘ก,๐œ’๐‘ก,๐œ“๐‘ก}, ๐œˆ๐‘ก>0, ๐œ’๐‘กโ‰ฅ0, ๐œ“๐‘กโ‰ฅ0 for ๐‘ก=1,โ€ฆ,๐‘› and ๐™๐‘ก={๐‘1,โ€ฆ,๐‘๐‘ก} as joint latent variables from time 1 to time ๐‘ก; that is, the model is now a dynamic GARCH model with each observation potentially generated by its own set of GARCH parameters as follows: ๐‘Œ๐‘กโˆฃ๐˜๐‘กโˆ’1,๐™๐‘กโˆผ๐‘๎€ท0,๐œŽ2๐‘ก๎€ธ,๐œŽ2๐‘ก=๐œˆ๐‘ก+๐œ’๐‘ก๐‘Œ2๐‘กโˆ’1+๐œ“๐‘ก๐œŽ2๐‘กโˆ’1.(2.2) Note that in model (2.2) the data controls the maximum potential number of GARCH regimes, the sample size ๐‘›. In contrast, finite switching models preallocate a maximum number of regimes typically much smaller than the number of observations. As the potential number of regimes gets larger, estimation of the associated transition probabilities and GARCH parameters in finite switching models become prohibitive. However, assuming that the latent variables ๐™๐‘›={๐‘1,โ€ฆ,๐‘๐‘›} are independent of each other and completing the hierarchy by modelling the GARCH parameters contained in ๐™๐‘› with an unknown mixing distribution, ๐บ, with law ๐’ซ the model becomes manageable, that is, ๐‘๐‘กโˆฃ๐บiidโˆผ๐บ(๐‘‘๐œˆ,๐‘‘๐œ’,๐‘‘๐œ“),๐บโˆผ๐’ซ(2.3) with ๐™๐‘› and the mixing distribution, ๐บ, parameters that we may estimate. Depending on the posterior distribution of the clustering structure associated with the mixing distribution, the results may suggest that ๐‘๐‘ก=๐‘ for ๐‘ก=1,2,โ€ฆ,๐‘› (a single regime GARCH model) up to a unique ๐‘๐‘ก for each ๐‘ก indicating a separate GARCH regime for each time point. This illustrates the flexibility of the model.

We write ๐‘ (๐™๐‘›,๐บ) as a positive integrable function of the latent variables, ๐™๐‘›, and the mixing distribution, ๐บ, to represent various quantities that may be of interest for inference. Under quadratic loss the Bayesian estimator is the posterior expectation ๐ธ[๐‘ (๐™๐‘›,๐บ)โˆฃ๐˜๐‘›]. For our model this is an appealing estimator because it does not require the posterior of ๐บ but only the posterior distribution of the sequence {๐‘1,โ€ฆ,๐‘๐‘›}, that is, ๐ธ๎€บ๐‘ ๎€ท๐™๐‘›,๐บ๎€ธโˆฃ๐˜๐‘›๎€ป=๎€œ๐’ต๐‘›โ„Ž๎€ท๐˜๐‘›,๐™๐‘›๎€ธ๐œ‹๎€ท๐‘‘๐™๐‘›โˆฃ๐˜๐‘›๎€ธ,(2.4) because โ„Ž๎€ท๐˜๐‘›,๐™๐‘›๎€ธ=๎€œ๐’ข๐‘ ๎€ท๐™๐‘›,๐บ๎€ธ๐œ‹๎€ท๐‘‘๐บโˆฃ๐˜๐‘›,๐™๐‘›๎€ธ,(2.5) where ๐œ‹(๐‘‘๐บโˆฃ๐˜๐‘›,๐™๐‘›) represents the posterior law of the random probability measure ๐บ, and ๐œ‹(๐‘‘๐™๐‘›โˆฃ๐˜๐‘›) is the posterior distribution of the sequence ๐™๐‘›={๐‘1,โ€ฆ,๐‘๐‘›}.

In Lau and So [9] the unknown mixing distribution for the GARCH parameters, ๐บ, is taken to be the Dirichlet process. This paper combines the theoretical ground work of Lijoi et al. [19, 22] with Lau and So [9] by allowing ๐บ to be the PKSS process. The result is a nonparametric GARCH model which contains (among others) the Dirichlet process typically used in time series as well as the PD and NGG processes as special cases.

Understanding conditions for stationarity of a time series model is fundamental for statistical inference. Since our model is specified with zero mean over time, we provide a necessary and sufficient condition for the existence of a secondary order stationary solution for the infinite mixture of GARCH(1,1) models. The derivation closely follows Embrechts et al. [23] and Zhang et al. [24], and we state the conditions without giving proof. By letting ๐œ–๐‘ก be a standard normal random variable and replacing ๐‘Œ2๐‘กโˆ’1 by ๐œŽ2๐‘กโˆ’1๐œ–2๐‘กโˆ’1 then, conditioned on ๐‘Œ๐‘– for ๐‘–=1,โ€ฆ,๐‘กโˆ’1 and {๐œˆ๐‘–,๐œ’๐‘–,๐œ“๐‘–} for ๐‘–=1,โ€ฆ,๐‘ก, ๐œŽ2๐‘ก in (2.2) becomes ๐œŽ2๐‘ก=๐œˆ๐‘ก+๐œ’๐‘ก๐‘Œ2๐‘กโˆ’1+๐œ“๐‘ก๐œŽ2๐‘กโˆ’1=๐œˆ๐‘ก+๎€ท๐œ’๐‘ก๐œ–2๐‘กโˆ’1+๐œ“๐‘ก๎€ธ๐œŽ2๐‘กโˆ’1.(2.6) Here (2.6) is well known to be an univariate stochastic difference equation expressed as ๐‘‹๐‘ก=๐ต๐‘ก+๐ด๐‘ก๐‘‹๐‘กโˆ’1,(2.7) where ๐‘‹๐‘ก=๐œŽ2๐‘ก, ๐ด๐‘ก=๐œ’๐‘ก๐œ–2๐‘กโˆ’1+๐œ“๐‘ก, and ๐ต๐‘ก=๐œˆ๐‘ก. The stationarity of (2.7) implies the secondary order stationarity of (2.2), that is, ๐‘‹๐‘ก๐‘‘โ†’๐‘‹ as ๐‘กโ†’โˆž for some random variable ๐‘‹, and ๐‘‹ satisfies ๐‘‹=๐ต+๐ด๐‘‹, where the random variable pair (๐ด๐‘ก,๐ต๐‘ก)๐‘‘โ†’(๐ด,๐ต) as ๐‘กโ†’โˆž for some random variable pairs (๐ด,๐ต). The stationary solution of (2.7) holds if ๐ธ[ln+|๐ด๐‘ก|]<0 and ๐ธ[ln+|๐ต๐‘ก|]<โˆž, where ln+|๐‘ฅ|=ln[max{๐‘ฅ,1}] as given in Embrechts et al. [23, Sectionโ€‰โ€‰8.4, pages 454โ€“481] and Zhang et al. [24, Theoremsโ€‰โ€‰2 andโ€‰โ€‰3], Vervaat [25], and Brandt [26]. So the conditions for the stationarity in our model are ๎€œ(0,โˆž)ln+|๐œˆ|๐ป1(๐‘‘๐œˆ)<โˆž,๎€œ(0,โˆž)2๐ธ๎‚ƒln+||๐œ’๐œ–2+๐œ“||๎‚„๐ป2(๐‘‘๐œ’,๐‘‘๐œ“)<0,(2.8) where the expectation for the second condition is applicable only to ๐œ–, which is a standard normal random variable, and both ๐ป1=โˆซ(0,โˆž)2๐ป(๐‘‘๐œˆ,๐‘‘๐œ’,๐‘‘๐œ“) and ๐ป2=โˆซ(0,โˆž)๐ป(๐‘‘๐œˆ,๐‘‘๐œ’,๐‘‘๐œ“) are marginal measures of ๐ป=๐ธ[๐บ], the mean measure of ๐บ.

Now consider the first two conditional moments from model (2.2). Obviously, the first conditional moment is zero, and the second conditional moment is identical to ๐œŽ2๐‘ก in (2.6). The distinguishing feature of model (2.2) is that parameters change over time and have the distribution ๐บ. Considering ๐œŽ2๐‘ก as a scale of the model results in a scale mixture model over time. From the representation in (2.6), ๐œŽ2๐‘ก could be rewritten as ๐œŽ2๐‘ก=๐œˆ๐‘ก+๐‘กโˆ’1๎“๐‘—=1๐œˆ๐‘กโˆ’๐‘—๐‘ก๎‘๐‘–=๐‘กโˆ’๐‘—+1๎€ท๐œ’๐‘–๐œ–2๐‘–โˆ’1+๐œ“๐‘–๎€ธ.(2.9)

The unconditional second moment could be derived according to this representation by marginalising over all the random variates. Also, ๐œŽ2๐‘ก in (2.9) can be viewed as a weighted sum of the random sequence {๐œˆ๐‘ก,โ€ฆ,๐œˆ1}, and the random weights decay to zero at a polynomial rate, as long as the model is stationary. In fact, the rate could be irregular over time, and this is a substantial difference between the mixture of GARCH models and the traditional GARCH models.

Finally, one might be also interested in the connection between models such as (2.2) with parameters having the distribution (2.3) and those having the Markov-switching characteristic that result in Markov-switching GARCH models [6, 7]. Markov-switching GARCH models have a similar structure to (2.3), and we can replace (2.3) by ๐‘๐‘กโˆฃ๎€ท๐‘†๐‘ก=๐‘–๎€ธ=๎€ฝ๐œˆ๐‘–,๐œ’๐‘–,๐œ“๐‘–๎€พ,๐‘†๐‘กโˆฃ๐‘†๐‘กโˆ’1โˆผ๐œ‚๐‘–๐‘—=๐‘ƒ๎€ท๐‘†๐‘ก=๐‘–โˆฃ๐‘†๐‘กโˆ’1=๐‘—๎€ธ,for๐‘–,๐‘—=1,โ€ฆ,๐พ,(2.10) where ๐‘†๐‘ก denotes the state variables, usually latent and unobserved. Marginalising the current state variable ๐‘†๐‘ก in (2.10) yields the conditional distribution for ๐‘๐‘ก given the previous state ๐‘†๐‘กโˆ’1, ๐‘๐‘กโˆฃ๐‘†๐‘กโˆ’1โˆผ๐พ๎“๐‘–=1๐œ‚๐‘–๐‘—๐›ฟ(๐œˆ๐‘–,๐œ’๐‘–,๐œ“๐‘–).(2.11) So (2.11) could be a random probability measure but with a finite number of components and dependent on previous state ๐‘†๐‘กโˆ’1.

3. The Random Probability Measures

We now describe PKSS process and detail the Dirichlet, the PD, and NGG processes to illustrate how the more general PKSS process allows for richer clustering mechanisms. Let ๐’ต be a complete and separable metric space and โ„ฌ(๐’ต) be the corresponding Borel ๐œŽ-field. Let ๐บโˆˆ๐’ข be a probability measure on the space (๐’ต,โ„ฌ(๐’ต)) where ๐’ข is the set of probability measures equipped with the suitable ๐œŽ-field โ„ฌ(๐’ข) and the corresponding probability measure ๐’ซ (see chapter 2 of Ghosh and Ramamoorthi [27] for more details). The random probability measure ๐บ is sampled from the law ๐’ซ and operates as the unknown mixing distribution of the GARCH parameters in (2.2).

All random probability measures within the class of PKSS processes feature an almost surely discrete probability measure represented as ๐บ(๐ด)=โˆž๎“๐‘–=1๐‘Š๐‘–๐›ฟ๐‘๐‘–(๐ด)for๐ดโˆˆโ„ฌ(๐’ต),(3.1) where ๐›ฟ๐‘๐‘– denotes the dirac delta measure concentrated at ๐‘๐‘–, in which the sequence of the random variables {๐‘1,๐‘2,โ€ฆ} has a nonatomic probability measure ๐ป and the sequence of the random variables {๐‘Š1,๐‘Š2,โ€ฆ} sums to 1 [28]. Also, the mean measure of the process ๐บ with respect to ๐’ซ is ๐ป as follows: ๐ธ[๐บ(๐ด)]=๐ป(๐ด)for๐ดโˆˆโ„ฌ(๐’ต).(3.2)

A common characterization of (3.1) is the well-known Pรณlya Urn prediction distribution described in Pitman [28]. For the purposes of this paper the Pรณlya Urn warrants further discussion for two reasons. First, it is important for developing our MCMC algorithm to explore the posterior distribution discussed in Section 4. Second, it explicitly details how the PKSS process is a generalisation of the Dirichlet, PD, and NGG processes and how the different cluster tuning mechanisms operates.

Let {๐‘1,โ€ฆ,๐‘๐‘Ÿ} be a sequence with size ๐‘Ÿ drawn from ๐บ where ๐‘Ÿ is a positive integer, and let ๐ฉ๐‘Ÿ denote a partition of integers {1,โ€ฆ,๐‘Ÿ}. A partition ๐ฉ๐‘Ÿ={๐ถ๐‘Ÿ,1,โ€ฆ,๐ถ๐‘Ÿ,๐‘๐‘Ÿ} of size ๐‘๐‘Ÿ contains disjoint clusters ๐ถ๐‘Ÿ,๐‘— of size ๐‘’๐‘Ÿ,๐‘— indicated by the distinct values {๐‘โˆ—1,โ€ฆ,๐‘โˆ—๐‘๐‘Ÿ}. The Pรณlya Urn prediction distribution for the PKSS process can now be written as ๐œ‹๎€ท๐‘‘๐‘๐‘–+1โˆฃ๐™๐‘–๎€ธ=๐‘‰๐‘–+1,๐‘๐‘–+1๐‘‰๐‘–,๐‘๐‘–๐ป๎€ท๐‘‘๐‘๐‘–+1๎€ธ+๐‘‰๐‘–+1,๐‘๐‘–๐‘‰๐‘–,๐‘๐‘–๐‘๐‘–๎“๐‘—=1๎€ท๐‘’๐‘–,๐‘—โˆ’๐›ผ๎€ธ๐›ฟ๐‘โˆ—๐‘—๎€ท๐‘‘๐‘๐‘–+1๎€ธ,(3.3) for ๐‘–=1,โ€ฆ,๐‘Ÿโˆ’1,๐œ‹(๐‘‘๐‘1)=๐ป(๐‘‘๐‘1),๐‘‰1,1=1 and ๐‘‰๐‘–,๐‘๐‘–=๎€ท๐‘–โˆ’๐‘๐‘–๐›ผ๎€ธ๐‘‰๐‘–+1,๐‘๐‘–+๐‘‰๐‘–+1,๐‘๐‘–+1.(3.4)

The Pรณlya Urn prediction distribution is that ๐‘๐‘–+1 will take a new value from ๐ป with mass ๐‘‰๐‘–+1,๐‘๐‘–+1/๐‘‰๐‘–,๐‘๐‘– and one of the existing values, {๐‘โˆ—1,โ€ฆ,๐‘โˆ—๐‘๐‘–}, with mass (๐‘–โˆ’๐‘๐‘–๐›ผ)๐‘‰๐‘–+1,๐‘๐‘–/๐‘‰๐‘–,๐‘๐‘–. This yields a joint prior distribution ๐œ‹๎€ท๐‘‘๐‘1,๐‘‘๐‘2,โ€ฆ,๐‘‘๐‘๐‘›๎€ธ=๐œ‹๎€ท๐‘‘๐‘1๎€ธ๐‘›โˆ’1๎‘๐‘–=1๐œ‹๎€ท๐‘‘๐‘๐‘–+1โˆฃ๐‘1,โ€ฆ,๐‘๐‘–๎€ธ,(3.5) as the product of easily managed conditional densities useful for our MCMC scheme below.

The PKSS process can be represented as either the Dirichlet, PD, or NGG processes in (3.3) as the following.(1)Taking 0โ‰ค๐›ผ<1 and ๐‘‰๐‘–,๐‘๐‘–=ฮ“(๐‘–)ฮ“(๐‘–+๐œƒ)๐‘๐‘–๎‘๐‘—=1(๐œƒ+(๐‘—โˆ’1)๐›ผ),(3.6) for ๐œƒ>โˆ’๐›ผ results in the PD process. (2)Setting that ๐›ผ=0๐‘‰๐‘–,๐‘๐‘–=ฮ“(๐‘–)ฮ“(๐‘–+๐œƒ)๐œƒ๐‘๐‘–,(3.7) and the PD process becomes the Dirichlet process. (3)The NGG process takes 0โ‰ค๐›ผ<1, such that ๐‘‰๐‘–,๐‘๐‘–=๐›ผ๐‘๐‘–โˆ’1๐‘’๐›ฝฮ“(๐‘–)๐‘–โˆ’1๎“๐‘˜=0โŽ›โŽœโŽœโŽ๐‘–โˆ’1๐‘˜โŽžโŽŸโŽŸโŽ (โˆ’1)๐‘˜๐›ฝ๐‘˜/๐›ผฮ“๎‚€๐‘๐‘–โˆ’๐‘˜๐›ผ,๐›ฝ๎‚,(3.8) for ๐›ฝ>0.

In the above ฮ“(โ‹…) is the complete Gamma function, and ฮ“(โ‹…,โ‹…) is the incomplete Gamma function. Examining of the predictive distribution (3.3), the ratios ๐‘‰๐‘–+1,๐‘๐‘–+1/๐‘‰๐‘–,๐‘๐‘– and ๐‘‰๐‘–+1,๐‘๐‘–/๐‘‰๐‘–,๐‘๐‘– indicate the difference between the Dirichlet process and the other processes. Substituting the values of ๐‘‰๐‘–,๐‘๐‘– into the allocation mass for each process reveals that the ratios do not depend on the number of existing clusters, ๐‘๐‘–. Rather, the Dirichlet process assigns probability to a new value independent of the number of existing clusters, and the rate of increment of partition size is a constant. In contrast, the PD and NGG processes assign probability to a new value dependent on the number of existing clusters. The comparison of these three special cases illustrates the richer clustering mechanisms of the PKSS process over the Dirichlet process. Furthermore, the PKSS process contains many other random measures, and these measures would be of interest for their clustering behaviors in further investigation.

Turning to the distribution of partitions, Pitman [28] shows that the joint prior distribution of the sequence {๐‘1,โ€ฆ,๐‘๐‘›} is ๐œ‹๎€ท๐‘‘๐‘1,โ€ฆ,๐‘‘๐‘๐‘Ÿ๎€ธ=๐‘‰๐‘Ÿ,๐‘๐‘Ÿ๐‘๐‘Ÿ๎‘๐‘—=1ฮ“๎€ท๐‘’๐‘Ÿ,๐‘—โˆ’๐›ผ๎€ธฮ“(1โˆ’๐›ผ)๐ป๎€ท๐‘‘๐‘โˆ—๐‘—๎€ธ.(3.9)

Notice that the joint distribution is dependent on the partition ๐ฉ๐‘Ÿ of ๐‘Ÿ integers {1,โ€ฆ,๐‘Ÿ}, and we can decompose (3.9) into ๐œ‹(๐‘โˆ—1,โ€ฆ,๐‘โˆ—๐‘๐‘Ÿ,๐ฉ๐‘Ÿ)=๐œ‹(๐‘‘๐‘โˆ—1,โ€ฆ,๐‘‘๐‘โˆ—๐‘๐‘Ÿโˆฃ๐ฉ๐‘Ÿ)๐œ‹(๐ฉ๐‘Ÿ). The distribution of the partition, ๐œ‹(๐ฉ๐‘Ÿ), is ๐œ‹๎€ท๐ฉ๐‘Ÿ๎€ธ=๐‘‰๐‘Ÿ,๐‘๐‘Ÿ๐‘๐‘Ÿ๎‘๐‘—=1ฮ“๎€ท๐‘’๐‘Ÿ,๐‘—โˆ’๐›ผ๎€ธฮ“(1โˆ’๐›ผ)(3.10) and is known as the Exchangeable Partition Probability Function. For many nonparametric models, this representation also helps MCMC construction by partitioning the posterior distribution in the form of Exchangeable Partition Probability Function. To do so it is necessary to obtain the posterior distribution of the partition ๐œ‹(๐ฉ๐‘›โˆฃ๐˜๐‘›) analytically. Then we could generate ๐ฉ๐‘› and approximate the posterior expectation. However, this is not possible in general. So we consider the joint distribution of {๐™โˆ—๐‘๐‘›,๐ฉ๐‘›} instead. We write the posterior expectation of ๐‘ (๐™๐‘›,๐บ) as a marginalization over the joint posterior distribution of {๐™โˆ—๐‘๐‘›,๐ฉ๐‘›} by ๐ธ๎€บ๐‘ ๎€ท๐™๐‘›,๐บ๎€ธโˆฃ๐˜๐‘›๎€ป=๎“๐ฉ๐‘›๎€œ๐’ตโ‹ฏ๎€œ๐’ต๎„ฟ๎…€๎…€๎…€๎…€๎…ƒ๎…€๎…€๎…€๎…€๎…Œ๐‘๐‘›โ„Ž๎‚€๐˜๐‘›,๎‚†๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚‡๎‚๐œ‹๎‚€๐‘‘๐™โˆ—๐‘๐‘›,๐ฉ๐‘›โˆฃ๐˜๐‘›๎‚.(3.11) Here the joint posterior distribution of (๐™โˆ—๐‘๐‘›,๐ฉ๐‘›) is given by ๐œ‹๎‚€๐‘‘๐™โˆ—๐‘๐‘›,๐ฉ๐‘›โˆฃ๐˜๐‘›๎‚=โˆ๐‘›๐‘ก=1๐œ™๎‚€๐‘Œ๐‘กโˆฃ0,๐œŽ2๐‘ก๎‚€๎‚†๐™โˆ—๐‘๐‘ก,๐ฉ๐‘ก๎‚‡๎‚๎‚๐œ‹๎‚€๐‘‘๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚โˆ‘๐ฉ๐‘›โˆซ๐’ตโ‹ฏโˆซ๐’ตโˆ๐‘›๐‘ก=1๐œ™๎‚€๐‘Œ๐‘กโˆฃ0,๐œŽ2๐‘ก๎‚€๎‚†๐™โˆ—๐‘๐‘ก,๐ฉ๐‘ก๎‚‡๎‚๎‚๐œ‹๎‚€๐‘‘๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚,(3.12) where ๐œ™(๐‘ฅโˆฃ๐‘Ž,๐‘) represents a normal density with mean ๐‘Ž and variance ๐‘ evaluated at ๐‘ฅ. The variance ๐œŽ2๐‘ก({๐™โˆ—๐‘๐‘ก,๐ฉ๐‘ก}) is identical to ๐œŽ2๐‘ก and emphasizes that ๐œŽ2๐‘ก is a function of {๐‘1,๐‘2,โ€ฆ,๐‘๐‘ก} represented by ({๐™โˆ—๐‘๐‘ก,๐ฉ๐‘ก}). This representation leads to the development of the MCMC algorithm in the next section. For the sake of simplicity, we prefer the following expression for the variance: ๐œŽ2๐‘ก๎‚€๎‚†๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚‡๎‚=๐œŽ2๐‘ก๎€ท๐™๐‘›๎€ธ=๐œŽ2๐‘ก๎‚€๎‚†๐™โˆ—๐‘๐‘ก,๐ฉ๐‘ก๎‚‡๎‚=๐œŽ2๐‘ก๎€ท๐™๐‘ก๎€ธ=๐œŽ2๐‘ก.(3.13)

We emphasise that we can always express ๐™๐‘› by two elements, namely, a partition and distinct values. In this case ๐ฉ๐‘› is a partition of the integers {1,โ€ฆ,๐‘›} that are the indices of ๐™๐‘›, and ๐™โˆ—๐‘๐‘›={๐‘โˆ—1,โ€ฆ,๐‘โˆ—๐‘๐‘›} represents the distinct values of ๐™๐‘›. The partition ๐ฉ๐‘› locates the distinct values from ๐™๐‘› to ๐™โˆ—๐‘๐‘› or vice versa. As a result, we have the following equivalent representations: ๐™๐‘›=๎€ฝ๐‘1,โ€ฆ,๐‘๐‘›๎€พ=๎‚†๐‘โˆ—1,โ€ฆ,๐‘โˆ—๐‘๐‘›๎‚‡=๎‚†๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚‡.(3.14)

In time series analysis, we usually consider the first ๐‘ก items, ๐™๐‘ก={๐‘1,โ€ฆ,๐‘๐‘ก}, the corresponding partition, ๐ฉ๐‘ก, and distinct values, ๐™โˆ—๐‘๐‘ก, such that ๐™๐‘ก=๎‚†๐™โˆ—๐‘๐‘ก,๐ฉ๐‘ก๎‚‡.(3.15) Here ๐™๐‘ก contains the first ๐‘ก elements of ๐™๐‘›={๐‘1,โ€ฆ,๐‘๐‘›}, and adding {๐‘ก+1,โ€ฆ,๐‘›} to ๐ฉ๐‘ก would yield ๐ฉ๐‘› according to ๐™โˆ—๐‘๐‘ก and the distinct values of {๐‘๐‘ก+1,โ€ฆ,๐‘๐‘›}. Combining ๐™โˆ—๐‘๐‘ก and the distinct values of {๐‘๐‘ก+1,โ€ฆ,๐‘๐‘›} gives ๐™โˆ—๐‘๐‘› providing the connection between {๐™โˆ—๐‘๐‘ก,๐ฉ๐‘ก} and {๐™โˆ—๐‘๐‘›,๐ฉ๐‘›}. To simplify the likelihood expression and the sampling algorithm, we replace ๐œŽ2๐‘ก(๐™๐‘ก)=๐œŽ2๐‘ก({๐™โˆ—๐‘๐‘ก,๐ฉ๐‘ก}) by ๐œŽ2๐‘ก({๐™โˆ—๐‘๐‘›,๐ฉ๐‘›}) since the subscript ๐‘ก in ๐œŽ2๐‘ก already tells us that the first ๐‘ก items of ๐™๐‘› are considered. We then have a more accessible representation of the likelihood function as ๐ฟ๎‚€๐˜๐‘›โˆฃ๎‚†๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚‡๎‚=๐‘›๎‘๐‘ก=1๐œ™๎‚€๐‘Œ๐‘กโˆฃ0,๐œŽ2๐‘ก๎‚€๎‚†๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚‡๎‚๎‚,(3.16) and (3.12) becomes ๐œ‹๎‚€๐‘‘๐™โˆ—๐‘๐‘›,๐ฉ๐‘›โˆฃ๐˜๐‘›๎‚=๐ฟ๎‚€๐˜๐‘›โˆฃ๎‚†๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚‡๎‚๐œ‹๎‚€๐‘‘๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚โˆ‘๐ฉ๐‘›โˆซ๐’ตโ‹ฏโˆซ๐’ต๐ฟ๎‚€๐˜๐‘›โˆฃ๎‚†๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚‡๎‚๐œ‹๎‚€๐‘‘๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚.(3.17) We are now equipped to describe the MCMC algorithm.

4. The Algorithm for the Partitions and Distinct Values Sampling

Our Markov chain Monte Carlo (MCMC) sampling procedure generates distinct values and partitions alternatively from the posterior distribution, ๐œ‹(๐‘‘๐™โˆ—๐‘๐‘›,๐ฉ๐‘›โˆฃ๐˜๐‘›). For ๐‘† iterations our MCMC algorithm is(1) Initialise ๐™โˆ—๐‘๐‘›=(๐™โˆ—๐‘๐‘›)[0].โ€‰For ๐‘ =1,2,โ€ฆ,๐‘†. (2) Generate ๐ฉ[๐‘ ]๐‘› from ๐œ‹(๐ฉ[๐‘ ]๐‘›โˆฃ(๐™โˆ—๐‘๐‘›)[๐‘ โˆ’1],๐˜๐‘›). (3) Generate (๐™โˆ—๐‘๐‘›)[๐‘ ] from ๐œ‹(๐‘‘๐™โˆ—๐‘๐‘›โˆฃ๐ฉ[๐‘ ]๐‘›,๐˜๐‘›). โ€‰End.

To obtain our estimates we use the weighted Chinese restaurant Gibbs type process introduced in Lau and So [9] for the time series models mixed over Dirichlet process. We have extended this scheme to allow for the more general PKSS process. In what follows, the extension from the Dirichlet to the PKSS lies in the weights of the Pรณlya Urn predictive distribution in (3.3).

The main idea of this algorithm is the โ€œleave one outโ€ principle that removes item ๐‘ก from the partition and then replaces it. This will give an update on both ๐™โˆ—๐‘๐‘› and ๐ฉ๐‘›. This idea has been applied in sampling of partitions in many Bayesian nonparametric models of Dirichlet process (see [17] for an review). The strategy is a simple evaluation on the product of the likelihood function (3.16) and the Pรณlya Urn distribution (3.3) of ๐‘๐‘ก, conditioned on the remaining parameters, yielding a joint updating distribution of ๐™โˆ—๐‘๐‘› and ๐ฉ๐‘›. We now describe the distributions ๐œ‹(๐ฉ๐‘›โˆฃ๐™โˆ—๐‘๐‘›,๐˜๐‘›) and ๐œ‹(๐‘‘๐™โˆ—๐‘๐‘›โˆฃ๐ฉ๐‘›,๐˜๐‘›) used in the sampling scheme.

Define ๐ฉ๐‘›,โˆ’๐‘ก to be the partition ๐ฉ๐‘› less item ๐‘ก. Then ๐ฉ๐‘›,โˆ’๐‘ก={๐ถ1,โˆ’๐‘ก,๐ถ2,โˆ’๐‘ก,โ€ฆ,๐ถ๐‘๐‘›,โˆ’๐‘ก,โˆ’๐‘ก} with corresponding distinct values given by ๐™โˆ—๐‘๐‘›,โˆ’๐‘ก={๐‘โˆ—1,โˆ’๐‘ก,๐‘โˆ—2,โˆ’๐‘ก,โ€ฆ,๐‘โˆ—๐‘๐‘›,โˆ’๐‘ก,โˆ’๐‘ก}. To generate from ๐œ‹(๐ฉ๐‘›โˆฃ๐™โˆ—๐‘๐‘›,๐˜๐‘›) for each ๐‘ก=1,โ€ฆ,๐‘›, the item ๐‘ก is assigned either to a new cluster ๐ถ๐‘๐‘›,โˆ’๐‘ก+1,โˆ’๐‘ก; that is, empty before ๐‘ก is assigned with probability ๐œ‹๎‚€ฬƒโ€Œ๐ฉ๐‘๐‘›,โˆ’๐‘ก+1๎‚ร—โˆซ๐’ต๐ฟ๎‚€๐˜๐‘›โˆฃ๎‚†๎‚๐™โˆ—๐‘๐‘›,โˆ’๐‘ก+1,ฬƒโ€Œ๐ฉ๐‘๐‘›,โˆ’๐‘ก+1๎‚‡๎‚๐ป๎‚€๐‘‘๐‘๐‘๐‘›,โˆ’๐‘ก+1,โˆ’๐‘ก๎‚๐œ‹๎‚€ฬƒโ€Œ๐ฉ๐‘๐‘›,โˆ’๐‘ก+1๎‚ร—โˆซ๐’ต๐ฟ๎‚€๐˜๐‘›โˆฃ๎‚†๎‚๐™โˆ—๐‘๐‘›,โˆ’๐‘ก+1,ฬƒโ€Œ๐ฉ๐‘๐‘›,โˆ’๐‘ก+1๎‚‡๎‚๐ป๎‚€๐‘‘๐‘๐‘๐‘›,โˆ’๐‘ก+1,โˆ’๐‘ก๎‚+โˆ‘๐‘๐‘›,โˆ’๐‘ก๐‘—=1๐œ‹๎€ทฬƒโ€Œ๐ฉ๐‘—๎€ธร—๐ฟ๎‚€๐˜๐‘›โˆฃ๎‚†๎‚๐™โˆ—๐‘—,ฬƒโ€Œ๐ฉ๐‘—๎‚‡๎‚(4.1) or to an existing cluster, ๐ถ๐‘—,โˆ’๐‘ก for ๐‘—=1,โ€ฆ,๐‘๐‘›,โˆ’๐‘ก with probability ๐œ‹๎€ทฬƒโ€Œ๐ฉ๐‘—๎€ธร—๐ฟ๎‚€๐˜๐‘›โˆฃ๎‚†๎‚๐™โˆ—๐‘—,ฬƒโ€Œ๐ฉ๐‘—๎‚‡๎‚๐œ‹๎‚€ฬƒโ€Œ๐ฉ๐‘๐‘›,โˆ’๐‘ก+1๎‚ร—โˆซ๐’ต๐ฟ๎‚€๐˜๐‘›โˆฃ๎‚†๎‚๐™โˆ—๐‘๐‘›,โˆ’๐‘ก+1,ฬƒโ€Œ๐ฉ๐‘๐‘›,โˆ’๐‘ก+1๎‚‡๎‚๐ป๎‚€๐‘‘๐‘๐‘๐‘›,โˆ’๐‘ก+1,โˆ’๐‘ก๎‚+โˆ‘๐‘๐‘›,โˆ’๐‘ก๐‘—=1๐œ‹๎€ทฬƒโ€Œ๐ฉ๐‘—๎€ธร—๐ฟ๎‚€๐˜๐‘›โˆฃ๎‚†๎‚๐™โˆ—๐‘—,ฬƒโ€Œ๐ฉ๐‘—๎‚‡๎‚,(4.2) where ฬƒโ€Œ๐ฉ๐‘—=๐ฉ๐‘›,โˆ’๐‘กโˆช๎€ฝ๐‘กโˆˆ๐ถ๐‘—,โˆ’๐‘ก๎€พ,๎‚๐™โˆ—๐‘—=๎‚†๐™โˆ—๐‘๐‘›,โˆ’๐‘ก,โˆ’๐‘ก,๐‘๐‘ก=๐‘โˆ—๐‘—,โˆ’๐‘ก๎‚‡,(4.3) for ๐‘—=1,โ€ฆ,๐‘๐‘›,โˆ’๐‘ก+1. In addition, if ๐‘—=๐‘๐‘›,โˆ’๐‘ก+1 and a new cluster is selected, a sample of ๐‘โˆ—๐‘๐‘›,โˆ’๐‘ก+1,โˆ’๐‘ก is drawn from ๐ฟ๎‚€๐˜๐‘›โˆฃ๎‚†๎‚๐™โˆ—๐‘๐‘›,โˆ’๐‘ก+1,ฬƒโ€Œ๐ฉ๐‘๐‘›,โˆ’๐‘ก+1๎‚‡๎‚๐ป๎‚€๐‘‘๐‘โˆ—๐‘๐‘›,โˆ’๐‘ก+1,โˆ’๐‘ก๎‚โˆซ๐’ต๐ฟ๎‚€๐˜๐‘›โˆฃ๎‚†๎‚๐™โˆ—๐‘๐‘›,โˆ’๐‘ก+1,ฬƒโ€Œ๐ฉ๐‘๐‘›,โˆ’๐‘ก+1๎‚‡๎‚๐ป๎‚€๐‘‘๐‘โˆ—๐‘๐‘›,โˆ’๐‘ก+1,โˆ’๐‘ก๎‚(4.4) for the next iteration.

To generate from ๐œ‹(๐™โˆ—๐‘๐‘›โˆฃ๐ฉ๐‘›,๐˜๐‘›) for ๐‘—=1,โ€ฆ,๐‘๐‘› generate ๐‘โˆ—๐‘— given {๐‘โˆ—1,โ€ฆ,๐‘โˆ—๐‘๐‘›}โงต{๐‘โˆ—๐‘—}, ๐ฉ๐‘›, and ๐˜๐‘› from the conditional distribution ๐œ‹๎‚€๐‘โˆ—๐‘—โˆฃ๐™โˆ—๐‘๐‘›โงต๎€ฝ๐‘โˆ—๐‘—๎€พ,๐ฉ๐‘›,๐˜๐‘›๎‚=๐ฟ๎‚€๐˜๐‘›โˆฃ๎‚†๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚‡๎‚๐ป๎€ท๐‘‘๐‘โˆ—๐‘—๎€ธโˆซ๐’ต๐ฟ๎‚€๐˜๐‘›โˆฃ๎‚†๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚‡๎‚๐ป๎€ท๐‘‘๐‘โˆ—๐‘—๎€ธ.(4.5) This step uses the standard Metropolis-Hastings algorithm to draw the posterior samples ๐‘โˆ—๐‘—. Precisely, (4.5) is given by ๐œ‹๎‚€๐‘‘๐‘โˆ—๐‘—โˆฃ๐™โˆ—๐‘๐‘›โงต๎€ฝ๐‘โˆ—๐‘—๎€พ,๐ฉ๐‘›,๐˜๐‘›๎‚=โˆ๐‘›๐‘ก=1๐œ™๎‚€๐‘Œ๐‘กโˆฃ0,๐œŽ2๐‘ก๎‚€๎‚†๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚‡๎‚๎‚๐ป๎€ท๐‘‘๐‘โˆ—๐‘—๎€ธโˆซ๐’ตโˆ๐‘›๐‘ก=1๐œ™๎‚€๐‘Œ๐‘กโˆฃ0,๐œŽ2๐‘ก๎‚€๎‚†๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚‡๎‚๎‚๐ป๎€ท๐‘‘๐‘โˆ—๐‘—๎€ธ.(4.6)

In (4.6) all elements in the sequence {๐‘โˆ—1,โ€ฆ,๐‘โˆ—๐‘๐‘›}, conditional on ๐ฉ๐‘› and ๐˜๐‘›, are no longer independent, and they require to be sampled individually, and conditional the remaining elements.

We note that the special case of the above algorithm can be found for independent data of the normal mixture models from West et al. [29] (see also [17]). Taking ๐œ’=0 and ๐œ“=0 in (2.2) yields ๐‘๐‘ก=๐œˆ๐‘ก=๐œŽ2๐‘ก for ๐‘–=1,โ€ฆ,๐‘› with distribution ๐บ. Let ๐บ be a Dirichlet process with parameter ๐œƒ๐ป. Then (4.1) and (4.2) become ๐œƒร—โˆซ๐’ต๐œ™๎€ท๐‘Œ๐‘กโˆฃ0,๐‘๎€ธ๐ป(๐‘‘๐‘)๐œƒร—โˆซ๐’ต๐œ™๎€ท๐‘Œ๐‘กโˆฃ0,๐‘๎€ธ๐ป(๐‘‘๐‘)+โˆ‘๐‘๐‘›,โˆ’๐‘ก๐‘—=1๐‘’๐‘—,โˆ’๐‘กร—๐œ™๎€ท๐‘Œ๐‘กโˆฃ0,๐‘โˆ—๐‘—,โˆ’๐‘ก๎€ธ,๐‘’๐‘—,โˆ’๐‘กร—๐œ™๎€ท๐‘Œ๐‘กโˆฃ0,๐‘โˆ—๐‘—,โˆ’๐‘ก๎€ธ๐œƒร—โˆซ๐’ต๐œ™๎€ท๐‘Œ๐‘กโˆฃ0,๐‘๎€ธ๐ป(๐‘‘๐‘)+โˆ‘๐‘๐‘›,โˆ’๐‘ก๐‘—=1๐‘’๐‘—,โˆ’๐‘กร—๐œ™๎€ท๐‘Œ๐‘กโˆฃ0,๐‘โˆ—๐‘—,โˆ’๐‘ก๎€ธ.(4.7) Furthermore, the joint distribution of {๐‘1,โ€ฆ,๐‘๐‘๐‘›} conditional on ๐ฉ๐‘› and ๐˜๐‘› is given by ๐‘๐‘›๎‘๐‘—=1๎‘๐‘กโˆˆ๐ถ๐‘—๐œ™๎€ท๐‘Œ๐‘กโˆฃ0,๐‘โˆ—๐‘—๎€ธ๐ป๎€ท๐‘‘๐‘โˆ—๐‘—๎€ธ.(4.8) In this case {๐‘1,โ€ฆ,๐‘๐‘๐‘›} are independent in both the prior, ๐œ‹(๐™โˆ—๐‘๐‘›โˆฃ๐ฉ๐‘›), and the posterior, ๐œ‹(๐™โˆ—๐‘๐‘›โˆฃ๐ฉ๐‘›,๐˜๐‘›). However, it is not true in the more general dynamic GARCH model we consider.

Usually, the parameters of interest are both the sequence {๐œŽ1,โ€ฆ,๐œŽ๐‘›} and the predictive density โˆซ๐’ต๐‘˜(๐‘Œ๐‘›+1โˆฃ๐˜๐‘›,๐™๐‘›,๐‘๐‘›+1)๐บ(๐‘‘๐‘๐‘›+1). These two sets of parameters are functions of ๐™๐‘› under the mixture of GARCH(1,1) model, and the Bayesian estimators are taken to be the expected means as outlined in Section 2. That is, writing the volatility as the vector ๐œŽ๐‘›(๐™๐‘›)={๐œŽ1(๐™๐‘›),โ€ฆ,๐œŽ๐‘›(๐™๐‘›)}๐ธ๎€บ๐œŽ๐‘›๎€ท๐™๐‘›๎€ธโˆฃ๐˜๐‘›๎€ป=๎“๐ฉ๐‘›๎€œ๐’ตโ‹ฏ๎€œ๐’ต๐œŽ๐‘›๎‚€๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚๐œ‹๎‚€๐‘‘๐™โˆ—๐‘๐‘›,๐ฉ๐‘›โˆฃ๐˜๐‘›๎‚,๐ธ๎‚ธ๎€œ๐’ต๐‘˜๎€ท๐‘Œ๐‘›+1โˆฃ๐˜๐‘›,๐‘๐‘›+1,๐™๐‘›๎€ธ๐บ๎€ท๐‘‘๐‘๐‘›+1๎€ธโˆฃ๐˜๐‘›๎‚น=๎“๐ฉ๐‘›๎€œ๐’ตโ‹ฏ๎€œ๐’ต๎€œ๐’ต๐‘˜๎‚€๐‘Œ๐‘›+1โˆฃ๐˜๐‘›,๐‘๐‘›+1,๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚ร—๐œ‹๎‚€๐‘‘๐‘๐‘›+1โˆฃ๐™โˆ—๐‘๐‘›,๐ฉ๐‘›๎‚๐œ‹๎‚€๐‘‘๐™โˆ—๐‘๐‘›,๐ฉ๐‘›โˆฃ๐˜๐‘›๎‚,(4.9) where ๐œ‹(๐‘‘๐™โˆ—๐‘๐‘›,๐ฉ๐‘›โˆฃ๐˜๐‘›) denotes the posterior distribution of ๐™๐‘›={๐™โˆ—๐‘๐‘›,๐ฉ๐‘›} and ๐œ‹(๐‘‘๐‘๐‘›+1โˆฃ๐™โˆ—๐‘๐‘›,๐ฉ๐‘›) represents the Pรณlya Urn predictive density of ๐‘๐‘›+1 given ๐™๐‘›.

5. Alternative Algorithmic Estimation Procedures

We now outline how the algorithms of Walker [20] and Papaspiliopoulos and Roberts [21] may be applied to our GARCH(1,1) model mixed over the PKSS process. First, consider the approach in Walker [20]. Beginning with (3.1) the weights can be written as ๐‘Š๐‘–=๎€œ(0,๐‘Š๐‘–)๐‘‘๐‘ข=๐‘Š๐‘–๎€œ(0,โˆž)๐‘Šโˆ’1๐‘–๐•€{(0,๐‘Š๐‘–)}(๐‘ข)๐‘‘๐‘ข=๐‘Š๐‘–๎€œ(0,โˆž)๐‘ˆ๎€ท๐‘ขโˆฃ0,๐‘Š๐‘–๎€ธ๐‘‘๐‘ข,(5.1) where ๐•€{(0,๐‘Š๐‘–)}(๐‘ข) denotes the indicator function that ๐•€{(0,๐‘Š๐‘–)}(๐‘ข)=1 if 0<๐‘ข<๐‘Š๐‘– and ๐•€{(0,๐‘Š๐‘–)}(๐‘ข)=0 otherwise, and ๐‘ˆ(๐‘ขโˆฃ0,๐‘Š๐‘–) represents the uniform density on the interval (0,๐‘Š๐‘–). Then, substituting (5.1), but without the integral over ๐‘ˆ, into (3.1), we obtain the joint measure ๐บ(๐‘‘๐‘ง,๐‘‘๐‘ข)=โˆž๎“๐‘–=1๐‘Š๐‘–๐‘ˆ๎€ท๐‘ขโˆฃ0,๐‘Š๐‘–๎€ธ๐‘‘๐‘ข๐›ฟ๐‘๐‘–(๐‘‘๐‘ง).(5.2)

Furthermore, we can take the classification variables {๐›ฟ1,โ€ฆ,๐›ฟ๐‘›} to indicate the points {๐‘๐›ฟ1,โ€ฆ,๐‘๐›ฟ๐‘›} taken from the measure. The classification variables {๐›ฟ1,โ€ฆ,๐›ฟ๐‘›} take values from the integers {1,2,โ€ฆ} and assign a configuration to model (2.2) so that the expression of the likelihood is simpler without the product of sums. So, combining (5.2) with model (2.2) yields ๐ฟ๎€ท๐˜๐‘›,๐ฎ๐‘›,๐œน๐‘›โˆฃ๐™,๐–๎€ธ=๐‘›๎‘๐‘ก=1๐‘Š๐›ฟ๐‘ก๐‘ˆ๎€ท๐‘ข๐‘กโˆฃ0,๐‘Š๐›ฟ๐‘ก๎€ธ๐œ™๎€ท๐‘Œ๐‘กโˆฃ0,๐œŽ2๐‘ก๎€ท๐™๐œน๐‘ก๎€ธ๎€ธ,(5.3) where ๐œน๐‘›={๐›ฟ1,โ€ฆ,๐›ฟ๐‘›} and ๐™๐œน๐‘ก={๐‘๐›ฟ1,โ€ฆ,๐‘๐›ฟ๐‘ก} for ๐‘ก=1,โ€ฆ,๐‘›. Here the random jumps that build up the random measure ๐บ in (5.2) can be reexpressed as ๐‘Š1๐‘‘=๐‘‰1 and ๐‘Š๐‘–๐‘‘=๐‘‰๐‘–โˆ๐‘–โˆ’1๐‘—=1(1โˆ’๐‘‰๐‘—) for ๐‘—=2,3,โ€ฆ. This is called the stick-breaking representation. Unfortunately, up to now, this representation includes only the Poisson-Dirichlet processe is (๐›ผ,๐œƒ) where ๐‘‰๐‘–s are Beta(1โˆ’๐›ผ,๐œƒ+๐‘–๐›ผ) random variables. Further development will be required to fully utilise the approach of Walker [20] for the PKSS process in general.

The likelihood (5.3) can be written as ๐ฟ๎€ท๐˜๐‘›,๐ฎ๐‘›,๐œน๐‘›โˆฃ๐™,๐•๎€ธ=๐‘›๎‘๐‘ก=1๐‘Š๐›ฟ๐‘ก(๐•)๐‘ˆ๎€ท๐‘ข๐‘กโˆฃ0,๐‘Š๐›ฟ๐‘ก(๐•)๎€ธ๐œ™๎€ท๐‘Œ๐‘กโˆฃ0,๐œŽ2๐‘ก๎€ท๐™๐œน๐‘ก๎€ธ๎€ธ,(5.4) and MCMC algorithms for sampling ๐ฎ๐‘›, ๐œน๐‘›, and ๐• are straightforward and already included in Walker ([20], Sectionโ€‰โ€‰3). To complete the algorithm for our model it requires sampling from ๐™. This can be achieved by sampling from ๐‘๐‘— for all {๐‘—โˆถ๐›ฟ๐‘ก=๐‘—} from โˆ๐‘›๐‘ก=1๐œ™๎€ท๐‘Œ๐‘กโˆฃ0,๐œŽ2๐‘ก๎€ท๐™๐œน๐‘ก๎€ธ๎€ธ๐ป๎€ท๐‘‘๐‘๐‘—๎€ธโˆซ๐’ตโˆ๐‘›๐‘ก=1๐œ™๎€ท๐‘Œ๐‘กโˆฃ0,๐œŽ2๐‘ก๎€ท๐™๐œน๐‘ก๎€ธ๎€ธ๐ป๎€ท๐‘‘๐‘๐‘—๎€ธ,(5.5) otherwise sample ๐‘๐‘— from ๐ป(๐‘‘๐‘) if there is no ๐‘— equal to ๐›ฟ๐‘ก. Notice that there are infinite ๐‘๐‘—s contained in ๐™, but it is only required to sample at most ๐‘› of them. The number of sampled ๐‘๐‘—s varies over iterations (see also ([20], Sectionโ€‰โ€‰3) for details).

Papaspiliopoulos and Roberts [21] suggest an approach similar to Walker [20]. Consider the classification variables {๐›ฟ1,โ€ฆ,๐›ฟ๐‘›} and the stick-breaking representation of {๐‘Š1,๐‘Š2,โ€ฆ} contributed by {๐‘‰1,๐‘‰2,โ€ฆ} defined above. Then the likelihood is immediately given by ๐ฟ๎€ท๐˜๐‘›,๐œน๐‘›โˆฃ๐™,๐•๎€ธ=๐‘›๎‘๐‘ก=1๐‘Š๐›ฟ๐‘ก(๐•)๐œ™๎€ท๐‘Œ๐‘กโˆฃ0,๐œŽ2๐‘ก๎€ท๐™๐œน๐‘ก๎€ธ๎€ธ.(5.6) The most challenging task is the reallocation of ๐‘› observations over the infinite components in (3.1), equivalent to sampling the classification variables {๐›ฟ1,โ€ฆ,๐›ฟ๐‘›} over the MCMC iterations. Here we will briefly discuss this task when it involves the variance ๐œŽ2๐‘ก(๐™๐œน๐‘ก) for our model (see also Papaspiliopoulos and Roberts [21, Sectionโ€‰โ€‰3]. Let ๐œน๐‘›(๐‘–,๐‘—)=๎€ฝ๐›ฟ1,โ€ฆ,๐›ฟ๐‘–โˆ’1,๐‘—,๐›ฟ๐‘–+1,โ€ฆ,๐›ฟ๐‘›๎€พ(5.7) be the vector produced from ๐œน๐‘› by substituting the ๐‘–th element by ๐‘—. This is a proposed move from ๐œน๐‘› to ๐œน๐‘›(๐‘–,๐‘—) where ๐‘—=1,2,โ€ฆ. Notice that it is not possible to consider an infinite number of ๐‘๐‘—s directly since we only have a finite number. Instead, we can employ the Metropolis-Hastings sampler that considers a proposal probability mass function which requires only a finite number of ๐‘๐‘—s. The probabilities for the proposed moves are given by ๐‘ž๐‘›(๐‘–,๐‘—)=๐‘Š๐‘—๐ถ๎€ท๐œน๐‘›๎€ธร—โŽงโŽชโŽจโŽชโŽฉ๐‘›โˆ๐‘ก=1๐œ™๎€ท๐‘Œ๐‘กโˆฃ0,๐œŽ2๐‘ก๎€ท๐™๐œน๐‘ก๎€ธ๎€ธโˆฃ๐›ฟ๐‘–=๐‘—,for๐‘—=1,โ€ฆ,max๎€ฝ๐›ฟ1,โ€ฆ,๐›ฟ๐‘›๎€พ๐‘€๎€ท๐œน๐‘›๎€ธ,for๐‘—>max๎€ฝ๐›ฟ1,โ€ฆ,๐›ฟ๐‘›๎€พ,(5.8) where ๐‘€(๐œน๐‘›)=max๐‘—=1,โ€ฆ,max{๐›ฟ1,โ€ฆ,๐›ฟ๐‘›}โˆ๐‘›๐‘ก=1๐œ™(๐‘Œ๐‘กโˆฃ0,๐œŽ2๐‘ก(๐™๐œน๐‘ก))โˆฃ๐›ฟ๐‘–=๐‘—, and the proportional constant is given by ๐ถ(๐œน๐‘›)=โˆ‘max{๐›ฟ1,โ€ฆ,๐›ฟ๐‘›}๐‘—=1๐‘Š๐‘—ร—โˆ๐‘›๐‘ก=1๐œ™(๐‘Œ๐‘กโˆฃ0,๐œŽ2๐‘ก(๐™๐œน๐‘ก))โˆฃ๐›ฟ๐‘–=๐‘—+(1โˆ’โˆ‘max{๐›ฟ1,โ€ฆ,๐›ฟ๐‘›}๐‘—=1๐‘Š๐‘—)ร—๐‘€(๐œน๐‘›). Then simulate a Uniform(0,1) random variable ๐‘ˆ๐‘– and accept the proposal to move to ๐œน๐‘›(๐‘–,๐‘—)={๐›ฟ1,โ€ฆ,๐›ฟ๐‘–โˆ’1,๐›ฟ๐‘–=๐‘—,๐›ฟ๐‘–+1,โ€ฆ,๐›ฟ๐‘›} if ๐‘— satisfies โˆ‘๐‘—โˆ’1โ„“=0๐‘ž๐‘›(๐‘–,โ„“)<๐‘ˆ๐‘–โ‰คโˆ‘๐‘—โ„“=1๐‘ž๐‘›(๐‘–,โ„“), where ๐‘ž๐‘›(๐‘–,0)=0. The acceptance probability for this Metropolis-Hastings is given by ๐›ผ๎€ท๐œน๐‘›,๐œน๐‘›(๐‘–,๐‘—)๎€ธ=โŽงโŽชโŽชโŽชโŽชโŽจโŽชโŽชโŽชโŽชโŽฉ1if๐‘—โ‰คmax๎€ฝ๐›ฟ1,โ€ฆ,๐›ฟ๐‘›๎€พandmax๎€ฝ๐œน๐‘›(๐‘–,๐‘—)๎€พ=max๎€ฝ๐›ฟ1,โ€ฆ,๐›ฟ๐‘›๎€พminโŽงโŽจโŽฉ1,๐ถ๎€ท๐œน๐‘›๎€ธ๐ถ๎€ท๐œน๐‘›(๐‘–,๐‘—)๎€ธ๐‘€๎€ท๐œน๐‘›(๐‘–,๐‘—)๎€ธโˆ๐‘›๐‘ก=1๐œ™๎€ท๐‘Œ๐‘กโˆฃ0,๐œŽ2๐‘ก๎€ท๐™๐œน๐‘ก๎€ธ๎€ธโŽซโŽฌโŽญif๐‘—โ‰คmax๎€ฝ๐›ฟ1,โ€ฆ,๐›ฟ๐‘›๎€พandmax๎€ฝ๐œน๐‘›(๐‘–,๐‘—)๎€พ<max๎€ฝ๐›ฟ1,โ€ฆ,๐›ฟ๐‘›๎€พminโŽงโŽจโŽฉ1,๐ถ๎€ท๐œน๐‘›๎€ธ๐ถ๎€ท๐œน๐‘›(๐‘–,๐‘—)๎€ธโˆ๐‘›๐‘ก=1๐œ™๎€ท๐‘Œ๐‘กโˆฃ0,๐œŽ2๐‘ก๎€ท๐™๐œน๐‘ก๎€ธ๎€ธโˆฃ๐›ฟ๐‘–=๐‘—๐‘€๎€ท๐œน๐‘›๎€ธโŽซโŽฌโŽญif๐‘—>max๎€ฝ๐›ฟ1,โ€ฆ,๐›ฟ๐‘›๎€พ.(5.9) This completes the task for sampling ๐œน๐‘›. Finally, similar to Walker [20], sampling ๐™ only needs (5.5). That is, for all {๐‘—โˆถ๐›ฟ๐‘ก=๐‘—}, sample ๐‘๐‘— from (5.5), otherwise sample ๐‘๐‘— from ๐ป(๐‘‘๐‘) if there is no ๐‘— equal to ๐›ฟ๐‘ก.

6. Application to the Standard & Poorโ€™s 500 Financial Index

The methodology is illustrated on the daily logarithm returns of the S&P500 (Standard & Poorโ€™s 500) financial index, dated from 2006 Jan 03 to 2009 Dec 31. The data contains a total of 1007 trading days and is available from the Yahoo Finance (URL: The log return is defined as ๐‘Œ๐‘ก=100(ln๐ผ๐‘กโˆ’ln๐ผ๐‘กโˆ’1) where ๐ผ๐‘ก is the index at time ๐‘ก. The algorithm described in Section 4 is used to estimate the nonparametric mixture models.

To compare the three different mixture models in Section 3, ๐บ is allocated in the Dirichlet process, the PD process, and the NGG process. In each case, the mean process is denoted ๐ป and is a Gamma-Dirichlet distribution given by ๐ป(๐‘‘๐œˆ,๐‘‘๐œ’,๐‘‘๐œ“)=๐ป1(๐‘‘๐œˆ)๐ป2(๐‘‘๐œ’,๐‘‘๐œ“) where ๐ป1(๐‘‘๐œˆ) is the Gamma(1,1) distribution and ๐ป2(๐‘‘๐œ’,๐‘‘๐œ“) is the Dirichlet(1,1,1) distribution. We set the parameters of each process such that the variance of each process evaluated over the same measure is equal. This reslts in ๐œƒ=2.3538 for the Dirichlet process, ๐œƒ=0.6769 and ๐›ผ=1/2 for the PD and ๐›ฝ=1 and ๐›ผ=1/2 for the NGG. We also compare the results to a no mixture GARCH(1,1) model in which the parameters (๐œˆ,๐œ’,๐œ“) have a prior distribution ๐ป1(๐‘‘๐œˆ)๐ป2(๐‘‘๐œ’,๐‘‘๐œ“). We initialise the MCMC algorithm with a partition that separates all integers, that is, ๐ฉ๐‘›={๐ถ1={1},๐ถ2={2},โ€ฆ,๐ถ๐‘›={๐‘›}}. We run the MCMC algorithm for 20,000 iterations of which the first 10,000 iterations are discarded. The last 10,000 iterations are considered a sample from the posterior density of {๐™โˆ—๐‘›,๐ฉ๐‘›}.

Figure 1 contains the volatility estimates (fitted data) for the no-mixture model, the Dirichlet process, the DP process, and the NGG process. The no mixture model, the Dirichelt process, and the PD process appear to give similar results. However, it is easy to distinguish the NGG process from the other models since the volatility estimates of the NGG process appear to better fit the observed spikes in the data. Figure 2 presents the predictive densities for each model. Again, the no-mixture model, the Dirichlet process, and the PD process give similar predictive density estimates in the sense that the distribution tails are all similar. However, the NGG process model estimates a predictive density with substantially wider tails than the other three models. Figures 1 and 2 suggest that the Dirichlet and PD processes allocate fewer clusters and consider the periods of increased volatility as outliers within the data. On the other hand the NGG process allocates more clusters and incorporates the periods of increased volatility directly into its predictive density.

Finally, we evaluate the goodness of fit in terms of the marginal likelihoods. The logarithm of the marginal likelihoods of the no mixture model, the Dirichlet process model, PD process model, and the NGG process model are โˆ’1578.085, โˆ’1492.086, โˆ’1446.275, and โˆ’1442.269, respectively. Under the marginal likelihood criterion all three mixture models outperform the GARCH(1,1). Further, the NGG process outperforms the PD process which in turn outperforms the model proposed in Lau and So [9]. These results suggest that generalisations of the Dirichlet process mixture model should be further investigated for time-dependent data.

7. Conclusion

In this paper we have extended nonparametric mixture modelling for GARCH models to the Kingman-Poisson process. The process includes the previously applied Dirichlet process and also includes the Poisson-Dirichlet and Normalised Generalised Gamma process. The Poisson-Dirichlet and Normalised Generalised Gamma process provide richer clustering structures than the Dirichlet process and have not been previously adapted to time series data. An application to the S&P500 financial index suggests that these more general random probability measures can outperform the Dirichlet process. Finally, we developed an MCMC algorithm that is easy to implement which we hope will facilitate further investigation into the application of nonparametric mixture modes to time series data.


  1. R. F. Engle, โ€œAutoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation,โ€ Econometrica, vol. 50, no. 4, pp. 987โ€“1007, 1982. View at: Google Scholar
  2. T. Bollerslev, โ€œGeneralized autoregressive conditional heteroskedasticity,โ€ Journal of Econometrics, vol. 31, no. 3, pp. 307โ€“327, 1986. View at: Google Scholar
  3. L. Bauwens, A. Preminger, and J. V. K. Rombouts, โ€œTheory and inference for a Markov switching GARCH model,โ€ Econometrics Journal, vol. 13, no. 2, pp. 218โ€“244, 2010. View at: Publisher Site | Google Scholar
  4. Z. He and J. M. Maheu, โ€œReal time detection of structural breaks in GARCH models,โ€ Computational Statistics and Data Analysis, vol. 54, no. 11, pp. 2628โ€“2640, 2010. View at: Publisher Site | Google Scholar
  5. R. T. Baillie and C. Morana, โ€œModelling long memory and structural breaks in conditional variances: an adaptive FIGARCH approach,โ€ Journal of Economic Dynamics and Control, vol. 33, no. 8, pp. 1577โ€“1592, 2009. View at: Publisher Site | Google Scholar
  6. M. Haas, S. Mittnik, M. Paolella, Z. He, and J. Maheu, โ€œMixed normal conditional het-eroskedasticity,โ€ Journal of Financial Econometrics, vol. 2, pp. 211โ€“250, 2004. View at: Google Scholar
  7. M. Haas, S. Mittnik, M. Paolella, Z. He, and J. Maheu, โ€œA new approach to markov- switching GARCH models,โ€ Journal of Financial Econometrics, vol. 2, pp. 493โ€“530, 2004. View at: Google Scholar
  8. S. Kaufmann and S. Frühwirth-Schnatter, โ€œBayesian analysis of switching arch models,โ€ Journal of Time Series Analysis, vol. 23, no. 4, pp. 425โ€“458, 2002. View at: Publisher Site | Google Scholar
  9. J. W. Lau and M. K. P. So, โ€œA Monte Carlo Markov chain algorithm for a class of mixture time series models,โ€ Statistics and Computing, vol. 21, no. 1, pp. 69โ€“81, 2011. View at: Publisher Site | Google Scholar
  10. M. J. Jensen and J. M. Maheu, โ€œBayesian semiparametric stochastic volatility modeling,โ€ Journal of Econometrics, vol. 157, no. 2, pp. 306โ€“316, 2010. View at: Publisher Site | Google Scholar
  11. J. E. Griffin, โ€œInference in infinite superpositions of Non-Gaussian Ornstein-Uhlenbeck processes using Bayesian nonparametic methods,โ€ Journal of Financial Econometrics, vol. 9, no. 3, Article ID nbq027, pp. 519โ€“549, 2011. View at: Publisher Site | Google Scholar
  12. T. S. Ferguson, โ€œA Bayesian analysis of some nonparametric problems,โ€ Annals of Statistics, vol. 1, pp. 209โ€“230, 1973. View at: Google Scholar
  13. A. Y. Lo, โ€œOn a class of Bayesian nonparametric estimates. I. Density estimates,โ€ Annals of Statistics, vol. 12, no. 1, pp. 351โ€“357, 1984. View at: Google Scholar
  14. J. Pitman and M. Yor, โ€œThe two-parameter Poisson-Dirichlet distribution derived from a stable subordinator,โ€ Annals of Probability, vol. 25, no. 2, pp. 855โ€“900, 1997. View at: Google Scholar
  15. J. Pitman, โ€œPoisson-Kingman partitions,โ€ in Statistics and Science: A Festschrift for Terry Speed, vol. 40 of IMS Lecture Notes Monogr, pp. 1โ€“34, Beachwood, Ohio, USA, 2003. View at: Google Scholar
  16. J. F. C. Kingman, โ€œRandom discrete distribution,โ€ Journal of the Royal Statistical Society, Series B, vol. 37, pp. 1โ€“22, 1975. View at: Google Scholar
  17. L. F. James, โ€œPoisson process partition calculus with applications to exchangeable models and Bayesian nonparametrics,โ€ View at: Google Scholar
  18. L. F. James, โ€œBayesian poisson process partition calculus with an application to Bayesian lévy moving averages,โ€ Annals of Statistics, vol. 33, no. 4, pp. 1771โ€“1799, 2005. View at: Publisher Site | Google Scholar
  19. A. Lijoi, I. Prünster, and S. G. Walker, โ€œBayesian nonparametric estimators derived from conditional Gibbs structures,โ€ Annals of Applied Probability, vol. 18, no. 4, pp. 1519โ€“1547, 2008. View at: Publisher Site | Google Scholar
  20. S. G. Walker, โ€œSampling the Dirichlet mixture model with slices,โ€ Communications in Statistics, vol. 36, no. 1, pp. 45โ€“54, 2007. View at: Publisher Site | Google Scholar
  21. O. Papaspiliopoulos and G. O. Roberts, โ€œRetrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models,โ€ Biometrika, vol. 95, no. 1, pp. 169โ€“186, 2008. View at: Publisher Site | Google Scholar
  22. A. Lijoi, I. Prünster, and S. G. Walker, โ€œInvestigating nonparametric priors with Gibbs structure,โ€ Statistica Sinica, vol. 18, no. 4, pp. 1653โ€“1668, 2008. View at: Google Scholar
  23. P. Embrechts, C. Klüppelberg, and T. Mikosch, Modelling Extremal Events for Insurance and Finance, Springer, 1997.
  24. Z. Zhang, W. K. Li, and K. C. Yuen, โ€œOn a mixture GARCH time-series model,โ€ Journal of Time Series Analysis, vol. 27, no. 4, pp. 577โ€“597, 2006. View at: Publisher Site | Google Scholar
  25. W. Vervaat, โ€œOn a stochastic difference equation and a representation of nonnegative infinitely divisible random variables,โ€ Advances in Applied Probability, vol. 11, pp. 750โ€“783, 1979. View at: Publisher Site | Google Scholar
  26. A. Brandt, โ€œThe stochastic equation Yn+1=AnYn+Bn with stationary coefficients,โ€ Advances in Applied Probability, vol. 18, no. 1, pp. 211โ€“220, 1986. View at: Publisher Site | Google Scholar | Zentralblatt MATH
  27. J. K. Ghosh and R. V. Ramamoorthi, Bayesian Nonparametrics, Springer Series in Statistics, Springer, New York, NY, USA, 2003.
  28. J. Pitman, Combinatorial Stochastic Processes, vol. 1875 of Lecture Notes in Mathematics, Springer, Berlin, Germany, 2006, Lectures from the 32nd Summer School on Probability Theory held in Saint-Flour, July 7–24, 2002,With a foreword by Jean Picard.
  29. M. West, P. Müller, and M. D. Escobar, โ€œHierarchical priors and mixture models, with applications in regression and density estimation,โ€ in A Tribute to D. V. Lindley, pp. 363โ€“386, John Wiley & Sons, New York, NY, USA, 1994. View at: Google Scholar

Copyright © 2012 John W. Lau and Ed Cripps. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

More related articles

 PDF Download Citation Citation
 Download other formatsMore
 Order printed copiesOrder

Related articles

Article of the Year Award: Outstanding research contributions of 2020, as selected by our Chief Editors. Read the winning articles.