Mathematical Problems in Engineering

Volume 2017 (2017), Article ID 7892507, 10 pages

https://doi.org/10.1155/2017/7892507

## Estimation of Poisson-Dirichlet Parameters with Monotone Missing Data

^{1}Department of Statistics and Actuarial Science, East China Normal University, Shanghai, China^{2}China Pacific Property Insurance Co., Ltd., Shanghai, China

Correspondence should be addressed to Xueqin Zhou; moc.361@uohzniqeux

Received 22 March 2017; Accepted 29 August 2017; Published 12 October 2017

Academic Editor: Giuseppe Vairo

Copyright © 2017 Xueqin Zhou et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

This article considers the estimation of the unknown numerical parameters and the density of the base measure in a Poisson-Dirichlet process prior with grouped monotone missing data. The numerical parameters are estimated by the method of maximum likelihood estimates and the density function is estimated by kernel method. A set of simulations was conducted, which shows that the estimates perform well.

#### 1. Introduction

As a young but fast-growing field of statistics, Bayesian nonparametrics (abbreviated as BNP below) focuses on Bayesian solutions of nonparametric and other infinite-dimensional statistical models. Compared with frequentist statistics and classical Bayesian statistics, BNP provides highly flexible and robust models for infinite-dimensional parameter spaces. The most extensively investigated priors in BNP include the famous Dirichlet process prior [1] and Polya tree process prior [2–5] which have played fundamental roles in the development of Bayesian nonparametrics.

Dirichlet processes are also referred to as one-parameter Poisson-Dirichlet processes by Kingman [6]. As the first and foremost generalization of Dirichlet processes, two-parameter Poisson-Dirichlet processes (abbreviated as Poisson-Dirichlet process below) were first discussed by Pitman and Yor [7] and from then have made huge success in Bayesian nonparametric modeling in language, images, ecology, biology, genomics, and so on. Remarkable examples include Goldwater et al. [8] who used Poisson-Dirichlet process as an adaptor to justify the appearance of type frequencies in formal analyses of natural language and improved the performance of an earlier model for unsupervised learning of morphology, Sudderth and Jordan [9] who modeled the object frequencies and segment sizes by Poisson-Dirichlet processes and developed a statistical framework for the simultaneous, unsupervised segmentation and discovery of visual object categories from image databases, Favaro et al. [10] who used a Poisson-Dirichlet model to deal with the issue of prediction within species sampling problems, and Hoshino [11] who studied the microdata disclosure risk assessment with Pitman’s sampling formula, clarified some of its theoretical implications, and compared various models based on the Akaike Information Criterion by applying them to real data sets. For more references related to the application of Poisson-Dirichlet process in the area of language learning, one can be referred to Johnson et al. [12], Wood and Teh [13], and Wallach et al. [14].

While the exact Bayesian methods take the assumption that prior distributions are completely specified, empirical Bayesian methods deal with the situations where prior distributions are at most partially specified and thus need to be estimated. Empirical Bayesian methods for parametric and semiparametric models have been investigated in a huge volume of literature. However, the study of empirical Bayesian methods in the framework of Bayesian nonparametrics is quite limited. A recently published paper is that by Yang and Wu [15] who studied the problem of estimation of the priors with monotonically missing data when the prior is a Dirichlet process.

In this paper, we aim at estimating the unknown numerical parameters and the density of the base measure with independent and identically distributed (i.i.d.) groups of observations with a Poisson-Dirichlet process prior. Because the Dirichlet process prior is a special case of the Poisson-Dirichlet process prior, we in fact have extended the methodologies of Yang and Wu [15] to a bigger model. The estimation of the unknown parameters is carried out in two different methods, the maximum likelihood method and a naive method proposed by Carlton [16], of which performances are compared by a simulation study.

Because there are two numerical parameters in Poisson-Dirichlet process priors, the maximum likelihood estimates (MLE) for the unknown parameters are discussed under three different settings (see the next section for the definition of the parameters and and the density function): (i) the discount parameter is unknown but the concentration parameter is known; (ii) the concentration parameter is unknown but the discount parameter is known; and (iii) both and are unknown. Favaro et al. [10] gave the empirical Bayes estimates when both and were unknown on complete data without missing. The comparisons between the estimates of Favaro et al. [10] and ours are presented with the same sample size in terms of bias, standard deviations (SD), and mean squared errors (MSEs).

The remainder of this paper is structured as follows. In Section 2, we review the basic model and the definition of the Poisson-Dirichlet processes priors. Data structure and model assumptions are also described in this section. In Section 3, the MLEs of the prior parameters are discussed in detail under three different aforementioned situations. A naive estimate for the discount parameter is also discussed. Section 4 discusses the estimates of base distribution density by kernel method. Section 5 presents a small simulation study to show the performance of the estimates discussed in Section 3.

#### 2. The Data and Model

The data are observed in time periods and accordingly organized in groups , with representing the calendar time point the individuals in this group begin to be observed. Each group contains individuals. Denote by the th individual in group which is represented by a random vector with being the th coordinate of individual . The observations are presented by -dimensional vectors for which the coordinates are observed sequentially in time, so that the observations are subject to monotone missing. A clearer picture of the data structure is exhibited in Table 1.