Journal of Probability and Statistics

Journal of Probability and Statistics / 2011 / Article

Research Article | Open Access

Volume 2011 |Article ID 372512 | 14 pages | https://doi.org/10.1155/2011/372512

Joint Estimation Using Quadratic Estimating Function

Academic Editor: Ricardas Zitikis
Received12 Jan 2011
Revised10 Mar 2011
Accepted11 Apr 2011
Published12 Jul 2011

Abstract

A class of martingale estimating functions is convenient and plays an important role for inference for nonlinear time series models. However, when the information about the first four conditional moments of the observed process becomes available, the quadratic estimating functions are more informative. In this paper, a general framework for joint estimation of conditional mean and variance parameters in time series models using quadratic estimating functions is developed. Superiority of the approach is demonstrated by comparing the information associated with the optimal quadratic estimating function with the information associated with other estimating functions. The method is used to study the optimal quadratic estimating functions of the parameters of autoregressive conditional duration (ACD) models, random coefficient autoregressive (RCA) models, doubly stochastic models and regression models with ARCH errors. Closed-form expressions for the information gain are also discussed in some detail.

1. Introduction

Godambe [1] was the first to study the inference for discrete time stochastic processes using estimating function method. Thavaneswaran and Abraham [2] had studied the nonlinear time series estimation problems using linear estimating functions. Naik-Nimbalkar and Rajashi [3] and Thavaneswaran and Heyde [4] studied the filtering and prediction problems using linear estimating functions in the Bayesian context. Chandra and Taniguchi [5], Merkouris [6], and Ghahramani and Thavaneswaran [7] among others have studied the estimation problems using estimating functions. In this paper, we study the linear and quadratic martingale estimating functions and show that the quadratic estimating functions are more informative when the conditional mean and variance of the observed process depend on the same parameter of interest.

This paper is organized as follows. The rest of Section 1 presents the basics of estimating functions and information associated with estimating functions. Section 2 presents the general model for the multiparameter case and the form of the optimal quadratic estimating function. In Section 3, the theory is applied to four different models.

Suppose that {𝐲𝑑,𝑑=1,…,𝑛} is a realization of a discrete time stochastic process, and its distribution depends on a vector parameter 𝜽 belonging to an open subset Θ of the 𝑝-dimensional Euclidean space. Let (Ξ©,β„±,π‘ƒπœ½) denote the underlying probability space, and let ℱ𝑦𝑑 be the 𝜎-field generated by {𝐲1,…,𝐲𝑑,𝑑β‰₯1}. Let 𝐑𝑑=𝐑𝑑(𝐲1,…,𝐲𝑑,𝜽), 1≀𝑑≀𝑛 be specified π‘ž-dimensional vectors that are martingales. We consider the class β„³ of zero mean and square integrable 𝑝-dimensional martingale estimating functions of the form𝐠ℳ=𝑛(𝜽)βˆΆπ π‘›(𝜽)=𝑛𝑑=1πšπ‘‘βˆ’1𝐑𝑑,(1.1) where πšπ‘‘βˆ’1 are π‘Γ—π‘ž matrices depending on 𝐲1,…,π²π‘‘βˆ’1, 1≀𝑑≀𝑛. The estimating functions 𝐠𝑛(𝜽) are further assumed to be almost surely differentiable with respect to the components of 𝜽 and such that E[(πœ•π π‘›(𝜽)/πœ•πœ½)βˆ£β„±π‘¦π‘›βˆ’1] and E[𝐠𝑛(𝜽)𝐠𝑛(𝜽)ξ…žβˆ£β„±π‘¦π‘›βˆ’1] are nonsingular for all 𝜽∈Θ and for each 𝑛β‰₯1. The expectations are always taken with respect to π‘ƒπœ½. Estimators of 𝜽 can be obtained by solving the estimating equation 𝐠𝑛(𝜽)=𝟎. Furthermore, the 𝑝×𝑝 matrix E[𝐠𝑛(𝜽)𝐠𝑛(𝜽)ξ…žβˆ£β„±π‘¦π‘›βˆ’1] is assumed to be positive definite for all 𝜽∈Θ. Then, in the class of all zero mean and square integrable martingale estimating functions β„³, the optimal estimating function π βˆ—π‘›(𝜽) which maximizes, in the partial order of nonnegative definite matrices, the information matrixπˆπ π‘›ξ‚΅Eξ‚Έ(𝜽)=πœ•π π‘›(𝜽)πœ•πœ½βˆ£β„±π‘¦π‘›βˆ’1ξ‚Ήξ‚Άξ…žξ€·E𝐠𝑛(𝜽)𝐠𝑛(𝜽)ξ…žβˆ£β„±π‘¦π‘›βˆ’1ξ€»ξ€Έβˆ’1ξ‚΅Eξ‚Έπœ•π π‘›(𝜽)πœ•πœ½βˆ£β„±π‘¦π‘›βˆ’1ξ‚Ήξ‚Ά(1.2) is given byπ βˆ—π‘›(𝜽)=𝑛𝑑=1πšβˆ—π‘‘βˆ’1𝐑𝑑=𝑛𝑑=1ξ‚΅Eξ‚Έπœ•π‘π‘‘πœ•πœ½βˆ£β„±π‘¦π‘‘βˆ’1ξ‚Ήξ‚Άξ…žξ€·Eξ€Ίπ‘π‘‘π‘ξ…žπ‘‘βˆ£β„±π‘¦π‘‘βˆ’1ξ€»ξ€Έβˆ’1𝐑𝑑,(1.3) and the corresponding optimal information reduces to E[π βˆ—π‘›(𝜽)π βˆ—π‘›(𝜽)ξ…žβˆ£β„±π‘¦π‘›βˆ’1].

The function π βˆ—π‘›(𝜽) is also called the β€œquasi-score” and has properties similar to those of a score function in the sense that E[π βˆ—π‘›(𝜽)]=𝟎 and E[π βˆ—π‘›(𝜽)π βˆ—π‘›(𝜽)ξ…ž]=βˆ’E[πœ•π βˆ—π‘›(𝜽)/πœ•πœ½ξ…ž]. This is a more general result in the sense that for its validity, we do not need to assume that the true underlying distribution belongs to the exponential family of distributions. The maximum correlation between the optimal estimating function and the true unknown score justifies the terminology β€œquasi-score” for π βˆ—π‘›(𝜽). Moreover, it follows from Lindsay [8, page 916] that if we solve an unbiased estimating equation 𝐠𝑛(𝜽)=𝟎 to get an estimator, then the asymptotic variance of the resulting estimator is the inverse of the information πˆπ π‘›. Hence, the estimator obtained from a more informative estimating equation is asymptotically more efficient.

2. General Model and Method

Consider a discrete time stochastic process {𝑦𝑑,𝑑=1,2,…} with conditional momentsπœ‡π‘‘ξ€Ίπ‘¦(𝜽)=Eπ‘‘βˆ£β„±π‘¦π‘‘βˆ’1ξ€»,𝜎2𝑑𝑦(𝜽)=Varπ‘‘βˆ£β„±π‘¦π‘‘βˆ’1ξ€Έ,𝛾𝑑1(𝜽)=𝜎3𝑑E𝑦(𝜽)π‘‘βˆ’πœ‡π‘‘ξ€Έ(𝜽)3βˆ£β„±π‘¦π‘‘βˆ’1ξ‚„,πœ…π‘‘1(𝜽)=𝜎4𝑑E𝑦(𝜽)π‘‘βˆ’πœ‡π‘‘ξ€Έ(𝜽)4βˆ£β„±π‘¦π‘‘βˆ’1ξ‚„βˆ’3.(2.1) That is, we assume that the skewness and the excess kurtosis of the standardized variable 𝑦𝑑 do not contain any additional parameters. In order to estimate the parameter 𝜽 based on the observations 𝑦1,…,𝑦𝑛, we consider two classes of martingale differences {π‘šπ‘‘(𝜽)=π‘¦π‘‘βˆ’πœ‡π‘‘(𝜽),𝑑=1,…,𝑛} and {𝑠𝑑(𝜽)=π‘š2𝑑(𝜽)βˆ’πœŽ2𝑑(𝜽),𝑑=1,…,𝑛} such thatβŸ¨π‘šβŸ©π‘‘ξ€Ίπ‘š=E2π‘‘βˆ£β„±π‘¦π‘‘βˆ’1𝑦=Eπ‘‘βˆ’πœ‡π‘‘ξ€Έ2βˆ£β„±π‘¦π‘‘βˆ’1ξ‚„=𝜎2𝑑,βŸ¨π‘ βŸ©π‘‘ξ€Ίπ‘ =E2π‘‘βˆ£β„±π‘¦π‘‘βˆ’1𝑦=Eπ‘‘βˆ’πœ‡π‘‘ξ€Έ4+𝜎4π‘‘βˆ’2𝜎2π‘‘ξ€·π‘¦π‘‘βˆ’πœ‡π‘‘ξ€Έ2βˆ£β„±π‘¦π‘‘βˆ’1ξ‚„=𝜎4π‘‘ξ€·πœ…π‘‘ξ€Έ,+2βŸ¨π‘š,π‘ βŸ©π‘‘ξ€Ίπ‘š=Eπ‘‘π‘ π‘‘βˆ£β„±π‘¦π‘‘βˆ’1𝑦=Eπ‘‘βˆ’πœ‡π‘‘ξ€Έ3βˆ’πœŽ2π‘‘ξ€·π‘¦π‘‘βˆ’πœ‡π‘‘ξ€Έβˆ£β„±π‘¦π‘‘βˆ’1ξ‚„=𝜎3𝑑𝛾𝑑.(2.2)

The optimal estimating functions based on the martingale differences π‘šπ‘‘ and 𝑠𝑑 are π βˆ—π‘€βˆ‘(𝜽)=βˆ’π‘›π‘‘=1(πœ•πœ‡π‘‘/πœ•πœ½)(π‘šπ‘‘/βŸ¨π‘šβŸ©π‘‘) and π βˆ—π‘†βˆ‘(𝜽)=βˆ’π‘›π‘‘=1(πœ•πœŽ2𝑑/πœ•πœ½)(𝑠𝑑/βŸ¨π‘ βŸ©π‘‘), respectively. Then, the information associated with π βˆ—π‘€(𝜽) and π βˆ—π‘†(𝜽) are πˆπ βˆ—π‘€βˆ‘(𝜽)=𝑛𝑑=1(πœ•πœ‡π‘‘/πœ•πœ½)(πœ•πœ‡π‘‘/πœ•πœ½ξ…ž)(1/βŸ¨π‘šβŸ©π‘‘) and πˆπ βˆ—π‘†βˆ‘(𝜽)=𝑛𝑑=1(πœ•πœŽ2𝑑/πœ•πœ½)(πœ•πœŽ2𝑑/πœ•πœ½ξ…ž)(1/βŸ¨π‘ βŸ©π‘‘), respectively. Crowder [9] studied the optimal quadratic estimating function with independent observations. For the discrete time stochastic process {𝑦𝑑}, the following theorem provides optimality of the quadratic estimating function for the multiparameter case.

Theorem 2.1. For the general model in (2.1), in the class of all quadratic estimating functions of the form 𝒒𝑄={𝐠𝑄(𝜽)βˆΆπ π‘„βˆ‘(𝜽)=𝑛𝑑=1(πšπ‘‘βˆ’1π‘šπ‘‘+π›π‘‘βˆ’1𝑠𝑑)}, (a)the optimal estimating function is given by π βˆ—π‘„βˆ‘(𝜽)=𝑛𝑑=1(πšβˆ—π‘‘βˆ’1π‘šπ‘‘+π›βˆ—π‘‘βˆ’1𝑠𝑑), whereπšβˆ—π‘‘βˆ’1=1βˆ’βŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺβˆ’1ξƒ©βˆ’πœ•πœ‡π‘‘1πœ•πœ½βŸ¨π‘šβŸ©π‘‘+πœ•πœŽ2π‘‘πœ•πœ½βŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺ,π›βˆ—π‘‘βˆ’1=1βˆ’βŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺβˆ’1ξƒ©πœ•πœ‡π‘‘πœ•πœ½βŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘βˆ’πœ•πœŽ2𝑑1πœ•πœ½βŸ¨π‘ βŸ©π‘‘ξƒͺ;(2.3)(b) the information πˆπ‘”βˆ—π‘„(𝜽) is given by πˆπ βˆ—π‘„(𝜽)=𝑛𝑑=11βˆ’βŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺβˆ’1ξƒ©πœ•πœ‡π‘‘πœ•πœ½πœ•πœ‡π‘‘πœ•πœ½ξ…ž1βŸ¨π‘šβŸ©π‘‘+πœ•πœŽ2π‘‘πœ•πœ½πœ•πœŽ2π‘‘πœ•πœ½ξ…ž1βŸ¨π‘ βŸ©π‘‘βˆ’ξƒ©πœ•πœ‡π‘‘πœ•πœ½πœ•πœŽ2π‘‘πœ•πœ½ξ…ž+πœ•πœŽ2π‘‘πœ•πœ½πœ•πœ‡π‘‘πœ•πœ½ξ…žξƒͺβŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺ;(2.4)(c) the gain in information πˆπ βˆ—π‘„(𝜽)βˆ’πˆπ βˆ—π‘€(𝜽) is given by 𝑛𝑑=11βˆ’βŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺβˆ’1ξƒ©πœ•πœ‡π‘‘πœ•πœ½πœ•πœ‡π‘‘πœ•πœ½ξ…žβŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©2π‘‘βŸ¨π‘ βŸ©π‘‘+πœ•πœŽ2π‘‘πœ•πœ½πœ•πœŽ2π‘‘πœ•πœ½ξ…ž1βŸ¨π‘ βŸ©π‘‘βˆ’ξƒ©πœ•πœ‡π‘‘πœ•πœ½πœ•πœŽ2π‘‘πœ•πœ½ξ…ž+πœ•πœŽ2π‘‘πœ•πœ½πœ•πœ‡π‘‘πœ•πœ½ξ…žξƒͺβŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺ;(2.5)(d) the gain in information πˆπ βˆ—π‘„(𝜽)βˆ’πˆπ βˆ—π‘†(𝜽) is given by 𝑛𝑑=11βˆ’βŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺβˆ’1ξƒ©πœ•πœ‡π‘‘πœ•πœ½πœ•πœ‡π‘‘πœ•πœ½ξ…ž1βŸ¨π‘šβŸ©π‘‘+πœ•πœŽ2π‘‘πœ•πœ½πœ•πœŽ2π‘‘πœ•πœ½ξ…žβŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘βˆ’βŸ¨π‘š,π‘ βŸ©2π‘‘βˆ’ξƒ©πœ•πœ‡π‘‘πœ•πœ½πœ•πœŽ2π‘‘πœ•πœ½ξ…ž+πœ•πœŽ2π‘‘πœ•πœ½πœ•πœ‡π‘‘πœ•πœ½ξ…žξƒͺβŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺ.(2.6)

Proof. We choose two orthogonal martingale differences π‘šπ‘‘ and πœ“π‘‘=π‘ π‘‘βˆ’πœŽπ‘‘π›Ύπ‘‘π‘šπ‘‘, where the conditional variance of πœ“π‘‘ is given by βŸ¨πœ“βŸ©π‘‘=(βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘βˆ’βŸ¨π‘š,π‘ βŸ©2𝑑)/βŸ¨π‘šβŸ©π‘‘=𝜎4𝑑(πœ…π‘‘+2βˆ’π›Ύ2𝑑). That is, π‘šπ‘‘ and πœ“π‘‘ are uncorrelated with conditional variance βŸ¨π‘šβŸ©π‘‘ and βŸ¨πœ“βŸ©π‘‘, respectively. Moreover, the optimal martingale estimating function and associated information based on the martingale differences πœ“π‘‘ are π βˆ—Ξ¨(𝜽)=𝑛𝑑=1ξƒ©πœ•πœ‡π‘‘πœ•πœ½βŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βˆ’πœ•πœŽ2𝑑ξƒͺπœ“πœ•πœ½π‘‘βŸ¨πœ“βŸ©π‘‘=𝑛𝑑=11βˆ’βŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺβˆ’1Γ—βˆ’ξƒ©ξƒ©πœ•πœ‡π‘‘πœ•πœ½βŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©2π‘‘βŸ¨π‘ βŸ©π‘‘+πœ•πœŽ2π‘‘πœ•πœ½βŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺπ‘šπ‘‘+ξƒ©πœ•πœ‡π‘‘πœ•πœ½βŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘βˆ’πœ•πœŽ2𝑑1πœ•πœ½βŸ¨π‘ βŸ©π‘‘ξƒͺ𝑠𝑑ξƒͺ,πˆπ βˆ—Ξ¨(𝜽)=𝑛𝑑=1ξƒ©πœ•πœ‡π‘‘πœ•πœ½βŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βˆ’πœ•πœŽ2π‘‘πœ•πœ½ξƒͺξƒ©πœ•πœ‡π‘‘πœ•πœ½ξ…žβŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βˆ’πœ•πœŽ2π‘‘πœ•πœ½ξ…žξƒͺ1βŸ¨πœ“βŸ©π‘‘=𝑛𝑑=11βˆ’βŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺβˆ’1Γ—ξƒ©πœ•πœ‡π‘‘πœ•πœ½πœ•πœ‡π‘‘πœ•πœ½ξ…žβŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©2π‘‘βŸ¨π‘ βŸ©π‘‘+πœ•πœŽ2π‘‘πœ•πœ½πœ•πœŽ2π‘‘πœ•πœ½ξ…ž1βŸ¨π‘ βŸ©π‘‘βˆ’ξƒ©πœ•πœ‡π‘‘πœ•πœ½πœ•πœŽ2π‘‘πœ•πœ½ξ…ž+πœ•πœŽ2π‘‘πœ•πœ½πœ•πœ‡π‘‘πœ•πœ½ξ…žξƒͺβŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺ.(2.7) Then, the quadratic estimating function based on π‘šπ‘‘ and πœ“π‘‘ becomes π βˆ—π‘„(𝜽)=𝑛𝑑=11βˆ’βŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺβˆ’1Γ—βˆ’ξƒ©ξƒ©πœ•πœ‡π‘‘1πœ•πœ½βŸ¨π‘šβŸ©π‘‘+πœ•πœŽ2π‘‘πœ•πœ½βŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺπ‘šπ‘‘+ξƒ©πœ•πœ‡π‘‘πœ•πœ½βŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘βˆ’πœ•πœŽ2𝑑1πœ•πœ½βŸ¨π‘ βŸ©π‘‘ξƒͺ𝑠𝑑ξƒͺ(2.8) and satisfies the sufficient condition for optimality Eξ‚Έπœ•π π‘„(𝜽)πœ•πœ½βˆ£β„±π‘¦π‘‘βˆ’1𝐠=Cov𝑄(𝜽),π βˆ—π‘„(𝜽)βˆ£β„±π‘¦π‘‘βˆ’1𝐾,βˆ€π π‘„(𝜽)βˆˆπ’’π‘„,(2.9) where 𝐾 is a constant matrix. Hence, π βˆ—π‘„(𝜽) is optimal in the class 𝒒𝑄, and part (a) follows. Since π‘šπ‘‘ and πœ“π‘‘ are orthogonal, the information πˆπ βˆ—π‘„(𝜽)=πˆπ βˆ—π‘€(𝜽)+πˆπ βˆ—Ξ¨(𝜽) and part (b) follow. Hence, for each component πœƒπ‘–, 𝑖=1,…,𝑝, neither π‘”βˆ—π‘€(πœƒπ‘–) nor π‘”βˆ—π‘†(πœƒ) is fully informative, that is, πΌπ‘”βˆ—π‘„(πœƒπ‘–)β‰₯πΌπ‘”βˆ—π‘€(πœƒπ‘–) and πΌπ‘”βˆ—π‘„(πœƒπ‘–)β‰₯πΌπ‘”βˆ—π‘†(πœƒπ‘–).

Corollary 2.2. When the conditional skewness 𝛾 and kurtosis πœ… are constants, the optimal quadratic estimating function and associated information, based on the martingale differences π‘šπ‘‘=π‘¦π‘‘βˆ’πœ‡π‘‘ and 𝑠𝑑=π‘š2π‘‘βˆ’πœŽ2𝑑, are given by π βˆ—π‘„ξ‚΅π›Ύ(𝜽)=1βˆ’2ξ‚Άπœ…+2π‘›βˆ’1𝑑=11𝜎3π‘‘ξƒ©ξƒ©βˆ’πœŽπ‘‘πœ•πœ‡π‘‘+π›Ύπœ•πœ½πœ…+2πœ•πœŽ2𝑑ξƒͺπ‘šπœ•πœ½π‘‘+1ξƒ©π›Ύπœ…+2πœ•πœ‡π‘‘βˆ’1πœ•πœ½πœŽπ‘‘πœ•πœŽ2𝑑ξƒͺπ‘ πœ•πœ½π‘‘ξƒͺ,πˆπ βˆ—π‘„(ξ‚΅π›Ύπœ½)=1βˆ’2ξ‚Άπœ…+2βˆ’1ξƒ©πˆπ βˆ—π‘€(𝜽)+πˆπ βˆ—π‘†(π›Ύπœ½)βˆ’πœ…+2𝑛𝑑=11𝜎3π‘‘ξƒ©πœ•πœ‡π‘‘πœ•πœ½πœ•πœŽ2π‘‘πœ•πœ½ξ…ž+πœ•πœŽ2π‘‘πœ•πœ½πœ•πœ‡π‘‘πœ•πœ½ξ…ž.ξƒͺξƒͺ(2.10)

3. Applications

3.1. Autoregressive Conditional Duration (ACD) Models

There is growing interest in the analysis of intraday financial data such as transaction and quote data. Such data have increasingly been made available by many stock exchanges. Unlike closing prices which are measured daily, monthly, or yearly, intra-day data or high-frequency data tend to be irregularly spaced. Furthermore, the durations between events themselves are random variables. The autoregressive conditional duration (ACD) process due to Engle and Russell [10] had been proposed to model such durations, in order to study the dynamic structure of the adjusted durations π‘₯𝑖, with π‘₯𝑖=π‘‘π‘–βˆ’π‘‘π‘–βˆ’1, where 𝑑𝑖 is the time of the 𝑖th transaction. The crucial assumption underlying the ACD model is that the time dependence is described by a function πœ“π‘–, where πœ“π‘– is the conditional expectation of the adjusted duration between the (π‘–βˆ’1)th and the 𝑖th trades. The basic ACD model is defined as π‘₯𝑖=πœ“π‘–πœ€π‘–,πœ“π‘–ξ‚ƒπ‘₯=Eπ‘–βˆ£β„±π‘₯π‘‘π‘–βˆ’1ξ‚„,(3.1) where πœ€π‘– are the iid nonnegative random variables with density function 𝑓(β‹…) and unit mean, and β„±π‘₯π‘‘π‘–βˆ’1 is the information available at the (π‘–βˆ’1)th trade. We also assume that πœ€π‘– is independent of β„±π‘₯π‘‘βˆ’1. It is clear that the types of ACD models vary according to different distributions of πœ€π‘– and specifications of πœ“π‘–. In this paper, we will discuss a specific class of models which is known as ACD (𝑝, π‘ž) model and given by π‘₯𝑑=πœ“π‘‘πœ€π‘‘,πœ“π‘‘=πœ”+𝑝𝑗=1π‘Žπ‘—π‘₯π‘‘βˆ’π‘—+π‘žξ“π‘—=1π‘π‘—πœ“π‘‘βˆ’π‘—,(3.2) where πœ”>0, π‘Žπ‘—>0, 𝑏𝑗>0, and βˆ‘max(𝑝,π‘ž)𝑗=1(π‘Žπ‘—+𝑏𝑗)<1. We assume that πœ€π‘‘'s are iid nonnegative random variables with mean πœ‡πœ€, variance 𝜎2πœ€, skewness π›Ύπœ€, and excess kurtosis πœ…πœ€. In order to estimate the parameter vector 𝜽=(πœ”,π‘Ž1,…,π‘Žπ‘,𝑏1,…,π‘π‘ž)ξ…ž, we use the estimating function approach. For this model, the conditional moments are πœ‡π‘‘=πœ‡πœ€πœ“π‘‘, 𝜎2𝑑=𝜎2πœ€πœ“2𝑑, 𝛾𝑑=π›Ύπœ€, and πœ…π‘‘=πœ…πœ€. Let π‘šπ‘‘=π‘₯π‘‘βˆ’πœ‡π‘‘ and 𝑠𝑑=π‘š2π‘‘βˆ’πœŽ2𝑑 be the sequences of martingale differences such that βŸ¨π‘šβŸ©π‘‘=𝜎2πœ€πœ“2𝑑, βŸ¨π‘ βŸ©π‘‘=𝜎4πœ€(πœ…πœ€+2)πœ“4𝑑, and βŸ¨π‘š,π‘ βŸ©π‘‘=𝜎3πœ€π›Ύπœ€πœ“3𝑑. The optimal estimating function and associated information based on π‘šπ‘‘ are given by π βˆ—π‘€(𝜽)=βˆ’(πœ‡πœ€/𝜎2πœ€)βˆ‘π‘›π‘‘=1(1/πœ“2𝑑)(πœ•πœ“π‘‘/πœ•πœ½)π‘šπ‘‘ and πˆπ βˆ—π‘€(𝜽)=(πœ‡2πœ€/𝜎2πœ€)βˆ‘π‘›π‘‘=1(1/πœ“2𝑑)(πœ•πœ“π‘‘/πœ•πœ½)(πœ•πœ“π‘‘/πœ•πœ½ξ…ž). The optimal estimating function and the associated information based on 𝑠𝑑 are given by π βˆ—π‘†(𝜽)=βˆ’2/𝜎2πœ€(πœ…πœ€βˆ‘+2)𝑛𝑑=1(1/πœ“3𝑑)(πœ•πœ“π‘‘/πœ•πœ½)𝑠𝑑 and πˆπ βˆ—π‘†(𝜽)=(4/(πœ…πœ€βˆ‘+2))𝑛𝑑=1(1/πœ“2𝑑)(πœ•πœ“π‘‘/πœ•πœ½)(πœ•πœ“π‘‘/πœ•πœ½ξ…ž). Then, by Corollary 2.2 that the optimal quadratic estimating function and associated information are given by π βˆ—π‘„1(𝜽)=𝜎2πœ€ξ€·πœ…πœ€+2βˆ’π›Ύ2πœ€ξ€Έπ‘›ξ“π‘‘=1ξƒ©βˆ’πœ‡πœ€ξ€·πœ…πœ€ξ€Έ+2+2πœŽπœ€π›Ύπœ€πœ“2π‘‘πœ•πœ“π‘‘π‘šπœ•πœ½π‘‘+πœ‡πœ€π›Ύπœ€βˆ’2πœŽπœ€πœ“π‘‘πœŽπœ€πœ“3π‘‘πœ•πœ“π‘‘π‘ πœ•πœ½π‘‘ξƒͺ,πˆπ βˆ—π‘„ξ‚΅π›Ύ(𝜽)=1βˆ’2πœ€πœ…πœ€ξ‚Ά+2βˆ’1ξƒ©πˆπ βˆ—π‘€(𝜽)+πˆπ βˆ—π‘†(𝜽)βˆ’4πœ‡πœ€π›Ύπœ€πœŽπœ€ξ€·πœ…πœ€ξ€Έ+2𝑛𝑑=11πœ“2π‘‘πœ•πœ“π‘‘πœ•πœ½πœ•πœ“π‘‘πœ•πœ½ξ…žξƒͺ=4𝜎2πœ€+πœ‡2πœ€ξ€·πœ…πœ€ξ€Έ+2βˆ’4πœ‡πœ€πœŽπœ€π›Ύπœ€πœŽ2πœ€ξ€·πœ…πœ€+2βˆ’π›Ύ2πœ€ξ€Έπ‘›ξ“π‘‘=11πœ“2π‘‘πœ•πœ“π‘‘πœ•πœ½πœ•πœ“π‘‘πœ•πœ½ξ…ž,(3.3) the information gain in using π βˆ—π‘„(𝜽) over π βˆ—π‘€(𝜽) isξ€·2πœŽπœ€βˆ’πœ‡πœ€π›Ύπœ€ξ€Έ2𝜎2πœ€ξ€·πœ…πœ€+2βˆ’π›Ύ2πœ€ξ€Έπ‘›ξ“π‘‘=11πœ“2π‘‘πœ•πœ“π‘‘πœ•πœ½πœ•πœ“π‘‘πœ•πœ½ξ…ž,(3.4) and the information gain in using π βˆ—π‘„(𝜽) over π βˆ—π‘†(𝜽) isξ€·πœ‡πœ€ξ€·πœ…πœ€ξ€Έ+2βˆ’2πœŽπœ€π›Ύπœ€ξ€Έ2𝜎2πœ€ξ€·πœ…πœ€+2βˆ’π›Ύ2πœ€πœ…ξ€Έξ€·πœ€ξ€Έ+2𝑛𝑑=11πœ“2π‘‘πœ•πœ“π‘‘πœ•πœ½πœ•πœ“π‘‘πœ•πœ½ξ…ž,(3.5) which are both nonnegative definite.

When πœ€π‘‘ follows an exponential distribution, πœ‡πœ€=1/πœ†, 𝜎2πœ€=1/πœ†2, π›Ύπœ€=2, and πœ…πœ€=3. Then, πˆπ βˆ—π‘€βˆ‘(𝜽)=𝑛𝑑=1(1/πœ“2𝑑)(πœ•πœ“π‘‘/πœ•πœ½)(πœ•πœ“π‘‘/πœ•πœ½ξ…ž), πˆπ βˆ—π‘†βˆ‘(𝜽)=(4/5)𝑛𝑑=1(1/πœ“2𝑑)(πœ•πœ“π‘‘/πœ•πœ½)(πœ•πœ“π‘‘/πœ•πœ½ξ…ž), and πˆπ βˆ—π‘„βˆ‘(𝜽)=𝑛𝑑=1(1/πœ“2𝑑)(πœ•πœ“π‘‘/πœ•πœ½)(πœ•πœ“π‘‘/πœ•πœ½ξ…ž), and hence πˆπ βˆ—π‘„(𝜽)=πˆπ βˆ—π‘€(𝜽)>πˆπ βˆ—π‘†(𝜽).

3.2. Random Coefficient Autoregressive Models

In this section, we will investigate the properties of the quadratic estimating functions for the random coefficient autoregressive (RCA) time series which were first introduced by Nicholls and Quinn [11].

Consider the RCA model𝑦𝑑=ξ€·πœƒ+π‘π‘‘ξ€Έπ‘¦π‘‘βˆ’1+πœ€π‘‘,(3.6) where {𝑏𝑑} and {πœ€π‘‘} are uncorrelated zero mean processes with unknown variance 𝜎2𝑏 and variance 𝜎2πœ€=𝜎2πœ€(πœƒ) with unknown parameter πœƒ, respectively. Further, we denote the skewness and excess kurtosis of {𝑏𝑑} by 𝛾𝑏, πœ…π‘ which are known, and of {πœ€π‘‘} by π›Ύπœ€(πœƒ), πœ…πœ€(πœƒ), respectively. In the model (3.6), both the parameter πœƒ and 𝛽=𝜎2𝑏 need to be estimated. Let 𝜽=(πœƒ,𝛽)ξ…ž, we will discuss the joint estimation of πœƒ and 𝛽. In this model, the conditional mean is πœ‡π‘‘=π‘¦π‘‘βˆ’1πœƒ then and the conditional variance is 𝜎2𝑑=𝑦2π‘‘βˆ’1𝛽+𝜎2πœ€(πœƒ). The parameter πœƒ appears simultaneously in the mean and variance. Let π‘šπ‘‘=π‘¦π‘‘βˆ’πœ‡π‘‘ and 𝑠𝑑=π‘š2π‘‘βˆ’πœŽ2𝑑 such that βŸ¨π‘šβŸ©π‘‘=𝑦2π‘‘βˆ’1𝜎2𝑏+𝜎2πœ€, βŸ¨π‘ βŸ©π‘‘=𝑦4π‘‘βˆ’1𝜎4𝑏(πœ…π‘+2)+𝜎4πœ€(πœ…πœ€+2)+4𝑦2π‘‘βˆ’1𝜎2π‘πœŽ2πœ€, βŸ¨π‘š,π‘ βŸ©π‘‘=𝑦3π‘‘βˆ’1𝜎3𝑏𝛾𝑏+𝜎3πœ€π›Ύπœ€. Then the conditional skewness is 𝛾𝑑=βŸ¨π‘š,π‘ βŸ©π‘‘/𝜎3𝑑, and the conditional excess kurtosis is πœ…π‘‘=βŸ¨π‘ βŸ©π‘‘/𝜎4π‘‘βˆ’2.

Since πœ•πœ‡π‘‘/πœ•πœ½=(π‘¦π‘‘βˆ’1,0)ξ…ž and πœ•πœŽ2𝑑/πœ•πœ½=(πœ•πœŽ2πœ€/πœ•πœƒ,𝑦2π‘‘βˆ’1)ξ…ž, by applying Theorem 2.1, the optimal quadratic estimating function for πœƒ and 𝛽 based on the martingale differences π‘šπ‘‘ and 𝑠𝑑 is given by π βˆ—π‘„βˆ‘(𝜽)=𝑛𝑑=1πšβˆ—π‘‘βˆ’1π‘šπ‘‘+π›βˆ—π‘‘βˆ’1𝑠𝑑, whereπšβˆ—π‘‘βˆ’1=1βˆ’βŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺβˆ’1ξƒ©ξ‚΅βˆ’π‘¦π‘‘βˆ’1βŸ¨π‘šβŸ©π‘‘+πœ•πœŽ2πœ€πœ•πœƒβŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξ‚Ά,𝑦2π‘‘βˆ’1βŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺξ…ž,π›βˆ—π‘‘βˆ’1=1βˆ’βŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺβˆ’1ξƒ©ξ‚΅π‘¦π‘‘βˆ’1βŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘βˆ’πœ•πœŽ2πœ€1πœ•πœƒβŸ¨π‘ βŸ©π‘‘ξ‚Άπ‘¦,βˆ’2π‘‘βˆ’1βŸ¨π‘ βŸ©π‘‘ξƒͺξ…ž.(3.7) Hence, the component quadratic estimating function for πœƒ isπ‘”βˆ—π‘„(πœƒ)=𝑛𝑑=11βˆ’βŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺβˆ’1Γ—βˆ’π‘¦ξ‚΅ξ‚΅π‘‘βˆ’1βŸ¨π‘šβŸ©π‘‘+πœ•πœŽ2πœ€πœ•πœƒβŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξ‚Άπ‘šπ‘‘+ξ‚΅π‘¦π‘‘βˆ’1βŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘βˆ’πœ•πœŽ2πœ€1πœ•πœƒβŸ¨π‘ βŸ©π‘‘ξ‚Άπ‘ π‘‘ξ‚Ά,(3.8) and the component quadratic estimating function for 𝛽 isπ‘”βˆ—π‘„(𝛽)=𝑛𝑑=11βˆ’βŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺβˆ’1𝑦2π‘‘βˆ’1βŸ¨π‘š,π‘ βŸ©π‘‘π‘šπ‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘βˆ’π‘¦2π‘‘βˆ’1π‘ π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺ.(3.9) Moreover, the information matrix of the optimal quadratic estimating function for πœƒ and 𝛽 is given by𝐈gβˆ—π‘„ξƒ©πΌ(𝜽)=πœƒπœƒπΌπœƒπ›½πΌπ›½πœƒπΌπ›½π›½ξƒͺ,(3.10) whereπΌπœƒπœƒ=𝑛𝑑=11βˆ’βŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺβˆ’1𝑦2π‘‘βˆ’1βŸ¨π‘šβŸ©π‘‘+ξ‚΅πœ•πœŽ2πœ€ξ‚Άπœ•πœƒ21βŸ¨π‘ βŸ©π‘‘βˆ’2πœ•πœŽ2πœ€π‘¦πœ•πœƒπ‘‘βˆ’1βŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺ,𝐼(3.11)πœƒπ›½=πΌπ›½πœƒ=𝑛𝑑=11βˆ’βŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺβˆ’1ξ‚΅πœ•πœŽ2πœ€1πœ•πœƒβŸ¨π‘ βŸ©π‘‘βˆ’π‘¦π‘‘βˆ’1βŸ¨π‘š,π‘ βŸ©π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξ‚Άπ‘¦2π‘‘βˆ’1𝐼,(3.12)𝛽𝛽=𝑛𝑑=11βˆ’βŸ¨π‘š,π‘ βŸ©2π‘‘βŸ¨π‘šβŸ©π‘‘βŸ¨π‘ βŸ©π‘‘ξƒͺβˆ’1𝑦4π‘‘βˆ’1βŸ¨π‘ βŸ©π‘‘.(3.13)

In view of the parameter πœƒ only, the conditional least squares (CLS) estimating function and the associated information are directly given by 𝑔CLSβˆ‘(πœƒ)=𝑛𝑑=1π‘¦π‘‘βˆ’1π‘šπ‘‘ and 𝐼CLSβˆ‘(πœƒ)=(𝑛𝑑=1𝑦2π‘‘βˆ’1)2/βˆ‘π‘›π‘‘=1𝑦2π‘‘βˆ’1βŸ¨π‘šβŸ©π‘‘. The optimal martingale estimating function and the associated information based on π‘šπ‘‘ are given by π‘”βˆ—π‘€βˆ‘(πœƒ)=βˆ’π‘›π‘‘=1(π‘¦π‘‘βˆ’1π‘šπ‘‘/βŸ¨π‘šβŸ©π‘‘) and πΌπ‘”βˆ—π‘€βˆ‘(πœƒ)=𝑛𝑑=1(𝑦2π‘‘βˆ’1/βŸ¨π‘šβŸ©π‘‘). Moreover, the inequality𝑛𝑑=1𝑦2π‘‘βˆ’1βŸ¨π‘šβŸ©π‘‘ξƒͺ𝑛𝑑=1𝑦2π‘‘βˆ’1βŸ¨π‘šβŸ©π‘‘ξƒͺβ‰₯𝑛𝑑=1𝑦2π‘‘βˆ’1ξƒͺ2(3.14) implies that 𝐼CLS(πœƒ)β‰€πΌπ‘”βˆ—π‘€(πœƒ). Hence the optimal estimating function is more informative than the conditional least squares one. The optimal quadratic estimating function based on the martingale differences π‘šπ‘‘ and 𝑠𝑑 is given by (3.8) and (3.11), respectively. It is obvious to see that the information of π‘”βˆ—π‘„(πœƒ) is larger than that of π‘”βˆ—π‘€(πœƒ). Therefore, we can conclude that for the RCA model, 𝐼CLS(πœƒ)β‰€πΌπ‘”βˆ—π‘€(πœƒ)β‰€πΌπ‘”βˆ—π‘„(πœƒ), and hence, the estimate obtained by solving the optimal quadratic estimating equation is more efficient than the CLS estimate and the estimate obtained by solving the optimal linear estimating equation.

3.3. Doubly Stochastic Time Series Model

Random coefficient autoregressive models we discussed in the previous section are special cases of what TjΓΈstheim [12] refers to as doubly stochastic time series model. In the nonlinear case, these models are given by𝑦𝑑=πœƒπ‘‘π‘“ξ€·π‘‘,β„±π‘¦π‘‘βˆ’1ξ€Έ+πœ€π‘‘,(3.15) where {πœƒ+𝑏𝑑} of (3.6) is replaced by a more general stochastic sequence {πœƒπ‘‘} and π‘¦π‘‘βˆ’1 is replaced by a function of the past, β„±π‘¦π‘‘βˆ’1. Suppose that {πœƒπ‘‘} is a moving average sequence of the formπœƒπ‘‘=πœƒ+π‘Žπ‘‘+π‘Žπ‘‘βˆ’1,(3.16) where {π‘Žπ‘‘} consists of square integrable independent random variables with mean zero and variance 𝜎2π‘Ž. We further assume that {πœ€π‘‘} and {π‘Žπ‘‘} are independent, then E[π‘¦π‘‘βˆ£β„±π‘¦π‘‘βˆ’1] depends on the posterior mean 𝑒𝑑=E[π‘Žπ‘‘βˆ£β„±π‘¦π‘‘βˆ’1], and variance 𝑣𝑑=E[(π‘Žπ‘‘βˆ’π‘’π‘‘)2βˆ£β„±π‘¦π‘‘βˆ’1] of π‘Žπ‘‘. Under the normality assumption of {πœ€π‘‘} and {π‘Žπ‘‘}, and the initial condition 𝑦0=0, 𝑒𝑑 and 𝑣𝑑 satisfy the following Kalman-like recursive algorithms (see [13, page 439]):π‘’π‘‘πœŽ(πœƒ)=2π‘Žπ‘“ξ€·π‘‘,β„±π‘¦π‘‘βˆ’1π‘¦ξ€Έξ€·π‘‘βˆ’ξ€·πœƒ+π‘šπ‘‘βˆ’1𝑓𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έξ€ΈπœŽ2𝑒(πœƒ)+𝑓2𝑑,β„±π‘¦π‘‘βˆ’1πœŽξ€Έξ€·2π‘Ž+π‘£π‘‘βˆ’1ξ€Έ,𝑣𝑑(πœƒ)=𝜎2π‘Žβˆ’πœŽ4π‘Žπ‘“2𝑑,β„±π‘¦π‘‘βˆ’1ξ€ΈπœŽ2𝑒(πœƒ)+𝑓2𝑑,β„±π‘¦π‘‘βˆ’1πœŽξ€Έξ€·2π‘Ž+π‘£π‘‘βˆ’1ξ€Έ,(3.17) where 𝑒0=0 and 𝑣0=𝜎2π‘Ž. Hence, the conditional mean and variance of 𝑦𝑑 are given byπœ‡π‘‘ξ€·(πœƒ)=πœƒ+π‘’π‘‘βˆ’1𝑓(πœƒ)𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έ,𝜎2𝑑(πœƒ)=𝜎2𝑒(πœƒ)+𝑓2𝑑,β„±π‘¦π‘‘βˆ’1πœŽξ€Έξ€·2π‘Ž+π‘£π‘‘βˆ’1ξ€Έ,(πœƒ)(3.18) which can be computed recursively.

Let π‘šπ‘‘=π‘¦π‘‘βˆ’πœ‡π‘‘ and 𝑠𝑑=π‘š2π‘‘βˆ’πœŽ2𝑑, then {π‘šπ‘‘} and {𝑠𝑑} are sequences of martingale differences. We can derive that βŸ¨π‘š,π‘ βŸ©π‘‘=0, βŸ¨π‘šβŸ©π‘‘=𝜎2𝑒(πœƒ)+𝑓2(𝑑,β„±π‘¦π‘‘βˆ’1)(𝜎2π‘Ž+π‘£π‘‘βˆ’1(πœƒ)), and βŸ¨π‘ βŸ©π‘‘=2𝜎4𝑒(πœƒ)+4𝑓2(𝑑,β„±π‘¦π‘‘βˆ’1)𝜎2𝑒(πœƒ)(𝜎2π‘Ž+π‘£π‘‘βˆ’1(πœƒ))+2𝑓4(𝑑,β„±π‘¦π‘‘βˆ’1)(𝜎2π‘Ž+π‘£π‘‘βˆ’1(πœƒ))2. The optimal estimating function and associated information based on π‘šπ‘‘ are given byπ‘”βˆ—π‘€(πœƒ)=βˆ’π‘›ξ“π‘‘=1𝑓𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έξ‚΅1+πœ•π‘’π‘‘βˆ’1(πœƒ)ξ‚Άπ‘šπœ•πœƒπ‘‘βŸ¨π‘šβŸ©π‘‘,πΌπ‘”βˆ—π‘€(πœƒ)=𝑛𝑑=1𝑓2𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έξ€·1+πœ•π‘’π‘‘βˆ’1ξ€Έ(πœƒ)/πœ•πœƒ2βŸ¨π‘šβŸ©π‘‘.(3.19) Then, the inequality 𝑛𝑑=1𝑓2𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έξ‚΅1+πœ•π‘’π‘‘βˆ’1(πœƒ)ξ‚Άπœ•πœƒ2βŸ¨π‘šβŸ©π‘‘ξƒͺ𝑛𝑑=1𝑓2𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έξ€·1+πœ•π‘’π‘‘βˆ’1ξ€Έ(πœƒ)/πœ•πœƒ2βŸ¨π‘šβŸ©π‘‘ξƒͺβ‰₯𝑛𝑑=1𝑓2𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έξ‚΅1+πœ•π‘’π‘‘βˆ’1(πœƒ)ξ‚Άπœ•πœƒ2ξƒͺ2(3.20) implies that𝐼CLSξ‚€βˆ‘(πœƒ)=𝑛𝑑=1𝑓2𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έξ€·1+πœ•π‘’π‘‘βˆ’1ξ€Έ(πœƒ)/πœ•πœƒ22βˆ‘π‘›π‘‘=1𝑓2𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έξ€·1+πœ•π‘’π‘‘βˆ’1ξ€Έ(πœƒ)/πœ•πœƒ2βŸ¨π‘šβŸ©π‘‘β‰€πΌπ‘”βˆ—π‘€(πœƒ),(3.21) that is, the optimal linear estimating function π‘”βˆ—π‘€(πœƒ) is more informative than the conditional least squares estimating function 𝑔CLS(πœƒ).

The optimal estimating function and the associated information based on 𝑠𝑑 are given byπ‘”βˆ—π‘†(πœƒ)=βˆ’π‘›ξ“π‘‘=1ξ‚΅πœ•πœŽ2𝑒(πœƒ)πœ•πœƒ+𝑓2𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έπœ•π‘£π‘‘βˆ’1(πœƒ)ξ‚Άπ‘ πœ•πœƒπ‘‘βŸ¨π‘ βŸ©π‘‘,πΌπ‘”βˆ—π‘†(πœƒ)=𝑛𝑑=1ξ‚΅πœ•πœŽ2𝑒(πœƒ)πœ•πœƒ+𝑓2𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έπœ•π‘£π‘‘βˆ’1(πœƒ)ξ‚Άπœ•πœƒ21βŸ¨π‘ βŸ©π‘‘.(3.22) Hence, by Theorem 2.1, the optimal quadratic estimating function is given by π‘”βˆ—π‘„(πœƒ)=βˆ’π‘›ξ“π‘‘=11𝜎2𝑒(πœƒ)+𝑓2𝑑,β„±π‘¦π‘‘βˆ’1πœŽξ€Έξ€·2π‘Ž+π‘£π‘‘βˆ’1×𝑓(πœƒ)𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έξ‚΅1+πœ•π‘’π‘‘βˆ’1(πœƒ)π‘šπœ•πœƒξ‚Άξ‚Άπ‘‘+πœ•πœŽ2𝑒(πœƒ)/πœ•πœƒ+𝑓2𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έξ€·πœ•π‘£π‘‘βˆ’1ξ€Έ(πœƒ)/πœ•πœƒπœŽ2𝑒(πœƒ)+𝑓2𝑑,ℱ𝑦tβˆ’1πœŽξ€Έξ€·2π‘Ž+π‘£π‘‘βˆ’1𝑠(πœƒ)𝑑ξƒͺ.(3.23) And the associated information, πΌπ‘”βˆ—π‘„(πœƒ)=πΌπ‘”βˆ—π‘€(πœƒ)+πΌπ‘”βˆ—π‘†(πœƒ), is given by πΌπ‘”βˆ—π‘„(πœƒ)=𝑛𝑑=11𝜎2𝑒(πœƒ)+𝑓2𝑑,β„±π‘¦π‘‘βˆ’1πœŽξ€Έξ€·2π‘Ž+π‘£π‘‘βˆ’1×𝑓(πœƒ)2𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έξ‚΅1+πœ•π‘’π‘‘βˆ’1(πœƒ)ξ‚Άπœ•πœƒ2+ξ€·πœ•πœŽ2𝑒(πœƒ)/πœ•πœƒ+𝑓2𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έξ€·πœ•π‘£π‘‘βˆ’1(πœƒ)/πœ•πœƒξ€Έξ€Έ2𝜎2𝑒(πœƒ)+𝑓2𝑑,β„±π‘¦π‘‘βˆ’1πœŽξ€Έξ€·2π‘Ž+π‘£π‘‘βˆ’1ξ€Έξƒͺ.(πœƒ)(3.24) It is obvious to see that the information of π‘”βˆ—π‘„ is larger than that of π‘”βˆ—π‘€ and π‘”βˆ—π‘†, and hence, the estimate obtained by solving the optimal quadratic estimating equation is more efficient than the CLS estimate and the estimate obtained by solving the optimal linear estimating equation. Moreover, the relations πœ•π‘’π‘‘(πœƒ)π‘“πœ•πœƒ=βˆ’2𝑑,β„±π‘¦π‘‘βˆ’1ξ€ΈπœŽ2π‘Žξ€·1+πœ•π‘’π‘‘βˆ’1𝜎(πœƒ)/πœ•πœƒξ€Έξ€·2𝑒(πœƒ)+𝑓2𝑑,β„±π‘¦π‘‘βˆ’1ξ€ΈπœŽ2π‘Ž+π‘£π‘‘βˆ’1ξ€Έ(πœƒ)ξ€·πœŽ2𝑒(πœƒ)+𝑓2𝑑,β„±π‘¦π‘‘βˆ’1πœŽξ€Έξ€·2π‘Ž+π‘£π‘‘βˆ’1(πœƒ)ξ€Έξ€Έ2βˆ’πœŽ2π‘Žξ€·π‘¦π‘‘ξ€·βˆ’π‘“π‘‘,β„±π‘¦π‘‘βˆ’1ξ€Έξ€·πœƒ+π‘’π‘‘βˆ’1(πœƒ)ξ€Έξ€Έξ€·πœ•πœŽ2𝑒(πœƒ)/πœ•πœƒ+𝑓2𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έξ€·πœ•π‘£π‘‘βˆ’1(πœƒ)/πœ•πœƒξ€Έξ€Έξ€·πœŽ2𝑒(πœƒ)+𝑓2𝑑,β„±π‘¦π‘‘βˆ’1πœŽξ€Έξ€·2π‘Ž+π‘£π‘‘βˆ’1(πœƒ)ξ€Έξ€Έ2,πœ•π‘£π‘‘(πœƒ)=πœŽπœ•πœƒ4π‘Žπ‘“2𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έξ€·πœ•πœŽ2𝑒(πœƒ)/πœ•πœƒ+𝑓2𝑑,β„±π‘¦π‘‘βˆ’1ξ€Έπœ•π‘£π‘‘βˆ’1ξ€Έ(πœƒ)/πœ•πœƒξ€·πœŽ2𝑒(πœƒ)+𝑓2𝑑,β„±π‘¦π‘‘βˆ’1πœŽξ€Έξ€·2π‘Ž+π‘£π‘‘βˆ’1(πœƒ)ξ€Έξ€Έ2(3.25) can be applied to calculate the estimating functions and associated information recursively.

3.4. Regression Model with ARCH Errors

Consider a regression model with ARCH (𝑠) errors πœ€π‘‘ of the form𝑦𝑑=𝐱𝐭𝜷+πœ€π‘‘,(3.26) such that E[πœ€π‘‘βˆ£β„±π‘¦π‘‘βˆ’1]=0, and Var(πœ€π‘‘βˆ£β„±π‘¦π‘‘βˆ’1)=β„Žπ‘‘=𝛼0+𝛼1πœ€2π‘‘βˆ’1+β‹―+π›Όπ‘ πœ€2π‘‘βˆ’π‘ . In this model, the conditional mean is πœ‡π‘‘=𝐱𝐭𝜷, the conditional variance is 𝜎2𝑑=β„Žπ‘‘, and the conditional skewness and excess kurtosis are assumed to be constants 𝛾 and πœ…, respectively. It follows form Theorem 2.1 that the optimal component quadratic estimating function for the parameter vector 𝜽=(𝛽1,…,π›½π‘Ÿ,𝛼0,…,𝛼𝑠)ξ…ž=(πœ·ξ…ž,πœΆξ…ž)ξ…ž is π βˆ—π‘„1(𝜷)=𝛾(πœ…+2)1βˆ’2ξ‚Άπœ…+2βˆ’1×𝑛𝑑=11β„Ž2π‘‘ξƒ©ξƒ©βˆ’β„Žπ‘‘(πœ…+2)π±π­ξ…ž+2β„Žπ‘‘1/2𝛾𝑠𝑗=1π›Όπ‘—π±π­ξ…žπœ€π‘‘βˆ’π‘—ξƒͺπ‘šπ‘‘+ξƒ©β„Žπ‘‘1/2π›Ύπ±π­ξ…žβˆ’2𝑠𝑗=1π›Όπ‘—π±π­ξ…žπœ€π‘‘βˆ’π‘—ξƒͺ𝑠𝑑ξƒͺ,π βˆ—π‘„1(𝜢)=𝛾(πœ…+2)1βˆ’2ξ‚Άπœ…+2βˆ’1×𝑛𝑑=11β„Ž2π‘‘ξƒ©β„Žπ‘‘1/2𝛾1,πœ€2π‘‘βˆ’1,…,πœ€2π‘‘βˆ’π‘ξ€Έξ…žπ‘šπ‘‘βˆ’π‘›ξ“π‘‘=1ξ€·1,πœ€2π‘‘βˆ’1,…,πœ€2π‘‘βˆ’π‘ξ€Έξ…žπ‘ π‘‘ξƒͺ.(3.27) Moreover, the information matrix for 𝜽=(πœ·ξ…ž,πœΆξ…ž)ξ…ž is given byξ‚΅π›Ύπˆ=1βˆ’2ξ‚Άπœ…+2βˆ’1ξƒ©πˆπœ·πœ·πˆπœ·πœΆπˆπœΆπœ·πˆπœΆπœΆξƒͺ,(3.28) where𝐈𝜷𝜷=𝑛𝑑=1ξƒ©π±π­ξ…žπ±π­β„Žπ‘‘+4ξ€·1,πœ€2π‘‘βˆ’1,…,πœ€2π‘‘βˆ’π‘ ξ€Έξ…žξ€·1,πœ€2π‘‘βˆ’1,…,πœ€2π‘‘βˆ’π‘ ξ€Έβ„Ž2𝑑ξƒͺ,𝐈(πœ…+2)𝜷𝜢=βˆ’π‘›ξ“π‘‘=1ξ€·β„Žπ‘‘1/2π›Ύπ‘‘π±π­ξ…žβˆ‘βˆ’2𝑠𝑗=1π›Όπ‘—π±π­ξ…žπœ€π‘‘βˆ’π‘—ξ€Έξ€·1,πœ€2π‘‘βˆ’1,…,πœ€2π‘‘βˆ’π‘ ξ€Έβ„Ž2𝑑,𝐈(πœ…+2)𝜢𝜷=πΌξ…žπœ·πœΆ=βˆ’π‘›ξ“π‘‘=1ξ€·1,πœ€2π‘‘βˆ’1,…,πœ€2π‘‘βˆ’π‘ ξ€Έξ…žξ€·β„Žπ‘‘1/2π›Ύπ±π­βˆ‘βˆ’2𝑠𝑗=1π›Όπ‘—π±π­πœ€π‘‘βˆ’π‘—ξ€Έβ„Ž2𝑑,𝐈(πœ…+2)𝜢𝜢=𝑛𝑑=1ξ€·1,πœ€2π‘‘βˆ’1,…,πœ€2π‘‘βˆ’π‘ ξ€Έξ…žξ€·1,πœ€2π‘‘βˆ’1,…,πœ€2π‘‘βˆ’π‘ ξ€Έβ„Ž2𝑑.(πœ…+2)(3.29)

It is of interest to note that when {πœ€π‘‘} are conditionally Gaussian such that 𝛾=0, πœ…=0,πΈξƒ¬ξ€·βˆ‘π‘ π‘—=1π›Όπ‘—π±π­ξ…žπœ€π‘‘βˆ’π‘—ξ€Έξ€·1,πœ€2π‘‘βˆ’1,…,πœ€2π‘‘βˆ’π‘ ξ€Έβ„Ž2𝑑(πœ…+2)=𝟎,(3.30) the optimal quadratic estimating functions for 𝜷 and 𝜢 based on the estimating functions π‘šπ‘‘=π‘¦π‘‘βˆ’π±π­π›½ and 𝑠𝑑=π‘š2π‘‘βˆ’β„Žπ‘‘, are, respectively, given byπ βˆ—π‘„(𝜷)=βˆ’π‘›ξ“π‘‘=11β„Ž2π‘‘ξƒ©β„Žπ‘‘π±π­ξ…žπ‘šπ‘‘+𝑛𝑑=1𝑠𝑗=1π›Όπ‘—π±π­ξ…žπœ€π‘‘βˆ’π‘—ξƒͺ𝑠𝑑ξƒͺ,π βˆ—π‘„(𝜢)=βˆ’π‘›ξ“π‘‘=11β„Ž2𝑑1,πœ€2π‘‘βˆ’1,…,πœ€2π‘‘βˆ’π‘ ξ€Έξ…žπ‘ π‘‘.(3.31) Moreover, the information matrix for 𝜽=(πœ·ξ…ž,πœΆξ…ž)ξ…ž in (3.28) has 𝐈𝜷𝜢=𝐈𝜢𝜷=𝟎,𝐈𝜷𝜷=𝑛𝑑=1β„Žπ‘‘π±π­ξ…žπ±π­ξ€·βˆ‘+2𝑠𝑗=1π›Όπ‘—π±π­ξ…žπœ€π‘‘βˆ’π‘—βˆ‘ξ€Έξ€·π‘ π‘—=1π›Όπ‘—π±π­πœ€π‘‘βˆ’π‘—ξ€Έβ„Ž2𝑑,𝐈𝜢𝜢=𝑛𝑑=1ξ€·1,πœ€2π‘‘βˆ’1,…,πœ€2π‘‘βˆ’π‘ ξ€Έξ…žξ€·1,πœ€2π‘‘βˆ’1,…,πœ€2π‘‘βˆ’π‘ ξ€Έ2β„Ž2𝑑.(3.32)

4. Conclusions

In this paper, we use appropriate martingale differences and derive the general form of the optimal quadratic estimating function for the multiparameter case with dependent observations. We also show that the optimal quadratic estimating function is more informative than the estimating function used in Thavaneswaran and Abraham [2]. Following Lindsay [8], we conclude that the resulting estimates are more efficient in general. Examples based on ACD models, RCA models, doubly stochastic models, and the regression model with ARCH errors are also discussed in some detail. For RCA models and doubly stochastic models, we have shown the superiority of the approach over the CLS method.

References

  1. V. P. Godambe, β€œThe foundations of finite sample estimation in stochastic processes,” Biometrika, vol. 72, no. 2, pp. 419–428, 1985. View at: Publisher Site | Google Scholar | Zentralblatt MATH
  2. A. Thavaneswaran and B. Abraham, β€œEstimation for nonlinear time series models using estimating equations,” Journal of Time Series Analysis, vol. 9, no. 1, pp. 99–108, 1988. View at: Publisher Site | Google Scholar | Zentralblatt MATH
  3. U. V. Naik-Nimbalkar and M. B. Rajarshi, β€œFiltering and smoothing via estimating functions,” Journal of the American Statistical Association, vol. 90, no. 429, pp. 301–306, 1995. View at: Publisher Site | Google Scholar | Zentralblatt MATH
  4. A. Thavaneswaran and C. C. Heyde, β€œPrediction via estimating functions,” Journal of Statistical Planning and Inference, vol. 77, no. 1, pp. 89–101, 1999. View at: Publisher Site | Google Scholar | Zentralblatt MATH
  5. S. A. Chandra and M. Taniguchi, β€œEstimating functions for nonlinear time series models. Nonlinear non-Gaussian models and related filtering methods,” Annals of the Institute of Statistical Mathematics, vol. 53, no. 1, pp. 125–141, 2001. View at: Publisher Site | Google Scholar | Zentralblatt MATH
  6. T. Merkouris, β€œTransform martingale estimating functions,” The Annals of Statistics, vol. 35, no. 5, pp. 1975–2000, 2007. View at: Publisher Site | Google Scholar | Zentralblatt MATH
  7. M. Ghahramani and A. Thavaneswaran, β€œCombining estimating functions for volatility,” Journal of Statistical Planning and Inference, vol. 139, no. 4, pp. 1449–1461, 2009. View at: Publisher Site | Google Scholar | Zentralblatt MATH
  8. B. G. Lindsay, β€œUsing empirical partially Bayes inference for increased efficiency,” The Annals of Statistics, vol. 13, no. 3, pp. 914–931, 1985. View at: Publisher Site | Google Scholar | Zentralblatt MATH
  9. M. Crowder, β€œOn linear and quadratic estimating functions,” Biometrika, vol. 74, no. 3, pp. 591–597, 1987. View at: Publisher Site | Google Scholar | Zentralblatt MATH
  10. R. F. Engle and J. R. Russell, β€œAutoregressive conditional duration: a new model for irregularly spaced transaction data,” Econometrica, vol. 66, no. 5, pp. 1127–1162, 1998. View at: Publisher Site | Google Scholar | Zentralblatt MATH
  11. D. F. Nicholls and B. G. Quinn, β€œThe estimation of random coefficient autoregressive models. I,” Journal of Time Series Analysis, vol. 1, no. 1, pp. 37–46, 1980. View at: Publisher Site | Google Scholar | Zentralblatt MATH
  12. D. TjΓΈstheim, β€œSome doubly stochastic time series models,” Journal of Time Series Analysis, vol. 7, no. 1, pp. 51–72, 1986. View at: Publisher Site | Google Scholar
  13. A. N. Shiryayev, Probability, vol. 95 of Graduate Texts in Mathematics, Springer, New York, NY, USA, 1984.

Copyright Β© 2011 Y. Liang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


More related articles

761Β Views | 499Β Downloads | 7Β Citations
 PDF  Download Citation  Citation
 Download other formatsMore
 Order printed copiesOrder

Related articles

We are committed to sharing findings related to COVID-19 as quickly and safely as possible. Any author submitting a COVID-19 paper should notify us at help@hindawi.com to ensure their research is fast-tracked and made available on a preprint server as soon as possible. We will be providing unlimited waivers of publication charges for accepted articles related to COVID-19. Sign up here as a reviewer to help fast-track new submissions.