Abstract

Bezerra et al. (2008) proposed a new method, based on Yule-Walker equations, to estimate the ARMA spectral model. In this paper, a Bayesian approach is developed for this model by using the noninformative prior proposed by Jeffreys (1967). The Bayesian computations, simulation via Markov Monte Carlo (MCMC) is carried out and characteristics of marginal posterior distributions such as Bayes estimator and confidence interval for the parameters of the ARMA model are derived. Both methods are also compared with the traditional least squares and maximum likelihood approaches and a numerical illustration with two examples of the ARMA model is presented to evaluate the performance of the procedures.

1. Introduction

The spectral estimation of autoregressive moving average (ARMA) is considered a topic of interest in several applied areas of knowledge, for example, engineering, econometrics, and so forth [14]. Different methods have been studied, comprising two focuses, among which, we can point out: (i) the optimal methods, such as the maximum likelihood estimates, which are computational difficult to deal [1, 3, 4]; (ii) the suboptimal methods based on the Yule-Walker equations, which employ linear equations in the parameters estimation [28]. Another type of suboptimal method, proposed by [9], makes a simultaneous estimate of both parameters AR and MA using the reformulated Steiglitz-McBride least-squares algorithm.

The estimation methods for the ARMA process have been widely studied, especially for the separate estimation of parameters, where Yule-Walker equations are used to estimate the AR process parameters, and following this, the parameters MA process are estimated through the Durbin method [3, 4]. Among these methods, the best known are the modified Yule-Walker equations (MYWEs) and the method of least squares Yule-Walker (LSYW). Moreover, these methods provide similar results for ARMA spectral estimation. One of the methods used to make the comparisons is proposed by [8].

Currently, Bayesian inference has been used successfully in practical problems also. In Bayesian inference, the unknown parameter is considered a random variable, thus assuming a probability distribution associated with (prior), which is specified from past data or in the expert opinion.

The main objective of this paper is to show the viability of the Bayesian approach in estimating the parameters of the ARMA process, considering a noninformative prior proposed by Jeffrey [10]. Thus, a comparison between the classical and Bayesian estimation methods is carried out by comparing the estimates of the parameters of the ARMA model and consequently the estimates of the power spectrum. The mean estimates of the parameters and their respective standard deviations are obtained from chains generated by simple Monte Carlo simulations MCMC algorithms.

The paper is organized as follows. In Section 2, the ARMA model and its main features are presented. Section 3 is devoted to the separate estimation methods using Yule-Walker equations, and presents the development of the method proposed, based on the methods described in section. In Section 4, we present a Bayesian analysis considering the Jeffreys and independent normal prior distributions for the parameters. Section 5 deals with applications using Monte Carlo simulations for two numerical examples.

2. The ARMA Spectral Model

Consider an ARMA stationary random process of order , defined by the following difference equation: where , are the parameters of the process and is a white noise with mean zero and variance . Thus, the vector of parameters to be estimated is defined by

We define the power spectral density of ARMA process, in plane-, from the system function, by where , , , are the poles, and , and , , are the zeros. As usual, the stars denote complex conjugation.

The ARMA power spectral density on frequency domain is obtained by substituting into (2.3) resulting in

The function is the Fourier transform of the autocorrelation function , presumably nonnegative and periodical in frequency with period 1 Hz. It is assumed to be band limited to ±0.5 Hz [13].

The problem of ARMA spectral estimation consists initially on selecting the suitable model such that we can estimate the vector of parameters of the process and then replace the estimates value in the spectral density function. This parametrization of the function is obtained by using the vector of parameters .

3. Estimation Methods Using Yule-Walker Equations

There are several methods which use Yule-Walker equations in separate spectral estimation for the ARMA model, as previously mentioned in the introduction. The methods the best known are modified Yule-Walker equations (MYWEs) and the least squares modified Yule-Walker (LSMYW). This uses more than linear equations in the parameters estimation. In both methods, the errors can be weighted in order to stabilize the variance. Moreover, these methods provide similar results for ARMA spectral estimation.

3.1. Least-Squares Yule-Walker Method

This method is an attempt to reduce the variance of the MYWE estimator and improve the quality of their estimates. It is known that the autocorrelation function of the ARMA process is defined as [6]

From the MYWE, initially we will describe the least squares method of the modified Yule-Walker equations (LSYW) for the estimation of the parameters of the AR process, given that the expanded equations of Yule-Walker for the ARMA process can be represented as [24, 6] where is the expanded autocorrelation matrix , which is given by and is the autocorrelations vector given by

The consistent estimates of AR vector parameters can be obtained by expression (3.2). As , there are more equations than unknowns. One error of size in order to estimate the autocorrelations should be introduced given by where and correspond to and estimators, respectively.

It is convenient to use the unbiased autocorrelations to estimate and , so that the bias of approximation error is null. Thus, the method of least squares can be applied to find the vector which minimizes the sum of squares of errors, that is, [2, 6] or, in the matrix form:

Substituting the equation of the errors in (3.7), we obtain The development of the products above in (3.9) one has the following expression results: Taking the partial derivatives in expression (3.10), We can obtain the estimator of the AR process, which is given by

The matrix is usually positive definite, hermitian, and estimator cannot be a minimum-phase estimator [2, 3].

The estimation of AR parameters of the ARMA model through LSYW method can be interpreted as an application of the covariance method considering the set of values [2, 3]. Moreover, the estimated process parameters MA can be obtained applying the method of Durbin [3].

The performance of the LSYW, , is usually better than the modified Yule-Walker equations method (MYWE), where (number of equations), using only equations, but the results will also depend on the kind of spectrum to be estimated [2, 3, 5].

3.2. The Method Using AR Filter in MA Estimates

The new method for separate estimation consists of making a new AR filtering of the signal by using the estimates of the MA parameters obtained with the Durbin method. Thus determining a new signal , proposed by [8], as shown in the Figure 1 and according to the expression:

In Figure 1, the is the system transfer function, and . This procedure is like repeating the second step of the Durbin method. This is the key idea of the separate estimation method [3, 6].

It is known that an AR process (large order) can be an approach for the MA process [14]. In this method, a new signal estimate will be used to obtain a new AR estimate through the LSYW method in order to subsequently determine a new MA estimate through the Durbin method. The key idea of the method proposed is to consist of obtain a new signal, from the AR filtering, which will provide better estimates when the parameters of the ARMA process are reestimated.

4. Bayesian Estimation

In a Bayesian framework, the inference is based on the posterior distribution of parameters , denoted by (), which, in turn, is used for inferences and decisions involving .

The posterior distribution () is obtained from the combination of the information provided by a prior distribution ( ) and the information supplied by the data through the likelihood (). Thus, using Bayes' theorem, the posterior distribution is given by

The prior distribution represents the knowledge or uncertainty state about the parameter before the experiment is run and the posterior distribution describes the updated information about after the data is observed.

For an ARMA(, ) model, we need to estimate the parameters and where (). The likelihood function proposed in [11] for the observations given the vector parameters (, ) can be written as follows: where ,

The likelihood above combined with the prior distribution results in the posterior distribution given by

The values of the variables and error terms are arbitrarily fixed.

To progress with the Bayesian analysis, it is necessary to specify a prior distribution over the parameter space. Different prior distributions can be used in our study according to all currently available information.

If prior information on study parameters is unavailable or does not exist for a device, then initial uncertainty about the parameters can be quantified with a noninformative prior distribution. This is the same to include in the analysis just the information provided by the data.

Therefore, for the ARMA coefficients, we assume that little is known about these parameters such that an uniform distribution can be used as a prior distribution. We also assume a prior that the components of are independent and hence the overall prior for is the product of the priors for its components. Besides, a conventional noninformative prior distribution for the model variance can be represented by Therefore, the joint prior distribution of and has the form: where and .

Other prior specifications also could be considered, as independent informative normal distributions for the components of , that is, ,, ,, with mean and variance specified a prior and an informative prior as the inverse Gamma distribution for the parameter , that is, , with hyperparameters and known. Thus, the joint prior for and would be given by

To obtain the marginal posterior distribution for each parameter from the model, one needs to solve integrals involving the joint posterior density that are not analytically tractable and which standard integral approximations can perform poorly. In this case, we use of Markov chain Monte Carlo (MCMC) approaches to carry out the Bayesian posterior inference. Specifically, we can run an algorithm for simulating a long chain of draws from the posterior distribution, and base inferences on posterior summaries of the parameters or functional of the parameters calculated from the samples. MCMC is essentially Monte Carlo integration using Markov chains.

Constructing such a Markov chain is not difficult. We first describe the Metropolis-Hastings algorithm. This algorithm is due to [12], which is a generalization of the method first proposed by [13].

Let be the distribution of interest. Suppose at time , is chosen by first sampling a candidate point from a proposal distribution .

The candidate is then accepted with probability:

If the candidate point is accepted, the next state becomes . If it is rejected, then and the chain does not move. The proposal distribution is arbitrary and, provided the chain is irreducible and aperiodic, the equilibrium distribution of the chain will be ().

After obtaining a random sample from the MCMC algorithm for each component of , it is important to investigate issues such as convergence and mixing, to determine whether the sample can reasonably be treated as a set of random realizations from the target posterior distribution. Looking at marginal trace plots is the simplest way to examine the output besides formal procedures. This way, all the values of the chain have marginal distribution given by the equilibrium distribution.

For more details of MCMC in a variety ways to construct these chains, see, for example, [1416].

We now describe the MCMC implementation used in our time series framework. We generate a posterior Monte Carlo sample by simulating a Markov chain described as follows:(i)choose starting values and . At step , we draw a new sample ,   conditional on the current sample using the following proposal distributions (and the usual Metropolis-Hastings acceptance rule);(ii)sample a new vector from the multivariate normal distribution where is a diagonal matrix;(iii)the candidate for , denoted by , will be accepted with a probability given by the Metropolis ratio: (iv)sample the new variance from inverse Gamma distribution ;(v)because this proposal is not symmetric, we find

The proposal distribution parameters and were chosen to obtain good mixing of the chains.

5. Numerical Illustration

In this section, we describe the Monte Carlo simulations [17], in order to obtain the estimates of the parameters and the spectral power density corresponding to each of the cited below ARMA processes: ARMA(4,2) [18] and ARMA(4,4) [19]. The simulation and the programs were implemented in the MATLAB and softwares.

The methods used on the simulations are modified Yule-Walker equations (MYWE); least squares modified Yule-Walker equations (LSYW); least squares modified Yule-Walker equations with AR filtering (LSYWS—proposed method); maximum likelihood estimator (MLE); Bayesian estimator considering the noninformative priors as Jeffreys and independence normal priors for the parameters.

In order to evaluate the performance of the methods, the random data have been generated from the same seed and that change it with the sequences, , and is the replications number in each simulation. We carried out this using observations and replications. The random numbers are generated from a standard Gaussian distribution (white noise) with mean equal zero and variance equal one, then generated a signal of the process ARMA.

The modified covariance was the criterion in the Durbin method [3], and the large AR process order was chosen using the criterion described by [20] and the number of equations, , was chosen in accordance with the spectral characteristics to be analyzed. On the other hand, for the spectrum with poles far from the unit circle (UC), as it was given in Example 5.1, a small number was used for , which has one pole near and other far from the unit cycle (UC).

In the first example, the ARMA(4,2) model was used with , to , and in the second example, we used ARMA(4,4) model with to . The dashed curve is theoretical and solid curve is the average of estimates.

Example 5.1. The ARMA(4,2) model presented in [18] is given by

Example 5.2. The ARMA(4,4) model presented in [19] is given by

5.1. Discussion

Firstly, we consider a comparative analysis between the classical methods applied to the ARMA(4,2) model fitted by the data of Example 5.1. Thus, Table 1 shows that the LSYWS method provided the best averages for estimate parameters than other methods, but with a slightly larger variability than the LSYW method for the AR model. The LSYW method produces intermediate estimates but with a superior performance than the MYWE and MLE methods. These facts can be observed through the average estimates of the spectra power in Figures 2 and 3, which shows that the best estimate is really provided by LSYWS method proposed by [8].

Comparing the classical methods and the Bayesian methods studied in this paper, Table 2 shows that the estimates of parameters from both Bayesian methods are similar to the LSYWS method, but with a smaller variability (see Figure 4).

By using the noninformative Jeffreys prior, we obtained a slightly better estimate than the empirical prior, most likely due to the mean prior given by LME does not produce so good results. Tables 3 and 4 present the results of estimation methods MEYW, LSYW, LSYWS, and MLE applied to the ARMA(4,4) model from Example 5.2 (see Figures 5 and 6). In this case, the average estimates of the parameters were approximately equal for all considered methods. Just the standard deviations of the mean estimates showed a difference with the Bayesian standard deviation, see Tables 5 and 6. The standard deviation, with Jeffreys prior, had lower value than the other methods, but with a value slightly smaller than the standard deviation obtained by the MLE method.

This analysis can also be confirmed by plots of the power spectra estimates 2 (see Figures 6 and 7). In this case, this similarity can be due to the ARMA(4,4) has a stable power spectrum and with more parameters to be estimated than case previous.

6. Conclusions

In this article, we describe the Bayesian approach by using the noninformative prior distribution proposed by Jeffreys [10] to estimate the density spectral of the ARMA model. This Bayesian approach is compared with non-Bayesian approaches LSYWS method and LSYW, both based on Yule-Walker equations and least-squares method. Other methods proposed in the literature as MLE and MYWE (Mod-fitted Yule-Walker Equations) are also compared.

Two ARMA models with different orders were introduced to illustrate the proposed methodologies and to examine their performances.

The study showed that the Bayesian approaches produced more accurate estimates than the other approaches by considering ARMA(4,2) model, but similar results are found for ARMA(4,4). This way, we conclude that Bayesian approach for less stable ARMA models is justified, and more stable power spectrums any one procedures could be chosen. The Bayesian approach provides best fitting of spectral applicable for any order of the ARMA model than the other approaches. Finally, the choice of the approaches becomes irrelevant for a moderate large size. In addition, generally speaking, it was seen that the results depend on the spectrum to be estimated.

Acknowledgments

This work is supported by Capes, RH-TVD, CNPq, Faepex/Unicamp, and FUNDUNESP.