About this Journal Submit a Manuscript Table of Contents
Journal of Probability and Statistics
Volume 2012 (2012), Article ID 451076, 31 pages
doi:10.1155/2012/451076
Research Article

Transfer Function Models with Time-Varying Coefficients

1Department of Statistics, Federal University of São Carlos, 13565-905 São Carlos, SP, Brazil
2Department of Statistics, University of São Paulo, 05508-090 São Paulo, SP, Brazil

Received 15 August 2011; Revised 26 January 2012; Accepted 28 January 2012

Academic Editor: Zhidong Bai

Copyright © 2012 Maria Sílvia de A. Moura et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

We consider a transfer function model with time-varying coefficients. We propose an estimation procedure, based on the least squares method and wavelet expansions of the time-varying coefficients. We discuss some statistical properties of the estimators and assess the validity of the methodology through a simulation study. We also present an application of the proposed procedure to a real pair of series.

1. Introduction

In this paper, we consider the transfer function model 𝑌 𝑡 , 𝑇 = 𝑚 𝑖 = 1 𝛿 𝑖 𝑡 𝑇 𝑌 𝑡 𝑖 , 𝑇 + 𝑛 𝑗 = 0 𝜔 𝑗 𝑡 𝑇 𝑋 𝑡 𝑗 , 𝑇 + 𝜀 𝑡 , 𝑡 = 1 , , 𝑇 , ( 1 . 1 ) where 𝑇 is the number of observations and 𝜀 𝑡 is i.i.d. ( 0 , 𝜎 2 ) . We assume that the error and the input series are independent. The functions 𝜔 𝑗 ( 𝑢 ) , 𝑗 = 0 , 1 , 2 , , 𝑛 , and 𝛿 𝑗 ( 𝑢 ) , 𝑗 = 1 , 2 , , 𝑚 , are supported on the interval [ 0 , 1 ] and connected to the underlying series by an appropriate rescaling of time, 𝑢 = 𝑡 / 𝑇 .

As noted by Nason and Sapatinas [1], when both time series 𝑋 𝑡 and 𝑌 𝑡 belong to the class of ARMA models, the transfer function models made popular by Box and Jenkins are appropriate. See Box et al. [2].

Model (1.1) is important because it can be used when either or both time series are not stationary, but it is perhaps more suitable when the series exhibit a locally stationary behaviour, for example, in the sense of Dahlhaus [3] or of Nason et al. [4]. In several areas, such as signal analysis (speech analysis in particular) and economics, empirical and theoretical works suggest that models with time-varying coefficients are useful, reflecting, for example, the evolution of the economy over time.

Chiann and Morettin [5, 6] considered a special case of (1.1), namely, 𝑌 𝑡 , 𝑇 = 𝑗 = 0 𝑎 𝑗 𝑡 𝑇 𝑋 𝑡 𝑗 , 𝑇 𝑡 + 𝜎 𝑇 𝜀 𝑡 , ( 1 . 2 ) while Dahlhaus et al. [7] entertained the model 𝑋 𝑡 , 𝑇 = 𝑝 𝑗 = 1 𝑎 𝑗 𝑡 𝑇 𝑋 𝑡 𝑗 , 𝑇 𝑡 + 𝜎 𝑇 𝜀 𝑡 . ( 1 . 3 )

These authors used wavelet expansions of the time-varying coefficients in order to obtain estimates.

Related references are Brockwell et al. [8, 9], who used a state-space approach to transfer function modelling, allowing for nonstationary input and output sequences, and Nason and Sapatinas [1] who show how a nondecimated wavelet packet transform (NDWPT) can be used to model a response time series 𝑌 𝑡 in terms of an explanatory time series 𝑋 𝑡 , both assumed to be nonstationary. Rather than building a model directly between 𝑋 𝑡 and 𝑌 𝑡 some regression-type model is built between 𝑌 𝑡 and an NDWPT version of 𝑋 𝑡 .

We consider the problem of estimating 𝜔 𝑗 ( 𝑢 ) , 𝑗 = 0 , 1 , 2 , , 𝑛 and 𝛿 𝑗 ( 𝑢 ) , 𝑗 = 1 , 2 , , 𝑚 , in time domain, using wavelet expansions. We use least squares to obtain the estimators of the wavelet coefficients. Then the empirical detail coefficients are shrunk before the inverse wavelet transform is applied to obtain the final estimates of 𝜔 𝑗 ( 𝑢 ) and 𝛿 𝑗 ( 𝑢 ) . This results in a nonlinear smoothing procedure. See Section 2 for further details.

We will apply the methodology to real data, namely, a pair of hourly wind speeds recorded at two Welsh Meteorological Office stations, Valley and Aberporth. Here the idea is to relate the target wind speeds (Valley) to wind speeds measured at a nearby reference site (Aberporth). See Nason and Sapatinas [1] for a complete description of this data and Section 6.

The plan of the paper is as follows. In Section 2, we introduce some background material on wavelets. In Section 3, we present the estimation procedure, and in Section 4, we give some properties of the empirical wavelet coefficients. In Section 5, we perform some simulations and in Section 6, we apply the methodology to the data described above. The paper ends with some remarks in Section 7.

2. Wavelets

In this section, we discuss some basic ideas on wavelets. For more details, see the books by Vidakovic [10], Percival and Walden [11], and Nason [12]. From two basic functions, the scaling function 𝜙 ( 𝑥 ) and the wavelet 𝜓 ( 𝑥 ) , we define infinite collections of translated and scaled versions, 𝜙 𝑗 , 𝑘 ( 𝑥 ) = 2 𝑗 / 2 𝜙 ( 2 𝑗 𝑥 𝑘 ) , 𝜓 𝑗 , 𝑘 ( 𝑥 ) = 2 𝑗 / 2 𝜓 ( 2 𝑗 𝑥 𝑘 ) , 𝑗 , 𝑘 = { 0 , ± 1 , ± 2 , } . We assume that { 𝜙 , 𝑘 ( ) } 𝑘 { 𝜓 𝑗 , 𝑘 ( ) } 𝑗 ; 𝑘 forms an orthonormal basis of 𝐿 2 ( ) , for some coarse scale . A key point [13] is that it is possible to construct compactly supported 𝜙 and 𝜓 that generate an orthonormal system and have space-frequency localization, which allows parsimonious representations for wide classes of functions in wavelet series.

In some applications, the functions involved are defined in a compact interval, such as [ 0 , 1 ] . This will be the case of our functions 𝜔 𝑗 ( 𝑢 ) and 𝛿 𝑗 ( 𝑢 ) in (1.1). So it will be necessary to consider an orthonormal system that spans 𝐿 2 ( [ 0 , 1 ] ) . Several solutions were proposed, the most satisfactory one being that by Cohen et al. [14]. As in any proposal, close to the boundaries the basis functions are modified to preserve orthogonality. Cohen et al. [14] use a preconditioned step. Periodized wavelets are defined by 𝜙 𝑗 , 𝑘 ( 𝑥 ) = 𝑛 𝜙 𝑗 , 𝑘 ( 𝑥 𝑛 ) , 𝜓 𝑗 , 𝑘 ( 𝑥 ) = 𝑛 𝜓 𝑗 , 𝑘 ( 𝑥 𝑛 ) , ( 2 . 1 ) and these generate a multiresolution level ladder 𝑉 0 𝑉 1 , in which the spaces 𝑉 𝑗 are generated by the 𝜓 𝑗 , 𝑘 . Negative values of 𝑗 are not necessary, since 𝜙 𝜙 = 0 , 0 = 1 and if 𝑗 0 , 𝜓 𝑗 , 𝑘 ( 𝑥 ) = 2 𝑗 / 2 . See Vidakovic [10] for details and Walter and Cai [15] for a different approach. Other instances are the boundary corrected Meyer wavelets [16]. From now on, we denote the periodized wavelets simply by 𝜓 𝑗 , 𝑘 .

Accordingly, for any function 𝑓 𝐿 2 ( [ 0 , 1 ] ) , we can expand it in an orthogonal series 𝑓 ( 𝑥 ) = 𝛼 0 , 0 𝜙 ( 𝑥 ) + 𝑗 0 𝑘 𝐼 𝑗 𝛽 𝑗 , 𝑘 𝜓 𝑗 , 𝑘 ( 𝑥 ) , ( 2 . 2 ) where we take = 0 and 𝐼 𝑗 = { 𝑘 𝑘 = 0 , , 2 𝑗 1 } . For each 𝑗 , the set 𝐼 𝑗 brings the values 𝑘 , such that 𝛽 𝑗 𝑘 belongs to scale 2 𝑗 . For example, for 𝑗 = 3 , we have 8 wavelet coefficients in scale 23, while for 𝑗 = 2 we have half of this number, namely, 4 coefficients in scale 2 2 . The wavelet coefficients are given by 𝛼 0 , 0 = 𝑓 ( 𝑥 ) 𝜙 ( 𝑥 ) 𝑑 𝑥 , 𝛽 𝑗 , 𝑘 = 𝑓 ( 𝑥 ) 𝜓 𝑗 , 𝑘 ( 𝑥 ) 𝑑 𝑥 . ( 2 . 3 )

Often we consider the sum in (2.2) for a maximum level 𝐽 , 𝑓 ( 𝑥 ) 𝛼 0 0 𝜙 ( 𝑥 ) + 𝐽 1 𝑗 = 0 𝑘 𝐼 𝑗 𝛽 𝑗 𝑘 𝜓 𝑗 𝑘 ( 𝑥 ) , ( 2 . 4 ) in such a way that we approximate 𝑓 in the 𝑉 𝐽 space.

Wavelet shrinkage provides a simple tool for nonparametric function estimation. It is an active research area where the methodology is based on optimal shrinkage estimators for the location parameters. Some references are Donoho and Johnstone [17, 18] and Donoho et al. [19]. In this paper, we focus on the simplest, yet most important shrinkage strategy-wavelet thresholding. Earlier models considered nonparametric regression with i.i.d. normal errors. These were further extended by Johnstone and Silverman [20] for correlated errors. In this context, the paper by Opsomer at al. [21] is a good survey of methods using kernels, splines, and wavelets.

The thresholding technique consists of reducing the noise included in a signal through the application of a threshold to the estimated wavelets coefficients ̂ 𝛽 𝑗 𝑘 . Some commonly used forms are the soft and hard thresholds, given by 𝛿 ( 𝑠 ) ̂ 𝛽 𝑗 𝑘 = | | ̂ 𝛽 , 𝜆 𝑗 𝑘 | | 𝜆 + ̂ 𝛽 s g n 𝑗 𝑘 , 𝛿 ( ) ̂ 𝛽 𝑗 𝑘 = ̂ 𝛽 , 𝜆 𝑗 𝑘 𝐼 | | ̂ 𝛽 𝑗 𝑘 | | , 𝜆 ( 2 . 5 ) respectively. One crucial issue is the choice of the parameter 𝜆 . If the empirical wavelet coefficients ̂ 𝛽 𝑗 𝑘 are assumed to be Gaussian with mean 𝛽 𝑗 𝑘 and constant variance, a popular method is to apply the so-called universal threshold 𝜆 = 𝜎 2 l o g 2 𝐽 , where 𝜎 2 is the noise variance. Other methods include the SURE procedure, cross validation, and Bayesian approaches. See Percival and Walden [11] and Vidakovic [10] for details.

But despite the good properties of these procedures, which are spatially adaptive and nearly optimum over wide classes of function spaces, they are not appropriate when the design is not regular or it is random. Recently, several approaches were proposed dealing with nonregular designs. Delouille [22] introduced the concept of design adapted wavelets, and Kerkyacharian and Picard [23] introduced the warped wavelets. Cai and Brown [24, 25] consider wavelet estimation for models with nonequispaced or random uniform designs, and Sweldens [26] introduced the so-called lifting scheme; now the wavelets are no longer obtained as dilations and translations of some mother wavelet. See also Jansen and Oonincx [27]. Some applications and extensions of these approaches are given in Porto et al. [28, 29] and Morettin and Chiann [30].

In this paper, we will use ordinary wavelets, as in Dahlhaus et al. [7], since these wavelets work well for our purposes. See also a related discussion in Morettin and Chiann [30]. Concerning the choice of the mother wavelet, there are no established rules. In our experience, we agree with Nason and Sapatinas [1], when they say that “how the choice of of wavelets affects the final model and its interpretation is an area of further research.” We will make use in particular of the Daubechies compactly supported wavelets in the simulations and application. Concerning thresholding, we use hard thresholds and the parameter 𝜆 chosen as explained in Sections 4 and 5.

3. Estimators

The aim is to estimate the functions 𝛿 𝑖 ( 𝑢 ) , 𝑖 = 1 , , 𝑚 , and 𝜔 𝑖 ( 𝑢 ) , 𝑖 = 0 , 1 , , 𝑛 , appearing in model (1.1), given 𝑇 observations of the series 𝑋 𝑡 , 𝑇 and 𝑌 𝑡 , 𝑇 . We assume that the orders 𝑚 and 𝑛 are fixed and known. The idea is to expand these functions in wavelet series 𝛿 𝑖 ( 𝑢 ) = 𝛼 ( 𝛿 𝑖 ) 0 0 𝜙 ( 𝑢 ) + 𝐽 1 𝑗 = 0 𝑘 𝐼 𝑗 𝛽 ( 𝛿 𝑖 ) 𝑗 𝑘 𝜓 𝑗 𝑘 𝜔 ( 𝑢 ) , 𝑖 = 1 , , 𝑚 , 𝑖 ( 𝑢 ) = 𝛼 ( 𝜔 𝑖 ) 0 0 𝜙 ( 𝑢 ) + 𝐽 1 𝑗 = 0 𝑘 𝐼 𝑗 𝛽 ( 𝜔 𝑖 ) 𝑗 𝑘 𝜓 𝑗 𝑘 ( 𝑢 ) , 𝑖 = 0 , , 𝑛 . ( 3 . 1 )

The empirical wavelet coefficients are obtained minimizing 𝑇 𝑡 = 𝑣 + 1 𝑌 𝑡 , 𝑇 𝑚 𝑖 = 1 𝛿 𝑖 ( 𝑢 ) 𝑌 𝑡 𝑖 , 𝑇 𝑛 𝑗 = 0 𝜔 𝑗 ( 𝑢 ) 𝑋 𝑡 𝑗 , 𝑇 2 , ( 3 . 2 ) with 𝛿 𝑖 ( 𝑢 ) and 𝜔 𝑖 ( 𝑢 ) replaced by (3.1), 𝑣 = m a x ( 𝑚 , 𝑛 ) . These empirical wavelet coefficients are then modified using a hard threshold, and finally we build estimators of 𝛿 𝑖 ( 𝑢 ) and 𝜔 𝑖 ( 𝑢 ) by applying the inverse wavelet transform to these thresholded coefficients.

For ease of exposition, we restrict our attention from now on to the simple model with 𝑚 = 1 and 𝑛 = 0 , namely, 𝑌 𝑡 , 𝑇 = 𝛿 1 𝑡 𝑇 𝑌 𝑡 1 , 𝑇 + 𝜔 0 𝑡 𝑇 𝑋 𝑡 , 𝑇 + 𝜀 𝑡 . ( 3 . 3 )

So we regress 𝑌 𝑡 , 𝑇 on 𝑋 𝑡 , 𝑇 and 𝑌 𝑡 1 , 𝑇 , using the expansion (3.1) for 𝛿 1 ( 𝑡 / 𝑇 ) and 𝜔 0 ( 𝑡 / 𝑇 ) . Let Δ = 2 𝐽 1 . In matrix notation, we have 𝑌 2 , 𝑇 𝑌 3 , 𝑇 𝑌 𝑇 , 𝑇 = 𝜙 0 0 2 𝑇 𝑌 1 , 𝑇 𝜓 0 0 2 𝑇 𝑌 1 , 𝑇 𝜓 𝐽 1 , Δ 2 𝑇 𝑌 1 , 𝑇 𝜙 0 0 3 𝑇 𝑌 2 , 𝑇 𝜓 0 0 3 𝑇 𝑌 2 , 𝑇 𝜓 𝐽 1 , Δ 3 𝑇 𝑌 2 , 𝑇 𝜙 0 0 𝑇 𝑇 𝑌 𝑇 1 , 𝑇 𝜓 0 0 𝑇 𝑇 𝑌 𝑇 1 , 𝑇 𝜓 𝐽 1 , Δ 𝑇 𝑇 𝑌 𝑇 1 , 𝑇 𝛼 ( 𝛿 1 ) 0 0 𝛽 𝛿 1 0 0 𝛽 ( 𝛿 1 ) 𝐽 1 , Δ + 𝜙 0 0 2 𝑇 𝑋 2 , 𝑇 𝜓 0 0 2 𝑇 𝑋 2 , 𝑇 𝜓 𝐽 1 , Δ 2 𝑇 𝑋 2 , 𝑇 𝜙 0 0 3 𝑇 𝑋 3 , 𝑇 𝜓 0 0 3 𝑇 𝑋 3 , 𝑇 𝜓 𝐽 1 , Δ 3 𝑇 𝑋 3 , 𝑇 𝜙 0 0 𝑇 𝑇 𝑋 𝑇 , 𝑇 𝜓 0 0 𝑇 𝑇 𝑋 𝑇 , 𝑇 𝜓 𝐽 1 , Δ 𝑇 𝑇 𝑋 𝑇 , 𝑇 𝛼 ( 𝜔 0 ) 0 0 𝛽 ( 𝜔 0 ) 0 0 𝛽 ( 𝜔 0 ) 𝐽 1 , Δ + 𝜀 2 𝜀 3 𝜀 𝑇 ( 3 . 4 )

or 𝚽 𝐘 = 𝑌 𝚿 𝑌 ( 0 ) 𝚿 𝑌 ( 1 ) 𝚿 𝑌 ( 𝐽 1 ) 𝜷 ( 𝛿 1 ) + 𝚽 𝑋 𝚿 𝑋 ( 0 ) 𝚿 𝑋 ( 1 ) 𝚿 𝑋 ( 𝐽 1 ) 𝜷 ( 𝜔 0 ) + 𝜀 , ( 3 . 5 ) where 𝑌 𝐘 = 2 , 𝑇 𝑌 3 , 𝑇 𝑌 𝑇 , 𝑇 , 𝜷 ( 𝛿 1 ) = 𝛼 ( 𝛿 1 ) 0 0 𝛽 ( 𝛿 1 ) 0 0 𝛽 ( 𝛿 1 ) 1 0 𝛽 ( 𝛿 1 ) 1 1 𝛽 ( 𝛿 1 ) 𝐽 1 , 0 𝛽 ( 𝛿 1 ) 𝐽 1 , Δ , 𝜷 ( 𝜔 0 ) = 𝛼 ( 𝜔 0 ) 0 0 𝛽 ( 𝜔 0 ) 0 0 𝛽 ( 𝜔 0 ) 1 0 𝛽 ( 𝜔 0 ) 1 1 𝛽 ( 𝜔 0 ) 𝐽 1 , 0 𝛽 ( 𝜔 0 ) 𝐽 1 , Δ , 𝜀 𝜀 = 2 𝜀 3 𝜀 𝑇 . ( 3 . 6 )

The vector Φ 𝑌 is given by 𝚽 𝑌 = 𝜙 0 0 2 𝑇 𝑌 1 , 𝑇 𝜙 0 0 3 𝑇 𝑌 2 , 𝑇 𝜙 0 0 𝑇 𝑇 𝑌 𝑇 1 , 𝑇 , ( 3 . 7 ) and the matrices Ψ 𝑌 ( 𝑚 ) are given, for 0 𝑚 𝐽 1 , by 𝚿 𝑌 ( 𝑚 ) = 𝜓 𝑚 0 2 𝑇 𝑌 1 , 𝑇 𝜓 𝑚 1 2 𝑇 𝑌 1 , 𝑇 𝜓 𝑚 , 2 𝑚 1 2 𝑇 𝑌 1 , 𝑇 𝜓 𝑚 0 3 𝑇 𝑌 2 , 𝑇 𝜓 𝑚 1 3 𝑇 𝑌 2 , 𝑇 𝜓 𝑚 , 2 𝑚 1 3 𝑇 𝑌 2 , 𝑇 𝜓 𝑚 0 𝑇 𝑇 𝑌 𝑇 1 , 𝑇 𝜓 𝑚 1 𝑇 𝑇 𝑌 𝑇 1 , 𝑇 𝜓 𝑚 , 2 𝑚 1 𝑇 𝑇 𝑌 𝑇 1 , 𝑇 . ( 3 . 8 )

The vector Φ 𝑋 is given by 𝚽 𝑋 = 𝜙 0 0 2 𝑇 𝑋 2 , 𝑇 𝜙 0 0 3 𝑇 𝑋 3 , 𝑇 𝜙 0 0 𝑇 𝑇 𝑋 𝑇 , 𝑇 , ( 3 . 9 ) and the matrices Ψ 𝑋 ( 𝑚 ) are given, for 0 𝑚 𝐽 1 , by 𝚿 𝑋 ( 𝑚 ) = 𝜓 𝑚 0 2 𝑇 𝑋 2 , 𝑇 𝜓 𝑚 1 2 𝑇 𝑋 2 , 𝑇 𝜓 𝑚 , 2 𝑚 1 2 𝑇 𝑋 2 , 𝑇 𝜓 𝑚 0 3 𝑇 𝑋 3 , 𝑇 𝜓 𝑚 1 3 𝑇 𝑋 3 , 𝑇 𝜓 𝑚 , 2 𝑚 1 3 𝑇 𝑋 3 , 𝑇 𝜓 𝑚 0 𝑇 𝑇 𝑋 𝑇 , 𝑇 𝜓 𝑚 1 𝑇 𝑇 𝑋 𝑇 , 𝑇 𝜓 𝑚 , 2 𝑚 1 𝑇 𝑇 𝑋 𝑇 , 𝑇 . ( 3 . 1 0 )

Let Ψ 𝑋 = [ Φ 𝑋 Ψ 𝑋 ( 0 ) Ψ 𝑋 ( 𝐽 1 ) ] and Ψ 𝑌 = [ Φ 𝑌 Ψ 𝑌 ( 0 ) Ψ 𝑌 ( 𝐽 1 ) ] .

It follows easily that the least squares estimators of the coefficients are then given by 𝜷 ( 𝛿 1 ) 𝜷 ( 𝜔 0 ) = Ψ 𝑌 Ψ 𝑌 Ψ 𝑌 Ψ 𝑋 Ψ 𝑋 Ψ 𝑌 Ψ 𝑌 Ψ 𝑋 1 Ψ 𝑌 𝐘 Ψ 𝑋 𝐘 . ( 3 . 1 1 )

Having obtained the estimates given by (3.11), we plug them in (3.1), resulting in linear estimates ̂ 𝛿 𝑖 ( 𝑢 ) and 𝜔 𝑖 ( 𝑢 ) . Finally, nonlinear smoothed estimators ̃ 𝛿 𝑖 ( 𝑢 ) and 𝜔 𝑖 ( 𝑢 ) , respectively, are obtained applying some threshold to the empirical detail coefficients ̂ 𝛽 ( 𝑖 ) 𝑗 𝑘 and taking the inverse wavelet transform.

We might think of using other procedures, as the LASSO (least absolute shrinkage and selection operator) procedure of Tibshirani [31] or the nonnegative garotte of Breiman [32]. For the case of orthogonal design, LASSO has a relationship with the soft shrinkage procedure of Donoho and coworkers. But using these procedures would make the derivations of the properties of the estimators above considerably more difficult, or even impossible.

4. Properties of Empirical Coefficients

Now we present some properties of the linear empirical wavelet coefficients. The techniques used to prove the results are quite involved and are based on function space theory. Basically we adapt the results of Dahlhaus et al. [7] for the transfer function model (1.1). The proofs may be obtained from the authors upon request or from the site of the journal.

We assume that functions 𝜔 𝑖 ( 𝑢 ) , 𝑖 = 0 , 1 , , 𝑛 and 𝛿 𝑖 ( 𝑢 ) , 𝑖 = 1 , , 𝑚 belong to some function spaces 𝑖 given by 𝑖 = 𝑓 𝑖 = 𝛼 ( 𝑖 ) 0 0 𝜙 + 𝑗 , 𝑘 𝛽 ( 𝑖 ) 𝑗 𝑘 𝜓 𝑗 𝑘 𝛼 ( 𝑖 ) 0 0 𝐶 𝑖 1 , 𝛽 ( 𝑖 ) . . , 𝑝 , 𝑞 𝐶 𝑖 2 , ( 4 . 1 ) where 𝛽 ( 𝑖 ) . . 𝑖 , 𝑝 𝑖 , 𝑞 𝑖 = 𝑗 0 2 𝑗 𝑠 𝑖 𝑝 𝑖 𝑘 𝐼 𝑗 | | | 𝛽 ( 𝑖 ) 𝑗 𝑘 | | | 𝑝 𝑖 𝑞 𝑖 / 𝑝 𝑖 1 / 𝑞 𝑖 , ( 4 . 2 )

𝑠 𝑖 = 𝑖 + 1 / 2 1 / 𝑝 𝑖 . Here, 𝑖 is the smoothness degree of each space, 𝑝 𝑖 and 𝑞 𝑖 ( 1 𝑝 𝑖 , 𝑞 𝑖 ) specify the norm, and 𝐶 𝑖 1 and 𝐶 𝑖 2 are positive constants. For these function spaces, the following result is valid [19]: s u p 𝑓 𝑖 𝑖 𝑗 𝐽 𝑘 | | | 𝛽 ( 𝑖 ) 𝑗 𝑘 | | | 2 2 = 𝑂 2 𝐽 ̃ 𝑠 𝑖 , ( 4 . 3 ) where ̃ 𝑠 𝑖 = 𝑖 + 1 / 2 1 / ̃ 𝑝 𝑖 , with ̃ 𝑝 𝑖 = m i n { 𝑝 𝑖 , 2 } .

These classes contain Besov, Hölder, and 𝐿 2 - Sobolev spaces, see, for example, Vidakovic [10], Dahlhaus et al. [7], and Triebel [33].

It can be shown, see Donoho et al. [19], that the loss incurred by truncating at level 𝐽 is of order 𝑇 2 𝑖 / ( 2 𝑖 + 1 ) if we choose 𝐽 such that 2 𝐽 1 𝑇 1 / 2 2 𝐽 . This is achieved if ̃ 𝑠 𝑖 > 1 .

Concerning the wavelets, we assume that 𝜙 and 𝜓 are compactly supported on [ 0 , 1 ] and have continuous derivatives up to order 𝑟 > 𝑙 , with = m a x 𝑖 . We denote the spectral norm by 2 and the sup norm by .

In order to analyze the statistical behavior of the estimated coefficients, it is convenient to assume that we take expansions of these coefficients as linear combinations of functions in the 𝑉 𝐽 spaces, generated by { 𝜙 𝐽 , 1 , 𝜙 𝐽 , 2 , , 𝜙 𝐽 , 2 𝐽 } . With this basis, we can write (3.3) as 𝑌 𝑡 , 𝑇 = 2 𝐽 𝑖 = 1 𝜁 ( 𝛿 1 ) 𝐽 , 𝑖 𝜙 𝐽 , 𝑖 𝑡 𝑇 𝑌 𝑡 1 , 𝑇 + 2 𝐽 𝑖 = 1 𝜁 ( 𝜔 0 ) 𝐽 , 𝑖 𝜙 𝐽 , 𝑖 𝑡 𝑇 𝑋 𝑡 , 𝑇 + 𝛾 𝑡 , 𝑇 , ( 4 . 4 ) where 𝛾 𝑡 , 𝑇 = 𝑗 𝐽 𝑘 𝐼 𝑗 𝛽 ( 𝛿 1 ) 𝑗 𝑘 𝜓 𝑗 𝑘 𝑡 𝑇 𝑌 𝑡 1 , 𝑇 + 𝑗 𝐽 𝑘 𝐼 𝑗 𝛽 ( 𝜔 0 ) 𝑗 𝑘 𝜓 𝑗 𝑘 𝑡 𝑇 𝑋 𝑡 , 𝑇 + 𝜀 𝑡 . ( 4 . 5 )

Equation (4.4) in matrix form becomes 𝜑 𝐘 = 𝑌 𝜑 𝑋 𝜻 ( 𝛿 1 ) 𝜻 ( 𝜔 0 ) + 𝛾 , ( 4 . 6 ) with 𝜑 𝑌 = 𝜙 𝐽 , 1 2 𝑇 𝑌 1 , 𝑇 𝜙 𝐽 , 2 2 𝑇 𝑌 1 , 𝑇 𝜙 𝐽 , 2 𝐽 2 𝑇 𝑌 1 , 𝑇 𝜙 𝐽 , 1 3 𝑇 𝑌 2 , 𝑇 𝜙 𝐽 , 2 3 𝑇 𝑌 2 , 𝑇 𝜙 𝐽 , 2 𝐽 3 𝑇 𝑌 2 , 𝑇 𝜙 𝐽 , 1 𝑇 𝑇 𝑌 𝑇 1 , 𝑇 𝜙 𝐽 , 2 𝑇 𝑇 𝑌 𝑇 1 , 𝑇 𝜙 𝐽 , 2 𝐽 𝑇 𝑇 𝑌 𝑇 1 , 𝑇 , 𝜑 𝑋 = 𝜙 𝐽 , 1 2 𝑇 𝑋 2 , 𝑇 𝜙 𝐽 , 2 2 𝑇 𝑋 2 , 𝑇 𝜙 𝐽 , 2 𝐽 2 𝑇 𝑋 2 , 𝑇 𝜙 𝐽 , 1 3 𝑇 𝑋 3 , 𝑇 𝜙 𝐽 , 2 3 𝑇 𝑋 3 , 𝑇 𝜙 𝐽 , 2 𝐽 3 𝑇 𝑋 3 , 𝑇 𝜙 𝐽 , 1 𝑇 𝑇 𝑋 𝑇 , 𝑇 𝜙 𝐽 , 2 𝑇 𝑇 𝑋 𝑇 , 𝑇 𝜙 𝐽 , 2 𝐽 𝑇 𝑇 𝑋 𝑇 , 𝑇 . ( 4 . 7 )

In (4.6), 𝜻 ( 𝛿 1 ) and 𝜻 ( 𝜔 0 ) contain the coefficients in the expansion (4.4) and 𝛾 contains lines given by (4.5).

Analogously to (3.11), we have the least squares estimates 𝜻 ( 𝛿 1 ) 𝜻 ( 𝜔 0 ) = 𝜑 𝑌 𝜑 𝑌 𝜑 𝑌 𝜑 𝑋 𝜑 𝑋 𝜑 𝑌 𝜑 𝑋 𝜑 𝑋 1 𝜑 𝑌 𝐘 𝜑 𝑋 𝐘 . ( 4 . 8 )

The relationship between 𝜷 1 ) ( 𝛿 𝜷 0 ) ( 𝜔 and 𝜻 1 ) ( 𝛿 𝜻 0 ) ( 𝜔 is 𝜷 ( 𝛿 1 ) 𝜷 ( 𝜔 0 ) 𝜻 = 𝚪 ( 𝛿 1 ) 𝜻 ( 𝜔 0 ) , ( 4 . 9 ) where the Γ is a ( 2 𝐽 + 1 × 2 𝐽 + 1 ) block diagonal matrix. The matrix Γ does the transformation ( 𝛼 ( 𝑖 ) 0 0 , ̂ 𝛽 ( 𝑖 ) 0 0 , ̂ 𝛽 ( 𝑖 ) 1 0 , ̂ 𝛽 ( 𝑖 ) 1 1 ̂ 𝛽 , , ( 𝑖 ) 𝐽 1 , 0 ̂ 𝛽 , , ( 𝑖 ) 𝐽 1 , Δ ̂ 𝜁 ) = Γ ( ( 𝑖 ) 𝐽 , 1 ̂ 𝜁 , , ( 𝑖 ) 𝐽 , 2 𝐽 ) ; so each coefficient estimate ̂ 𝛽 ( 𝑖 ) 𝑗 𝑘 , 𝑖 = 𝛿 1 , 𝜔 0 in (3.11) can be written as ̂ 𝛽 ( 𝑖 ) 𝑗 𝑘 = Γ 𝑖 , 𝑗 𝑘 ̂ 𝜁 and Γ 𝛿 1 , 𝑗 , 𝑘 𝐿 2 = Γ 𝜔 0 , 𝑗 , 𝑘 𝐿 2 = 1 .

Equation (4.6) can be written in the form 𝐘 = 𝚼 𝜻 + 𝛾 , ( 4 . 1 0 ) with Υ = [ 𝜑 𝑌 𝜑 𝑋 ] , 𝜻 = 𝜻 1 ) ( 𝛿 𝜻 0 ) ( 𝜔 and the vector 𝛾 as defined above. The least squares estimator of 𝜻 is given by 𝚼 𝜻 = 𝚼 1 𝚼 𝐘 . ( 4 . 1 1 )

We notice that the error term 𝛾 is not independent from the regressors 𝚼 , which means that the estimator is biased and its squared error is of order 𝑂 𝑝 ( 2 𝐽 𝑇 1 / 2 + b i a s ) . But this order can be improved, as shown below. To take care of the bias, we might use some robust procedure, as in Martin et al. [34], for example.

From (4.10) and (4.11), we obtain 𝜻 = ( 𝚼 𝚼 ) 1 𝚼 ( 𝚼 𝜻 + 𝛾 ) . ( 4 . 1 2 ) If we write 𝛾 = 𝜀 + 𝐒 , we get 𝚼 𝜻 = 𝚼 1 𝚼 ( 𝚼 𝜻 + 𝜀 + 𝐒 ) , ( 4 . 1 3 ) where the matrix 𝐒 is an ( ( 𝑇 1 ) × 1 ) matrix, given by 𝐒 = 𝑗 𝐽 𝑘 𝐼 𝑗 𝛽 ( 𝛿 1 ) 𝑗 𝑘 𝜓 𝑗 𝑘 2 𝑇 𝑌 1 , 𝑇 + 𝑗 𝐽 𝑘 𝐼 𝑗 𝛽 ( 𝜔 0 ) 𝑗 𝑘 𝜓 𝑗 𝑘 2 𝑇 𝑋 2 , 𝑇 𝑗 𝐽 𝑘 𝐼 𝑗 𝛽 ( 𝛿 1 ) 𝑗 𝑘 𝜓 𝑗 𝑘 3 𝑇 𝑌 2 , 𝑇 + 𝑗 𝐽 𝑘 𝐼 𝑗 𝛽 ( 𝜔 0 ) 𝑗 𝑘 𝜓 𝑗 𝑘 3 𝑇 𝑋 3 , 𝑇 𝑗 𝐽 𝑘 𝐼 𝑗 𝛽 ( 𝛿 1 ) 𝑗 𝑘 𝜓 𝑗 𝑘 𝑇 𝑇 𝑌 𝑇 1 , 𝑇 + 𝑗 𝐽 𝑘 𝐼 𝑗 𝛽 ( 𝜔 0 ) 𝑗 𝑘 𝜓 𝑗 𝑘 𝑇 𝑇 𝑋 𝑇 , 𝑇 . ( 4 . 1 4 )

We now state results on the squared error and mean squared error of the linear empirical wavelet coefficients. The following assumptions are needed.(A1) The wavelets 𝜙 and 𝜓 are compactly supported on [ 0 , 1 ] and have continuous derivatives up to order 𝑟 > 𝑙 , with = m a x 𝑖 .(A2) In the estimation procedure, we have used first a linear estimator, truncating the wavelet expansion at scale 2 𝐽 . In order to get functions with the appropriate smoothness, 𝐽 was chosen such that ̃ 𝑠 𝑖 > 1 .(A3) The matrix 𝚼 in (4.10) satisfies 𝐸 ( 𝚼 𝚼 ) 1 𝐿 2 + 𝛿 2 = 𝑂 ( 𝑇 2 𝛿 ) , for some 𝛿 > 0 .

The Daubechies Extremal Phase wavelets 𝐷 2 𝑁 , for an appropriate choice of 𝑁 , satisfy (A1). See Härdle et al. [35] for details. The number of continuous derivatives has a connection with the number of null moments of 𝜓 . Also, the support of 𝜓 is contained in the interval [ 𝑁 + 1 , 𝑁 ] .

Since 𝛽 𝑗 𝑘 is the coefficient of the regressor 𝜓 𝑗 𝑘 , the squared error of the least squares estimator is ̂ 𝛽 𝑗 𝑘 𝛽 𝑗 𝑘 2 = 𝑂 𝑝 ( 𝑇 1 ) , which implies that the square error ̂ 𝛿 ( 𝑢 ) 𝛿 ( 𝑢 ) 2 = 𝑂 𝑝 ( 2 𝐽 𝑇 1 / 2 + b i a s ) , for example. We can improve this rate as in the next result.

Proposition 4.1. Under Assumptions (A1)–(A3), we have that 𝜻 𝜻 2 = 𝑂 𝑝 ( 2 𝐽 𝑇 1 l o g ( 𝑇 ) ) .

This proposition allows us to derive the rate of convergence of the mean square error of the original linear estimators of the wavelet coefficients.

Proposition 4.2. If Assumptions (A1)–(A3) hold, then ̂ 𝛽 𝐸 ( ( 𝑖 ) 𝑗 𝑘 𝛽 ( 𝑖 ) 𝑗 𝑘 ) 2 = 𝑂 ( 𝑇 1 ) holds uniformly in 𝑖 , 𝑘 and 𝑗 < 𝐽 .

5. Some Simulations

We now present three simulation examples for the linear and nonlinear estimates ̂ 𝛿 𝑖 ( ) , 𝑤 𝑖 ( ) and ̃ 𝛿 𝑖 ( ) , 𝑤 𝑖 ( ) , respectively. In the first, the coefficients are very smooth functions, the second presents an intermediary case, and, in the third, the functions are piecewise constants.

The performance of the estimators are assessed via the square root of the mean squared errors (RMSE), namely, R M S E ( 𝛿 ) 𝑗 = 1 𝑇 𝑣 𝑇 𝑡 = 𝑣 + 1 ̂ 𝛿 𝑗 𝑡 𝑇 𝛿 𝑗 𝑡 𝑇 2 1 / 2 , R M S E ( 𝑤 ) 𝑗 = 1 𝑇 𝑣 𝑇 𝑡 = 𝑣 + 1 𝑤 𝑗 𝑡 𝑇 𝑤 𝑗 𝑡 𝑇 2 1 / 2 , ( 5 . 1 ) the same for the nonlinear estimators. Recall that 𝑣 = m a x ( 𝑚 , 𝑛 ) .

Concerning the choice of the wavelet, we have tried the Daubechies Extremal Phase D8 and D12, the Daubechies Least Asymmetric S8 and S12 and the Haar wavalets. The best results were obtained with the wavelet mentioned in each example. Of course, in a real problem situation, this cannot be done, since we do not know the true function coefficients. In our experience, the Haar and D8 wavelets have proven useful.

Other important issues are the choice of 𝐽 , how to estimate 𝜎 2 , the noise variance, and the choice of the threshold parameter 𝜆 . Concerning 𝐽 , we have used the rule specified in Section 4, below (4.3), resulting 𝐽 = 6 for all examples. The variance was estimated using the empirical wavelet coefficients in all scales. Hard thresholding was used with the parameter 𝜆 chosen as 𝜆 = 𝜎 2 l o g ( 2 𝐽 ) .

In all cases, the input time series was generated as an AR(2) model, 𝑋 𝑡 , 𝑇 = 𝑎 1 ( 𝑡 / 𝑇 ) 𝑋 ( 𝑡 1 ) , 𝑇 + 𝑎 2 ( 𝑡 / 𝑇 ) 𝑋 ( 𝑡 2 ) , 𝑇 + 𝜍 𝑡 , where 𝜍 𝑡 are independent, normally distributed random variables, with zero mean and variance one, for 𝑡 = 1 , 2 , , 𝑇 , and the coefficients are given by 𝑎 1 𝑎 ( 𝑢 ) = 1 . 6 9 , 0 < 𝑢 0 . 6 , 0 . 3 , 0 . 6 < 𝑢 1 , 2 ( 𝑢 ) = 0 . 8 1 0 < 𝑢 1 . ( 5 . 2 )

Example 5.1. Let 𝑚 = 2 and 𝑛 = 0 , so the transfer function model is 𝑌 𝑡 , 𝑇 = 𝛿 1 𝑡 𝑇 𝑌 𝑡 1 , 𝑇 + 𝛿 2 𝑡 𝑇 𝑌 𝑡 2 , 𝑇 + 𝜔 0 𝑡 𝑇 𝑋 𝑡 , 𝑇 + 𝜖 𝑡 , 𝑡 = 1 , 2 , , 𝑇 . ( 5 . 3 )

We choose 𝜔 0 ( 𝑢 ) , 𝛿 1 ( 𝑢 ) , and 𝛿 2 ( 𝑢 ) as 𝜔 0 ( 𝑢 ) = 0 . 9 c o s ( 2 𝜋 𝑢 + 𝜋 ) , 𝛿 1 ( 𝑢 ) = 0 . 5 s i n ( 2 𝜋 𝑢 ) , and 𝛿 2 ( 𝑢 ) = 0 . 4 c o s ( 2 𝜋 𝑢 + 𝜋 / 4 ) .

For the simulation, we generated 𝑇 = 2 0 4 8 data values for 𝑋 𝑡 , 𝑇 and 𝑌 𝑡 , 𝑇 with 5000 replicates. For this example, the Daubechies D12 wavelet is employed for our estimation procedure.

Figure 1 presents the input and output series, and Figure 2 presents the true and estimated coefficients from a typical sample. The boxplots for the 5000 RMSE values are presented in Figure 3. We see that both types of estimators perform well, but clearly the nonlinear estimator perform, better, both visually and in RMSE sense.

fig1
Figure 1: Example 5.1: (a) input series and (b) output series.
fig2
Figure 2: Example 5.1: True parameters and estimates using the Daubechies D12 wavelet: (a) 𝛿 1 , (b) 𝛿 2 , (c) 𝜔 0 , (d) ̂ 𝛿 1 , (e) ̂ 𝛿 2 (f) 𝜔 0 , (g) ̃ 𝛿 1 , (h) ̃ 𝛿 2 (i) 𝜔 0 .
451076.fig.003
Figure 3: Example 5.1: Boxplots of the 5000 RMSE-values using estimators with the Daubechies D12 wavelet, for linear estimates (a) ̂ 𝛿 1 ( ) , (c) ̂ 𝛿 2 , (e) 𝑤 0 ( ) and nonlinear estimators (b) ̃ 𝛿 1 ( ) , (d) ̃ 𝛿 2 , (f) 𝑤 0 ( ) .

In Figure 4, we use bootstrap with 𝐵 = 3 0 0 replications to see that the true transfer function coefficients lie within the 9 5 % confidence regions. See Section 6 for the description of the bootstrap procedure.

fig4
Figure 4: Example 5.1: Bootstrap 95% confidence regions for linear estimators, (a), (c), and (e) and nonlinear estimators, (b), (d), and (f), with true curves inside.

In Figures 17 and 18 of the appendix, we show histograms for linear and nonlinear estimators of the parameter 𝛿 1 at some points of the interval [ 0 , 1 ] . Notice that the normal approximation for the estimators is a possibility to be considered under the theoretical point of view.

Example 5.2. Let 𝑚 = 1 and 𝑛 = 0 ; so the transfer function model is 𝑌 𝑡 , 𝑇 = 𝛿 1 𝑡 𝑇 𝑌 𝑡 1 , 𝑇 + 𝜔 0 𝑡 𝑇 𝑋 𝑡 , 𝑇 + 𝜖 𝑡 , 𝑡 = 1 , 2 , , 𝑇 . ( 5 . 4 )

We choose 𝜔 0 ( 𝑢 ) and 𝛿 1 ( 𝑢 ) as 𝜔 0 ( 𝑢 ) = 9 c o s ( 2 𝜋 𝑢 + 𝜋 ) , 𝑢 ( 0 , 1 ] ; 𝛿 1 ( 𝑢 ) = 0 . 8 c o s ( 1 . 5 c o s ( 4 𝜋 𝑢 + 𝜋 ) ) , for 0 < 𝑢 0 . 2 5 and 0 . 7 5 < 𝑢 1 , and 𝛿 1 ( 𝑢 ) = 0 . 8 c o s ( 3 c o s ( 4 𝜋 𝑢 + 𝜋 / 2 ) ) , for 0 . 2 5 < 𝑢 0 . 7 5 .

As in the Example 5.1, we generated 𝑇 = 2 0 4 8 data values for 𝑋 𝑡 , 𝑇 and 𝑌 𝑡 , 𝑇 with 5000 replicates and the Daubechies Extremal Phase wavelet D8 is employed.

From a typical sample, the input and output series are shown in Figure 5 and the estimated coefficients in Figure 6. The boxplots for the 5000 RMSE values are presented in Figure 7. We see that, visually, both types of estimators perform well, but the thresholded estimator is better for the smoother parameter 𝜔 0 in terms of RMSE. This is a well-known fact, see, for example, Percival and Walden [11]. In Figure 8, we use bootstrap with 𝐵 = 3 0 0 replications to see that the true transfer function coefficients lie within the 9 5 % confidence regions.

fig5
Figure 5: Example 5.2: (a) input series and (b) output series.
fig6
Figure 6: Example 5.2: True parameters and estimates using the Daubechies D8 wavelet: (a) 𝛿 1 , (b) 𝜔 0 , (c) ̂ 𝛿 1 , (d) 𝜔 0 , (e) ̃ 𝛿 1 , (f) 𝜔 0 .
451076.fig.007
Figure 7: Example 5.2: Boxplots of the 5000 RMSE-values using estimators with the Daubechies D8 wavelet, for linear estimates (a) ̂ 𝛿 1 ( ) , (c) 𝑤 0 ( ) , and nonlinear estimators (b) ̃ 𝛿 1 ( ) , (d) 𝑤 0 ( ) .
fig8
Figure 8: Example 5.2: Bootstrap 9 5 % confidence regions for linear estimators, (a) and (c), and nonlinear estimators, (b) and (d), with true curves inside.

In Figures 6(c) and 6(e), we see some unwanted artifacts next to discontinuity points. These do not disappear, even with expansions with many terms. But this Gibbs effect is less severe than in Fourier expansions. See Vidakovic [10] and Percival and Walden [36] for details.

In Figures 19 and 20 of the appendix, we show histograms for linear and nonlinear estimators of the parameter 𝛿 1 at some points of the interval [ 0 , 1 ] .

Example 5.3. Let Let 𝑚 = 1 and 𝑛 = 0 ; so the transfer function model is as in Example 5.2.

We choose the coefficients 𝜔 0 ( 𝑢 ) and 𝛿 1 ( 𝑢 ) as piecewise constant functions: 𝜔 0 ( 𝑢 ) = 2 , for 0 < 𝑢 0 . 5 and 𝜔 0 ( 𝑢 ) = 2 , for 0 . 5 < 𝑢 1 ; 𝛿 1 ( 𝑢 ) = 0 . 6 , for 0 < 𝑢 0 . 2 5 or 0 . 5 0 < 𝑢 0 . 7 5 , and 𝛿 1 ( 𝑢 ) = 0 . 5 , for 0 . 2 5 < 𝑢 0 . 5 or 0 . 7 5 < 𝑢 1 .

We simulated 𝑇 = 2 0 4 8 data values for 𝑋 𝑡 , 𝑇 and 𝑌 𝑡 , 𝑇 with 5000 replicates, and, due to the nature of the functions, the Haar wavelet is appropriate in this situation.

Figure 9 presents the input and output series, and Figure 10 presents the true and estimated coefficients from a typical sample. The boxplots for the 5000 RMSE values are presented in Figure 11. We see that both types of estimators perform well, but clearly the nonlinear estimator perform better, both visually and in RMSE sense.

fig9
Figure 9: Example 5.3: (a) input series and (b) output series.
fig10
Figure 10: Example 5.3: True parameters and estimates using the Haar wavelet: (a) 𝛿 1 , (b) 𝜔 0 , (c) ̂ 𝛿 1 , (d) 𝜔 0 , (e) ̃ 𝛿 1 , (f) 𝜔 0 .
451076.fig.0011
Figure 11: Example 5.3: Boxplots of the 5000 RMSE-values using estimators with the Haar wavelet, for linear estimates (a) ̂ 𝛿 1 ( ) , (c) 𝑤 0 ( ) and nonlinear estimators (b) ̃ 𝛿 1 ( ) , (d) 𝑤 0 ( ) .

In Figure 12, we use bootstrap with 𝐵 = 3 0 0 replications to see that the true transfer function coefficients lie within the 95% confidence regions.

fig12
Figure 12: Example 5.3: Bootstrap 9 5 % confidence regions for linear estimators, (a) and (c), and nonlinear estimators, (b) and (d), with true curves inside.

In Figures 21 and 22 of the appendix, we present histograms for linear and nonlinear estimators of 𝛿 1 at some points of the interval [ 0 , 1 ] .

6. An Application

The data that we analyze are hourly wind speeds recorded from 00:00 on 6th April 1995 at two Welsh Meteorological Office stations: Valley and Aberporth, with 2048 observations. Valley is located approximately 120 km north of Aberporth, and they are mostly separated by Cardigan Bay. Part of these data were analysed by Nason and Sapatinas [1]. Figure 13 shows the data we analyze. We are grateful to N&M Wind Power for providing the data.

fig13
Figure 13: Hourly wind speeds from 00:00 on 6th April 1995: (a) Aberporth (b) Valley.

In this application, our aim is to model Valley’s wind speeds, { 𝑌 𝑡 , 𝑇 } , in terms of its lagged values and present and lagged values of wind speeds at Aberporth, { 𝑋 𝑡 , 𝑇 } . We should remember that both series are not stationary and so classical methods should not be used. We fitted four potential models, namely, 𝑀 1 𝑦 𝑡 , 𝑇 = 𝛿 1 𝑡 𝑇 𝑦 𝑡 1 , 𝑇 + 𝜔 0 𝑡 𝑇 𝑥 𝑡 , 𝑇 + 𝜖 𝑡 , 𝑀 2 𝑦 𝑡 , 𝑇 = 𝛿 1 𝑡 𝑇 𝑦 𝑡 1 , 𝑇 + 𝜔 0 𝑡 𝑇 𝑥 𝑡 , 𝑇 + 𝜔 1 𝑡 𝑇 𝑥 𝑡 1 , 𝑇 + 𝜖 𝑡 , 𝑀 3 𝑦 𝑡 , 𝑇 = 𝛿 1 𝑡 𝑇 𝑦 𝑡 1 , 𝑇 + 𝛿 2 𝑡 𝑇 𝑦 𝑡 2 , 𝑇 + 𝜔 0 𝑡 𝑇 𝑥 𝑡 , 𝑇 + 𝜖 𝑡 , 𝑀 4 𝑦 𝑡 , 𝑇 = 𝛿 1 𝑡 𝑇 𝑦 𝑡 1 , 𝑇 + 𝛿 2 𝑡 𝑇 𝑦 𝑡 2 , 𝑇 + 𝜔 0 𝑡 𝑇 𝑥 𝑡 , 𝑇 + 𝜔 1 𝑡 𝑇 𝑥 𝑡 1 , 𝑇 + 𝜖 𝑡 . ( 6 . 1 )

The Daubechies Extremal Phase D8 wavelet was used in the computations. The model 𝑀 4 was marginally superior (in terms of RMSE) to model 𝑀 3 , so it was discarded. The mean residual sums of squares for 𝑀 1 , 𝑀 2 , and 𝑀 3 are 0.02258, 0.02186, and 0.02177, respectively, for linear estimation. For nonlinear estimation, the figures are 0.02391, 0.02359, and 0.02178, respectively. The final coefficient estimates for model 𝑀 3 are shown in Figure 14.

fig14
Figure 14: Model 𝑀 3 : (a) ̂ 𝛿 1 ( 𝑢 ) (full line) and ̃ 𝛿 1 ( 𝑢 ) (dashed line), (b) ̂ 𝛿 2 ( 𝑢 ) (full line) and ̃ 𝛿 2 ( 𝑢 ) (dashed line), (c) 𝜔 0 ( 𝑢 ) (full line) and 𝜔 0 ( 𝑢 ) (dashed line).

For Model 𝑀 3 , Figures 15 and 16 show the 9 5 % confidence regions based on bootstrap, using linear and nonlinear estimates, respectively.

fig15
Figure 15: Model 𝑀 3 : Bootstrap 95 % confidence regions for (a) 𝛿 1 ( 𝑢 ) , (b) 𝛿 2 ( 𝑢 ) , (c) 𝜔 0 ( 𝑢 ) , using linear estimates.
fig16
Figure 16: Model 𝑀 3 : Bootstrap 95 % confidence regions for (a) 𝛿 1 ( 𝑢 ) , (b) 𝛿 2 ( 𝑢 ) , (c) 𝜔 0 ( 𝑢 ) , using nonlinear estimates.
451076.fig.0017
Figure 17: Example 5.1: Histograms of linear estimate of 𝛿 1 at some points of [0,1].
451076.fig.0018
Figure 18: Example 5.1: Histograms of nonlinear estimate of 𝛿 1 at some points of [0, 1].
451076.fig.0019
Figure 19: Example 5.2: Histograms of linear estimate of 𝛿 1 at some points of [0, 1].
451076.fig.0020
Figure 20: Example 5.2: Histograms of nonlinear estimate of 𝛿 1 at some points of [0, 1].
451076.fig.0021
Figure 21: Example 5.3: Histograms of linear estimate of 𝛿 1 at some points of [0, 1].
451076.fig.0022
Figure 22: Example 5.3: Histograms of nonlinear estimate of 𝛿 1 at some points of [0, 1].

The bootstrap procedure works as follows:(a)using the real series, we obtain the estimated transfer function coefficients;(b)we obtain the fitted series 𝑌 𝑡 , 𝑇 using these estimated coefficients;(c)we obtain the residuals from the fitted model;(d)we resample these residuals and obtain bootstrap samples of the 𝑌 𝑡 , 𝑇 series;(e)we obtain bootstrap estimators of the transfer function coefficients.

We used 𝐵 = 3 0 0 bootstrap replications of ̂ 𝛿 1 , ̂ 𝛿 2 , 𝜔 0 , and ̃ 𝛿 1 , ̃ 𝛿 2 , 𝜔 0 , to obtain bootstrap standard errors and the 95 % confidence regions. We also fitted the model 𝑀 3 with constant coefficients given an MSE value of 0.02354, indicating that the model with varying coefficients is better. We see that is not possible to plot constant lines totally within these regions, showing that, in fact, the coefficients are time varying.

7. Further Remarks

In this paper, we have proposed an estimation procedure for a transfer function model with time-varying coefficients. Basically it is a least squares procedure, with the use of wavelets to expand the function coefficients. Firstly, linear estimators for the time-varying coefficients are obtained, truncating the wavelet expansion at an appropriate scale. Then thresholds are applied to the empirical wavelet coefficients, and inverse wavelet transformation gives the nonlinear smoothed estimators for the function coefficients.

Simulations have shown that this procedure leads to estimators with a good performance. Some statistical properties for the empirical wavelet coefficients in the case of linear estimators were presented.

Further studies are needed on the estimation of the variance and asymptotic normality of the empirical wavelet coefficients in the linear case, on the rate of convergence for the risk of the nonlinear estimator over the smoothness classes 𝑖 and on issues related to the identification, diagnostics, and forecasting for these nonstationary models.

The issue of forecasting was discussed in the literature for some special and related models. Fryzlewicz et al. [37] addressed the problem of how to forecast locally stationary wavelet processes in the sense of Nason et al. [4] by means of nondecimated wavelets. Van Bellegem and von Sachs [38] consider a special modulated locally stationary process, where a stationary process is modulated by a time-varying variance, so the problem reduces to forecast this variance.

Concerning asymptotic distribution of the empirical wavelet estimators, in the linear case, we believe that this can be obtained under further assumptions, namely on the cumulants of the 𝜀 𝑡 , on the thresholds used and on the roots of the polynomials associated with the model (1.1). This will be the subject of further research.

Appendix

For more details, see Figures 17, 18, 19, 20, 21, and 22.

Acknowledgments

The authors are grateful to two anonymous referees, whose comments improved substantially a previous version of the paper. The first author acknowledges the support of CNPq, and the last three authors acknowledge the suport of Fapesp (Grant 2008/51097-6) and Capes/Procad (Grant 177/2007).

References

  1. G. P. Nason and T. Sapatinas, “Wavelet packet transfer function modelling of nonstationary time series,” Statistics and Computing, vol. 12, no. 1, pp. 45–56, 2002. View at Publisher · View at Google Scholar
  2. G. E. P. Box, G. M. Jenkins, and G. C. Reinsel, Time Series Analysis: Forecastingand Control, Prentice Hall, Englewood Cliffs, NJ, USA, 3rd edition, 1994. View at Zentralblatt MATH
  3. R. Dahlhaus, “Fitting time series models to nonstationary processes,” The Annals of Statistics, vol. 25, no. 1, pp. 1–37, 1997. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  4. G. P. Nason, R. von Sachs, and G. Kroisandt, “Wavelet processes and adaptive estimation of the evolutionary wavelet spectrum,” Journal of the Royal Statistical Society. Series B, vol. 62, no. 2, pp. 271–292, 2000. View at Publisher · View at Google Scholar
  5. C. Chiann and P. A. Morettin, “Estimation of time varying linear systems,” Statistical Inference for Stochastic Processes, vol. 2, no. 3, pp. 253–285, 1999. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  6. C. Chiann and P. A. Morettin, “Time-domain estimation of time-varying linear systems,” Journal of Nonparametric Statistics, vol. 17, no. 3, pp. 365–383, 2005. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  7. R. Dahlhaus, M. H. Neumann, and R. von Sachs, “Nonlinear wavelet estimation of time-varying autoregressive processes,” Bernoulli, vol. 5, no. 5, pp. 873–906, 1999. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  8. P. J. Brockwell, R. A. Davis, and H. Salehi, “Maximum likelihood transfer functionmodelling. Computer Science and Statistics,” in Proceedings of the 22nd Symposium on theInterface, R. Le Page and C. Page, Eds., Springer, New York, NY, USA, 1991. View at Publisher · View at Google Scholar
  9. P. J. Brockwell, R. A. Davis, and H. Salehi, “Transfer-function models with nonstationary input,” in New Directions in Time Series Analysis, Part I, vol. 45 of The IMA Volumes in Mathematics and Its Applications, pp. 65–74, Springer, New York, NY, USA, 1992.
  10. B. Vidakovic, Statistical Modelling by Wavelets, John Wiley & Sons, New York, NY, USA, 1999.
  11. D. B. Percival and A. T. Walden, Wavelet Methods for Time Series Analysis, vol. 4 of Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press, Cambridge, UK, 2000.
  12. G. P. Nason, Wavelet Methods in Statistics with R, Use R!, Springer, New York, NY, USA, 2008. View at Publisher · View at Google Scholar
  13. I. Daubechies, Ten Lectures on Wavelets, vol. 61, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, Pa, USA, 1992.
  14. A. Cohen, I. Daubechies, and P. Vial, “Wavelets on the interval and fast wavelet transforms,” Applied and Computational Harmonic Analysis, vol. 1, no. 1, pp. 54–81, 1993. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  15. G. G. Walter and L. Cai, “Periodic wavelets from scratch,” Journal of Computational Analysis and Applications, vol. 1, no. 1, pp. 25–41, 1999. View at Publisher · View at Google Scholar
  16. Y. Meyer, Ondelettes et Opérateurs. I, Actualités Mathématiques, Hermann, Paris, France, 1990.
  17. D. L. Donoho and I. M. Johnstone, “Ideal spatial adaptation by wavelet shrinkage,” Biometrika, vol. 81, no. 3, pp. 425–455, 1994. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  18. D. L. Donoho and I. M. Johnstone, “Adapting to unknown smoothness via wavelet shrinkage,” Journal of the American Statistical Association, vol. 90, no. 432, pp. 1200–1224, 1995. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  19. D. L. Donoho, I. M. Johnstone, G. Kerkyacharian, and D. Picard, “Wavelet shrinkage: asymptopia?” Journal of the Royal Statistical Society. Series B, vol. 57, no. 2, pp. 301–369, 1995. View at Zentralblatt MATH
  20. I. M. Johnstone and B. W. Silverman, “Wavelet threshold estimators for data with correlated noise,” Journal of the Royal Statistical Society. Series B, vol. 59, no. 2, pp. 319–351, 1997. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  21. J. Opsomer, Y. Wang, and Y. Yang, “Nonparametric regression with correlated errors,” Statistical Science, vol. 16, no. 2, pp. 134–153, 2001. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  22. V. Delouille, Nonparametric stochastic regression using design-adapted wavelets, Ph.D. thesis, Institut de Statistique, Université catholique de Louvain, 2002.
  23. G. Kerkyacharian and D. Picard, “Regression in random design and warped wavelets,” Bernoulli, vol. 10, no. 6, pp. 1053–1105, 2004. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  24. T. T. Cai and L. D. Brown, “Wavelet shrinkage for nonequispaced samples,” The Annals of Statistics, vol. 26, no. 5, pp. 1783–1799, 1998. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  25. T. T. Cai and L. D. Brown, “Wavelet estimation for samples with random uniform design,” Statistics & Probability Letters, vol. 42, no. 3, pp. 313–321, 1999. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  26. W. Sweldens, “The lifting scheme: a construction of second generation wavelets,” SIAM Journal on Mathematical Analysis, vol. 29, no. 2, pp. 511–546, 1998. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  27. M. Jansen and P. Oonincx, Second Generation Wavelets and Applications, Springer, London, UK, 2005.
  28. R. F. Porto, P. A. Morettin, and E. C. Q. Aubin, “Wavelet regression with correlated errors on a piecewise Hölder class,” Statistics & Probability Letters, vol. 78, no. 16, pp. 2739–2743, 2008. View at Publisher · View at Google Scholar
  29. R. F. Porto, P. A. Morettin, and E. C. Q. Aubin, “Smoothed design-adapted Haar wavelet regression with autocorrelated errors,” Journal of Time Series Econometrics, to appear.
  30. P. A. Morettin and C. Chiann, “Estimation of functional-coefficient autoregressive models by wavelet methods,” Current Development in Theory and Applications of Wavelets, vol. 1, no. 3, pp. 309–340, 2007. View at Zentralblatt MATH
  31. R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society. Series B, vol. 58, no. 1, pp. 267–288, 1996. View at Zentralblatt MATH
  32. L. Breiman, “Better subset regression using the nonnegative garrote,” Technometrics, vol. 37, no. 4, pp. 373–384, 1995. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  33. H. Triebel, Theory of Function Spaces II, Birkhäuser, Basel, Switzerland, 1992.
  34. R. D. Martin, V. J. Yohai, and R. H. Zamar, “Min-max bias robust regression,” The Annals of Statistics, vol. 17, no. 4, pp. 1608–1630, 1989. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  35. W. Härdle, G. Kerkyacharian, D. Picard, and A. Tsybakov, Wavelets, Approximation, and Statistical Applications, vol. 129 of Lecture Notes in Statistics, Springer, New York, NY, USA, 1998.
  36. D. B. Percival and A. T. Walden, Spectral Analysis for Physical Applications, Cambridge University Press, Cambridge, UK, 1993. View at Publisher · View at Google Scholar
  37. P. Fryzlewicz, S. Van Bellegem, and R. von Sachs, “Forecasting non-stationary time series by wavelet process modelling,” Annals of the Institute of Statistical Mathematics, vol. 55, no. 4, pp. 737–764, 2003. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  38. S. Van Bellegem and R. von Sachs, “Forecasting economic time series with unconditional time-varying variance,” International Journal of Forecasting, vol. 20, no. 4, pp. 611–627, 2004. View at Publisher · View at Google Scholar