Abstract

We consider two well-known facts in econometrics: (i) the failure of the orthogonality assumption (i.e., no independence between the regressors and the error term), which implies biased and inconsistent Least Squares (LS) estimates and (ii) the consequences of using nonstationary variables, acknowledged since the seventies; LS might yield spurious estimates when the variables do have a trend component, whether stochastic or deterministic. In this work, an optimistic corollary is provided: it is proven that the LS regression, employed in nonstationary and cointegrated variables where the orthogonality assumption is not satisfied, provides estimates that converge to their true values. Monte Carlo evidence suggests that this property is maintained in samples of a practical size.

1. Introduction

Two well-known facts lie behind this work: (i) the behavior of LS estimates whenever variables are nonstationary and (ii) the failure of the orthogonality assumption between independent variables and the error term, also in an LS regression. (1)The reappraisal of the impact of unit roots in time-series observations, initiated in the late seventies, had profound consequences for modern econometrics. It became clear that (i) insufficient attention was being paid to trending mechanisms and (ii) most macroeconomic variables are probably nonstationary; such an appraisal gave rise to an extraordinary development that substantially modified the way empirical studies in time-series econometrics are carried out. Research into nonstationarity has advanced significantly since it was reassessed in several important papers, such as those of [15]. (2)The orthogonality problem constitutes another significant research program in econometrics; its formal seed can be traced back to [6], where a proposal to solve the identification problem is made in the estimation of demand and supply curves (see [7]). Typically, in textbooks, the method of Instrumental Variables (IVs) is proposed as a solution to the problem of simultaneous equations and, broadly speaking, whenever there is no independence between the error term and the regressors, that is, when the orthogonality assumption is not satisfied.

This paper aims to study the consequences of using nonstationary variables in an LS regression when the regressor is related to the error term; this is done in a simple regression framework. The specification under particular scrutiny is𝑦𝑡=𝛼+𝛽𝑥𝑡+𝑢𝑡.(1.1) To the best of our knowledge, the asymptotics—and the finite-sample properties—of the combination of nonstationarity, nonorthogonality between 𝑥𝑡 and 𝑢𝑦,𝑡, and LS estimates, have been scarcely studied (but see [8]). That said, we acknowledge that there are several comprehensive studies concerning the use of IV in the presence of nonstationarity [9, 10], for example, studied the asymptotics as well as the finite-sample properties of the IV estimator in the context of a cointegrated relationship, and proved that even spurious instruments (i.e., 𝐼(1) instruments not structurally related to the regressors) provide consistent estimates. Phillips [11] proved that, when there is no structural relationship between the regressand and a single regressor, that is, when there is no cointegration between 𝑦 and 𝑥, the use of spurious instruments does not prevent the phenomenon (this is a simple extension of [5]). We derive the asymptotic behavior of LS estimates, where the data generating processes (DGPs) consist of two cointegrated variables in which the regressor bears a relationship with the error term. In this case, LS provide consistent estimates. Additionally, some Monte Carlo evidence is presented to account for the adequacy of asymptotic results in finite samples. In other words, LS estimates of the true DGP parameters, 𝜇𝑦 and 𝛽𝑦 (see (2.4) in the next section), do not require the information on the parameters of 𝑥, that is, 𝑥𝑡 is weakly exogenous for the estimation of 𝜇𝑦 and 𝛽𝑦 as defined by [12].

2. Relevant DGPs

This work aims to study the asymptotic properties of LS estimates when neither the orthogonality nor the stationarity assumptions are satisfied. Our approach is twofold: we assume (i) the variable 𝑥 is statistically related to the innovations of 𝑦, as in the problem of independent variables measured with error and (ii) the DGPs of both variables are interdependent, as in the problem of simultaneity. All the cases studied consider nonstationary and cointegrated variables (DGP (2.1) is included because it eases the comprehension of the paper)𝑥𝑡=𝜇𝑥+𝑢𝑥,𝑡,(2.1)𝑥𝑡=𝜇𝑥+𝑥𝑡1+𝑢𝑥,𝑡,(2.2)𝑥𝑡=𝑋0+𝜇𝑥+𝜌𝑦1𝑡+𝜉𝑥,𝑡1+𝜌𝑦2𝜉𝑦,t1,(2.3)𝑦𝑡=𝜇𝑦+𝛽𝑦𝑥𝑡+𝑢𝑦,𝑡+𝜌𝑥𝑢𝑥,𝑡innovations,(2.4) where 𝑢𝑧,𝑡, for 𝑧=𝑥,𝑦, are independent white noises with zero-mean and constant variance 𝜎2𝑧, 𝜉𝑧𝑡=𝑡𝑖=0𝑢𝑧𝑖 and 𝑍0 is an initial condition. We may relax the assumptions made for the innovations; for example, we could force them to obey the general level conditions in [5, Assumption  1]. Nevertheless, although the asymptotic results would still hold in this case, our primary target concerns the problem of orthogonality between the regressor and the error term, not those of autocorrelation or heteroskedasticity. These DGPs allow for an interesting variety of cases (note that the asymptotics of the LS estimates when 𝑥 and 𝑦 have been independently generated by any of first three DGPs can be found, e.g., in [13]; notwithstanding, the authors can provide these cases as mathematica code upon request).(1)Bookcase no. 1: DGP of 𝑥 is (2.1) and DGP of 𝑦 is (2.4) with 𝜌𝑥=0. When the variables are generated in this manner, we fulfill the classical assumptions made in most basic econometrics textbooks. The variables are stationary, the innovations are homoskedastic and independent, and so forth. It is straightforward to show that: ̂𝛼𝑝𝜇𝑦, ̂𝛽𝑝𝛽𝑦 and ̂𝜎2𝑝𝜎2𝑦. (2)Bookcase no. 2: DGP of 𝑥 is (2.1) and DGP of 𝑦 is (2.4) with 𝜌𝑥0. These DGPs also represent a typical example of a problem of orthogonality in most basic econometrics textbooks. Although the variables are stationary and the innovations are homoskedastic and independent, the explanatory variable is related to the innovations of 𝑦. It is well known that the estimates do not converge to their true value. In particular, it is straightforward to show that: ̂𝛼𝑝𝜇𝑦𝜇𝑥𝜌𝑥, ̂𝛽𝑝𝛽𝑦+𝜌𝑥 and ̂𝜎2𝑝𝜎2𝑦. (3)Bookcase no. 3: DGP of 𝑥 is (2.2) and DGP of 𝑦 is (2.4) with 𝜌𝑥=0. These DGPs allow the relationship between 𝑥 and 𝑦 to be cointegrated à la [14]. Once again, asymptotic results have been known for a long time, obtaining these does not entail any particular difficulty: ̂𝛼𝑝𝜇𝑦, ̂𝛽𝑝𝛽𝑦, ̂𝜎2𝑝𝜎2𝑦 and 1𝑅2=𝑂𝑝(𝑇2). (4)Nonstationarity and non-orthogonality case no. 1: DGP of 𝑥 is (2.2) and DGP of 𝑦 is (2.4). Notwithstanding, the obvious problem of orthogonality between 𝑥 and the error term, the variables remain cointegrated. The artifact employed to induce the orthogonality problem can be considered as, for example, measurement errors in the explanatory variable. One should expect that, in the presence of this problem, estimates would not converge to their true value. We prove below that, contrary to expectations, this is not the case. (5)Nonstationarity and non-orthogonality case no. 2: DGP of 𝑥 is (2.3) and DGP of 𝑦 is (2.4). As in the previous case, we have a cointegrated relationship between 𝑥 and 𝑦, only in this case, the problem of orthogonality between the regressor and the error term is even more explicit; the artifact employed to induce the orthogonality problem can be related to the typical simultaneous equations case. We also prove below that Least Squares (LS) provide consistent estimates.

The common belief as regards the last two cases is that the failure of the orthogonality assumption induces LS to generate inconsistent estimates, even in a cointegrated relationship. In fact, when the variables are generated as in (2.2)–(2.4), the estimates of the parameter converge to their true value (note that we did not consider the case where the orthogonality assumption is not satisfied because of the omission of a relevant variable; [15] studied the later case and proved that the LS estimates do not converge to their true values). This is proven in Theorem 2.1:

Theorem 2.1. Let 𝑦𝑡 be generated by (2.4). (i)Let 𝑥𝑡 be generated by (2.2). The innovations of both DGPs, 𝑢𝑧,𝑡, for 𝑧=𝑦,𝑥, are independent white noises with zero-mean and constant variance 𝜎2𝑧; use 𝑦𝑡 and 𝑥𝑡 to estimate regression (1.1) by LS. Hence, as 𝑇,(a)̂𝛼𝑝𝜇𝑦, (b)̂𝛽𝑝𝛽𝑦, (c)𝑇3/2𝑡𝛽=𝑂𝑝(1), (d)̂𝜎2𝑝𝜎2𝑦+𝜌2𝑥𝜎2𝑥, (e)𝑇2(1𝑅2)𝑝12(𝜎2𝑦+𝜌2𝑥𝜎2𝑥)/(𝛽𝑦𝜇𝑥)2. (ii)Let 𝑥𝑡 be generated by (2.3). The innovations of both DGPs, 𝑢𝑧,𝑡, for 𝑧=𝑦,𝑥, are independent white noises with zero-mean and constant variance 𝜎2𝑧; use 𝑦𝑡 and 𝑥𝑡 to estimate regression (1.1) by LS. Hence, as 𝑇, (a)̂𝛼𝑝𝜇𝑦, (b)̂𝛽𝑝𝛽𝑦, (c)𝑇3/2𝑡𝛽=𝑂𝑝(1), (d)̂𝜎2𝑝𝜎2𝑦+𝜌2𝑥𝜎2𝑥, (e)𝑇2(1𝑅2)𝑝12(𝜎2𝑦+𝜌2𝑥𝜎2𝑥)/[𝛽𝑦(𝜇𝑥+𝜌𝑦1)]2.

Proof. See Appendix A.

These asymptotic results show that a relationship between the innovations of 𝑦𝑡 and 𝑥𝑡—as stated by DGPs (2.2), (2.3), and (2.4)—does not obstruct the consistency of LS estimates when the variables are nonstationary and cointegrated (our results are in line with those of [8]). In other words, the failure of the orthogonality assumption does not preclude adequate asymptotic properties of LS. Furthermore, it can be said that 𝑥𝑡 is weakly exogenous for the estimation of 𝜇𝑦 and 𝛽𝑦 but not for the estimation of 𝜎2. The formula of the variance is noteworthy and the asymptotic expression of 𝑡𝛽 depends on the values of 𝜎2𝑥, 𝜎2𝑦, and 𝜌𝑥.

In order to emphasize the relevance of this result, we modified the DGPs of the variables in an effort to strengthen the link between the DGPs and the literature on simultaneous equations. The modifications are twofold and appear in the following propositions. As in Theorem 2.1, the results in proposition 1 are made under the assumption that innovations are i.i.d processes.

Proposition 2.2. Let 𝑦𝑡 and 𝑥𝑡 be generated by 𝑦𝑡𝛽𝑦𝑥𝑡𝜇𝑦=𝑢𝑦,𝑡,𝛽𝑥𝑦𝑡+𝑥𝑡𝜇𝑥𝛾𝑥𝑡=𝑢𝑥,𝑡,(2.5) where 𝑢𝑧,𝑡, for 𝑧=𝑥,𝑦, are independent white noises with zero mean and variance 𝜎2𝑧. Let these variables be used to estimate regression (1.1) by LS. Hence, as 𝑇, (1)̂𝛼𝑝𝜇𝑦, (2)̂𝛽𝑝𝛽𝑦, (3)𝑇3/2𝑡𝛽=𝑂𝑝(1), (4)̂𝜎2𝑝𝜎2𝑦, (5)𝑇2(1𝑅2)𝑝12𝜎2𝑦(1𝛽𝑥𝛽𝑦)2/(𝛽𝑦𝛾𝑥)2.

Proof. See Appendix A.

Proposition 2.3. Let 𝑦𝑡 and 𝑥𝑡 be generated by 𝑦𝑡𝛽𝑦𝑥𝑡𝜇𝑦=𝑢𝑦,𝑡,𝛽𝑥𝑦𝑡+𝑥𝑡𝑋0𝜇𝑥𝑡=𝜉𝑥,𝑡,(2.6) where 𝑢𝑧,𝑡, for 𝑧=𝑥,𝑦, are independent white noises with zero mean and variance 𝜎2𝑧, and 𝜉𝑥,𝑡=𝑡𝑖=0𝑢𝑥,𝑖. Let these variables be used to estimate regression (1.1) by LS. Hence, as 𝑇, (1)̂𝛼𝑝𝜇𝑦, (2)̂𝛽𝑝𝛽𝑦, (3)𝑇3/2𝑡𝛽=𝑂𝑝(1), (4)̂𝜎2𝑝𝜎2𝑦, (5)𝑇2(1𝑅2)𝑝12𝜎2𝑦(1𝛽𝑥𝛽𝑦)2/(𝜇𝑥𝛽𝑦)2.

Proof. See Appendix A.

The two systems, represented in (2.5) and (2.6), bear a striking resemblance to classical examples of simultaneous equations in econometrics. The fundamental variations are, (i) a deterministic trend in the variable 𝑥𝑡 in system (2.5) and (ii) a stochastic as well as a deterministic trend in system (2.6). The asymptotics of LS estimates do not show significant differences from those in Theorem 2.1. Note, however, that 𝑥𝑡 is weakly exogenous for the estimation of 𝜇𝑦, 𝛽𝑦, and 𝜎2𝑦. The main result is in fact identical, that is, the failure of orthogonality between 𝑥𝑡 and the error term does not preclude the estimates from converging to their true values.

Asymptotic properties of LS estimators clearly provide an encouraging perspective in time-series econometrics. Notwithstanding, we should bear in mind that asymptotic properties may be a poor finite-sample approximation. In order to observe the behavior of LS estimates in finite samples, we present two Monte Carlo experiments. Firstly, we represent graphically the convergence process of ̂𝛽 towards its true value, 𝛽. In accordance with asymptotic results, ̂𝛽𝛽𝑝0 as 𝑇. We reproduce the behavior of the later difference in figure 1. The variables 𝑥 and 𝑦 are generated according to (2.3) and (2.4), respectively. The sample size varies from 50 to 700 whilst 𝛽𝑦 goes from −5 to 5. The remaining parameters appear below the figure.

A brief glance at Figure 1 reveals that the asymptotic results stated in Theorem 2.1 approximate conveniently the finite-sample results for 𝑇>150. For smaller sample sizes, it can be seen that the difference between the parameter and its estimates corresponds usually to approximately 1.5% or less of the value of the former (we tried different variables in the 𝑦 axis (𝜌𝑥,𝜌𝑦1,𝜌𝑦2,𝜎2𝑦,𝜎2𝑦,); all of these trials produced similar figures).

The second Monte Carlo is built upon the same basis. In Table 1, each cell indicates the sample mean of ̂𝛽𝛽𝑦 and, below, its estimated standard deviation (in parentheses). The number of replications is 10,000. The parameter values used in the simulation are explicit within the table. The variables, 𝑥 and 𝑦, are generated according to (2.3) and (2.4), respectively. Sample size ranges from 𝑇=50-700; 𝜌𝑦1=0.15; 𝜌𝑥=4; 𝜎2𝑦=𝜎2𝑥=1; 𝜇𝑦=4.20; the error term is a white noise with variance 𝜎2𝜖=1.

Table 1 shows that LS estimates of a nonstationary relationship with a nonorthogonality problem quickly converge to their true value; with a sample size as small as 50 observations, the difference between 𝛽𝑦 and its estimate averages, at most, 0.015, and represents a deviation from the true value of 1.5%; in many other cases, the deviation is even smaller, of order 10−3–10−4. These differences tend to diminish further as the sample size grows. In fact, when there are 700 observations, the order of magnitude of such differences oscillates between 10−5–10−8. We performed the same experiment with autocorrelated disturbances AR(1) with 𝜙=0.7 (data available upon request); using such disturbances severely deteriorates the efficiency of the LS estimates although ̂𝛽𝛽𝑦 still converges to zero; we do not focus on this issue because, as mentioned earlier, neither autocorrelation nor heteroskedasticity are under scrutiny in this work.

3. Concluding Remarks

Using cointegrated variables in an LS regression where the regressor is not independent of the error term does not preclude the method from yielding consistent estimates. In other words, it is proven that, under these circumstances, the regressor remains weakly exogenous for the estimation of 𝜇𝑦 and 𝛽𝑦 (and for 𝜎2𝑦 in systems (2.5) and (2.6)) as defined by [12]. Furthermore, the finite-sample evidence indicates that LS provide good estimates even in samples of a practical size.

Notwithstanding, one should note the striking resemblance between the properties of the DGPs used in the propositions and those of variables belonging to a classical simultaneous-equation model. It may be possible that the estimation of such models, even if the macroeconomic variables they are nourished with are not stationary, would yield correct estimates. Of course, such a possibility rules out the existence of structural shifts, parameter instability, omission of a relevant variable, or any other major assumption failure.

Appendix

A. Proof of Theorem 2.1 and Propositions 2.2 and 2.3

The estimated specification in Theorem 2.1 and Propositions 2.2 and 2.3 is 𝑦𝑡=𝛼+𝛽𝑥𝑡+𝑢𝑡. In all three cases, we employ the following classical LS formulae (all sums run from 𝑡=1 to 𝑇 unless otherwise specified): (i)𝐵=(𝑋𝑋)1𝑋𝑌, (ii)̂𝜎2=𝑇1̂𝑢2𝑡=𝑇1[𝑦2𝑡+̂𝛼2𝑇+̂𝛽2𝑥2𝑡2̂𝛼𝑦𝑡2̂𝛽𝑦𝑡𝑥𝑡+2̂𝛼̂𝛽𝑥𝑡], (iii)𝑡𝛽=̂𝛽((𝑋𝑋)122̂𝜎2)1/2, (iv)𝑅2=1RSS/TSS,

where 𝑋𝑋=𝑇𝑥𝑡𝑥𝑡𝑥2𝑡,𝑋𝑌=𝑦𝑡𝑦𝑡𝑥𝑡,𝐵=̂𝛼̂𝛽,(A.1)

SSR=̂𝑢2𝑡, TSS=(𝑦𝑡𝑦)2=𝑦2𝑡𝑇1(𝑦𝑡)2, and (𝑋𝑋)122 is the element in row 2, column 2, of the (𝑋𝑋)1 matrix.

To obtain the asymptotics of ̂𝛼, ̂𝛽, ̂𝜎2, 𝑡𝛽, and 𝑅2 we need to ascertain the behaviour of the following expressions when 𝑇: 𝑥𝑡, 𝑦𝑡, 𝑥2𝑡, 𝑦2𝑡, and 𝑥𝑡𝑦𝑡. The behavior of these expressions varies depending on the DGP of the variables 𝑥𝑡 and 𝑦𝑡. We present such behavior for the DGPs underlying Theorem 2.1 and Propositions 2.2 and 2.3. All of the orders in probability stated in the underbraced sums can be found in [5, 13, 1618]. It is important to clarify that the computation of the asymptotics follows [5] and was assisted by Mathematica; we thus rewrote below the expressions written as Mathematica code.

A.1. Theorem 2.1: First Result

The expressions needed to compute the asymptotic values of ̂𝛼, ̂𝛽, ̂𝜎2, and 𝑅2 are𝑥𝑡=𝑋0𝑇+𝜇𝑥𝑡+𝜉𝑥,𝑡1𝑂𝑝𝑇3/2,𝑥2𝑡=𝑋20𝑇+𝜇2𝑥𝑡2+𝜉2𝑥,𝑡1𝑂𝑝𝑇2+2𝑋0𝜇𝑥𝑡+2𝑋0𝜉𝑥,𝑡1+2𝜇𝑥𝜉𝑥,𝑡1𝑡𝑂𝑝𝑇5/2,𝑥𝑡𝑢𝑥,𝑡=𝑋0𝑢𝑥,𝑡𝑂𝑝𝑇1/2+𝜇𝑥𝑢𝑥,𝑡𝑡𝑂𝑝𝑇3/2+𝜉𝑥,𝑡1𝑢𝑥,𝑡𝑂𝑝(𝑇),𝑥𝑡𝑢𝑦,𝑡=𝑋0𝑢𝑦,𝑡+𝜇𝑥𝑢𝑦,𝑡𝑡+𝜉𝑥,𝑡1𝑢𝑦,𝑡,𝑦𝑡=𝜇𝑦𝑇+𝛽𝑦𝑥𝑡+𝑢𝑦,𝑡+𝜌𝑥𝑢𝑥,𝑡,𝑦2𝑡=𝜇2𝑦𝑇+𝛽2𝑦𝑥2𝑡+𝜌2𝑥𝑢2𝑥,𝑡+𝑢2𝑦,𝑡+2𝜇𝑦𝛽𝑦𝑥𝑡+2𝜇𝑦𝜌𝑥𝑢𝑥,𝑡+𝑢𝑦,𝑡+2𝛽𝑦𝜌𝑥𝑥𝑡𝑢𝑥,𝑡+𝑥𝑡𝑢𝑦,𝑡+2𝜌𝑥𝑢𝑥,𝑡𝑢𝑦,𝑡𝑂𝑝𝑇1/2,𝑥𝑡𝑦𝑡=𝜇𝑦𝑥𝑡+𝛽𝑦𝑥2𝑡+𝑥𝑡𝑢𝑦,𝑡+𝜌𝑥𝑥𝑡𝑢𝑥,𝑡,(A.1) where 𝜉𝑦,𝑡=𝑡𝑖=1𝑢𝑦,𝑖 and 𝑌0 is an initial condition. The sums including solely the deterministic trend component are𝑡=12𝑇2+𝑇,𝑡2=162𝑇3+3𝑇2+𝑇.(A.2) The code in this case is represented below. To understand it, a brief glossary is required and appears in Table 2.

These expressions were written as Mathematica 7.0 code.ClearAll,𝑆𝑡=12𝑇2+𝑇,𝑆𝑡2=162𝑇3+3𝑇2+𝑇,𝑆𝑥=𝑋0𝑇+𝜇𝑥𝑆𝑡+𝑆𝜉𝑥𝑇3/2,𝑆𝑥2=𝑋20𝑇+𝜇2𝑥𝑆𝑡2+𝑆𝜉𝑥2𝑇2+2𝑋0𝜇𝑥𝑆𝑡+2𝑋0𝑆𝜉𝑥𝑇3/2+2𝜇𝑥𝑆𝜉𝑥𝑡𝑇5/2,𝑆𝑦=𝜇𝑦𝑇+𝛽𝑦𝑆𝑥+𝜌𝑥𝑆𝑢𝑥𝑇1/2+𝑆𝑢𝑦𝑇1/2,𝑆𝑥𝑢𝑥=𝑋0𝑆𝑢𝑥𝑇1/2+𝜇𝑥𝑆𝑢𝑥𝑡𝑇3/2+𝑆𝜉𝑥𝑢𝑥𝑇,𝑆𝑥𝑢𝑦=𝑋0𝑆𝑢𝑦𝑇1/2+𝜇𝑥𝑆𝑢𝑦𝑡𝑇3/2+𝑆𝜉𝑥𝑢𝑦𝑇,𝑆𝑦2=𝜇2𝑦𝑇+𝛽2𝑦𝑆𝑥2+𝜌2𝑥𝑆𝑢𝑥2𝑇+𝑆𝑢𝑦2𝑇+2𝜇𝑦𝛽𝑦𝑆𝑥+2𝜇𝑦𝜌𝑥𝑆ux𝑇1/2+𝑆𝑢𝑦𝑇1/2+2𝛽𝑦𝜌𝑥𝑆𝑥𝑢𝑥+𝑆𝑥𝑢𝑦+2𝜌𝑥𝑆𝑢𝑥𝑢𝑦𝑇1/2,𝑆𝑥𝑦=𝜇𝑦𝑆𝑥+𝛽𝑦𝑆𝑥2+𝜌𝑥𝑆𝑥𝑢𝑥+𝑆𝑥𝑢𝑦.(A.3)

A.2. Theorem 2.1: Second Result

The expressions needed to compute the asymptotic values of ̂𝛼, ̂𝛽, ̂𝜎2, and 𝑅2 appear below. Note that 𝑦𝑡, 𝑦2𝑡, and 𝑦𝑡𝑥𝑡 are identical to the ones presented in the previous appendix and have been therefore omitted𝑥𝑡=𝑋0𝑇+𝜇𝑥+𝜌𝑦1𝑡+𝜉𝑥,𝑡1+𝜌𝑦2𝜉𝑦,𝑡1,𝑥2𝑡=𝑋20𝑇+𝜇𝑥+𝜌𝑦12𝑡2+𝜉2𝑥,𝑡1+𝜌2𝑦2𝜉2𝑦,𝑡1+2𝑋0𝜇𝑥+𝜌𝑦1𝑡+2𝑋0𝜉𝑥,𝑡1+2𝜌𝑦2𝑋0𝜉𝑦,𝑡1+2𝜇𝑥+𝜌𝑦1𝜉𝑥,𝑡1𝑡+2𝜌𝑦2𝜇𝑥+𝜌𝑦1𝜉𝑦,𝑡1𝑡+2𝜌𝑦2𝜉𝑥,𝑡1𝜉𝑦,𝑡1𝑂𝑝𝑇2,𝑥𝑡𝑢𝑥,𝑡=𝑋0𝑢𝑥,𝑡+𝜇𝑥+𝜌𝑦1𝑢𝑥,𝑡𝑡+𝜉𝑥,𝑡1𝑢𝑥,𝑡+𝜌𝑦2𝜉𝑦,𝑡1𝑢𝑥,𝑡,𝑥𝑡𝑢𝑦,𝑡=𝑋0𝑢𝑦,𝑡+𝜇𝑥+𝜌𝑦1𝑢𝑦,𝑡𝑡+𝜉𝑥,𝑡1𝑢𝑦,𝑡+𝜌𝑦2𝜉𝑦,𝑡1𝑢𝑦,𝑡.(A.1) The code in this case is represented below. ClearAll,𝑆𝑡=12𝑇2+𝑇,𝑆𝑡2=162𝑇3+3𝑇2+𝑇,𝑆𝑥=𝑋0𝑇+𝜇𝑥+𝜌𝑦1𝑆𝑡+𝑆𝜉𝑥𝑇3/2+𝜌𝑦2𝑆𝜉𝑦𝑇3/2,𝑆𝑥2=𝑋20𝑇+𝜇𝑥+𝜌𝑦12𝑆𝑡2+𝑆𝜉𝑥2𝑇2+𝜌2𝑦2𝑆𝜉𝑦2𝑇2+2𝑋0𝜇𝑥+𝜌𝑦1𝑆𝑡+2𝑋0𝑆𝜉𝑥𝑇3/2+2𝑋0𝜌𝑦2𝑆𝜉𝑦𝑇3/2+2𝜇𝑥+𝜌𝑦1𝑆𝜉𝑥𝑡𝑇5/2+2𝜇𝑥+𝜌𝑦1𝜌𝑦2𝑆𝜉𝑦𝑡𝑇5/2+2𝜌𝑦2𝑆𝜉𝑦𝜉𝑥𝑇2,𝑆𝑦=𝜇𝑦𝑇+𝛽𝑦𝑆𝑥+𝜌𝑥𝑆𝑢𝑥𝑇1/2+𝑆𝑢𝑦𝑇1/2,𝑆𝑥𝑢𝑥=𝑋0𝑆𝑢𝑥𝑇1/2+𝜇𝑥+𝜌𝑦1𝑆𝑢𝑥𝑡𝑇3/2+𝑆𝜉𝑥𝑢𝑥𝑇+𝜌𝑦2𝑆𝜉𝑦𝑢𝑥𝑇,𝑆𝑥𝑢𝑦=𝑋0𝑆𝑢𝑦𝑇1/2+𝜇𝑥+𝜌𝑦1𝑆𝑢𝑦𝑡𝑇3/2+𝑆𝜉𝑥𝑢𝑦𝑇+𝜌𝑦2𝑆𝜉𝑦𝑢𝑦𝑇,𝑆𝑦2=𝜇2𝑦𝑇+𝛽2𝑦𝑆𝑥2+𝜌2𝑥𝑆𝑢𝑥2𝑇+𝑆𝑢𝑦2𝑇+2𝜇𝑦𝛽𝑦𝑆𝑥+2𝜇𝑦𝜌𝑥𝑆𝑢𝑥𝑇1/2+𝑆𝑢𝑦𝑇1/2+2𝛽𝑦𝜌𝑥𝑆𝑥𝑢𝑥+𝑆𝑥𝑢𝑦+2𝜌𝑥𝑆𝑢𝑥𝑢𝑦𝑇1/2,𝑆𝑥𝑦=𝜇𝑦𝑆𝑥+𝛽𝑦𝑆𝑥2+𝜌𝑥𝑆𝑥𝑢𝑥+𝑆𝑥𝑢𝑦.(A.2)

A.3. Proposition 2.2

First note that DGP (2.5) can be written as𝑦𝑡=𝜇𝑦+𝛽𝑦𝑥𝑡+𝑢𝑦,𝑡,𝑥𝑡=𝐶1+𝐶2𝑡+𝛽𝑥𝐶0𝑢𝑦,𝑡+𝑢𝑥,𝑡𝐶0,(A.1) where, 𝐶0=1𝛽𝑥𝛽𝑦, 𝐶1=(𝜇𝑥+𝜇𝑦𝛽𝑥)/𝐶0, and 𝐶2=𝛾𝑥/𝐶0. The expressions needed to compute the asymptotic values of ̂𝛼, ̂𝛽, ̂𝜎2, and 𝑅2 are𝑥𝑡=𝐶1𝑇+𝐶2𝑡+𝛽𝑥𝐶0𝑢𝑦,𝑡+1𝐶0𝑢𝑥,𝑡,𝑥2𝑡=𝐶21𝑇+𝐶22𝑡2+𝛽𝑥𝐶02𝑢2𝑦,𝑡+1𝐶02𝑢2𝑥,𝑡+2𝐶1𝐶2𝑡+2𝐶1𝛽𝑥𝐶0𝑢𝑦,𝑡+2𝐶1𝐶0𝑢𝑥,𝑡+2𝐶2𝛽𝑥𝐶0𝑢𝑦,𝑡𝑡+2𝐶2𝐶0𝑢𝑥,𝑡𝑡+2𝛽𝑥𝐶20𝑢𝑥,𝑡𝑢𝑦,𝑡,𝑥𝑡𝑢𝑦,𝑡=𝐶1𝑢𝑦,𝑡+𝐶2𝑢𝑦,𝑡𝑡+𝛽𝑥𝐶0𝑢2𝑦,𝑡+1𝐶0𝑢𝑥,𝑡𝑢𝑦,𝑡,𝑦𝑡=𝜇𝑦𝑇+𝛽𝑦𝑥𝑡+𝑢𝑦,𝑡,𝑦2𝑡=𝜇2𝑦𝑇+𝛽2𝑦𝑥2𝑡𝑢2𝑦,𝑡+2𝜇𝑦𝛽𝑦𝑥𝑡+2𝜇𝑦𝑢𝑦,𝑡+2𝛽𝑦𝑥𝑡𝑢𝑦,𝑡,𝑥𝑡𝑦𝑡=𝜇𝑦𝑥𝑡+𝛽𝑦𝑥2𝑡+𝑥𝑡𝑢𝑦,𝑡.(A.2) The code in this case is represented below.ClearAll,𝑆𝑡=12𝑇2+𝑇,𝑆𝑡2=162𝑇3+3𝑇2+𝑇,𝐶0=1𝛽𝑥𝛽𝑦,𝐶1=𝜇𝑥+𝜇𝑦𝛽𝑥1𝛽𝑥𝛽𝑦,𝐶2=𝛾𝑥1𝛽𝑥𝛽𝑦,𝑆𝑥=𝐶1𝑇+𝐶2𝑆𝑡+𝛽𝑥𝐶0𝑆𝑢𝑦𝑇1/2+1𝐶0𝑆𝑢𝑥𝑇1/2,𝑆𝑥2=𝐶12𝑇+𝐶22𝑆𝑡2+𝛽𝑥𝐶02𝑆𝑢𝑦2𝑇+1𝐶02𝑆𝑢𝑥2𝑇+2𝐶1𝐶2𝑆𝑡+2𝐶1𝛽𝑥𝐶0𝑆𝑢𝑦𝑇1/2+2𝐶1𝐶0𝑆𝑢𝑥𝑇1/2+2𝐶2𝛽𝑥𝐶0𝑆𝑢𝑦𝑡𝑇3/2+2𝐶2𝐶0𝑆𝑢𝑥𝑡𝑇3/2+2𝛽𝑥𝐶02𝑆𝑢𝑥𝑢𝑦𝑇1/2,𝑆𝑦=𝜇𝑦𝑇+𝛽𝑦𝑆𝑥+𝑆𝑢𝑦𝑇1/2,𝑆𝑥𝑢𝑦=𝐶1𝑆𝑢𝑦𝑇1/2+𝐶2𝑆𝑢𝑦𝑡𝑇3/2+𝛽𝑥𝐶0𝑆𝑢𝑦2𝑇+1𝐶0𝑆𝑢𝑥𝑢𝑦𝑇1/2,𝑆𝑦2=𝜇𝑦2𝑇+𝛽𝑦2𝑆𝑥2+𝑆𝑢𝑦2𝑇+2𝜇𝑦𝛽𝑦𝑆𝑥+2𝜇𝑦𝑆𝑢𝑦𝑇1/2+2𝛽𝑦𝑆𝑥𝑢𝑦,𝑆𝑥𝑦=𝜇𝑦𝑆𝑥+𝛽𝑦𝑆𝑥2+𝑆𝑥𝑢𝑦.(A.3)

A.4. Proposition 2.3

As in the previous appendix, first rewrite DGP (2.5) as𝑦𝑡=𝜇𝑦+𝛽𝑦𝑥𝑡+𝑢𝑦,𝑡,𝑥𝑡=𝐷1+𝐷2𝑡+𝛽𝑥𝐷0𝑢𝑦,𝑡+𝜉𝑥,𝑡1𝐷0,(A.1) where, 𝐷0=1𝛽𝑥𝛽𝑦, 𝐷1=(𝑋0+𝜇𝑦𝛽𝑥)/𝐷0, and 𝐷2=𝜇𝑥/𝐷0. The expressions needed to compute the asymptotic values of ̂𝛼, ̂𝛽, ̂𝜎2, and 𝑅2 appear below. Note that 𝑦𝑡, 𝑦2𝑡, and 𝑦𝑡𝑥𝑡 are identical to the ones presented in the previous appendix and have been therefore omitted𝑥𝑡=𝐷1𝑇+𝐷2𝑡+𝛽𝑥𝐷0𝑢𝑦,𝑡+1𝐷0𝜉𝑥,𝑡1,𝑥2𝑡=𝐷21𝑇+𝐷22𝑡2+𝛽𝑥𝐷02𝑢2𝑦,𝑡+1𝐷02𝜉2𝑥,𝑡1+2𝐷1𝐷2𝑡+2𝐷1𝛽𝑥𝐷0𝑢𝑦,𝑡+2𝐷1𝐷0𝜉𝑥,𝑡1+2𝐷2𝛽𝑥𝐷0𝑢𝑦,𝑡1𝑡+2𝐷2𝐷0𝜉𝑥,𝑡1𝑡+2𝛽𝑥𝐷20𝜉𝑥,𝑡1𝑢𝑦,𝑡,𝑥𝑡𝑢𝑦,𝑡=𝐷1𝑢𝑦,𝑡+𝐷2𝑢𝑦,𝑡𝑡+𝛽𝑥𝐷0𝑢2𝑦,𝑡+1𝐷0𝜉𝑥,𝑡1𝑢𝑦,𝑡.(A.2) The code in this case is represented below:ClearAll,𝑆𝑡=12𝑇2+𝑇,𝑆𝑡2=162𝑇3+3𝑇2+𝑇,𝐷0=1𝛽𝑥𝛽𝑦,𝐷1=𝑋0+𝜇𝑦𝛽𝑥1𝛽𝑥𝛽𝑦,𝐷2=𝜇𝑥1𝛽𝑥𝛽𝑦,𝑆𝑥=𝐷1𝑇+𝐷2𝑆𝑡+𝛽𝑥𝐷0𝑆𝑢𝑦𝑇1/2+1𝐷0𝑆𝜉𝑥𝑇3/2,𝑆𝑥2=𝐷12𝑇+𝐷22𝑆𝑡2+𝛽𝑥𝐷02𝑆𝑢𝑦2𝑇+1𝐷02𝑆𝜉𝑥2𝑇2+2𝐷1𝐷2𝑆𝑡+2𝐷1𝛽𝑥𝐷0𝑆𝑢𝑦𝑇1/2+2𝐷1𝐷0𝑆𝜉𝑥𝑇3/2+2𝐷2𝛽𝑥𝐷0𝑆𝑢𝑦𝑡𝑇3/2+2𝐷2𝐷0𝑆𝜉𝑥𝑡𝑇5/2+2𝛽𝑥𝐷02𝑆𝜉𝑥𝑢𝑦𝑇;𝑆𝑦=𝜇𝑦𝑇+𝛽𝑦𝑆𝑥+𝑆𝑢𝑦𝑇1/2,𝑆𝑥𝑢𝑦=𝐷1𝑆𝑢𝑦𝑇1/2+𝐷2𝑆𝑢𝑦𝑡𝑇3/2+𝛽𝑥𝐷0𝑆𝑢𝑦2𝑇+1𝐷0𝑆𝜉𝑥𝑢𝑦𝑇,𝑆𝑦2=𝜇𝑦2𝑇+𝛽𝑦2𝑆𝑥2+𝑆𝑢𝑦2𝑇+2𝜇𝑦𝛽𝑦𝑆𝑥+2𝜇𝑦𝑆𝑢𝑦𝑇1/2+2𝛽𝑦𝑆𝑥𝑢𝑦,𝑆𝑥𝑦=𝜇𝑦𝑆𝑥+𝛽𝑦𝑆𝑥2+𝑆𝑥𝑢𝑦.(A.3)

A.5. Computation of the Asymptotics

The previous three appendices provide the Mathematica code of 𝑥𝑡, 𝑦𝑡, 𝑥2𝑡, 𝑦2𝑡, and 𝑥𝑡𝑦𝑡 for different DGP combinations. We now present the code that computes the asymptotics of (1.1) in any such combination. Note that the code computes the asymptotics in the following order: the matrix (𝑋𝑋)1, (𝑋𝑋)122, ̂𝛼, ̂𝛽, ̂𝜎2, 𝑡𝛽, and 1𝑅2. Comments appear inside parentheses (*---*).

(Matrix(𝑋𝑋))

Mx=𝑇𝑆𝑥𝑆𝑥𝑆𝑥2;

(InverseofMatrix(𝑋𝑋))𝑖Mx=Inverse[Mx];

(Element1,1of(𝑋𝑋)1)R1=Extract[𝑖Mx,{1,1}];(Element1,2of(𝑋𝑋)1)R2=Extract[𝑖Mx,{1,2}];(Element2,1of(𝑋𝑋)1)R3=Extract[𝑖Mx,{2,1}];(Element2,2of(𝑋𝑋)1)R4=Extract[𝑖Mx,{2,2}];

(Factorization)R40=Factor[R4];(Numerator)R4num=Numerator[R40];(Denominator)R4den=Denominator[R40];

(Highestpowerof𝑇innumerator)

K1=Exponent[R4num,𝑇],

(Highestpowerof𝑇indenominator)

K2=Exponent[R4den,𝑇];(Limitofthenumeratordividedby𝑇K1)

R4num2=Limit[Expand[R4num/𝑇K1],T];

(Limitofthedenominatordividedby𝑇K2)

R4den2=Limit[Expand[R4den/𝑇K2],𝑇];(LimitoftheElement2,2of(𝑋𝑋)1multipliedby𝑇k1/𝑇K2)R42=Factor[Expand[(R4num2/R4den2)𝑇K1/𝑇K2]];

(Parameter̂𝛼)

P10=Factor[Expand[R1𝑆𝑦+R2𝑆𝑥𝑦]];P11num=Numerator[P10];K5=Exponent[P11num,𝑇];Anum=Limit[Expand[P11num/𝑇𝐾5],𝑇];P12den=Denominator[P10];K6=Exponent[P12den,𝑇];Aden=Limit[Expand[P12den/𝑇K6],𝑇];Apar=Factor[Expand[(Anum/Aden)𝑇K5/𝑇K6]]

(Parameter̂𝛽)

P20=Factor[Expand[R3𝑆𝑦+R4𝑆𝑥𝑦]];P21num=Numerator[P20];K7=Exponent[P21num,𝑇];Bnum=Limit[Expand[P21num/𝑇K7],𝑇];P22den=Denominator[P20];K8=Exponent[P22den,𝑇];Bden=Limit[Expand[P22den/𝑇K8],𝑇];Bpar=Factor[Expand[(Bnum/Bden)𝑇K7/𝑇K8]]

(Parameter̂𝜎2)

P40=Factor[Expand[𝑆𝑦2+P102𝑇+P202𝑆𝑥22P10𝑆𝑦2P20𝑆𝑥𝑦+2P10P20𝑆𝑥]];P41num=Numerator[P40];K11=Exponent[P41num,𝑇];Vnum=Factor[Limit[Expand[P41num/TK11],𝑇]];P42den=Denominator[P40];K12=Exponent[P42den,𝑇];Vden=Factor[Limit[Expand[P42den/𝑇K12],𝑇]];Vpar=Factor[Expand[𝑇1(Vnum/Vden)𝑇K11/𝑇K12]]

(SSR/TSS)

P50=Factor[Expand[P40/(𝑆𝑦2𝑇(𝑆𝑦/𝑇)2)]];P51num=Numerator[P50];K13=Exponent[P51num,𝑇];Rcnum=Factor[Limit[Expand[P51num/𝑇K13],𝑇]];P52den=Denominator[P50];K14=Exponent[P52den,𝑇];Rcden=Factor[Limit[Expand[P52den/𝑇K14],𝑇]];Rc=Factor[Expand[(Rcnum/Rcden)𝑇K13/𝑇K14]];

(𝑡𝛽)

𝑡𝛽=FullSimplify[Bpar/(VparR42)1/2]

(1𝑅2)

P70=FullSimplify[Rc].

Acknowledgments

The authors would like to thank an anonymous referee for insightful comments. The opinions in this paper correspond to the authors and do not necessarily reflect the point of view of Banco de México.