Abstract

We provide sufficient conditions for the definition and the existence of strongly consistent indirect estimators when the binding function is a compact valued correspondence. We use conditions that concern the asymptotic behavior of the epigraphs of the criteria involved, a relevant notion of continuity for the binding correspondence as well as an indirect identification condition that restricts the behavior of the aforementioned correspondence. These are generalizations of the analogous results in the relevant literature and hence permit a broader scope of statistical models. We examine simple examples involving Levy and ergodic conditionally heteroskedastic processes.

1. Introduction

Indirect estimators (henceforth ) are multistep M-estimators defined in the context of (semi-) parametric inference. They are minimizers of criteria (inversion criterion) that are functions of an auxiliary estimator, itself derived as an extremum estimator. The latter minimizes a criterion function (auxiliary criterion) that partially reflects the structure of a possibly misspecified auxiliary statistical model. The inversion criterion is usually a (possibly stochastic) distance function evaluated on the auxiliary estimator as well as on some functional approximation of a mapping between the statistical models involved that is termed binding function. This is constructed by some limiting argument concerning the auxiliary criterion. The is finally defined by minimization of the inversion criterion. This definition is conceptually justified by properties of the binding function that guarantee indirect identification and the subsequent use of the analogy principle. Given the auxiliary criterion, differences between IEs hinge on differences on the distance functions, and/or the approximations of the binding function and/or the optimization errors involved.

Indirect inference algorithms were initially employed in [1], formally introduced by [2], complemented by [3], and extended by [4]. Furthermore econometric applications of these estimators have become increasingly popular. They have been applied to stochastic volatility and equity return models (e.g., [57]), exchange rate models (e.g., [8, 9]), commodity price and storage models (e.g., [10]), dynamic panel data (e.g., [11]), stochastic dif and only iferential equation models (e.g., [12, 13]), and in models (e.g., [1417]).

In the present paper, we are concerned with the issue of the existence of strongly consistent allowing for cases where the binding function is compact valued (hence possibly multivalued). Therefore we perform our study in a more general framework than the ones employed in the relevant literature.

Our motivation lies in cases where the auxiliary criterion is a quasilikelihood function involving a class of stationary-ergodic volatility processes defined by some or type models that represent the statistical model at hand and an auxiliary class of stationary-ergodic invertible processes living in the premises of a possibly misspecified analogous model. In such frameworks the limit criterion could assume extended real values due to the possible existence of parameter values that imply nonexistence of relevant moments. Furthermore since the limit criterion is a statistical divergence between the two classes of processes the binding function can in principle be multivalued due to the geometry involved. Notice that even when this is true, it is possible to study single valued reductions of it via measurable selections. This could however imply stricter conditions for indirect identification. Furthermore since these frameworks are generally suboptimal with respect to asymptotic efficiency, actual multivaluedness could lead to efficiency gains.

This study has the form of a calculus of escalating weak sufficient conditions that enable the definition of in this framework and the proof of existence of strongly consistent ones. First, using mild assumptions on the structure of the auxiliary criterion functions, we are occupied with a weaker than the uniform notion of convergence of the relevant sequence of criterion functions. This is termed epiconvergence and essentially concerns the almost sure asymptotic behavior of their epigraphs. By construction it is suitable for the study of the asymptotic behavior of their minimizers. This form of convergence has been extensively studied in the statistical literature (see among others [1820]) and it enables the definition of the binding function and the determination of the limiting relation between this function and the auxiliary estimator.

Then we strengthen our assumptions in order to obtain a form of continuity of the binding function that enables the definition of derived from this function and the verification of their existence. Finally the imposition of a relatively weak condition of indirect identification on the behavior of the binding function along with the limiting relation already established enables the proof of the existence of strongly consistent via the use of the same limit arguments also used for the pseudoconsistency of the auxiliary estimator and the subsequent definition of the binding function. This framework readily enables the description of conditions concerning the behavior of any approximation of the binding function that could also be used for the definition of in a similar manner.

Hence we manage to extend the framework for the definition of in a threefold manner. We allow for the auxiliary and/or the inversion criteria and/or their appropriate limits to assume extended real values. We study their asymptotic behavior via the use of the weaker known topology associated with convergence of minimizers and we allow for the binding function to be a correspondence with values on the collection of nonempty and compact subsets of the relevant parameter space. This incorporates the definitions used in the existing literature but simultaneously generalizes the set of the statistical models that are in accordance with these conventions.

The structure of the paper is as follows. We formulate our setup and define and study the asymptotic behavior of the auxiliary estimator, the binding correspondence, and finally of the . We then exhibit some of our results by a set of simple examples. We conclude posing some questions for future research. In the appendix we briefly describe some general notions that are essentially used in main body.

2. Assumptions and Main Results

2.1. General Setup

We construct our framework and describe the underlying statistical problem. Let the triad denote a complete probability space. Let also and denote two compact separable metric spaces. Let also and denote the corresponding Borel algebras. denotes the extended real line. Again denotes the corresponding Borel algebra.

The auxiliary criterion is a function , which is of the form with , for , for some appropriate space, usually homeomorphic to for some . represents the sample. reflects part of the structure of an auxiliary model, a statistical model defined on the measurable space , with as its parameter space (e.g., it can be a likelihood function or a type criterion; see Section 3) (which in general is a correspondence , with the set of probability measures on ). is measurable for any and thereby represents the underlying statistical model which is essentially the set . These two models need not coincide.

We abbreviate with a.s. any statement that concerns elements of of unit probability. When not nessesary we avoid notating the potential dependence of those elements on the parameters.

In the following we provide with an escalating description of a set of sufficient conditions that enable first the existence of the auxiliary estimator, second the construction of the binding function and the description of the asymptotic relation between the two, third an appropriate form of continuity of the binding function which along with the previous enables the definition of the , and finally consistency.

2.2. Definition and Existence of the Auxiliary Estimator

We begin with a sufficient weak assumption on the behavior of that enables the definition and the existence of the auxiliary estimator. It comprises of a joint measurability condition along with a pointwise with respect to and a.s. with respect to continuity and some condition concerning the facilitation of minimization. All these conditions are weak enough so that their verification is easy in many cases. Remember that a function with values in the extended real line is called proper, if it does not take the value and its image contains at least a real number. It is called -compact, if its level sets (for , the level set of with respect to ( ) is defined by ) are compact. -compactness follows trivially when the function is and its domain is compact.

Assumption 1. Let the following holds.(1) is measurable.(2) is lower semicontinuous ( ) and proper a.s., for all . (It is obvious that the element of of unit probability with respect to which Assumption 1 holds can depend on ).

In our examples presented in Section 3, the issue of joint measurability is handled easily due to the fact that the ’s considered are in fact Caratheodory functions, that is, jointly continuous (with respect to ) and pointwise measurable. Separability of and Lemma 4.51 of Aliprantis and Border [21] implies the required measurability. Properness is an ad hoc consideration that is easily established in many cases. For instance, when has the form of a quasilikelihood function it a.s. does not attain extended values. This is also the case in instances where has the form of a hemimetric (see Appendix) as in Section 2 due to the compactness of its arguments.

Remark 2. The joint measurability and the pointwise semicontinuity imply that is a normal integrand (see Definition 3.5 and Proposition 3.6 in Chapter 5 of [22]). The compactness of then implies that is -compact a.s. for all .

We are now ready to define and prove existence for the auxiliary estimator. We remind the reader that the Appendix provides with the definition of a measurable compact valued correspondence.

Definition 3. The auxiliary correspondence satisfies where is a a.s. nonnegative random variable defined on .

Notice that the dependence of on is due to the assumed dependence of on . In practice is evaluated by the minimization appearing in the first equality of the previous display.

Proposition 4. Under Assumption 1   is -measurable and a.s. non emptycompact valued for all .

Proof. For any is non empty and compact due to Assumption 1(2) and the compactness of . (For the for which nonemptiness of does not necessitate Assumption 1(2) due to the properties of the ). Then the joint measurability of due to Assumption 1 along with Proposition 3.10(iii) in Chapter 5 of [22] guarantees the measurability for . The result follows from the separability of .

Obviously . In the following, dependence on will henceforth be suppressed (where possible) for notational simplicity. Dependence on , , and the “optimization error” will be kept. (The fundamental selection theorem (Theorem 2.13 of [22]) implies the existence of a measurable selection, i.e., a -measurable random element termed as auxiliary selection, defined by . We will not use selections to define and explore the subsequent definition of the since this would imply stricter conditions for identification.)

2.3. Epilimits and Existence of a Fell Consistent Auxiliary Correspondence

Assumption 1 is not sufficient for the construction of the binding function as an appropriate limit of the auxiliary correspondence. The following assumption facilitates the investigation of the issue of (pseudo-) consistency for the auxiliary correspondence and thereby permits this construction. It indicates the almost sure epiconvergence of the auxiliary criterion to a proper, semicontinuous asymptotic counterpart. The first part is essentially the sequential characterization of this form of convergence. For the topological definition of epiconvergence along with some properties please see the Appendix. For the equivalence between the topological and the sequential definitions see among others [20]. The second part enforces to the limit criterion a property that are is not generally invariant with respect to epiconvergence.

Assumption 5. There exists a function such that(1) , a.s. for any :(i) for all such that , and(ii) for some such that ,(2) is proper .

Remark 6. For any , is lower semicontinuous since the local compactness of and the subsequent coincidence of convergence with respect to the Fell topology with the Painleve-Kuratowski one (see Appendix), the epigraph of is closed. In the case that and is ergodic for any , the assumed epiconvergence would follow if for any there exists an open cover of , so that for any in the cover (condition and Theorem 2.3 of [23]). In the considered cases is proper since it is either an expectation that cannot assume the value , and there exists at least some parameter value for which it is finite or it is defined by composition with a (hemi-) metric and assumes the value for at least one parameter value (see, for example, [24] Part 1, (ii) in association with Part 2 of the proof of Theorem 5.3.1, where is a quasilikelihood function and coincides with for the former or Section 2 for the latter case). -compactness follows from the compactness of .

The assumption essentially enables the use of the fact that the correspondence is upper continuous as a function defined on the relevant space of functions equiped with the topology of epiconvergence. Analogous assumptions have been used for the establishment of strong consistency of various estimators. See among others [1820, 25]. Hence it makes possible the definition of the binding function as The following proposition provides its existence and is essentially similar to Proposition 4.

Proposition 7. Under Assumptions 1 and 5 the binding correspondence is nonempty compact valued.

Proof. It follows from Remark 6.

Both the auxiliary and the binding correspondence will be used for the definition of the via some intuition that utilizes an analogy principle. The following result explores their asymptotic relation. For its construction the hemimetric is needed that is defined by where , where and , are nonempty closed subsets of some topological space. For several continuity and measurability properties of see the Appendix.

The first and last implications of the following proposition are already well known. Its second implication is a partial generalization of Theorems 7.30, 7.32 of [26] in our setting. Given the definitions of the upper and the Fell topology in the Appendix, the first essentially establishes the upper pseudoconsistency and the other two the Fell pseudoconsistency of with respect to to .

Proposition 8. Under Assumptions 1 and 5 the following holds:(1)for any     such that     a.s. then     a.s.,(2)there exists a nonnegative random variable,   , such that     a.s. and     a.s.,(3)if     is singleton then for any     such that     a.s. then     a.s.

For its proof we will use the following lemmata. Let which is well defined due to Remark 6 and the compactness of . Remember that for a sequence of non empty sets is the set comprised of the limit points of any possible sequence such that , and is the one comprised of the analogous cluster points. Also for .

Lemma 9. Under Assumptions 1 and 5,

Proof. Consider the family of -parameterized correspondences . Due to the fact that is locally compact, is a random closed set in the sense of the previous paragraph, that is, a -measurable correspondence. Hence is an -measurable correspondence due to the measurability of the relevant projection. Now due to Assumption 5 (see Section 4) we have that for large and for all in an element of of unit probability since is open in the relevant product topology. Hence for all described previously.

The next result will be used for the proof of Proposition 8(2)-(3).

Lemma 10. Under Assumptions 1 and 5 there exists a sequence of random variables defined on , say (It is obvious from the proof that depends also on ). Such that a.s. and

Proof. From the sequential implication of epiconvergence in Remark 6, we have that, for any , there exists a measurable such that a.s. Obviously for which is measurable, we have that . Since is compact, it is totally bounded and therefore, for any , there exist an and , such that the collection of balls (in ) covers . For some real sequence , consider and extract analogously random sequences such that a.s. Define and which is well defined due to Egoroff’s theorem and the fact that is finite. Obviously is nondecreasing in . Then define which is measurable and a.s. Then for , it follows that, for any , there exist an and a measurable such that a.s.

Proof of Proposition 8. For (1) we have first that for any measurable nonnegative that need not converge to zero if measurable, such that a subsequence a.s. where that last inequality follows from Lemma 9. This establishes that for any nonnegative random variable Now (1) follows from the fact that - - if . For (2) notice that, from the definition of the Fell topology in Section 4 for any , we have that for large , a.s. since is compact in the relevant product topology. This implies that a.s. and in conjunction with Lemma 9 that a.s. Then using Lemma 10 set which is obviously measurable and converges to zero a.s. (3) follows from (1) the single valuedness of and the compactness of .

The proof of Lemma 10 implies that the sequence that appears in Proposition 8(2) is nonunique. However the fact that this implication does not hold for any sequence of nonnegative random variables that a.s. converge to zero is the cause of that in what follows we can only prove the existence of strongly consistent indirect estimators among the set of the ones to be defined.

2.4. Upper Hemicontinuity of the Binding Correspondence

Proposition 8 enables the use of as the inversion criterion. The following assumption concerns the upper continuity of the binding correspondence which along with the relevant properties of would imply the analogous continuity property for the particular inversion criterion and thereby facilitate the issue of existence and consistency of the to be defined.

Assumption 11. is upper hemicontinuous that is, for any and , .

The following proposition provides with sufficient conditions for this to hold. It essentially strengthens Assumption 5 in that it requires that the relevant   a.s. epiconvergence be continuous on . Notice that its requirements are also stricter with respect to measurability compared to the ones in Assumption 5 since the former requires that for any the relevant set of unit probability does not depend on the sequence that converges to .

Proposition 12. If, for any ,   a.s. for any , any   , and any ,(1) ,   for all     such that   ,(2) ,   for some     such that   ,
then Assumption 11 holds.

Proof. Suppose that metrizes on (see Section 4). Then it is obvious that Proposition 12(1)-(2) is equivalent to the requirement that, for any and any , converges to zero a.s. Then, for any and any , epiconverges to ( ), that is, is epicontinuous on . This is due to the following standard argument: for an arbitrary and , we have that a.s. for any in some open neighborhood of and large enough , due to the assumed form of convergence and Egoroff’s theorem. By the same reasoning a.s. for any such . The result follows from the fact that is independent of . This along with 3.1 of Theorem 5.3.4 and proposition Appendix D.2 of [22] implies that the composite mapping is appropriately continuous.

An examination of the proof of Proposition 8 shows that if the conditions (1)-(2) of Proposition 12 hold then Proposition 8 could be restated with replacing when this occurs. This means that the premises of Proposition 12 render strongly continuously (upper in the first case or Fell in the second and third cases of Proposition 8) -consistent. Furthermore the compactness of then implies that these notions are also uniform with respect to .

Remark 13. Proposition 12(1)-(2) would obviously be implied if is a.s. jointly continuous and converges jointly uniformly a.s. to . Since we allow and/or to assume extended values, the relevant notion of uniform convergence must also be extended as in Definition 7.12 of [26]. The following lemma provides with a set of even weaker sufficient conditions than extended jointly uniform a.s. convergence when has the form of an arithmetic mean with respect to stationary and ergodic processes.

Lemma 14. Suppose that , is ergodic for any , is jointly continuous a.s., there exists a finite open cover of , such that , for any in the cover, assumes values in for any , and in a countable dense subset of . Then Proposition 12 holds.

Proof. Proposition 12(1) follows from the fact that the assumption framework of the lemma implies condition and thereby Theorem 2.3 of [23], which implies the joint a.s. epiconvergence of to . For Proposition 12(2) notice that the separability of and the a.s. continuity of implies the existence of a countable dense such that for any and any there exists a such that By assumption the subset of of unit probability can be chosen independent of and can be chosen arbitrarily small. Hence, Proposition 12(2) would be implied for , if, for any , , a.s. and any , due to the countability of . Notice that is also stationary-ergodic for any (see, e.g., Proposition 2.1.1. of [24]), and hence the uniform version of Birkhoff’s implies that converges a.s. to for any . Due to the separability of the subset of of unit probability can be chosen independent of . Hence from Definition 7.12 of [26] we obtain .

This lemma explores sufficient conditions for the required continuity of solely via restrictions on the behavior of which in applications is generally more analytically tractable than . Furthermore it combines joint epi-onvergence with pointwise (on ) extended uniform (with respect to ) almost sure convergence. Finally notice that analogous result would also hold if ergodicity is replaced by any kind of mixing condition that would justify the s used in the previous proof or implied in Remark 13.

2.5. Definition, Existence, and Consistency of the Indirect Estimator

We are now ready to define the and explore the issues of its existence and consistency. Proposition 8 along with the measurability of the auxiliary correspondence and the upper hemicontinuity of facilitate the use of for some distinguished , for the definition of the and the subsequent existence argument. Again an almost surely nonnegative random variable will assume the role of the “optimization error” in this second step of the estimation procedure.

Definition 15. The indirect correspondence satisfies where is a nonnegative random variable defined on .

The dependence of on follows form the analogous dependence of which in turn follows from the dependence of on . Obviously in practice this dependence is not “visible.” We are initially concerned with the question of existence of the . We again suppress the dependence of on when there is not a risk of confusion.

Proposition 16. Under Assumptions 1 and 11   is -measurable, a.s. nonempty, compact valued correspondence.

Proof. First, notice that due to Lemma A.9, Proposition 4 (implied by Assumption 1), Assumption 11, and the facts that is independent of and is independent of , we obtain that is -measurable. Due to Lemma A.4 and Proposition 4   is a.s. and therefore a normal integrand. It is also a.s. proper due to the fact that and are a.s. compact valued. Hence the result follows from Proposition 4 where , when we consider and (the left-hand sides correspond to the notation of the latter lemma).

The fundamental selection theorem (Theorem 2.13 of [22]) would also enable the definition of the as a measurable function with values in . Having established existence we turn to the issue of consistency. We need an assumption of indirect identification that is essentially derived from the form of the roots of the hemimetric used. Notice that this assumption along with the proof of the following proposition justifies the definition of the Definition 15 by an analogy principle. Mathematically both the definition and the existence argument do not require the following assumption in order to be valid.

Assumption 17. If .

Remark 18. This condition is weaker than a condition of the form “If ” and stronger that a condition of the form “If .” The latter cannot be used due to the properties of upon which the definition of the is based. In the case that the binding correspondence is single valued, these become equivalent. This also makes evident the claim that if the auxiliary estimator is defined by a measurable selection of the corresponding identification condition cannot be weaker than the one above.

The main result of the current section follows. It merely concerns the existence of strongly consistent inside the established framework. Denote by the which is nonempty and compact due to the compactness of , the properness of , the hemicontinuity of , and Lemma A.3. Obviously, , while Assumption 17 holds. Since we once again are dealing with compact valued correspondences, convergence is metrized by and/or .

Proposition 19. Under Assumptions 1, 5, and 11 and if a.s. then(1)   a.s. where     is defined in Proposition 8(2),(2)if      is singleton then     a.s. for any     a.s.,
if furthermore Assumption 17 holds then   a.s. where     is defined in Proposition 8(2), if      is singleton then     a.s. for any   a.s.

Proof. First notice that due to Proposition 8(2), Assumption 11, and Lemma A.6 we have that for any and and that, for any and , due to Lemma A.4 Hence (1) follows from Proposition 8(1) for and if we denote by (in the notation of this lemma) the space and by (in the notation of this lemma) . (2) follows in the same manner if we replace any invocation of Proposition 8(2) with Proposition 8(3). Finally, notice that if Assumption 17 holds, then establishing and via another use of Proposition 8(3).

If Assumption 17 does not hold then the implications of Proposition 19(1)-(2) correspond to the fact that the statistical model is only indirectly set identified given this framework. They are trivial when and the closer to zero is the more informative they become. We once again point out that the implications Proposition 19(1) and merely explore the issue of the existence of strongly consistent estimators among those that comply with Definition 15. The properties of the function along with Proposition 8 do not permit for a stronger result without strengthening the assumption framework. Finally notice that this framework enables both the definition and the result on consistency of the to be derived via the use of exact same notions that were used for the analogous results concerning the auxiliary one.

2.6. Extension

In most cases is analytically unknown even if several of its properties, such as some of the ones discussed above, can be established. In these cases the estimators defined in Definition 15 are infeasible. However, it may be the case that (possibly) stochastic and algorithmically feasible approximations of the unknown can be used for the construction of several other classes of feasible . In such a context the results derived previously could be used to describe properties of such approximations that would imply that these estimators are well defined and among them strongly consistent ones exist. Let denote such an approximation. We readily obtain the following result for the indirect estimator defined by the substitution of with in Definition 15.

Proposition 20. Consider the defined by where as before.(i)Under Assumption 1 and if     is   -measurable,     a.s. compact valued and upper hemicontinuous, then     is   -measurable,     a.s. nonempty, compact valued correspondence.(ii)If moreover Assumption 5 holds and     a.s. for any     and any   ,   and     converge to zero then the implications of Proposition 19(1)-(2) hold also for   . If furthermore Assumption 17 holds of then the implications Proposition 19 - hold also for   .

Proof. (i) As in the proof of Proposition 16 from Lemma A.9, Proposition 4 the a.s. upper hemicontinuity of the facts that is independent of , and is jointly measurable, we obtain that is -measurable. Due to Lemma A.4 and Proposition 4   is a.s. and therefore a normal integrand. It is also a.s. proper due to the fact that and are a.s. compact valued. Hence the result follows from Proposition 4. (ii) It suffices to prove that a.s. when   satisfies the implications (2) or (3) of Proposition 8. The rest would then follow as in the proof of Proposition 20. Notice that for any and any due to Lemma A.6, the definition of and the a.s. continuous w.r.t upper convergence of to . Moreover for where the first inequality follows from the triangle inequality and the second from the definition of and the a.s. Fell convergence of to and Lemma A.4. Due to the a.s. pointwise with respect to Fell convergence of to we have that a.s. and therefore we obtain the needed result.

Notice that this proposition generalizes the results of Propositions 16 and 20. For a simple example consider the case where . This is possible when by some sort of resampling technique (e.g., bootstrap or Monte Carlo) realizations of the random elements are available to the practitioner for any and thereby so is for any and some optimization error   independent of . Then a feasible can be defined by the approximate minimization of with respect to . In this case the joint measurability of would follow from the joint measurability of and , the separability of , and the subsequent joint measurability of the relevant projection. The a.s. upper hemicontinuity of would follow from an easy extension of the implication Proposition 8(1) if Assumption 1 is strengthened so that the mapping is a.s. Fell continuous. The a.s. joint continuity of would suffice. Then the a.s. pointwise Fell convergence to would follow as in Proposition 8(2) or (3) and the a.s. continuous w.r.t upper convergence to would follow if Proposition 12 holds with the set of unit probability independent of . It would suffice that is a.s. jointly continuous and converges to jointly uniformly.

3. Examples

In this section we consider four simple examples that represent some of the previous results. The first concerns the case of a linear semiparametric model, the second a model comprised of Levy processes, and the final two emerge in the context of conditionally heteroskedastic ones. In any of these, is a compact subset of and a compact subset of . In the second and the fourth ones the binding function is actually single valued (hence a fortiori compact valued) and 1-1 enabling the direct application of Proposition 19 . The first and second examples include cases in which the can be interpreted as performing “inconsistency” correction to the auxiliary one.

Example 21 (semi-parametric linear model with linear auxiliary). Consider the and dimensional random matrices and , respectively, where . Suppose that , , a.s., where and . For an random vector, let the underlying statistical model be the set of “regressions” , . For a large enough compact and convex subset of and any , let , which clearly satisfies Assumption 1 due to continuity with respect to and the compactness of . Obviously, is constructed by the auxiliary set of regression with respect to . Proposition 4 ensures the existence of which in the light of the previous can be interpreted as an in the context of the auxiliary model. Let be the (generally nonlinear) projection defined by the optimization problem for in . is well defined due to the compactness and the convexity of and the linearity and continuity of and continuous. Furthermore, for any , consider the linear system , which is always satisfied by any member of the coset , where is a matrix of rank and is a -dimensional subspace of , which is trivial if and only if whereas and maximal in the case that . For , , assume that , and a.s. The previous imply the joint uniform a.s. convergence of to which implies both Assumptions 5 and 11 (via Proposition 12 and Remark 13). Notice that due to the convexity of with respect to for any and the definition of . If for any then Assumption 17 applies. (More precisely we have that Hence, Proposition 19 implies the existence of a consistent for . In the special case where and then and Proposition 19 implies that any defined by Definition 15 can be perceived as an “inconsistency corrector” of the underlying for .

We know consider the case of the estimation of the drift of a continuous time cadlag process.

Example 22 (the drift of a levy process with bounded jumps). Let denote a standard Bownian motion and a finite measure on the Borel algebra of , such that when for . Obviously is a Levy measure (see paragraph 1.2.4 of [27]). For consider the stochastic process on defined by the following Levy-Ito decomposition (see Theorem 2.4.16 of [27]) where denotes the independence to Poisson random measure of the existence of which is established by Theorem 2.3.6 of [27]. Let the underlying statistical model be the set of the previous stochastic processes and for a large enough compact subset of and any , let , where . This can be perceived to emerge as an approximate likelihood function of the auxiliary model that contains the relevant discretizations of the processes that satisfy the for each . Obviously Assumption 1 is satisfied, due to continuity with respect to and the compactness of . Proposition 4 ensures the existenceof which in the light of the previous can be interpreted as an (approximate) in the context of the auxiliary model. Furthermore since and independent of , we have that for for Due to the definition of the process is . and this along with the compactness of , and and the existence of the previous moments imply the joint uniform a.s. convergence of to which implies both Assumptions 5 and 11 (via Proposition 12 and Remark 13). If then In this case Assumption 17 applies and therefore Proposition 19 implies that any defined by Definition 15 is consistent for any . When (and therefore ) whereas the can be perceived as an “inconsistency corrector” of the underlying for the estimation of the drift of a geometric Brownian motion (see, e.g., paragraph 6.1.1 of [13]).

For the last pair of examples, let be an i.i.d. (double infinite) sequence of random variables, with and . Consider a random element , with the product space equipped with with independent of , , . Analogously, define the random element as

Then for all , is called a conditionally heteroskedastic process, while the random element a conditionally heteroskedastic model. Our examples will solely concern ergodic heteroskedastic models. (The establishment of the ergodicity is initiated by the analogous establishment for . Sufficient conditions for that are described and employed in a variety of heteroskedastic models in Chapter 5 of [24] via Theorem 5.2.1. Then the ergodicity of and follows from the definition of , the previous assumption and Proposition 2.2.1 of [24]).

Example 23 (instrumental variables estimation in regressions on squared ARCH(1) processes). Let and . Suppose that and consider the stochastic dif and only iference equation Due to the fact that Theorem 5.2.1. of [24] implies that for any the equation admits a unique stationary and ergodic solution defining the analogous process. Consider the random vector and the dimensional random matrices jointly measurable with respect to , where and ergodic for any . For , let which clearly satisfies Assumption 1 due to joint continuity with respect to the compactness of , the joint measurability, and the fact that is defined via composition with a norm. This consideration is motivated from the representation of the process with respect to the martingale dif and only iference noise (see, e.g., [28]) and can be perceived to emerge from an auxiliary model that is consisted of the set of “auxiliary” regression functions of on , along with the instrumental variables appearing in the columns of where obviously the element in any column is clearly orthogonal to for . Proposition 4 ensures the existenceof which in the light of the previous sentence can be interpreted as an estimator in the context of the auxiliary model. Due to the compactness of , the definition of the model, and the definitions of and we have that for which along with (the uniform version of) Birkhoff’s Ergodic theorem (see, e.g., [24], Theorem 2.2.1) implies both Assumptions 5 and 11 (via Proposition 12 and Remark 13) for In fact a simple calculation shows that which clearly implies Assumption 17. Hence Proposition 19 implies that any defined by Definition 15 is consistent if , and Proposition 19 implies the existence of an analogously consistent when . It is easy to see that, given , where denotes projection to the -axis and denotes the midpoint of the smallest interval that contains . Notice that in our case is well defined due to the fact that is a.s. compact valued hence its is a a.s. compact subset of the real line. Finally and due to the fact that bootstrap resampling techniques are readily available in the context of this model, Proposition 20 implies also the analogous properties for defined by when this equals the auxiliary estimator derived from bootstrap resampling for any .

The final example is about an asymmetric heteroskedastic process.

Example 24 ( is the quasilikelihood function of an approximate to model). Let and , and consider for , , and the stochastic dif and only iference equation For any , the previous defines a unique stationary and ergodic volatility process with existing moments that is uniformly bounded from below away from zero (see Lemmas 2.1 and 3.4 and Remark R.2 of Arvanitis and Louka [29]). Notice that Jensen’s inequality allows which in turn implies that . For consider the process defined by where is well defined due to the definition of and it is stationary and ergodic with existing moments due to the previous and Proposition 2.1.1. of [24]. Now consider where can be considered as an approximation of (a monotonic transformation of) the conditional quasilikelihood function of the auxiliary conditionally heteroskedastic model defined by 3 and . Also the ergodicity of for any follows from the previous and Proposition 2.1.1. of [24]. (In practice is unknown but approximated by an analogous dependent on nonergodic solutions of the stochastic dif and only iference equation that defines based on arbitrary initial conditions. In this case, due to ergodicity, Proposition 5.2.12 of [24] can be employed in order to ensure that converges almost surely to zero for any (see the first part of the proof of Theorem 5.3.1 of [24]), thereby facilitating the asymptotic analysis of minimizers of by the analogous analysis of minimizers of Assumption 1 follows readily from the form of and the a.s. continuity with respect to . Hence is well defined and can be interpreted as an approximate in the context of the auxiliary model. Now, consider an arbitrary finite open cover of and notice that and that for an arbitrary member of the partition. Notice also that for all due to the fact that Hence, Remark 6 implies that Assumption 5 holds with Notice that is uniquely minimized at (see, e.g., Part 1. of the proof of Theorem 5.3.1. of [24] to obtain the analogous arguments along with the fact that if and only if ). When then due to the fact that and that when , then . Furthermore, using the fact that by 3 h is a.s. two times dif and only iferentiable with respect to for and since as well as dominated convergence, we have that which is zero if and only if , and establishing along with the previous that . This validates simultaneously both Assumptions 11 and 17. Notice that Assumption 11 could also be verified by the use of Lemma 14 due to the continuity of and with respect to the parameters, the existence of moments, and the fact that is uniformly bounded from below away from zero. Then Proposition 19 implies that any defined by Definition 15 is consistent for any .

4. Conclusions

In this paper we generalize the definition of and are occupied with the questions of existence and strong consistency. We allow for cases where the binding function is a compact valued correspondence. We have used conditions that concern the asymptotic behavior of the epigraphs of the criteria involved in the relevant procedures, a relevant notion of continuity for the binding correspondence and an indirect identification condition that restricts the behavior of the aforementioned correspondence. These results are generalizations of the analogous ones in the relevant literature and hence permit a broader scope of statistical models.

First, notice that our framework could still be extended in the following manner. The established results would remain almost intact if the underlying parameter spaces were only locally compact under more restrictive assumptions on the behavior of the criteria involved. In such a case Proposition 4.2.1.(i) of [30] would permit the validity of the results, except for the compactness of the auxiliary and the binding correspondences, under the additional condition that a.s. , and have totally bounded level sets and nonempty argmins.

Second, the present generalization is certainly nonunique. Again under stricter conditions on the behavior of the auxiliary criteria, possibly relevant to the ones in Proposition 3.42 of [31], the implication Proposition 8(2) could be strengthened to hold for any asymptotically null sequence of optimization errors. In this case in the definition of the could be replaced by and this would initially allow the identification condition in Assumption 17 to be replaced by the weaker “if .” If Assumption 11 was also strengthened to require Fell continuity then the strong consistency result would be valid for any   defined in this framework. We leave this for future research.

We also leave for future research the questions of the definition and consistency of when the appearing in Proposition 20 is some sort of integral of (see, e.g., [32]) or some (possibly) stochastic approximation of it. The same holds for the issues of the establishment of the rates of convergence and the asymptotic distribution of the in this general framework. Notice that this limit theory could in principle be quite complex due to complexities in the analogous theory for and/or to dif and only iferent properties of potential polynomial approximations for dif and only iferent selections of around and so forth. The establishment of primitive conditions that guarantee the indirect identification Assumption 17 is also of separate interest.

Appendices

A. Some General Notions

In this section we briefly describe some general notions that are essentially used in the main body.

A.1. Fell and Upper Topology

Let denote a topological space. We identify it with when there is no risk of confusion. We denote by the set of closed nonempty subsets of . We will use two topologies on cosntructed via and the inclusion partial order on . The first is entitled upper topology and we define it as the restriction of the topology in Definition 1.3.1 of [30] on .

Definition A.1. The upper topology on is generated by the subbase consisting of

The upper topology is extremely useful for the analysis of the asymptotic behavior of sequences of sets of minimizers. If is generated by a metric (say with respect to which is compact then is hemimetrizable (see Definition 4.1.3 for the term hemimetric and Proposition 4.2.2 of [30]) by , defined by where . (where ). Obviously, when then . The following properties of will be useful in what follows.

Lemma A.2. consider

Proof. Since is closed if and , then . But then and therefore .

Lemma A.3. is a lower semicontinuous (lsc) real function with respect to the first argument.

Proof. If with respect to the upper topology on , then , and hence .

Lemma A.4. is an upper semicontinuous (usc) real function with respect to the second argument.

Proof. Suppose that with respect to the upper topology on ; then by the triangle inequality we have that establishing that .

The second topology on , known as the Fell topology, is defined by the use of the following subbase (see [22], paragraph 1.1, and [30], Definition 4.5.1).

Definition A.5. The Fell topology, say , is the smallest topology on consisting of both(1) , , nonempty,(2) , nonempty and compact.

From Theorems 4.5.3-5 of [30] we have that when is locally compact and Hausdorff then is locally compact and with respect to the Fell topology if and only if where is the set comprised of the limit points of any possible sequence such that , and is the one comprised of the analogous cluster points. Hence, in this case this type of convergence coincides with the Painleve-Kuratowski convergence (see among others, Appendix B of [22], or Definition 3.1.4. of [30]). If is also separable then the Fell topology is metrizable. If furthermore is compact and metrized by , then the Fell topology is actually metrized by the Hausdorff extended metric defined via a symmetrization of ; that is, where . In this case we can prove the following lemma.

Lemma A.6. If is compact and metrized by , then is a lower semicontinuous (lsc) real function with respect to the product topology on , when the first factor is endowed with and the second with .

Proof. If with respect to the aforementioned product topology on , then establishing that .

A.2. Epigraphs of Semicontinuous Functions and Epiconvergence

Consider now the case where is locally compact and Hausdorff and .

Definition A.7. The epigraph of is

Note that despite the fact that the image of may include infinities, is by definition a subset of . If and only if is lower semicontinuous ( ) we have that, due to Proposition A.2 of [22], with respect to the obvious product topology. Hence any relevant function can be identified with its epigraph, which in turn lies in a space endowed with Fell topology, which in turn implies a notion of convergence.

Definition A.8. A sequence of lsc functions epiconverges to ) if and only if with respect to the Fell topology.

It is easy to see that uniform convergence implies epiconvergence (see, e.g., Remark 6 above). Furthermore the relevant set of is closed with respect to the Fell topology. This notion is particularly suitable for the description of the asymptotic behavior of the set of minimizers of sequences of functions (see Theorem 3.4 of [22] along with Theorem 7.1.4 of [30], Definition D.1 and Proposition D.2 of [22]).

A.3. Closed and Compact Valued Correspondences-Random Closed Sets

A closed valued correspondence is by definition a representation of an underlying function from a set to (i.e., a closed valued multifunction with domain the set ), when this is considered as a relation in . The benefit of not directly working with the underlying function is the fact that we can consider the graph of the correspondence as the set which resides in instead of the set inside . When is compact for any , then the correspondence in obviously termed as compact valued. In the following we do not make explicit distinction between the correspondence and the underlying multifunction.

The Borel -algebra on generated by will be abbreviated by and is usually termed Effros algebra (see Paragraph 1.1 of [22]). If is a measurable space, then is a random closed set if and only if for any . Analogously we abbreviate by the Borel -algebra on generated by and by the Borel -algebra on generated by the product topology described in Lemma A.6. Finally denote by the Borel -algebra of the extended real numbers with respect to the usual topology.

Lemma A.9. If is compact, separable, and metrized by , then is measurable.

Proof. The separability of implies the separability of and for if is dense in then the countable subset of , intersects any basic open set with respect to to either topology. This implies the separability of when equipped with the topology discussed in Lemma A.6. This in turn implies that the Borel -algebra with respect to to the product topology on coincides with by Lemma 1.4.1. of [33]. The rest follows by Lemma A.6 along with the fact that the sets in the subbase of the upper topology of generate . (It is also possible to prove that in the context of separability ).