Research Article | Open Access
The Fréchet Derivative of an Analytic Function of a Bounded Operator with Some Applications
The main result in this paper is the determination of the Fréchet derivative of an analytic function of a bounded operator, tangentially to the space of all bounded operators. Some applied problems from statistics and numerical analysis are included as a motivation for this study. The perturbation operator (increment) is not of any special form and is not supposed to commute with the operator at which the derivative is evaluated. This generality is important for the applications. In the Hermitian case, moreover, some results on perturbation of an isolated eigenvalue, its eigenprojection, and its eigenvector if the eigenvalue is simple, are also included. Although these results are known in principle, they are not in general formulated in terms of arbitrary perturbations as required for the applications. Moreover, these results are presented as corollaries to the main theorem, so that this paper also provides a short, essentially self-contained review of these aspects of perturbation theory.
Motivated by certain applications in numerical analysis and, in particular, statistics, this paper deals with the Fréchet derivative of an analytic function of a bounded linear operator on a separable Hilbert space (in the sense of the usual functional calculus), tangentially to the Banach space of all bounded linear operators mapping into itself. More precisely, a first order approximation to the difference is obtained, including the order of magnitude of the remainder. An example of such a function is a generalized or regularized inverse of the square rootwhere is the identity operator. Once the Fréchet derivative has been established (Section 2), it yields the asymptotic distribution of functions of certain random operators via an ensuing delta method: a well-known statistical technique (see Section 4).
Clearly can be regarded as a perturbed version of , and it is not surprising that perturbation methods are employed to obtain the desired result. The authors are aware of the possibility that the rather straightforward result on the Fréchet derivative might be hidden somewhere in the rich literature on perturbation theory [1–3]. Yet they have not been successful in identifying a reference that states the result in its present form, tailored to the applications they have in mind. Some remarks are particularly in order.
(a)The perturbations are typically of small norm but otherwise arbitrary bounded or Hermitian. In literature, they are often of the formfor operators , and a small number . In statistics, there is no point in representing the perturbation in such a form.(b)The perturbation and the operator are not assumed to commute, because in our applications such an assumption would not in general be fulfilled. If the operators do commute, however, the Fréchet derivative would reduce to , in the sense of functional calculus with the derivative of . In the case considered here, the actual Fréchet derivative and may differ considerably.(c)A central theme in perturbation theory concerns the perturbation of an isolated eigenvalue and corresponding eigenprojection (see, e.g, the references mentioned before). Some of the results are included, because they can be easily derived from the main result on the Fréchet derivative by choosing a special function (Section 3). In this way, the paper presents a concise and essentially self-contained review of some basic results in this area. They are again presented in terms of a general (Hermitian) perturbation , as being required for statistical application, in the same vein as, but somewhat more general than, Dauxois et al. .
As has already been mentioned in the beginning, will be a separable Hilbert space and the Banach space of all bounded linear operators mapping into itself. The inner product on will be denoted by and the norm by The norm on will be written , and the notation and will be used to denote the subspace of all Hermitian and all compact Hermitian operators, respectively.
We will exclusively deal with infinite dimensional Hilbert spaces and will not attempt to include the simpler finite dimensional case in our formulation. The Fréchet derivative for arbitrary perturbations is well known in the finite dimensional matrix case. This result and further references can be found in the recent monograph by Bhatia . In the finite dimensional case, this derivative is also implicitly present in Theorem 2.1 of Ruymgaart and Yang  to obtain the asymptotic distribution of a function of a random matrix.
2. The Fréchet Derivative
Let us fix an arbitrary with spectrum and a bounded open region in the complex plane with smooth boundary , such thatFurthermore, let us consider functions of typewhere is an open neighborhood of . Let us write
The resolventis analytic on the resolvent set , and the operatoris well defined. This relation establishes an algebra homomorphism [7, Section 17.2] which implies in particular thatif is also analytic. In particular, we have
The operators are well defined for every sufficiently small. Note that according to Dunford and Schwartz [8, Lemma VII.6.11], there is a constant , such that
Theorem 2.1 (Fréchet Derivative). Let and suppose that satisfies (2.2). Then maps the neighborhood into , when defined in the usual way of functional calculus. This mapping is Fréchet differentiable at , tangentially to , with bounded derivative as defined by (2.8). More specifically, we havewhere is defined in (2.9) and
Proof. For to be well
defined on the neighborhood, let us first show thatTo verify this, note that by
(2.10) we have for such . Consequently, the operatoris bounded for each , which entails (2.14). Hence, is well defined for with .
Applying a Neumann series expansion [9, Section 5.2] to the inverse on the left in (2.15), we obtainjust as in Watson  for matrices. Term-wise integration yields (2.11).
The upper bounds in (2.12) and (2.13) are immediate from (2.8) and (2.9), respectively, by exploiting (2.3) and (2.15). The boundedness of as a linear operator mapping into itself follows at once from (2.12).
Remark 2.2. It will be seen in Section 4 that for the
applications we have in mind it is important that we do not require that and commute. If
they do, however, it is clear that the Fréchet derivative in (2.8) reduces
toIt has been shown in Dunford
and Schwartz [8, proof of Theorem VII.6.10] thatCombination of (2.11) with
(2.18) and (2.19) yieldswriting, for any ,to indicate any quantity
(operator, vector, number) whose norm or absolute value is of the given order.
Note that in (2.20) the operator, is to be understood
in the sense of the usual functional calculus as in (2.5) with replaced by its
In this situation of commuting operators, Dunford and Schwartz  obtain the Taylor expansionwhich implies, of course, (2.20).
Keeping the perturbation as before, we now restrict to the class of compact Hermitian operators. The bounded and countable spectrum consists of the number , whether an eigenvalue or not, and all the nonzero eigenvalues . In this work, we avoid technical issues related to being an eigenvalue, and assume that is one-to-one, that is, implies that . It is well known  that such a can be represented aswhere the are the corresponding orthogonal eigenprojections onto the mutually orthogonal finite dimensional eigenspaces. These projections provide a resolution of the identity in , that is,The resolvent has the expansion
Proof. Let us substitute the expansion (2.25) for into the expression for in (2.8). Application of the partial fraction method yieldsThe right-hand side of (2.27) reduces at once to the expression on the right in (2.26) by an application of Cauchy's integral formula.
Example 2.4. The function , , is analytic on the entire complex plane so that Corollary 2.3 applies. The Fréchet derivative in (2.26) now reduces to. Of course this result is immediate because in this simple case .
Example 2.5. Next let us, for , consider the functionfor . Note that the choice of ensures that the pole at remains outside the contour . Clearly there exists an open region of the type required, such that is analytic on some open neighborhood of . Hence Corollary 2.3 applies again. The operator represents a regularized or generalized inverse of Tikhonov type, according to whether is injective or not. The Fréchet derivative in (2.26) now equalsfor .
3. Perturbation of Eigenvalues and Eigenvectors
Throughout this section, both and are assumed to be Hermitian, so that also . In addition to this, we assume thatwith one-dimensional eigenspace. Consequently, the eigenprojection can be writtenwhere for the operator is defined by .
The region will now be chosen in such a way that it has a connected component with the propertiesA special analytic function such thatwill play an important role in the sequel. Note, for instance, that
For the Fréchet derivative of at , a special expression can be obtained. Let us writewhere is Hermitian with spectrum . According to the spectral theorem, there exists a resolution of the identity , , such thatIt should be noted thatwhere is the zero operator, and thatLet us define
Lemma 3.1. The Fréchet derivative of at is given by
Proof. This follows by substitution of (3.9) in the
expression on the right infor this derivative; see also
(2.8). We thus obtain
By Cauchy's integral formulaso that the first term on the right side in (3.13) is the zero operator. Regarding the second, note thatbecause each lies outside the contour . Consequently, the second term equalsSimilarly, the third term equals . The last term cancels, becausesince both and lie outside .
Some results about the perturbation of and in a given direction as in (1.3) that are well known in literature [1, 2] can be partly recovered for perturbations in some neighborhood, in an essentially self-contained manner, as simple consequences of the results in Section 2.
Corollary 3.2. Under the assumptions (3.1), (3.2), and for sufficiently small, the operator has an isolated eigenvalue with eigenprojection for some unit vector , satisfyingwhere is defined in (3.10).
Proof. In view of (3.5) and (3.11), application of (2.11)
with yields . Clearly is Hermitian,
and because by (2.6), it is
also idempotent so that it is in fact some projection , for example, it follows that for all sufficiently
small, and hence the range of must also have
dimension 1 
so that for some with .
Next, let , , be the identity function. By (2.6), again, on the one hand we have , and on the other . Hence is an eigenvector of with eigenvalue .
Corollary 3.3. Under the assumptions of Corollary 3.2, we have
Proof. Let us first observe that because of
(3.8). Hence (3.18) yields , whereIt sufficies to show that for . The idea of the proof can be found in Dauxois et al. .
Regarding , note that , once more using (3.18). Hence , as , and therefore for sufficiently small. This entails
For we haveas can be seen from (3.21).
Corollary 3.4. Under the assumptions of Corollary 3.2, we have
Proof. With the help of (3.19), we see that . The result follows from a routine calculation combined with the equalities , , and . For the last two equalities we assume that and are Hermitian and by (3.8).
Remark 3.6. The assumption that be Hermitian is in fact not necessary. Of course, if we just require to be bounded, the perturbed operator is not in general Hermitian anymore. In particular, a suitably modified version of Corollary 3.3 will now claim the existence of a pair of eigenvectors, for and for , with expansionsas .
In this section, we will sketch three applications: two in statistics and one in numerical analysis.
4.1. Noisy Integral Equations
Let be a compact injective integral operator, with measurable real kernel denoted by the same symbol without confusion. More specifically, input and output are related according toIn practice, only finitely many data regarding the output are available, usually blurred by random measurement error. If the data are collected according to a random design, we may think of the data set as of independent copies of a pair of random variables, wherethe design variable has a Uniform distribution, the error variable has finite variance and zero mean, and where and are stochastically independent.
It is the purpose to recover from these data. It is expedient to “precondition” with the adjoint operator and recover from the equationwhere is compact, Hermitian, and strictly positive. Under suitable conditions, is an unbiased and -consistent estimator of ; see, for instance, van Rooij and Ruymgaart . Since is unbounded, an estimator of the input is obtained by applying a regularized inverse of to . Here we will use the Tikhonov type inversewhere ; see also (2.29). This yields the input estimatorTo assess the quality of the estimator, one considers the mean integrated squared error (MISE)The behavior of the MISE is well studied in literature.
Recently, there is an interest in certain econometric models where the operator (or ) is unknown but can be estimated from the data. Let denote an estimator of and assume that is also compact, Hermitian, and nonnegative. In this case, the input estimatorwill be employed. One expects that estimation of will increase the MISE, and naturally the question arises how much bigger the MISE of will be than that of .
An upper bound for this increase of the MISE can be easily found from the results in Section 2. For large sample size , will be close to , and can be considered as a small random perturbation of . Writing for the Fréchet derivative at , we see from Theorem 2.1 thatApparently, is an extra error term due to the estimation of .
To find an upper bound for its MISE, let us first observe that (2.30) simplifies for and yieldswhere now the and the are the spectral characteristics of . Let us write, for brevity,and note thatWe thus arrive at
Hence, under suitable assumptions, estimation of the kernel yields an extra term in the MISE of the input estimator which is of order . In the Russian literature, sharper bounds can be found; see in particular Bakushinsky and Kokurin [13,Section 2.2]. For results of this type in the statistical literature, obtained in a different manner, see, for instance, Hall and Horowitz  and Florens .
4.2. Some Asymptotics for Functional Canonical Correlations
Let be a real random element in the Hilbert space and assume that . Its mean and covariance operator are well defined by the relations , for all . The operator is known to be of finite trace and hence Hilbert-Schmidt and compact. It is also nonnegative Hermitian. Without real loss of generality, we will assume to be injective, so that it will be strictly positive.
Next suppose that we are given a random sample of independent copies of . The usual estimators of and are and , respectively, where shares all the properties of , except that it cannot be injective because it has a finite dimensional kernel whose range has dimension at most .
Because cannot be injective, the finite dimensional definition of sample canonical correlation has to be modified, and some kind of smoothing or regularization is recommended in literature . Regularization might even be useful when the population is considered, although is injective . This regularization yields Tikhonov type inverses in an expression for the canonical correlation .
For a precise definition, let and be two closed subspaces of and the orthogonal projection onto (). Let us write , and note that for . Similar notation will be used for . The regularized squared principal canonical correlation for the population is now defined asIts sample analogue is obtained by replacing the with in (4.14). The supremum is actually a maximum, and pairs of maximizers will be denoted by , , and , , respectively. The corresponding canonical variates then are
For an alternative description of these canonical correlations, let us introduce the operatorInterchanging the indices and yields , and replacing with yields and . It can be seen that all these operators are Hilbert-Schmidt and strictly positive Hermitian. It will be assumed that has the largest eigenvalue with one-dimensional eigenspace generated by with . Under this condition, it has been shown in Cupidon et al.  thatfor . A similar result holds true for .
It is well known that the asymptotic distribution of the eigenvalues and eigenfunctions of a random operator can be derived from the asymptotic distribution of this random operator itself (see  for Euclidean spaces and  for Hilbert spaces). This technique is based on the results of Section 3. In the present situation, this means that we have to show the convergence in distribution of the suitably standardized . Because all operators are Hilbert-Schmidt, it can be shown that
Result (4.18) follows easily if convergence in distribution can be established for each of the factors defining , for instance,where this time , compare (2.29).It is known  that for some Gaussian random element , where is the Hilbert space of all Hilbert-Schmidt operators mapping into itself. Writing for the Fréchet derivative evaluated at (Section 2) and exploiting the fact that the imbedding of is are continuous, we obtain via a kind of delta-method [18, 19]the desired result. A combination of results like this for each of the factors of yields (4.18).
4.3. Solution of a Nonlinear Operator Equation
In Bakushinsky and Kokurin , the following problem is considered. Let and be Hilbert spaces and an operator, not necessarily linear. The (nonlinear) equationis studied. Let be a solution of (4.22) and introduce a set , for some . It is assumed that is Fréchet differentiable on . If is the derivative at it is, moreover, assumed thatwhere is a given number. Given an initial point and a sequence , , of regularization parameters, these authors show that, under some further conditions, the generalized Gauss-Newton method generates a sequence of points such that
In their proof of this result, the authors need a crucial upper bound. Under some additional assumptions, we want to derive this upper bound as an immediate consequence of Theorem 2.1. In order to relate the present problem to the setup of our paper, let us assume that , and note thatFor , letand setwhere obviously . It is not hard to see that (4.23) entailsfor some . Let be the contour in (2.1) and the corresponding domain. As in Bakushinsky and Kokurin , a function , , is employed in the iteration scheme, which is analytic on .
Narrowing down the generality in Bakushinsky and Kokurin  somewhat further, so that the current conditions are satisfied, their proof of the convergence of the iterations requires an upper bound for the expression (in our notation)Keeping fixed, let us briefly write this last expression as . Now Theorem 2.1 applies with , and application yields at oncefor some , by (4.28).
The authors are grateful to the referee for some useful comments. For this research, D. S. Gilliam was supported by AFOSR Grant no. FA9550-04-1027 and F. H. Ruymgaart by NSF Grant no. DMS-0605167.
- T. Kato, Perturbation Theory for Linear Operators, Springer, Berlin, Germany, 1966.
- F. Rellich, Perturbation Theory of Eigenvalue Problems, Gordon and Breach, New York, NY, USA, 1969.
- F. Chatelin, Spectral Approximation of Linear Operators, Computer Science and Applied Mathematics, Academic Press, New York, NY, USA, 1983.
- J. Dauxois, A. Pousse, and Y. Romain, “Asymptotic theory for the principal component analysis of a vector random function: some applications to statistical inference,” Journal of Multivariate Analysis, vol. 12, no. 1, pp. 136–154, 1982.
- R. Bhatia, Positive Definite Matrices, Princeton Series in Applied Mathematics, Princeton University Press, Princeton, NJ, USA, 2007.
- F. Ruymgaart and S. Yang, “Some applications of Watson's perturbation approach to random matrices,” Journal of Multivariate Analysis, vol. 60, no. 1, pp. 48–60, 1997.
- P. D. Lax, Functional Analysis, Pure and Applied Mathematics, John Wiley & Sons, New York, NY, USA, 2002.
- N. Dunford and J. T. Schwartz, Linear Operators. Part I: General Theory, Wiley Classics Library, Wiley-Interscience, New York, NY, USA, 1988.
- L. Debnath and P. Mikusiński, Introduction to Hilbert Spaces with Applications, Academic Press, San Diego, Calif, USA, 2nd edition, 1999.
- G. S. Watson, Statistics on Spheres, vol. 6 of University of Arkansas Lecture Notes in the Mathematical Sciences, John Wiley & Sons, New York, NY, USA, 1983.
- F. Riesz and B. Sz.-Nagy, Functional Analysis, Dover Books on Advanced Mathematics, Dover, New York, NY, USA, 1990.
- A. C. M. van Rooij and F. Ruymgaart, “Asymptotic minimax rates for abstract linear estimators,” Journal of Statistical Planning and Inference, vol. 53, no. 3, pp. 389–402, 1996.
- A. B. Bakushinsky and M. Yu. Kokurin, Iterative Methods for Approximate Solution of Inverse Problems, vol. 577 of Mathematics and Its Applications, Springer, Dordrecht, The Netherlands, 2004.
- P. Hall and J. L. Horowitz, “Nonparametric methods for inference in the presence of instrumental variables,” The Annals of Statistics, vol. 33, no. 6, pp. 2904–2929, 2005.
- J.-P. Florens, “Inverse problems and structural econometrics: the example of instrumental variables,” in Advances in Economics and Econometrics: Theory and Applications Dewatripont, M. Hanson and S. J. Turnovsky, Eds., vol. 2, pp. 284–311, Cambridge University Press, Cambridge, UK, 2003.
- S. E. Leurgans, R. A. Moyeed, and B. W. Silverman, “Canonical correlation analysis when the data are curves,” Journal of the Royal Statistical Society. Series B, vol. 55, no. 3, pp. 725–740, 1993.
- J. Cupidon, R. Eubank, D. S. Gilliam, and F. Ruymgaart, “Some properties of canonical correlations and variates in infinite dimensions,” Journal of Multivariate Analysis, vol. 99, no. 6, pp. 1083–1104, 2008.
- J. Cupidon, D. S. Gilliam, R. Eubank, and F. Ruymgaart, “The delta method for analytic functions of random operators with application to functional data,” Bernoulli, vol. 13, no. 4, pp. 1179–1194, 2007.
- A. W. van der Vaart, Asymptotic Statistics, Cambridge University Press, Cambridge, UK, 1998.
Copyright © 2009 D. S. Gilliam et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.