The main result in this paper is the determination of the Fréchet derivative of an
analytic function of a bounded operator, tangentially to the space of all bounded operators. Some applied problems from statistics and numerical analysis are included as a motivation for this study. The perturbation operator (increment) is not of any special form and is not supposed to commute with the operator at which the derivative is evaluated. This generality is important for the applications. In the Hermitian case, moreover, some results on perturbation of an isolated eigenvalue, its eigenprojection, and its eigenvector if the eigenvalue is simple, are also included. Although these results are known in principle, they are not in general formulated in terms of arbitrary perturbations as required for the applications. Moreover, these results are presented as corollaries to the main theorem, so that this paper also provides a short, essentially self-contained review of these aspects of perturbation theory.
1. Introduction
Motivated by certain applications in numerical analysis
and, in particular, statistics, this paper deals with the Fréchet derivative of
an analytic function of a bounded
linear operator on a separable
Hilbert space (in the sense
of the usual functional calculus), tangentially to the Banach space of all bounded
linear operators mapping into itself.
More precisely, a first order approximation to the difference
is obtained, including the order
of magnitude of the remainder. An example of such a function is a
generalized or regularized inverse of the square rootwhere is the identity
operator. Once the Fréchet derivative has been established (Section 2), it
yields the asymptotic distribution of functions of certain random operators via
an ensuing delta method: a well-known statistical technique (see Section 4).
Clearly can be regarded
as a perturbed version of , and it is not surprising that perturbation methods
are employed to obtain the desired result. The authors are aware of the
possibility that the rather straightforward result on the Fréchet derivative
might be hidden somewhere in the rich literature on perturbation theory [1–3]. Yet they have not been successful in
identifying a reference that states the result in its present form, tailored to
the applications they have in mind. Some remarks are particularly in order.
(a)The perturbations
are typically
of small norm but otherwise arbitrary bounded or Hermitian. In literature, they are often of the formfor operators , and a small number . In statistics, there is no point in representing the
perturbation in such a form.(b)The
perturbation and the
operator are not assumed
to commute, because in our applications such an assumption would not in general
be fulfilled. If the operators do commute, however, the Fréchet derivative
would reduce to , in the sense of functional calculus with the derivative
of . In the case considered here, the actual Fréchet
derivative and may differ
considerably.(c)A central theme
in perturbation theory concerns the perturbation of an isolated eigenvalue and
corresponding eigenprojection (see, e.g, the references mentioned
before). Some of the results are included, because they can be easily derived
from the main result on the Fréchet derivative by choosing a special function (Section 3). In
this way, the paper presents a concise and essentially self-contained review of some
basic results in this area. They are again presented in terms of a general
(Hermitian) perturbation , as being required for statistical application, in the
same vein as, but somewhat more general than, Dauxois et al. [4].
As has already been mentioned in the beginning, will be a
separable Hilbert space and the Banach
space of all bounded linear operators mapping into itself.
The inner product on will be denoted
by and the norm by The norm on will be written , and the notation and will be used to
denote the subspace of all Hermitian and all compact Hermitian operators, respectively.
We will exclusively deal with infinite dimensional
Hilbert spaces and will not attempt to include the simpler finite dimensional
case in our formulation. The Fréchet derivative for arbitrary perturbations is
well known in the finite dimensional matrix case. This result and further
references can be found in the recent monograph by Bhatia [5]. In the finite
dimensional case, this derivative is also implicitly present in Theorem 2.1 of
Ruymgaart and Yang [6] to obtain the asymptotic distribution of a function
of a random matrix.
2. The Fréchet Derivative
Let us fix an arbitrary with spectrum and a bounded open region in the complex
plane with smooth boundary , such thatFurthermore, let us consider
functions of typewhere is an open
neighborhood of . Let us write
The resolventis analytic on the resolvent set , and the
operatoris well defined. This relation
establishes an algebra homomorphism [7, Section 17.2] which implies in
particular thatif is also
analytic. In particular, we have
The operators are well defined for every sufficiently
small. Note that according to Dunford and Schwartz [8, Lemma VII.6.11],
there is a constant , such that
Theorem 2.1 (Fréchet Derivative). Let and suppose
that satisfies (2.2). Then maps the
neighborhood into , when defined in the usual way of functional
calculus. This mapping is Fréchet differentiable at , tangentially to , with bounded derivative as defined by (2.8). More specifically, we
have where is defined in (2.9) and
Proof. For to be well
defined on the neighborhood, let us first show thatTo verify this, note that by
(2.10) we have for such . Consequently, the operatoris bounded for each , which entails (2.14). Hence, is well defined for with .
Applying a Neumann series expansion [9, Section 5.2] to the inverse on the left in
(2.15), we obtainjust as in Watson [10] for
matrices. Term-wise integration yields (2.11).
The upper bounds in (2.12) and
(2.13) are immediate
from (2.8) and (2.9), respectively, by exploiting
(2.3) and (2.15). The boundedness of as a linear
operator mapping into itself
follows at once from (2.12).
Remark 2.2. It will be seen in Section 4 that for the
applications we have in mind it is important that we do not require that and commute. If
they do, however, it is clear that the Fréchet derivative in (2.8) reduces
toIt has been shown in Dunford
and Schwartz [8, proof of Theorem VII.6.10] thatCombination of (2.11) with
(2.18) and (2.19) yieldswriting, for any ,to indicate any quantity
(operator, vector, number) whose norm or absolute value is of the given order.
Note that in (2.20) the operator, is to be understood
in the sense of the usual functional calculus as in (2.5) with replaced by its
derivative .
In this situation of commuting operators, Dunford and Schwartz [8] obtain the Taylor expansionwhich implies, of course, (2.20).
Keeping the perturbation as before, we now restrict to the class of compact
Hermitian operators. The bounded and countable spectrum consists of the number , whether an eigenvalue or not, and all the nonzero
eigenvalues . In this work, we avoid technical issues related to being an eigenvalue,
and assume that is one-to-one,
that is, implies that . It is well known [7] that such a can be
represented aswhere the are the
corresponding orthogonal eigenprojections onto the mutually orthogonal finite
dimensional eigenspaces. These projections provide a resolution of the identity
in , that is,The resolvent has the
expansion
Corollary 2.3. Let the conditions of Theorem 2.1 be fulfilled for with expansion (2.23). In this case the Fréchet derivative is given
by
Proof. Let us substitute the expansion (2.25) for into the
expression for in (2.8).
Application of the partial fraction method yieldsThe right-hand side of (2.27)
reduces at once to the expression on the right in (2.26) by an application of
Cauchy's integral formula.
Example 2.4. The function , , is analytic on the entire complex plane so that
Corollary 2.3 applies. The Fréchet derivative in (2.26) now reduces
to. Of course this result is immediate because in this
simple case .
Example 2.5. Next let us, for , consider the functionfor . Note that the choice of ensures that
the pole at remains outside
the contour . Clearly there exists an open region of the type
required, such that is analytic on
some open neighborhood of . Hence Corollary 2.3 applies again. The operator represents a
regularized or generalized inverse of Tikhonov type, according to whether is injective or
not. The Fréchet derivative in (2.26) now equalsfor .
Remark 2.6. For , and commuting the
double sum on the right in (2.26) cancels and we obtainin accordance with (2.20).
Apparently, the double sum is a correction term needed when and do not
commute.
3. Perturbation of Eigenvalues and
Eigenvectors
Throughout this section, both and are assumed to
be Hermitian, so that also . In addition to this, we assume thatwith one-dimensional eigenspace.
Consequently, the eigenprojection can be writtenwhere for the operator is defined by .
The region will now be
chosen in such a way that it has a connected component with the propertiesA special analytic function such
thatwill play an important role in
the sequel. Note, for instance, that
For the Fréchet derivative of at , a special
expression can be obtained. Let us writewhere is Hermitian
with spectrum . According to the spectral theorem, there exists a
resolution of the identity , , such thatIt should be noted
thatwhere is the zero
operator, and thatLet us define
Lemma 3.1. The Fréchet derivative of at is given
by
Proof. This follows by substitution of (3.9) in the
expression on the right infor this derivative; see also
(2.8). We thus obtain
By Cauchy's integral formulaso that the first term on the
right side in (3.13) is the zero operator. Regarding the second, note
thatbecause each lies outside
the contour . Consequently, the second term equalsSimilarly, the third term equals . The last term cancels, becausesince both and lie outside .
Some results
about the perturbation of and in a given
direction as in (1.3) that are well known in literature [1, 2] can be partly recovered for perturbations in some neighborhood,
in an essentially self-contained manner, as simple consequences of the results
in Section 2.
Corollary 3.2. Under the assumptions (3.1), (3.2), and for sufficiently
small, the operator has an isolated
eigenvalue with
eigenprojection for some unit
vector , satisfying where is defined in
(3.10).
Proof. In view of (3.5) and (3.11), application of (2.11)
with yields . Clearly is Hermitian,
and because by (2.6), it is
also idempotent so that it is in fact some projection , for example, it follows that for all sufficiently
small, and hence the range of must also have
dimension 1 [11]
so that for some with .
Next, let , , be the identity function. By (2.6), again, on the
one hand we have , and on the other . Hence is an
eigenvector of with eigenvalue .
Corollary 3.3. Under the assumptions of Corollary 3.2, we
have
Proof. Let us first observe that because of
(3.8). Hence (3.18) yields , whereIt sufficies to show that for . The idea of the proof can be found in Dauxois et al. [4].
Regarding , note that , once more using (3.18). Hence , as , and therefore for sufficiently
small. This entails
For we
haveas can be seen from (3.21).
Corollary 3.4. Under the assumptions of Corollary 3.2, we
have
Proof. With the help of (3.19), we see that . The result follows from a routine calculation
combined with the equalities , , and . For the last two equalities we assume that and are Hermitian
and by (3.8).
Corollary 3.5. Let be given by
(2.23) and satisfy (3.2). Then (3.18) and (3.19) remain true
with
Proof. All nonzero eignvalues of are isolated,
in particular . It is immediate from (2.23) that , and this leads
to the special expression for in (3.24).
Remark 3.6. The assumption that be Hermitian is
in fact not necessary. Of course, if we just require to be bounded,
the perturbed operator is not in
general Hermitian anymore. In particular, a suitably modified version of
Corollary 3.3 will now claim the existence of a pair of eigenvectors, for and for , with expansionsas .
4. Applications
In this section, we will sketch three applications: two
in statistics and one in numerical analysis.
4.1. Noisy Integral Equations
Let be a compact
injective integral operator, with measurable real kernel denoted by the same
symbol without confusion. More specifically, input and output are related
according toIn practice, only finitely many
data regarding the output are available, usually blurred by random measurement
error. If the data are collected according to a random design, we may think of
the data set as of independent
copies of a pair of random
variables, wherethe design variable has a Uniform distribution,
the error variable has finite
variance and zero mean, and where and are
stochastically independent.
It is the purpose to recover from these
data. It is expedient to “precondition” with the adjoint operator and recover from the
equationwhere is compact,
Hermitian, and strictly positive. Under suitable conditions, is an unbiased and -consistent
estimator of ; see, for instance, van Rooij and Ruymgaart
[12]. Since is unbounded,
an estimator of the input is obtained by
applying a regularized inverse of to . Here we will use the Tikhonov type
inversewhere ; see also (2.29). This yields the input
estimatorTo assess the quality of the
estimator, one considers the mean integrated squared error
(MISE)The behavior of the MISE is
well studied in literature.
Recently, there is an interest in certain econometric
models where the operator (or ) is unknown
but can be estimated from the data. Let denote an
estimator of and assume that is also
compact, Hermitian, and nonnegative. In this case, the input
estimatorwill be employed. One expects
that estimation of will increase
the MISE, and naturally the question arises how much bigger the MISE of will be than
that of .
An upper bound for this increase of the MISE can be
easily found from the results in Section 2. For large sample size , will be close
to , and can be
considered as a small random perturbation of . Writing for the Fréchet
derivative at , we see from Theorem 2.1 thatApparently, is an extra
error term due to the estimation of .
To find an upper bound for its MISE, let us first
observe that (2.30) simplifies for and
yieldswhere now the and the are the
spectral characteristics of . Let us write, for brevity,and note thatWe thus arrive
at
Hence, under suitable assumptions, estimation of the
kernel yields an extra term in the MISE of the input estimator which is of
order . In the Russian literature, sharper bounds can be
found; see in particular Bakushinsky and Kokurin [13,Section 2.2]. For
results of this type in the statistical literature, obtained in a different
manner, see, for instance, Hall and Horowitz [14] and Florens [15].
4.2. Some Asymptotics for Functional Canonical Correlations
Let be a real
random element in the Hilbert space and assume that . Its mean and covariance
operator are
well defined by the relations , for all . The operator is known to be
of finite trace and hence Hilbert-Schmidt and compact. It is also nonnegative
Hermitian. Without real loss of generality, we will assume to be
injective, so that it will be strictly positive.
Next suppose that we are given a random sample of independent
copies of . The usual estimators of and are and