Minimization of Tikhonov Functionals in Banach Spaces

Bonesky, Thomas; Kazimierski, Kamil S.; Maass, Peter; Schöpfer, Frank; Schuster, Thomas

doi:https://doi.org/10.1155/2008/192679

Abstract and Applied Analysis

On this page

Abstract Introduction Preliminaries Appendix References Copyright Related Articles

Research Article | Open Access

Volume 2008 | Article ID 192679 | https://doi.org/10.1155/2008/192679

Minimization of Tikhonov Functionals in Banach Spaces

Thomas Bonesky,¹Kamil S. Kazimierski,¹Peter Maass,¹Frank Schöpfer,²and Thomas Schuster²

Academic Editor: Simeon Reich

Received03 Jul 2007

Accepted31 Oct 2007

Published12 Mar 2008

Abstract

Tikhonov functionals are known to be well suited for obtaining regularized solutions of linear operator equations. We analyze two iterative methods for finding the minimizer of norm-based Tikhonov functionals in Banach spaces. One is the steepest descent method, whereby the iterations are directly carried out in the underlying space, and the other one performs iterations in the dual space. We prove strong convergence of both methods.

1. Introduction

This article is concerned with the stable solution of operator equations of the first kind in Banach spaces. More precisely, we aim at computing a solution of for a linear, continuous mapping , where and are Banach spaces and denotes the measured data which are contaminated by some noise . There exists a large variety of regularization methods for (1.1) in case that and are Hilbert spaces such as the truncated singular value decomposition, the Tikhonov-Phillips regularization, or iterative solvers like the Landweber method and the method of conjugate gradients. We refer to the monographs of Louis [1], Rieder [2], Engl et al. [3] for a comprehensive study of solution methods for inverse problems in Hilbert spaces.

The development of explicit solvers for operator equations in Banach spaces is a current field of research which has great importance since the Banach space setting allows for dealing with inverse problems in a mathematical framework which is often better adjusted to the requirements of a certain application. Alber [4] established an iterative regularization scheme in Banach spaces to solve (1.1) where particularly is a monotone operator. In case that , Plato [5] applied a linear Landweber method together with the discrepancy principle in order to get a solution to (1.1) after a discretization. Osher et al. [6] developed an iterative algorithm for image restoration by minimizing the norm. Butnariu and Resmerita [7] used Bregman projections to obtain a weakly convergent algorithm for solving (1.1) in a Banach space setting. Schöpfer et al. [8] proved strong convergence and stability of a nonlinear Landweber method for solving (1.1) in connection with the discrepancy principle in a fairly general setting where has to be smooth and uniformly convex.

The idea of this paper is to get a solver for (1.1) by minimizing a Tikhonov functional where we use Banach space norms in the data term as well as in the penalty term. Since we only consider the case of exact data we put in (1.1). That means that we investigate the problem where the Tikhonov functional is given by with a continuous linear operator mapping between two Banach spaces and .

If and are Hilbert spaces, many results exist for problem (1.2) concerning solution methods, convergence, and stability of them and parameter choice rules for can be found in the literature. In case that only is a Hilbert space, this problem has been thoroughly studied and many solvers have been established; see [9, 10]. A possibility to get an approximate solution for (1.2) is to use the steepest descent method. Assume for the moment that both and are Hilbert spaces and . Then is Gâteaux differentiable and the steepest descent method applied to (1.2) coincides with the well-known Landweber method This iterative method converges to the unique minimizer of problem (1.2), if the stepsize is chosen properly.

In the present paper, we consider two generalizations of (1.4). First we notice that the natural extension of the gradient for convex, but not necessarily smooth, functionals is the notion of the subdifferential . We will elaborate the details later, but for the time being we note that is a set-valued mapping, that is, . Here we make use of the usual notation in the context of convex analysis, where means a mapping from to . We then consider the formally defined iterative scheme where is a duality mapping of . In the case of smooth we also consider a second generalization to (1.4) We will show that both schemes converge strongly to the unique minimizer of problem (1.2), if is chosen properly.

Alber et al. presented in [11] an algorithm for the minimization of convex and not necessarily smooth functionals on uniformly smooth and uniformly convex Banach spaces which looks very similar to our first method in Section 3 and where the authors impose summation conditions on the stepsizes . However, only weak convergence of the proposed scheme is shown. Another interesting approach to obtain convergence results of descent methods in general Banach spaces can be found in the recent papers by Reich and Zaslavski [12, 13]. We want to emphasize that the most important novelties of the present paper are the strong convergence results.

In the next section, we give the necessary theoretical tools and apply them in Sections 3 and 4 to describe the methods and prove their convergence properties.

2. Preliminaries

Throughout the paper, let and be Banach spaces with duals and . Their norms will be denoted by . We omit indices indicating the space since it will become clear from the context which one is meant. For and , we write

Let be conjugate exponents such that

2.1. Convexity and Smoothness of Banach Spaces

We introduce some definitions and preliminary results about the geometry of Banach spaces, which can be found in [14, 15].

The functions and defined by are referred to as the modulus of convexity of and the modulus of smoothness of .

Definition 2.1. A Banach space is said to be
(1) uniformly convex if for all ,(2)-convex or convex of power type if for some and ,(3) smooth if for every , there is a unique such that and ,(4) uniformly smooth if ,(5)-smooth or smooth of power type if for some and ,

There is a tight connection between the modulus of convexity and the modulus of smoothness. The Lindenstrauss duality formula implies that (cf. [16], chapter II, Thereom 2.12). From Dvoretzky's theorem [17], it follows that and . For Hilbert spaces the polarization identity asserts that every Hilbert space is -convex and -smooth. For the sequence spaces , Lebesgue spaces , and Sobolev spaces it is also known [18, 19] that

2.2. Duality Mapping

For the set-valued mapping defined byis called the duality mapping of (with weight function ). By we denote a single-valued selection of .

One can show [15, Theorem I.4.4] that is monotone, that is, If is smooth, the duality mapping is single valued, that is, one can identify it as [15, Theorem I.4.5].

If is uniformly convex or uniformly smooth, then is reflexive [15, Theorems II.2.9 and II.2.15]. By , we then denote the duality mapping from into .

Let be the subdifferential of the convex functional . At it is defined by Another important property of is due to the theorem of Asplund [15, Theorem I.4.4] This equality is also valid in the case of set valued duality mappings.

Example 2.2. In spaces with , we have
In the sequence spaces with , we have

We also refer the interested reader to [20] where additional information on duality mappings may be found.

2.3. Xu-Roach Inequalities

The next theorem (see [19]) provides us with inequalities which will be of great relevance for proving the convergence of our methods.

Theorem 2.3. (1) Let be a -smooth Banach space. Then there exists a positive constant such that
(2) Let be a -convex Banach space. Then there exists a positive constant such that

We remark that in a real Hilbert space these inequalities reduce to the well-known polarization identity (2.7). Further, we refer to [19] for the exact values of the constants and . For special cases like -spaces these constants have a simple form, see [8].

2.4. Bregman Distances

It turns out that due to the geometrical characteristics of Banach spaces other than Hilbert spaces, it is often more appropriate to use Bregman distances instead of conventional-norm-based functionals or for convergence analysis. The idea to use such distances to design and analyze optimization algorithms goes back to Bregman [21] and since then his ideas have been successfully applied in various ways [4, 8, 22–26].

Definition 2.4. Let be smooth and convex of power type. Then the Bregman distances are defined as

We summarize a few facts concerning Bregman distances and their relationship to the norm in (see also [8, Theorem 2.12] ).

Theorem 2.5. Let be smooth and convex of power type. Then for all , , and sequences in the following holds:
(1),(2),(3) is coercive, that is, the sequence remains bounded if the sequence is bounded.

Remark 2.6. is in general not metric. In a real Hilbert space .

To shorten the proof in Chapter 3, we formulate and prove the following.

Lemma 2.7. Let be a -convex Banach space, then there exists a positive constant , such that

Proof. We have and , hence
By Theorem 2.3, we obtain
This completes the proof.

3. The Dual Method

This section deals with an iterative method for minimizing functionals of Tikhonov type. In contrast to the algorithm described in the next section, we iterate directly in the dual space .

Due to simplicity, we restrict ourselves to the Tikhonov functional where is a -convex and smooth Banach space, is an arbitrary Banach space and is a linear, continuous operator. For minimizing the functional, we choose an arbitrary starting point and consider the following scheme

We show the convergence of this method in a constructive way. This will be done via the following steps.

(1) We show the inequality where is the unique minimizer of the Tikhonov functional (3.1).(2) We choose admissible stepsizes and show that the iterates approach in the Bregman sense, if we assumeWe suppose to be small and specified later.(3) We establish an upper estimate for in the case that the condition is violated.(4) We choose such that in the case the iterates stay in a certain Bregman ball, that is, , where is some a priori chosen precision we want to achieve.(5) Finally, we state the iterative minimization scheme.

(i) First, we calculate the estimate for , where

Under our assumptions on , we know that has a unique minimizer . Using (3.2) we getWe remember that is -convex, hence is -smooth; see Section 2.1. By Theorem 2.3 applied to , we get

Therefore,

We have (cf. [27], Chapter I; Propositons 5.6, 5.7). By definition, is the minimizer of , hence . Therefore, with the monotonicity of , we getConsider

Finally, we arrive at the desired inequality

(ii) Next, we choose admissible stepsizes. Assume that

We see that the choice

minimizes the right-hand side of (3.12). We do not know the distance , therefore, we set We will impose additional conditions on later. For the time being, assume that is small. The number is defined by The Tikhonov functional is bounded on norm bounded sets, thus also is bounded on norm-bounded sets. By Lemma 2.7, we know then that Hence, is finite for finite .

Remark 3.1. If we assume and with the help of Lemma 2.7, the definition of , and the duality mapping , we get an estimate for . We have We calculate an estimate for : This calculation gives us an estimate for . In practice, we will not determine this estimate exactly, but choose in a sense big enough.

For we approach the minimizer in the Bregman sense, that is, where This ensures as long as is fulfilled.

(iii) We know the behavior of the Bregman distances, if holds. Next, we need to know what happens if . By (3.12), we then have

(iv) We choosewhere is the accuracy we aim at. For the case this choice of assures thatNote that the choice of implies .

Next, we calculate an index , which ensures that the iterates with are located in a Bregman ball with radius around . We know that if fulfills , then all following iterates fulfill this condition as well.

Hence, the opposite case is . By (3.20), we know that this is only the case if

By choosing such that

we get

Figure 1 illustrates the behavior of the iterates.

(v) We are now in the same situation as described in (2). If we replace by , by and by some and repeat the argumentation in (2)–(4), we obtain a contracting sequence of Bregman balls.

If the sequence is a null sequence, then by Lemma 2.7 the iterates converge strongly to . This proves the following.

Theorem 3.2. The iterative method, defined by
(S₀)choose an arbitrary and a decreasing positive sequence withset ;(S₁)compute , , , and as (S₂)iterate byfor at least iterations, where(S₃)let , reset and go to step (S₁), defines an iterative minimization method for the Tikhonov functional , defined in (3.1) and the iterates converge strongly to the unique minimizer .

Remark 3.3. A similar construction can be carried out for any -convex and smooth Banach space.

4. Steepest Descent Method

Let be uniformly convex and uniformly smooth and let be uniformly smooth. Then the Tikhonov functionalis strictly convex, weakly lower semicontinuous, coercive, and Gâteaux differentiable with derivative Hence, there exists the unique minimizer of , which is characterized byIn this section, we consider the steepest descent method to find . In [28, 29], it has already been proven that for a general continuously differentiable functional every cluster point of such steepest descent method is a stationary point. Recently, Canuto and Urban [30] have shown strong convergence under the additional assumption of ellipticity, which our in (4.1) would fulfill if we required to be -convex. Here we prove strong convergence without this additional assumption. To make the proof of convergence more transparent, we confine ourselves here to the case of -smooth and -smooth (with then being the ones appearing in the definition of the Tikhonov functional (4.1)) and refer the interested reader to the appendix, where we prove the general case.

Theorem 4.1. The sequence , generated by
(S₀)choose an arbitrary starting point and set ;(S₁)if , then STOP else do a line search to find such that(S₂)set , and go to step (S₁), converges strongly to the unique minimizer of .

Remark 4.2. (a) If the stopping criterion is fulfilled for some , then by (4.3), we already have and we can stop iterating.
(b) Due to the properties of , the function defined byappearing in the line search of step (S₁) is strictly convex and differentiable with continuous derivativeSince and is increasing by the monotonicity of the duality mappings, we know that must in fact be positive.

Proof of Theorem 4.1. By the above remark it suffices to prove convergence in case for all . We fix and show that there exists positive such that which will finally assure convergence. To establish this relation, we use the characteristic inequalities in Theorem 2.3 to estimate, for all ,
By (4.1) and (4.2) for andwe can further estimate whereby we set The function is continuous and increasing with and . Hence, there exists a such thatand we get We show that . From (4.13), we infer that the sequence is decreasing and especially bounded and thatSince is coercive, the sequence remains bounded and (4.2) then implies that the sequence is bounded as well. Suppose and let for . Then we must have by (4.14). But by the definition of (4.11) and the choice of (4.12), we get for some constant with ,Since the right-hand side converges to zero for , this leads to a contradiction. So we have and thus . We finally show that converges strongly to . By (4.3) and the monotonicity of the duality mapping , we getSince is bounded and , this yieldsfrom which we infer that converges strongly to in a uniformly convex [15, Theorom II.2.17].

5. Conclusions

We have analyzed two conceptionally quite different nonlinear iterative methods for finding the minimizer of norm-based Tikhonov functionals in Banach spaces. One is the steepest descent method, where the iterations are directly carried out in the -space by pulling the gradient of the Tikhonov functional back to via duality mappings. The method is shown to be strongly convergent in case the involved spaces are nice enough. In the other one, the iterations are performed in the dual space . Though this method seems to be inherently slow, strong convergence can be shown without restrictions on the -space.

Appendix

Steepest Descent Method in Uniformly Smooth Spaces

As already pointed out in Section 4, we prove here Theorem 4.1 for the general case of being uniformly convex and uniformly smooth and being uniformly smooth, and with in the definition of the Tikhonov functional (4.1). To do so, we need some additional results based on the paper of Xu and Roach [19].

In what follows are always supposed to be (generic) constants and we writeLet be the functionwhere is the modulus of smoothness of a Banach space . The function is known to be continuous and nondecreasing [14, 31].

The next lemma allows us to estimate via , which in turn will be used to derive a version of the characteristic inequality that is more convenient for our purpose.

Lemma A.1. Let be a uniformly smooth Banach space with duality mapping with weight . Then for all the following inequalities are valid: (hence, is uniformly continuous on bounded sets) and

Proof. We at first prove (A.3). By [19, formula (3.1)], we have We estimate similarly as after inequality (3.5) in the same paper. If , then we get by the monotonicity of and therefore (A.3) is valid. In case (), we use the fact that is equivalent to a decreasing function (i.e. for [14]) and getand therefore For , we thus arrive at and also in this case (A.3) is valid.
Let us prove (A.4). As in [19], we consider the continuously differentiable function with and get For , we set and get , and thus . By the monotonicity of , we haveand by (A.3), we thus obtain

The proof of Theorem 4.1 is now quite similar to the case of smoothness of power type, though it is more technical, and we only give the main modifications.

Proof of Theorem 4.1 (for uniformly smooth spaces). We fix , and for , we choose such thatHere the function is defined by with the constants being the ones appearing in the respective characteristic inequalities (A.4). This choice of is possible since by the properties of and , the function is continuous, increasing and . We again aim at an inequality of the formwhich will finally assure convergence. Here we use the characteristic inequalities (A.4) to estimate Since and by the definition of (A.15), we can further estimateThe choice of (A.14) finally yields It remains to show that this implies . The rest then follows analogously as in the proof of Theorem 4.1. From (A.19), we infer that and that the sequences and are bounded.
Suppose and let for . Then we must have by (A.20). We show that this leads to a contradiction. On the one hand by (A.15), we get Since the right-hand side converges to zero for , so does . On the other hand, by (A.14), we haveHence, for all big enough which contradicts . So we have and thus .

Acknowledgment

The first author was supported by Deutsche Forschungsgemeinschaft, Grant no. MA 1657/15-1.

References

A. K. Louis, Inverse und schlecht gestellte Probleme, Teubner Studienbücher Mathematik, B. G. Teubner, Stuttgart, Germany, 1989.
View at: Zentralblatt MATH | MathSciNet
A. Rieder, No Problems with Inverse Problems, Vieweg & Sohn, Braunschweig, Germany, 2003.
View at: Zentralblatt MATH | MathSciNet
H. Engl, M. Hanke, and A. Neubauer, Regularization of Inverse Problems, Kluwer Academic, Dordrecht, The Netherlands, 2000.
Y. I. Alber, “Iterative regularization in Banach spaces,” Soviet Mathematics, vol. 30, no. 4, pp. 1–8, 1986.
View at: Google Scholar | Zentralblatt MATH
R. Plato, “On the discrepancy principle for iterative and parametric methods to solve linear ill-posed equations,” Numerische Mathematik, vol. 75, no. 1, pp. 99–120, 1996.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
S. Osher, M. Burger, D. Goldfarb, J. Xu, and W. Yin, “An iterative regularization method for total variation-based image restoration,” Multiscale Modeling & Simulation, vol. 4, no. 2, pp. 460–489, 2005.
View at: Publisher Site | Google Scholar | MathSciNet
D. Butnariu and E. Resmerita, “Bregman distances, totally convex functions, and a method for solving operator equations in Banach spaces,” Abstract and Applied Analysis, vol. 2006, Article ID 84919, p. 39, 2006.
View at: Publisher Site | Google Scholar | MathSciNet
F. Schöpfer, A. K. Louis, and T. Schuster, “Nonlinear iterative methods for linear ill-posed problems in Banach spaces,” Inverse Problems, vol. 22, no. 1, pp. 311–329, 2006.
View at: Google Scholar | MathSciNet
I. Daubechies, M. Defrise, and C. De Mol, “An iterative thresholding algorithm for linear inverse problems with a sparsity constraint,” Communications on Pure and Applied Mathematics, vol. 57, no. 11, pp. 1413–1457, 2004.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
K. Bredies, D. Lorenz, and P. Maass, “A generalized conditional gradient method and its connection to an iterative shrinkage method,” to appear in Computational Optimization and Applications.
View at: Google Scholar
Y. I. Alber, A. N. Iusem, and M. V. Solodov, “Minimization of nonsmooth convex functionals in Banach spaces,” Journal of Convex Analysis, vol. 4, no. 2, pp. 235–255, 1997.
View at: Google Scholar | Zentralblatt MATH | MathSciNet
S. Reich and A. J. Zaslavski, “Generic convergence of descent methods in Banach spaces,” Mathematics of Operations Research, vol. 25, no. 2, pp. 231–242, 2000.
View at: Google Scholar | Zentralblatt MATH | MathSciNet
S. Reich and A. J. Zaslavski, “The set of divergent descent methods in a Banach space is $σ$ -porous,” SIAM Journal on Optimization, vol. 11, no. 4, pp. 1003–1018, 2001.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
J. Lindenstrauss and L. Tzafriri, Classical Banach Spaces. II, vol. 97 of Results in Mathematics and Related Areas, Springer, Berlin, Germany, 1979.
View at: Zentralblatt MATH | MathSciNet
I. Cioranescu, Geometry of Banach Spaces, Duality Mappings and Nonlinear Problems, vol. 62 of Mathematics and Its Applications, Kluwer Academic Publishers, Dordrecht, The Netherlands, 1990.
View at: Zentralblatt MATH | MathSciNet
R. Deville, G. Godefroy, and V. Zizler, Smoothness and Renormings in Banach Spaces, vol. 64 of Pitman Monographs and Surveys in Pure and Applied Mathematics, Longman Scientific & Technical, Harlow, UK, 1993.
View at: Zentralblatt MATH | MathSciNet
A. Dvoretzky, “Some results on convex bodies and Banach spaces,” in Proceedings of the International Symposium on Linear Spaces, pp. 123–160, Jerusalem Academic Press, Jerusalem, Israel, 1961.
View at: Google Scholar | Zentralblatt MATH | MathSciNet
O. Hanner, “On the uniform convexity of $L^{p}$ and $l^{p}$ ,” Arkiv för Matematik, vol. 3, no. 3, pp. 239–244, 1956.
View at: Google Scholar | Zentralblatt MATH | MathSciNet
Z. B. Xu and G. F. Roach, “Characteristic inequalities of uniformly convex and uniformly smooth Banach spaces,” Journal of Mathematical Analysis and Applications, vol. 157, no. 1, pp. 189–210, 1991.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
S. Reich, “Review of I. Cioranescu “Geometry of Banach spaces, duality mappings and nonlinear problems”,” Bulletin of the American Mathematical Society, vol. 26, no. 2, pp. 367–370, 1992.
View at: Google Scholar
L. M. Bregman, “The relaxation method for finding common points of convex sets and its application to the solution of problems in convex programming,” USSR Computational Mathematics and Mathematical Physics, vol. 7, pp. 200–217, 1967.
View at: Google Scholar
C. Byrne and Y. Censor, “Proximity function minimization using multiple Bregman projections, with applications to split feasibility and Kullback-Leibler distance minimization,” Annals of Operations Research, vol. 105, no. 1–4, pp. 77–98, 2001.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
Y. I. Alber and D. Butnariu, “Convergence of Bregman projection methods for solving consistent convex feasibility problems in reflexive Banach spaces,” Journal of Optimization Theory and Applications, vol. 92, no. 1, pp. 33–61, 1997.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
H. H. Bauschke, J. M. Borwein, and P. L. Combettes, “Bregman monotone optimization algorithms,” SIAM Journal on Control and Optimization, vol. 42, no. 2, pp. 596–636, 2003.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
H. H. Bauschke and A. S. Lewis, “Dykstra's algorithm with Bregman projections: a convergence proof,” Optimization, vol. 48, no. 4, pp. 409–427, 2000.
View at: Publisher Site | Google Scholar | MathSciNet
J. D. Lafferty, S. D. Pietra, and V. D. Pietra, “Statistical learning algorithms based on Bregman distances,” in Proceedings of the 5th Canadian Workshop on Information Theory, Toronto, Ontario, Canada, June 1997.
View at: Google Scholar
I. Ekeland and R. Temam, Convex Analysis and Variational Problems, North-Holland, Amsterdam, The Netherlands, 1976.
View at: MathSciNet
R. R. Phelps, “Metric projections and the gradient projection method in Banach spaces,” SIAM Journal on Control and Optimization, vol. 23, no. 6, pp. 973–977, 1985.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
R. H. Byrd and R. A. Tapia, “An extension of Curry's theorem to steepest descent in normed linear spaces,” Mathematical Programming, vol. 9, no. 1, pp. 247–254, 1975.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
C. Canuto and K. Urban, “Adaptive optimization of convex functionals in Banach spaces,” SIAM Journal on Numerical Analysis, vol. 42, no. 5, pp. 2043–2075, 2005.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
T. Figiel, “On the moduli of convexity and smoothness,” Studia Mathematica, vol. 56, no. 2, pp. 121–155, 1976.
View at: Google Scholar | Zentralblatt MATH | MathSciNet

Copyright

Copyright © 2008 Thomas Bonesky et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1964

Downloads

1339

Citations