Abstract

We give an elementary derivation of an extension of the Ritz method to trial functions that do not satisfy essential boundary conditions. As in the Babuška-Brezzi approach boundary conditions are treated as variational constraints and Lagrange multipliers are used to remove them. However, we avoid the saddle point reformulation of the problem and therefore do not have to deal with the Babuška-Brezzi inf-sup condition. In higher dimensions boundary weights are used to approximate the boundary conditions, and the assumptions in our convergence proof are stated in terms of completeness of the trial functions and of the boundary weights. These assumptions are much more straightforward to verify than the Babuška-Brezzi condition. We also discuss limitations of the method and implementation issues that follow from our analysis and examine a number of examples, both analytic and numerical.

1. Introduction

In variational problems linear boundary conditions are often divided into essential (geometric) and natural (dynamic) [1, II.12] and [2, 4.4.7]. More generally, one calls the boundary conditions essential if they involve derivatives of order less than half of the order of the differential equation and natural otherwise [3, I.1.2]. In the standard exposition of the Ritz method the trial functions may violate the natural conditions but must satisfy all the essential ones [2, 4.4.7] and [4]. The reason is that the variational equations force the natural conditions on the trial solutions anyway, even if the trial functions themselves do not satisfy them.

But what if we wish to use trial functions that violate the essential conditions as well? For instance, in problems involving parametric asymptotics, the trial functions are preimposed with no regard for boundary conditions [5, 6] and in initial-boundary problems with time-dependent boundary conditions the (time independent) trial functions can not satisfy them in principle. One may also wish to use such violating trial functions because they are simpler; see [7] for other possible reasons. Thus, there is abundant motivation to generalize the Ritz method to trial functions that do not satisfy the essential conditions. From a theoretical viewpoint this is a particular case of approximating solutions by nonconforming functions, the nonconformity here being at the boundary [8].

A natural idea is to treat the essential boundary conditions as variational constraints and to remove them as any other constraints using the Lagrange multipliers. Such approach is naively taken in some applied works at least since 1946 [9]; see also [10] where the authors explicitly cite the simplicity of the trial functions as a reason for using them. Babuška et al. [8, Sec.7] and [11] were the first to theoretically analyse the use of Lagrange multipliers in the context of nonconforming Finite Element Method (FEM); their work was generalized to more general trial functions by Brezzi [12] and [13, II.1]. Their analysis relies on the saddle point reformulation of the original variational problem and leads to the celebrated Babuška-Brezzi inf-sup condition that dictates a strict relation between choices of spaces for the trial functions and for the Lagrange multipliers; see also [13, 14] for applications in non-FEM context. This approach however is very far from the intuitive reasoning behind the naive application of the Lagrange multipliers in [9] or in [10]. As a result, the meaning of the Babuška-Brezzi condition remains obscure, which explains why it remains relatively unfamiliar to nonexperts, and its verification is often quite involved mathematically [14]. The standard approach applies to problems with quadratic functionals [13, II.1] but produces very strong stability and approximation results.

It is not our purpose to match the technical sophistication of the Babuška-Brezzi approach, but to give a more elementary and natural derivation of the Ritz method with Lagrange multipliers, the Ritz-Lagrange method for short. It applies to general convex functionals and is much closer to the standard theory of the Ritz method [1517]. We avoid the saddle point reformulation altogether and work directly with the original variational problem. A system of boundary weights is used to approximate the Lagrange multipliers, and our assumptions are stated in terms of completeness of the trial functions and of the boundary weights. While completeness is implicitly contained in the Babuška-Brezzi condition, its role is by no means obvious or well-known, see [18], and it has significant practical consequences. On the other hand, completeness is much more intuitive and straightforward to verify than an inf-sup condition. As a trade-off, we only prove convergence of the method rather than obtain an explicit error estimate, so the relation between spaces of the trial functions and of the boundary weights is less precise. But hopefully a more intuitive approach provides a better understanding of analytic and numerical issues involved.

This paper is largely inspired by observations in [18] on intriguing and counterintuitive effects that the lack of completeness has on convergence of trial solutions to a true solution. We illustrate these effects further in a number of examples and explain them in the general context of convex analysis. Our approach leads to a number of practically useful observations. In particular, extra care is needed compared to the usual Ritz method: the variational functional has to be more regular on a larger space, the trial functions have to be complete in this larger space as well, and the multipliers can not be eliminated from the approximating systems using the usual variational formulas because of convergence issues. In the higher dimensional problems one needs to balance the numbers of the boundary weights and the trial functions to obtain well-posed approximating problems. While we do not prescribe this balance precisely, which would involve an analog of the Babuška-Brezzi condition, we still obtain a practical rule of thumb that works well in examples.

In Section 2 we introduce the Ritz-Lagrange method using simple one-dimensional examples, where not only the exact solution, but even all trial solutions are computed analytically. We also introduce notation and terminology of convex analysis needed to analyse the method theoretically. The proof of convergence is obtained in this case by straightforward reduction to the classical Ritz method. Unfortunately, this direct approach does not carry over to higher dimensions, and we develop a suitable generalization in Section 3. Numerical applications to multidimensional problems follow in Section 4. The paper ends with Conclusions, where we summarize our findings and discuss Galerkin type generalizations. Technical proofs are collected in the Appendix.

2. Boundary Conditions as Variational Constraints

As a motivation, consider a boundary value problem for the second-order equation on with essential boundary conditions on both ends of the interval . We set for convenience , and to make everything explicitly computable. The exact solution is easily found to be . The corresponding variational functional isand the boundary value problem is equivalent to minimizing it on functions satisfying the boundary conditions.

We select , as our trial functions; they obviously do not satisfy the boundary conditions. Taking of them the trial solution is of the form with unknown coefficients ( in front of is for agreement with the convention for the cosine series). Since our boundary conditions are essential, and our trial functions do not satisfy them, the Ritz method has to be modified in some way. Our approach is to treat the essential conditions as variational constraints and remove them using Lagrange multipliers. The Lagrange functional is , where , are the Lagrange multipliers. Substituting, we find thatThe variational equations are , and adding the two boundary conditions we get the Ritz-Lagrange system. Solving these equations for unknowns ,  , and we get , for and all , while where is the floor function returning the largest integer not exceeding its argument. Thus, the trial solutions converge to the sum of the series: By extending the exact solution to an even function on and expanding it into a cosine series with coefficients [19, 12.1], one finds that is exactly the cosine series of the exact solution .

Variational problems typically come with a natural energy space, where convergence of solutions is considered [18]. On its energy space a variational functional typically has two important properties: it is continuous and it is weakly coercive; that is, [15, 6.2] and [16, III.10.2]. For a convex functional (we will only consider those) these two properties are sufficient to prove convergence of the usual Ritz approximations in the energy norm [15, 6.2A]. For the functional from (1) the energy space is , which is the Hilbert space of functions square integrable with their first derivatives and vanishing at and , with the norm . This norm is stronger than the norm in the sense that any convergent sequence converges in , but not conversely.

For our purposes the concept of energy space is not quite suitable because it incorporates the essential boundary conditions, and our trial functions do not satisfy them. Instead we start with a reflexive Banach space (the reader may assume it to be Hilbert without much loss) that has nothing to do with the boundary conditions, and a convex functional on it. Next, we introduce the boundary operator, a linear map , that maps functions to their boundary values. The subspace consists of functions that satisfy the boundary conditions. In our example we have ,  , and . The following three assumptions turn into a generalized analog of the energy space:(1) is convex and continuous.(2) on ; that is, is weakly coercive on .(3) is linear and continuous. This setup applies to homogeneous boundary conditions only. Nonhomogeneous conditions can be accommodated by selecting a function that satisfies them and switching to considering the differences with it. These differences solve the corresponding homogeneous problem, and all convergence issues can be reduced to them; see, for example, [20, 2.1].

We now turn to the trial functions. Recall that a system of elements in a Banach space is called complete if any element can be approximated by their linear combinations in the norm of the space. Let be a complete system in and let denote the linear span of . The Ritz-Lagrange approach amounts to minimizing on subject to the boundary conditions, which is equivalent to minimizing it on . Although by assumption about completeness all elements in can be approximated by linear combinations of , it is not a priori clear that functions from can be approximated by linear combinations of that are themselves in . The next lemma proved in the Appendix assures us that this is the case.

Lemma 1. For any complete system of elements in there exists a system of their finite linear combinations belonging to , which is complete in .

This lemma effectively reduces the Ritz-Lagrange method to the traditional Ritz method. Indeed, if is the complete system in produced by Lemma 1, then applying the Ritz method with as the trial functions amounts to minimizing on . In other words, the Ritz-Lagrange method with produces the same (up to reindexing) as the Ritz method with . This allows us to use well-known results on convergence of the Ritz method [15, 6.2A], [16, IV.12.4], and [21, 42.5] to prove convergence of its Ritz-Lagrange generalization. Producing the Ritz system involves differentiating the functional, so in addition to convexity and continuity we also have to assume its Gateaux differentiability [16, I.2.1], and [15, 3.2].

Theorem 2. Let be a convex continuous Gateaux differentiable functional, let be a bounded linear operator, and let be a complete system in . If is weakly coercive on , then it has a minimizer on it, as well as minimizers on all , and there exists a subsequence such that , where denotes weak convergence in . Moreover, if is strictly convex on , then are unique and . In both cases the values of converge to its minimum on .

The proof is fairly standard, but we outline it in the Appendix for the convenience of the reader. For general convex functionals only weak convergence can be expected; we discuss a stronger convexity assumption that guarantees convergence by norm in the next section. Note also that weak convergence in implies convergence by norm in due to Sobolev embedding theorems [20, I.6] and [22, 1.8].

Theorem 2 mostly justifies the Ritz-Lagrange method used to solve our example. The required properties of and the boundary operator are easily verified, except for the strict convexity, which follows from the Poincaré-Friedrichs inequality [20, I.6]. The completeness is a trickier issue. Completeness in follows from the standard theorems on cosine series [19, Ch.12], but we need a stronger form of completeness in . Fortunately, completeness of cosines reduces to the completeness of sines. The minimality below means that the system becomes incomplete after deleting any function.

Lemma 3. The system , , is complete and minimal in .

The proof is given in the Appendix.

As emphasized in [18] completeness is not a mere technicality in this context; it imposes a practical restriction on the choice of trial functions. To underscore the point, consider the biharmonic equation with the boundary conditions . For the exact solution is . If the cosine system is used again, proceeding as above we find thatThis time the trial solutions do not converge to the exact one; in fact . This is because cosines are incomplete in . As observed in [18], the second derivatives   do not include a constant and therefore can not approximate the second derivative of in . But then cosines can not approximate in since its norm incorporates the norm for second derivatives. In [18] the authors add to the cosine system, but as we just saw the limit difference is a linear combination of and , so at least also needs to be added. We show in the Appendix that nothing else is needed.

Lemma 4. The system , , , , is complete and minimal in .

Note that verifying completeness in a correct space can not be avoided even if one uses the usual Ritz method with trial functions satisfying the essential boundary conditions. The second example demonstrates that solutions can not always be approximated in the space where the trial functions happen to be complete. Only completeness in the norm dictated by the variational functional counts. Completeness of trial functions in a norm weaker than the energy norm does not simply weaken the convergence to the exact solution; trial solutions may not converge to it at all.

After adding and as the trial functions the trial solutions become giving the Lagrange functionalThe Ritz-Lagrange system now has two extra variables , and two extra equations. Solving it we find that matches the exact solution as expected.

3. The Ritz-Lagrange Method in Higher Dimensions

The Ritz-Lagrange method described in Section 2 can not be applied to multidimensional boundary value problems. In this section we will develop a suitable generalization and prove that it works. The main distinction is that the boundary operator no longer maps into a finite-dimensional space. Indeed, in dimensions two and higher the boundary values are not arrays of numbers but functions on the boundary forming an infinite-dimensional space . The induction proof of the key Lemma 1 no longer works and its claim itself is false. It is easy to find complete systems of functions with no (finite) nontrivial linear combinations satisfying the boundary conditions. If we are committed to using arbitrary complete systems of trial functions we must find a way to form their linear combinations that satisfy essential boundary conditions “approximately.”

To this end we will use a complete system of linear functionals on the Banach space , that is, elements of the dual space (as with , the reader may assume that is a Hilbert space, in which case ). If for all , then and , so we can think of operators as approximations to and the corresponding spaces as approximations to . Assuming is continuous also will be and we can apply Theorem 2 with in place of for each . This gives us a sequence of approximations converging to an exact minimizer of on . The remaining question is whether we can count on to approximate the overall minimizer of on . Before proceeding let us describe the approximating procedure that our approach suggests.

Multidimensional Ritz-Lagrange Method. To minimize a functional subject to essential boundary conditions with select internal trial functions and boundary weight functions with . A Ritz-Lagrange trial solution is obtained by solving the system of equations with unknowns ,   consisting of internal equations and boundary equations , where is the Lagrange functional and is the Lagrange multiplier.

A justification of our approach is based on Theorem 6. The reader not interested in justification may skip the rest of this section and look at applications in the next one. Even in simplest cases we can not expect to converge to in the same generality as in Theorem 2. The root cause is that the minimizer in is approximated by elements outside of , which is why we need to be well-behaved on the entire , not just , something avoidable if do satisfy the boundary conditions. In particular, the values are potentially smaller than because they are obtained by minimizing on larger subspaces . As a consequence, standard properties of convex functionals, which we relied on in Theorem 2, no longer guarantee convergence of to if converges only weakly (in technical terms, convex functionals are weakly lower semicontinuous, but not necessarily weakly upper semicontinuous [16, III.8.5] and [17, 41.2]).

To make our proof work we need to assume a stronger form of convexity of . For Gateaux differentiable functionals convexity is equivalent to monotonicity of their derivatives; that is, for all , [16, II.5.3]. This is a generalization of a familiar fact that convex functions have monotone derivatives. We will need a form of uniform monotonicity for , compare [16, VI.18.6] and [21, 25.3]; namely,where is a continuous monotone increasing function with . The point is that if then and hence by norm. The next Lemma answers in the affirmative the question about convergence of intermediate minimizers under the uniform monotonicity assumption.

Lemma 5. Let be a convex Gateaux differentiable functional, and let be a bounded linear map. Let be a complete system in and set . If is weakly coercive on some , and is uniformly monotone on it, then has a minimizer on , as well as minimizers on all with , and .

Uniform monotonicity also allows us to improve on Theorem 2 by replacing weak limits with strong limits leading to our main result.

Theorem 6. Let be a convex continuous Gateaux differentiable functional, and let be a bounded linear map. Let be a complete system in , and let be a complete system in . Denote by the linear span of , and set , , and . Suppose that is weakly coercive on some and that is uniformly monotone on it. Then for    has unique minimizers , on , , respectively, and in the norm of , while .

In examples it is typical that does not satisfy (8) on the entire space but does satisfy it on subspaces much larger than , such as . Let us discuss the case of quadratic functionals in more detail. By direct calculation , therefore, We need with to satisfy (8); that is, we need to be strictly positive definite. The multidimensional analog of the functional from our first example is , where is a domain with smooth boundary. It follows from the Poincaré-Friedrichs inequality [20, I.6] and [23] that is strictly positive definite on , but it certainly is not on the entire since for any . Nevertheless, it still follows from the calculus of variations that satisfies (8) on any subspace complementary to the subspace of constants; see, for example, [24, VI.1]. Similar considerations apply to other quadratic forms related to the strongly elliptic equations like the biharmonic equation. They are usually strictly positive definite on complements to finite-dimensional subspaces that they annihilate [23, 22.11]. Such conditions are sometimes called Ker-ellipticity [7, 12].

It should be emphasized that Theorem 6 does not imply that the double sequence converges to ; that is, the repeated limit in it can not be replaced by the double limit. In fact, suppose and let the functionals , where is the dual of , be linearly independent. Then the boundary equations alone are enough to force no matter how large and are. In practice, this means that one should always take many more internal trial functions than the boundary ones; hence the prescription . This way for large the approximation will be close to , while in turn will be close to if itself is large enough. The readers familiar with non-conforming Finite Element methods will recognize this as a reflection of the Ladyzhenskaya-Babuška-Brezzi type condition. One of its consequences is that the mesh size on the boundary has to be larger than in the interior, yielding a smaller number of the boundary elements [7, 11]. Modern approach can be found in [25] for finite elements, and in [14] for Galerkin approximations.

As in one-dimensional examples one will have to verify completeness of trial functions in the appropriate space; the same applies to the boundary weights as well. One has to be extra careful with functionals involving higher-order derivatives because the values of function and their derivatives have to be approximated simultaneously. Natural spaces to use are , the spaces of functions with integrable th powers along with all of their derivatives up to order . A generalization of the Weierstrass theorem implies that polynomials form a complete system in for any , any bounded domain , and any positive integer (in fact, polynomials are even uniformly complete [24, II.4.3]). However, polynomials may not always be convenient in a particular problem. The following lemma can be useful in finding other complete systems.

Lemma 7. Let and be complete systems in and , respectively, where and are some bounded domains. Then the system is complete in .

If one starts from one-dimensional systems the lemma will only produce complete systems in box-like domains . However, any system of functions complete on a domain will be complete on any of its subdomains, so for an arbitrary domain one can always use a system complete on the smallest box that contains it. A more targeted choice is to take eigenfunctions of an operator on the same domain that is simpler than the one involved but is somewhat similar to it. Various spectral theorems often ensure completeness of eigenfunctions in suitable Sobolev spaces [23, 22.11a].

4. Multidimensional Examples

In this section we illustrate the multidimensional Ritz-Lagrange method developed in Section 3 by applying it to some typical problems. Since calculations by hand quickly become intractable we performed them using a computer algebra system.

Consider a boundary value problem for the Laplace equation in , where is the unit disk, with the boundary condition on and . This equation describes the transverse deflection of a membrane fixed everywhere at the boundary and subjected to pressure given by [24, IV.10.3]. The profile of was chosen so that the problem has an analytic solution which is not a polynomial. Specifically, one can represent the exact solution as a rapidly convergent serieswhere , is the Euler-Mascheroni constant, and is the cosine integral.

To solve the problem we use the multidimensional Ritz-Lagrange method. A variational formulation is to minimize the functional , which gives the total potential energy of the membrane, subject to the boundary condition. In the notation of Section 3 we take with being the restriction of to the boundary . Moreover, is continuous if we take . Our internal trial functions are the monomials, which obviously do not satisfy the boundary condition, and the trial solution is . Note that of Section 3 will be here because of double indexing. As the boundary weight functions we choose the piecewise linear ones on uniform partitions of . Unlike some circle specific choices, for example, the trigonometric functions, such weights can be used on a wide variety of boundaries. Instead of using a single indexed system it is convenient to split it into the constant terms and the linear terms . If the boundary is partitioned into segments, we haveTherefore, the number of boundary weight functions, denoted by in Section 3, will be here. The Lagrange multiplier has the form , and the Lagrange functional is . The unknown coefficients , , and are determined from the system of internal and boundary equations.

The relative errors of the Ritz-Lagrange solutions versus the exact solution (10) are shown in Table 1 as the percentages of the maximum deflection at and . They are quite small considering that one has to determine coefficients in each case. Note that we always keep as recommended in the description of the method to ensure that the system matrices have full rank and are invertible.

Our next example involves a fourth-order equation. Consider the problem of bending a uniformly loaded, simply supported on all sides (SS-SS-SS-SS), isotropic, square plate of constant thickness, unit stiffness, and unit edge length. Simply supported means that on . We do not need to list the natural boundary conditions since a variational formulation incorporates them automatically. The variational functional giving the potential energy of the plate is as follows [24, IV.10.3]:where is the displacement of the plate, is the Poisson ratio, and is a distributed load.

The Euler-Lagrange equation induced by (12) is the biharmonic equation ; the terms multiplied by the Poisson ratio form a divergence and only affect the natural boundary conditions. For the trial functions we choose the products of cosines and , so that the trial solution is . Obviously, the trial functions do not satisfy the boundary condition. The Lagrange functional iswhere for convenience we split the Lagrange multiplier into its restrictions to each edge of the plate. This way we can represent the set of the boundary weight functions as the union of four sets selected separately for each edge; namely, , where are the unknown coefficients. The number of internal equations here is again , and the number of the boundary equations is , so the nondegeneracy condition is .

The exact solution to this problem can be expressed as a rapidly convergent series; we use its first ten terms to calculate the errors. We do not tabulate them here, because they are very large (up to 70%), and increasing the number of terms does not improve the approximation. At this point, this is not surprising since we are dealing with the same completeness issue as in the second example of Section 2. As we know, the system of cosines is incomplete on the interval, so the system of their products naturally is incomplete on the product of intervals that represents the plate. A more intuitive explanation is that products of cosines have vanishing normal derivatives on all edges of the plate, and all their linear combinations inherit this property. This does not matter for second-order equations because the normal derivatives are discontinuous on the relevant spaces, but it does matter for the higher-order equations like the biharmonic equation.

The choice of cosine products unwittingly enforces an additional boundary condition, on . We therefore ended up solving a different variational problem. Together with on this describes, physically, a plate clamped on all sides (C-C-C-C) rather than a simply supported one. One can also check that in the weak formulation the boundary terms that multiply the variation of the solution’s derivatives get removed because of the vanishing normal derivatives. Thus, we should be comparing our Ritz-Lagrange solutions to the answers for the C-C-C-C plate. Unfortunately, an analytic solution for a plate clamped on all sides is not known, but one can use the values obtained in [26, VI.44] numerically to make the comparison. The relative errors as percentages of the maximum deflection at the center of the plate are shown in Table 2.

To solve the original problem, we just need to complete the system of cosine products. By Lemma 7 it suffices to complete the cosines on the interval and take the products of the completed system. Namely, we take the products of , , and , , as the new trial functions and keep the rest of the above setting intact. The relative errors for the Ritz-Lagrange solutions with the completed system against the known series solution [2, 8.2.4] are shown in Table 3 and demonstrate the validity of the method.

As a final demonstration, we apply the Ritz-Lagrange method to a boundary eigenvalue problem for square plates. The eigenmodes describe standing vibrations of a plate, and their zeros (nodal curves) are known as Chladni figures [27, 5.1]. The problem has attracted a lot of attention from both analytic and numerical viewpoints; indeed Ritz himself applied his method to it in his original paper.

Boundary eigenvalue problems are somewhat beyond the scope of the theory in Section 3, which deals with linear constraints only. Under the Rayleigh-Ritz approach to solve for the eigenmodes one needs to impose an additional normalization constraint [23, 18.5], [24, VI.1.1], and [27, 5.2], which is quadratic. However, the general approach of Section 3 remains valid, and one can justify applying the Ritz-Lagrange method to problems with nonlinear constraints along the same lines.

Consider a uniformly loaded isotropic square plate of constant thickness with unit edge length, simply supported on all sides. The potential energy of the plate is given by (12) without the distributed load term at the end. The boundary eigenvalue problem can be interpreted as finding extrema of subject to the boundary condition on , and the normalization constraint .

Compared to (12) the Lagrange functional acquires an additional term and an additional equation, which amounts to the normalization constraint on the eigenmodes. When solving for trial solutions one can ignore this equation and use standard methods for finding eigenvectors instead. We keep the same choices for the trial functions and the boundary weights as before. Let denote the vector of internal coefficients and let denote the vector of boundary coefficients . In terms of and the Lagrange functional can be conveniently represented as , where and are matrices of size obtained by integrating the internal trial functions, see [2, 8.2.7], and is a matrix obtained by integrating the boundary weights. Matrix can be obtained by multiplying the boundary equations with the corresponding Lagrange multiplier functions and extracting the coefficients of and after the integration. Note that the boundary equations can be written as . Finally, differentiating the Lagrangian with respect to and , we are led to the following generalized eigenvalue problem:For this eigenvalue problem to be solvable one needs to have the maximal rank , which is ensured by the nondegeneracy condition . The eigenvalues approximate the squares of the dimensionless natural frequencies of the plate’s vibrations.

With and we obtain a set of approximate natural frequencies , first nine of which are shown below. Since the eigenmodes are known to be of the form we change the single index notation to and arrange the frequencies in a square pattern: The exact values are taken from [2, 8.2.4]; the repeated frequencies correspond to multiple eigenvalues with the eigenmodes symmetric along different axes:One can see that the estimated frequencies are slightly lower than the exact ones. This is in contrast with the application of the classical Ritz method, where the estimated frequencies are always higher. From a physical viewpoint, the latter happens because replacing an infinite system with a finite one is equivalent to imposing additional constraints, which tend to raise the stiffness of the system, and hence the frequencies. This assumes however that all the boundary constraints are enforced in both systems, that is, that the trial functions satisfy the essential boundary conditions.

In the Ritz-Lagrange method the trial functions do not satisfy the essential conditions, and even the trial solutions are forced to satisfy them only approximately. In other words, we are effectively relaxing the boundary constraints in addition to imposing additional ones through discretization. This relaxation lowers the frequencies (because a plate with fewer constraints is less stiff) and counteracts the effects of discretization. If we were able to impose the boundary conditions everywhere along the boundary the estimated frequencies would have been higher than the exact ones just as in the usual Ritz method.

From a mathematical viewpoint, this effect is also natural since the eigenvalues are the minima of a quadratic functional on subspaces of the original space [24, VI.1.1]. In the Ritz-Lagrange method we approximate them by using functions from a larger space (by relaxing the boundary conditions), thus lowering the minima that can be attained. In particular, one can see from the proof of Lemma 5 that the values of the functional at the approximating minimizers are potentially smaller than at the true minimizer.

5. Conclusions

We developed a general extension of the Ritz method to systems of trial functions that do not satisfy the essential boundary conditions and proved its convergence. Our approach is based on treating the essential conditions as variational constraints and removing them using the Lagrange multipliers closely following the intuition behind the naive use of Lagrange multipliers. It is also more elementary than the Babuška-Brezzi saddle point formulation and leads to more transparent convergence conditions in terms of completeness of trial functions and boundary weights in appropriate spaces. We list some general observations on the workings of the method.(i)The variational functional has to be well-behaved not only on the energy space of the problem, but also on its extension that contains the trial functions. Sufficiently good behavior is a strong form of convexity, which in the case of quadratic functionals means that the boundary value problem is strongly elliptic.(ii)The systems of trial functions must be complete in the norm consistent with the functional, which is usually an extension of the energy norm of the problem to a larger space containing the trial functions. Although similar requirement applies to the classical Ritz method, it is much easier to encounter systems that appear complete but are not due to effects at the boundary.(iii)The Lagrange multipliers have to be treated as additional variables in the approximating systems. They can not be eliminated by substituting the trial solutions into the variational formulas for them in terms of the exact solution. These formulas are discontinuous in the relevant norms.(iv)In multidimensional problems the boundary conditions incorporate infinitely many constraints, and to obtain a finite-dimensional approximating system one has to select boundary weight functions in addition to the trial functions. The number of trial functions has to be significantly larger than the number of the boundary weights; otherwise the approximating system may be inconsistent or only have the trivial solution.(v)In multidimensional problems the approximating values of the functional may approach the exact value from below rather than above, in contrast to the classical Ritz method, because the minimization takes place on a larger space of functions not satisfying the boundary conditions exactly.(vi)The method can be applied to boundary eigenvalue problems interpreted (similar to the Rayleigh-Ritz approach) as minimization problems on subspaces of the original space with the additional normalization constraint. Due to the presence of Lagrange multiplier variables the resulting finite-dimensional problem is a generalized eigenvalue problem instead of the ordinary one with . In multidimensional vibrational problems the approximate eigenfrequencies obtained in this way may be lower than the exact ones (for the Ritz method they are always higher), due to relaxation of the boundary constraints.

As is well-known [2, 4], the Ritz method leads to the same approximating systems as the Galerkin method, but the latter can also be applied to nonoptimization problems. It is interesting if one can develop a “Galerkin-Lagrange method” without resorting to the saddle point formulation of Babuška-Brezzi. We believe that analogs of Theorem 6 can be proved for nonvariational equations with monotone operators using the approach of [16, VII.23.6] or [21, 26.2].

Appendix

Proofs

Proof of Lemma 1. Let be a complete system in . Since has an -dimensional image we can represent it as , where are bounded linear functionals. Assume without loss of generality that they are linearly independent; otherwise some of them can be dropped without changing . Set ; we will construct a complete system in each by induction on . Since the process concludes in steps.
For we must produce a complete system of linear combinations in . Without loss of generality, since is complete and can not vanish on all . We claim that form the desired system. Let and be the coefficients such that for a given . By definition of , To estimate the second term we find, Therefore, , and since , are arbitrary completeness of follows.
Let be a complete system in from the preceeding step. Linear independence of guarantees that does not vanish on some , which we may as well take to be . Apply the process above with replaced by and replaced by to obtain . Then are linear combinations of (and hence of the original ), belong to , and are complete in it by the same argument. This concludes the induction step.

Proof of Theorem 2. A standard argument from convex analysis shows that if on , then has minimizers on , , and there is a weakly convergent subsequence [15, 6.2] and [16, III.10.3]; moreover since . For large enough there is a arbitrarily close to a minimizer of on by Lemma 1. Since is continuous is arbitrarily close to the minimal value . But can not exceed for since is a minimizer on , so . Convex functionals are weakly lower semicontinuous [16, III.8.4], so after passing to limit we have that ; that is, is a minimizer of on and .
If we assume additionally that is strictly convex on , then is unique and the entire sequence (which is now also uniquely defined) converges to it at least weakly [15, 6.2A].

The next two proofs use equivalent norms (inner products) on and , respectively. Two norms are equivalent if they define the same notion of convergence; for equivalent norms on Sobolev spaces see [20, I.8] and especially [22, 1.9].

Proof of Lemma 3. The following inner product is equivalent to the usual one on : . To prove completeness it suffices to show that any function orthogonal to all cosines must be . For such we have and hence for . Thus, is orthogonal to for all . Since the latter form an orthogonal basis in we must have a.e. But then by the Fundamental Theorem of Calculus a.e. establishing completeness. Being an orthogonal basis in cosines must be minimal there, and therefore in any space with a stronger norm, which includes .

Proof of Lemma 4. An equivalent inner product on is . Consider orthogonal to all cosines; then we have and for because all sines vanish at . In particular, is orthogonal to for all . But orthogonal complement of the latter in consists of constants, so and . Since free term is and is a linear combination of and . Thus, orthogonal complement to cosines is spanned by and proving completeness.
For minimality notice that by direct calculation ; that is, and are orthogonal to all cosines and to each other. This means that neither one of them can be deleted without losing completeness. It also means that if a cosine can be approximated in by other cosines combined with and , then it can already be approximated by other cosines alone. But the latter can not be done with arbitrary precision even in , let alone in .

Proof of Lemma 5. Since and the minimum on a larger space can not get bigger we have . Thus, the numerical sequence is bounded. Moreover, for , so for since is weakly coercive on . The derivatives vanish when paired with elements from subspaces where , , respectively, minimize , so , and, by uniform monotonicity (8),We prove below that the last expression converges to when implying that by norm since implies by assumptions on .
Since is a minimizer on the functional vanishes on any element from it. The subspace of functionals that vanish on the entire is the closed linear span of in . Indeed, if did not span it there would exist, by the Hahn-Banach theorem, a such that for all , while , contradicting the completeness of . Thus, for any , there exists a linear combination such that . But then , and for we have , so Since is arbitrary .

Proof of Theorem 6. Let ; then for the triple , , satisfies conditions of Theorem 2. Moreover, uniform monotonicity of implies strict monotonicity, and hence strict convexity of [16, III.5.3]. Therefore, by Theorem 2 there exist unique minimizers , on , , respectively, and . The proof of strong convergence below relies on uniform monotonicity and more or less combines arguments from Theorem 23.3 in [16, VII.23.6] and Theorem 26.A(b) in [21, 26.2].
Since , are minimizers the derivatives , vanish when paired with elements of , , respectively. Therefore, , and by uniform monotonicity (8) We prove below that for any , which implies , and hence converges to by norm.
First we show that the sequence of functionals is uniformly bounded on . By monotonicity for any we have , so The right-hand side is clearly uniformly bounded for any since is weakly convergent, so follows from the principle of uniform boundedness. Next for any choose so large that and . This is possible due to completeness of , and since minimizes on . Therefore, and as announced.
Convergence of to follows from Lemma 5, and convergence of and follows from the continuity of .

Proof of Lemma 7. In the multi-index notation an equivalent norm on is given by where the norm is just . If and then it follows from the Fubini theorem that since and depend on different variables. Let and denote the variables on and , respectively, so that is the variable on . Then we estimate Since are complete any monomial can be approximated to any precision in by their linear combination , and analogously can be approximated by a linear combination . But is a linear combination of , while the difference between the products can be made arbitrarily small: where the first inequality follows from the above estimate. Hence any product of monomials, and therefore any polynomial, can be approximated in by linear combinations of . By the generalized Weierstrass theorem [24, II.4.3], polynomials are complete in and hence so is the system .

Competing Interests

The authors declare that they have no competing interests.