Abstract
An approach is developed to obtain solutions to lower Hessenberg linear systems with general entries. The approach involves developing solution vectors for an extended lower Hessenberg linear system (having an extra column and an extra introduced unknown) for each nonzero term on the right hand side. The overall solution is then found through superposition and determination of the extra introduced unknown. The approach supports parallel solution algorithms without communication between processors, since each solution vector is computed independently of the others. The number of parallel processors needed will be equal to the number of nonzero right hand side terms.
1. Introduction
A number of researchers have studied the inverses of lower Hessenberg matrices [1β4], that is, inverses of square matrices of the form , such that when . Most recently, a recursive algorithm has been developed for inverting Hessenberg matrices [1]. This paper proposes an alternate solution approach. It is shown here that lower Hessenberg linear systems in particular lend themselves to a solution via an extended system (that adds a column to , as well as an additional unknown). The basic strategy involves generating solution vectors for such extended systems (where is the number of nonzero right hand side terms). In each such extended system, all of the right hand side terms are set to zero except for one entry. The first term of each solution vector is chosen arbitrarily, and then each subsequent term is found directly through a forward-substitution-like process similar to that used in LU decomposition, with a number of operations on the order of that required for forward substitution, that is, an order of magnitude smaller than that required for performing the LU decomposition itself. The overall solution is found through superposition and solution of the extra introduced variable (which is common to all of the extended systems). The process is highly parallelizable, since each solution vector can be computed independently of the others.
This approach will first be illustrated for a three-dimensional system (i.e., ) with general entries in Section 2. Section 3 provides a proof of the validity of the approach for systems having an arbitrary dimension .
2. Three-Dimensional Systems
This section details the approach to solve a lower Hessenberg linear system with and general entries. Consider the following system:
This can be solved via an extended system developed by adding an additional column to the coefficient matrix, and an additional unknown : where for all . The coefficient is set to 1 to simplify the solution process. Note that the third equation in (2.2) is or
The strategy for solving (2.2) involves first solving three such systems, each with a different nonzero right hand side term to obtain the solution vectors , , and , and then using superposition. The first such system, is solved by choosing arbitrarily and then determining the remaining terms using the subsequent equations in (2.5). Setting leads to
This process can be continued to compute all of the terms. The next system to be considered is:
Setting leads to
The third system is:
Setting leads to
Superposition of the vectors , , and leads to a solution for (2.2), that is,
From (2.11), note that
Substituting into (2.12) for , , and (and performing some algebra) yields where
Finally, substituting and , , and into (2.11) leads to the solution to (2.1), after some algebra:
The next section extends the process for a system having an arbitrary dimension . However, few comments are first warranted. It can be inferred by inspection of (2.11) that computations are minimized when , the number of nonzero right hand side terms, is small. For example, when , only two of the vectors need to be computed, regardless of . (Only one vector needs to be computed if is the only nonzero right hand side term.) The number of operations to compute a single vector is on the order of that for the forward substitution process used in LU decomposition, or an order of magnitude smaller than performing the LU decomposition process itself. Thus, the approach can be useful when . However, even when , the solution for the vectors can be performed on parallel processors, since each can be determined simultaneously on a single processor without any communication needed with any of the other processors. Thus, the run time in such a parallel algorithm should approach that for computing just one of the vectors.
A considerable amount of recent work has involved attempts to parallelize the solutions of linear systems [5β7] (e.g., involving LU or QR decompositions). Typically, the degree of parallelism achieved is only partial and is highly dependent on the structure and sparseness of and requires varying degrees of communication between processors. In contrast, this approach provides a framework for full parallelization with processors without any communication required between processors.
3. Proof of the General Case
This section provides a proof that the algorithm from the previous section is valid for the general case, that is, a lower Hessenberg linear system having arbitrary dimension . Consider the system of equations for each value of where , and when . Following the procedure in the last section, consider the extended system for each value of , where . Here when , is the Kronecker delta, , and for all . Consider further the related systems of the form for each , where , and for each , where .
For a given , each successive term of can thus be obtained by solving a single equation of the system (3.3) for one unknown (similar to the forward substitution process used in LU decomposition).
The solutions for , once obtained, can then be superimposed as follows for :
Equation (3.4) satisfies (3.2) and provides an equation to determine , that is,
Since (3.4) satisfies (3.2) and are arbitrary, it follows that (3.5) leads to a solution for . Substituting obtained from (3.5) into (3.4) then determines all of the terms. Finally, the th equation of (3.2) is or which is identical to the th equation in (3.1). Thus, this approach solves (3.2) and also solves (3.1), since (3.1) is a subset of (3.2) and is further isolated from the additional variable .
4. Concluding Remarks
It has been shown that lower Hessenberg linear systems can be solved by considering related extended systems, first by directly solving a system with , and then extending the procedure to arbitrary . The approach involving the development of solution vectors from an extended system (having an additional column in the coefficient matrix and an additional introduced unknown) is highly parallelizable on processors.
It should be noted that this approach also lends itself to lower -Hessenberg linear systems, that is, involving square matrices , such that when . For example, leads to two additional columns and two additional introduced variables and , that are solved with two additional equations. The proof is similar.
There is no limit in principle to the number of diagonals that can be added. Each will lead to an additional introduced unknown and an additional column to . The process can thus be extended to many systems of practical interest, providing a framework for parallelization of such systems.