Abstract

The least squares problem appears, among others, in linear models, and it refers to inconsistent system of linear equations. A crucial question is how to reduce the least squares solution in such a system to the usual solution in a consistent one. Traditionally, this is reached by differential calculus. We present a purely algebraic approach to this problem based on some identities for nonhomogeneous quadratic forms.

1. Introduction and Notation

The least squares problem appears, among others, in linear models, and it refers to inconsistent system of linear equations. Formally, it reduces to minimizing the nonhomogeneous quadratic form Classical approach to the problem, presented in such known books as Scheffé ([1, Chapter 1]), Rao ([2, pages 222 and 223]), Rao and Toutenburg ([3, pages 20–23]), uses differential calculus and leads to the so called normal equation , which is consistent. The aim of this note is to present some useful algebraic identities for nonhomogeneous quadratic forms leading directly to normal equation.

Traditional vector-matrix notation will be used. Among others, if is a matrix then , , and stand for its transposition, range (column space), and rank. Moreover, by will be denoted the -dimensional euclidean space represented by column vectors.

2. Background

Any system of linear equations may be presented in the vector-matrix form as where is a given matrix, is a given vector in , while is unknown vector. It is well known that (2.1) is consistent, if and only if, belongs to the range .

If (2.1) is inconsistent, one can seek for a vector minimizing the norm or, equivalently, its square . The Least Squares Solution (LSS) of (2.1) is defined as a vector such that

A crucial problem is how to reduce the LSS of the inconsistent equation (2.1) to the usual solution of a consistent one. Formally, the least squares problem deals with minimizing the nonhomogeneous quadratic form . Traditionally, this problem is solved by differential calculus and leads to the normal equation .

In the next section, we will present some useful algebraic identities for nonhomogeneous quadratic forms. They yield directly the inequality (2.2).

3. Identities and Inequalities for Nonhomogeneous Quadratic Forms

The usual, that is homogeneous quadratic form is a real function defined on . In this note, we shall consider also nonhomogeneous quadratic forms of type where is a symmetric matrix and is a vector in .

Some inequalities for nonhomogeneous quadratic forms may be found in Stępniak [4]. Let us recall one of these results, which is very useful in the nonhomogeneous linear estimation.

Lemma 3.1. For any symmetric nonnegative definite matrices and of order , the condition for some , , and all implies that is nonnegative definite and .

Now we will present some identity which may serve as a convenient tool in the LSS of (2.1). For convenience, we will start from the case , leaving the singular case to Section 5.

Proposition 3.2. For arbitrary matrix of rank and arbitrary vector

Proof. Let us start from the trivial identity
We only need to premultiply this identity by , postmultiply it by , and then collect the terms to get (3.3).

4. Least Squares and Usual Solutions: Nonsingular Case

As above, we consider an inconsistent equation , where is an matrix of rank . We are interested in the LSS of this equation.

Theorem 4.1. Vector is a Least Squares Solution of the inconsistent equation , if and only if, it is usual solution of the consistent equation

Proof. Consistency of (4.1) follows from the fact that .
We note that the second component in the right side of the identity (3.3) does not depend on . Thus, we only need to minimize the first one. Since , and in consequence , is positive definite, this component is minimal, if and only if, . This completes the proof.

5. General Case

If then the matrix is singular, and, therefore, the identity (3.3) is not applicable. However, as we will show, it remains true if we replace by arbitrary generalized inverse .

There are many papers on generalized inverses and several books; among others Bapat [5] Ben-Israel and Greville [6], Campbell and Meyer [7], Pringle and Rayner [8], and Rao and Mitra [9]. A recent paper by Stępniak [10] may serve as a brief and self-contained introduction to this field.

Let us recall that a given matrix , its generalized inverse is defined as an arbitrary matrix satisfying the condition .

A key result in this section is stated as follows.

Proposition 5.1. For arbitrary matrix and arbitrary vector where means arbitrary generalized inverse.

Proof. The idea of the proof is the same as in Proposition 3.2. Since one can replace the vector by for some .

Now we will apply the identity (5.1) to the least squares problem.

Theorem 5.2. For arbitrary matrix and arbitrary vector ,(i)a vector is a Least Squares Solution of the equation , if and only if, it is the usual solution of the (consistent) equation ;(ii)the lower bound of is equal to , and it does not depend on choice of generalized inverse .

Proof. By setting , the first component in the right side of (5.1) reduces to which is nonnegative and takes zero, if and only if, or, equivalently, if . Since the second component does not depend on , this is just total minimum. The same setting shows that the lower bound does not depend on the choice of generalized inverse.

Acknowledgment

The author thanks a referee for his (or her) useful suggestions concerning presentation of this paper.