Abstract

The least-squares method is the most popular method for fitting a polynomial curve to data. It is based on minimizing the total squared error between a polynomial model and the data. In this paper we develop a different approach that exploits the autocorrelation function. In particular, we use the nonzero lag autocorrelation terms to produce a system of quadratic equations that can be solved together with a linear equation derived from summing the data. There is a maximum of solutions when the polynomial is of degree . For the linear case, there are generally two solutions. Each solution is consistent with a total error of zero. Either visual examination or measurement of the total squared error is required to determine which solution fits the data. A comparison between the comparable autocorrelation term solution and linear least squares shows negligible difference.

1. Introduction

Curve fitting is the process of fitting a curve, in this case a polynomial, to a set of data points. There are different types of curve fitting, but we will discuss only the most popular method, the method of least squares [1].

It is assumed that the data consists of fluctuations about an ideal curve. These fluctuations create an error between the polynomial model and the actual data. After computing a total squared error, we can apply calculus to minimize the squared error with respect to the coefficients of the polynomial. This produces a set of  linear equations called the normal equations. The coefficients are derived from solving the normal equations [2].

A different approach to the problem is as follows.

We propose that the data we are considering consists of a deterministic component and a random component with zero mean. The random component does not correlate with the deterministic component and it does not correlate with itself at nonzero lag. We want to extract only the portion of the data that is correlative at nonzero lag. This requires the computation of nonzero lag autocorrelation terms and produces a system of quadratic equations [3]. After substitution based on a linear equation derived from summing the data, the system of quadratic equations consists of equations with unknowns, where is the degree of the polynomial. This results in a maximum of solutions. Each solution is consistent with a total error between the polynomial and the data that is equal to zero. Either visual examination or measurement of the total squared error is required to determine which solution fits the data.

In this paper, we derive the autocorrelation term method and compare it to linear least squares.

2. The Autocorrelation Term Method

Consider a set of data . Characterize the data as the sum of a polynomial function and a discrete, zero-mean random variable [4] Consequently, we can write Let .

Summation yields since the cross-correlation [5] and autocorrelation terms with must be zero. The polynomial function of degree can be written as Therefore, (3) becomes the system of equations: Using (1), we can write the average We know that , so which implies or There are a maximum of solutions to the system of equations created by (5) and (9) for the coefficients . Observe that the total error, , is presumed to be zero:

3. The Least-Squares Straight Line

Before examining the autocorrelation method, it is worth becoming familiar with the special case of fitting a straight line in the least-squares sense. We fit to a set of data . We have two parameters and and want to minimize the function of two variables ( and ): This becomes Apply the calculus method Dropping factor 2 and rewriting, we get Solving these equations for and yields

4. The Autocorrelation Term Method for

For , system (5) becomes Suppressing the summation limits, we can write Equation (9) becomes Solving (19) for yields Squaring yields Now substitute (20) and (21) into (18) and organize the terms: This is in the form of a quadratic equation that we can write as where, with reference to (22), is the precursor of , is the precursor of , and is the residual expression.

The solutions are the roots is found using (20).

5. The Autocorrelation Term Method for Higher Values of

The autocorrelation term method for curve fitting with an th degree polynomial requires the solution of one linear equation and quadratic equations with variables which reduces to quadratic equations with variables. This can produce solutions [6]. For example, consider the case. System (5) can be written as Equation (9) can be written as From (26) we can write Substitute (27) into (25): Equations (28) are two quadratic equations in and . This can produce four solutions. In general, these types of systems can be solved on a Grobner basis using Buchberger’s algorithm [7].

6. A Comparison of the Autocorrelation Term Method to Linear Least Squares

We have constructed synthetic data consisting of the sum of a line and a random variable. The line has an intercept of fifteen and a slope of five. The random variable is Gaussian with a zero mean. The amplitude of the random variable is twenty. There are a total of one hundred equally spaced data points. Figure 1 shows the data, the least squares fit, and the autocorrelation term fit for . Figure 2 shows the data, the least squares fit, and the autocorrelation term fit for . In this example, we find that the quadratic roots are negatives of one another. This occurs because which implies that . This follows from (22) which for implies This becomes which reduces to

7. Conclusions

We have derived a method for computing polynomial curve fits to data based on terms from the autocorrelation function. The method produces quadratic equations with variables for a polynomial of degree . The solutions of this system maximally produce curves. Each solution is consistent with a total error between the polynomial and the data that is equal to zero. The proper curve can be selected with either visual examination or measurement of the total squared error. We tested this method with a linear curve fit and compared it to linear least squares. There is negligible difference between the comparable solutions.

Acknowledgment

Discussions with Gwen Houston and Dominique Lueckenhoff are greatly appreciated.