Research Article  Open Access
Robust Modeling of LowCost MEMS Sensor Errors in Mobile Devices Using Fast Orthogonal Search
Abstract
Accessibility to inertial navigation systems (INS) has been severely limited by cost in the past. The introduction of lowcost microelectromechanical systembased INS to be integrated with GPS in order to provide a reliable positioning solution has provided more wide spread use in mobile devices. The random errors of the MEMS inertial sensors may deteriorate the overall system accuracy in mobile devices. These errors are modeled stochastically and are included in the error model of the estimated techniques used such as Kalman filter or Particle filter. Firstorder GaussMarkov model is usually used to describe the stochastic nature of these errors. However, if the autocorrelation sequences of these random components are examined, it can be determined that firstorder GaussMarkov model is not adequate to describe such stochastic behavior. A robust modeling technique based on fast orthogonal search is introduced to remove MEMSbased inertial sensor errors inside mobile devices that are used for several locationbased services. The proposed method is applied to MEMSbased gyroscopes and accelerometers. Results show that the proposed method models lowcost MEMS sensors errors with no need for denoising techniques and using smaller model order and less computation, outperforming traditional methods by two orders of magnitude.
1. Introduction
Presently, GPSenabled mobile devices offer various positioning capabilities to pedestrians, drivers, and cyclists. GPS provides absolute positioning information, but when signal reception is attenuated and becomes unreliable due to multipath, interference, and signal blockage, augmentation of GPS with inertial navigation systems (INS) or the like is needed. INS is inherently immune to the signal jamming, spoofing, and blockage vulnerabilities of GPS, but the accuracy of INS is significantly affected by the error characteristics of the inertial sensors it employs [1].
GPS/INS integrated navigation systems are extensively used [2], for example, in mobile devices that require lowcost microelectromechanical System (MEMS) inertial sensors (gyroscopes and accelerometers) due to their low cost, low power consumption, small size, and portability. The inadequate longterm performance of most commercially available MEMSbased INS limits their usefulness in providing reliable navigation solutions. MEMSs are challenging in any consumer navigation system because of their large errors, extreme stochastic variance, and quickly changing error characteristics.
According to [3], the inertial sensor errors of a lowcost INS consist of two parts: a deterministic part and a random part. The deterministic part includes biases and scale factors, which are determined by calibration and then removed from the raw measurements. The random part is correlated over time and is basically due to the variations in the INS sensor bias terms. These errors are mathematically integrated during the INS mechanization process, which results in increasingly inaccurate position and attitude over time. Therefore, these errors must be modeled.
The fusion of INS and GPS data is a highly synergistic coupling as INS can provide reliable shortterm positioning information during GPS outages, while GPS can correct for longerterm INS errors [1]. INS and GPS integration (i.e., data fusion) is typically achieved through an optimal estimation technique, such as the Kalman filter or Particle filter [4].
Despite having an INS/GPS integration algorithm to correct for INS errors, it is still advantageous to have an accurate INS solution before the data fusion process. This requires preprocessing (i.e., prefiltering or denoising) each of the inertial sensor (gyroscope and accelerometer) signals before they are used to compute position, velocity, and attitude. This paper offers a robust method based on fast orthogonal search (FOS) to model the stochastic errors of lowcost MEMS sensors for smart mobile phones.
Orthogonal search [5] is a technique developed for identifying difference equation and functional expansion models by orthogonalizing over the actual data record. It mainly utilizes GramSchmidt orthogonalization to create a series of orthogonal functions from a given set of arbitrary functions. This enables signal representation by a functional expansion of arbitrary functions and therefore provides a wider selection of candidate functions that can be used to represent the signal. FOS is a variant of orthogonal search [6] where one major difference is that FOS achieves orthogonal identification without creating orthogonal functions at any stage of the process. As a result FOS is many times faster and less memory storage intensive than the earlier technique, while equally as accurate and robust [5–7].
Many techniques have been used previously to denoise and stochastically model the inertial sensor errors [3, 8, 9]. For example, several levels of wavelet decomposition have been used to denoise the raw INS data and eliminate highfrequency disturbances [3, 8, 9]. Modeling inertial sensor errors using autoregressive (AR) models was performed in [3], where the YuleWalker, the covariance, and Burg AR methods were used. The AR model parameters were estimated after reducing the INS sensor measurements noise using wavelet denoising techniques.
FOS has been applied before in several applications [5–7, 10–12]. In [13], FOS was used to augment a Kalman filter (KF) to enhance the accuracy of a lowcost 2D MEMSbased navigation system by modeling only the azimuth error. FOS is used in this paper to model the raw MEMS gyroscope and accelerometer measurement errors in the time domain. In this paper, the performance of FOS is compared to linear modeling techniques such as YuleWalker, the covariance and Burg AR methods in terms of meansquare errors (MSEs) and computational time.
2. Problem Statement
It is generally accepted that the longterm errors are modeled as correlated random noise. Correlated random noise is typically characterized by an exponentially decaying autocorrelation function with a finite correlation time. When the autocorrelation function of some of the noise sequences of MEMS measurements is studied, it has been shown that a firstorder GaussMarkov (GM) process may not be adequate in all cases to model such noise behavior. The shape of the autocorrelation sequence is often different from that of a firstorder GM process, which is represented by a decaying exponential as shown in Figure 1. The GM process is characterized by an autocorrelation function of the form , where is variance of the process and the correlation time ( point) is given by . The autocorrelation function approaches zero as , as depicted in Figure 1, indicating that the process gradually becomes less and less correlated as the time separation between samples increases [1].
Most of the computed autocorrelation sequences follow higherorder GM processes. An example of such computed autocorrelation sequences for one hour of static data of an MEMS accelerometer is shown in Figure 2. It clearly represents a higherorder GM process. These higherorder GM processes can be modeled using an autoregressive (AR) process of an appropriate order. In [3] it has been decided to model the randomness of the inertial sensor measurements using an AR process of order higher than one. With the present computational efficiency of microprocessor systems, efficient modeling of MEMS residual biases can be realized, and, thus, accurate prediction and estimation of such errors can be provided.
3. Modeling Methods of AR Processes
The autoregressive moving average (ARMA) modeling is based on the mathematical modeling of a time series of measurements assuming that each value of such series is dependent on (a) a weighted sum of the “previous” values of the same series (AR part) and (b) a weighted sum of the “present and previous” values of a different time series (MA part) [14]. The ARMA process can be described using a polezero (ARMA) transfer function system as follows: where is the transform of the input , is the transform of the output , is the order of the AR process, is the order of the MA process, and and are the AR and MA process parameters (weights), respectively. The AR process is a special case of an ARMA process, where in (1) will be zero and thus will be only an allpole transfer function of the form Therefore, the name “Autoregressive” comes from the fact that each signal sample is regressed on (or predicted from) the previous values of itself [3]. In the time domain, the previous AR transfer function relationship can be obtained after applying the inverse transform for (2).
The resultant equation is written as [14] The previous inputoutput relationship in both frequency and time domains is shown in Figure 3.
The problem in this case is to determine the values of the AR model parameters (predictor coefficients) that optimally represent the random part of the inertial sensor biases. This is performed by minimizing the error between the original signal represented by the “AR process” of (3) and the estimated signal , which is estimated by an “AR model” of the form [8] The cost function for this minimization problem is the energy of , which is given as where is the total number of data samples. In this case, is a sequence of stationary uncorrelated sequences (white noise) with zero mean and unity variance.
Therefore, the resultant energy from (5) will be . Therefore, represents the estimated variance of the white noise input to the AR model or, more generally, the prediction meansquare error. This is due to the fact that the AR model order is completely negligible with respect to the MEMS data sample size .
Several methods have been reported to estimate the parameter values by fitting an AR model to the input data. Three AR methods are considered in this paper, namely, the YuleWalker method, the covariance method, and Burg’s method. In principle, all of these estimation techniques should lead to approximately the same parameter values if fairly large data samples are used [3].
3.1. The YuleWalker Method
The YuleWalker method, which is also known as the autocorrelation method, determines first the autocorrelation sequence of the input signal (inertial sensor residual bias in our case). Then, the AR model parameters are optimally computed by solving a set of linear normal equations. These normal equations are obtained using the formula [15] which leads to the following set of normal equations: where
If the meansquare error is also required, it can be determined by
Equations (7) and (9a) are known as the YuleWalker equations [7–9, 13]. Instead of solving (9a) directly (i.e., by first computing ), it can efficiently be solved using the LevinsonDurbin (LD) algorithm which proceeds recursively to compute , and . The LD algorithm is an iterative technique that computes the next prediction coefficient (AR parameter) from the previous one. This LD recursive procedure can be summarized in the following [9]: Equations (9b)–(9f) are solved recursively for and the final solution for the AR parameters is provided by
Therefore, the values of the AR prediction coefficients in the YuleWalker method are provided directly based on minimizing the forward prediction error in the leastsquares sense. The intermediate quantities represented by (9c) are known as the reflection coefficients. In (9f), both energies and are positive, and, thus, the magnitude of should be less than one to guarantee the stability of the allpole filter.
However, the YuleWalker method performs adequately only for long data records [15]. The inadequate performance in case of short data records is usually due to the data windowing applied by the YuleWalker algorithm. Moreover, the YuleWalker method may introduce a large bias in the ARestimated coefficients since it does not guarantee a stable solution of the model [16].
3.2. The Covariance Method
The covariance method is similar to the YuleWalker method in that it minimizes the forward prediction error in the leastsquares sense, but it does not consider any windowing of the data. Instead, the windowing is performed with respect to the prediction error to be minimized. Therefore, the AR model obtained by this method is typically more accurate than the one obtained from the YuleWalker method [17].
Furthermore, it uses the covariance instead of . In this case, the Toeplitz structure of the normal equations used in the autocorrelation method is lost, and hence the LD algorithm cannot be used for the computations. To achieve an efficient in this case, Cholesky factorization is usually utilized [15].
The method provides more accurate estimates than the YuleWalker method especially for short data records. However, the covariance method may lead to unstable AR models since the LD algorithm is not used for solving the covariance normal equations [18].
3.3. Burg’s Method
Burg’s method was introduced in 1967 to overcome most of the drawbacks of the other AR modeling techniques by providing both stable models and high resolution (i.e., more accurate estimates) for short data records [19]. Burg’s method tries to make the maximum use of the data by defining both a forward and a backward prediction error terms, and . The energy to be minimized in this case () is the sum of both the forward and backward prediction error energies; that is, where and are defined as
The forward and backward prediction error criteria are the same, and, hence, they have the same optimal solution for the model coefficients [20]. Considering the energies in (9f) to be , the forward and backward prediction errors can, therefore, be expressed recursively as
These recursion formulas form the basis of what is called Lattice (or Ladder) realization of a prediction error filtering (see Figure 4).
As has been shown for the YuleWalker method, the accuracy of the estimated parameters , and depends mainly on accurate estimates of the autocorrelation sequence . However, this can be rarely achieved due to the prewindowing of data [17] or the existence of large measurement noise [21]. To avoid the difficulties of the computation of the autocorrelation sequences, Burg in his method estimates first the reflection coefficients using another formula instead of (9c). This formula is derived by substituting (12a) and (12b) into (13) and setting the derivative of with respect to (instead of in the YuleWalker and covariance methods) to zero.
This leads to the form which shows clearly that the magnitude of is forced (guaranteed) to be less than one, and thus the obtained model is guaranteed to be stable. Equations (12a), (12b), and (13) form the recursive structure of Burg’s lattice filter, which is shown in Figure 4 with the initial conditions of . Finally, the prediction coefficients are obtained by constraining them to satisfy (9e) in the LD algorithm.
Therefore, the utilization of (9e) and (13) together will always ensure the stability of Burg’s method solution [22]. Moreover, the utilization of both forward and backward prediction errors minimization usually yields better estimation results than using the forward prediction approach used in the previous two methods. Finally, it has been reported by [23] that Burg’s method generally provides better residual estimates than the YuleWalker method [19].
4. Fast Orthogonal Search (FOS) Method
FOS [5–7, 10–12] is a general purpose modeling technique, which can be applied to spectral estimation and timefrequency analysis. The algorithm uses an arbitrary set of nonorthogonal candidate functions and finds a functional expansion of an input in order to minimize the mean square error (MSE) between the input and the functional expansion.
The functional expansion of the input in terms of the arbitrary candidate functions is given by where is the set of weights of the functional expansion, , and the are the model terms selected from the set of candidate functions, and is the modeling error.
These model terms can involve the system input and output and crossproducts and powers thereof: By choosing nonorthogonal candidate functions, there is no unique solution for (14). However, FOS may model the input with fewer model terms than an orthogonal functional expansion [11]. For the FFT to model a frequency that does not have an integral number of periods in the record length, energy is spread into all the other frequencies, which is a phenomenon known as spectral leakage [24]. By using candidate functions that are nonorthogonal, FOS may be able to model this frequency between two FFT bins with a single term resulting in many fewer weighting terms in the model [5, 25].
FOS begins by creating a functional expansion using orthogonal basis functions such that where is a set of orthogonal functions derived from the candidate functions , is the weight, and is an error term. The orthogonal functions are derived from the candidate functions using the GramSchmidt (GS) orthogonalization algorithm. The orthogonal functions are implicitly defined by the GramSchmidt coefficients and do not need to be computed pointbypoint.
The GramSchmidt coefficients and the orthogonal weights can be found recursively using the equations [11] In its last stage, FOS calculates the weights of the original functional expansion (6), from the weights of the orthogonal series expansion and GramSchmidt coefficients . The value of can be found recursively using where and FOS requires the calculation of the correlation between the candidate functions and the calculation of the correlation between the input and the candidate functions. The correlation between the input and the candidate function is typically calculated pointbypoint once at the start of the algorithm and then stored for later quick retrieval.
The MSE of the orthogonal function expansion has been shown to be [5, 6, 11] It then follows that the MSE reduction given by the th candidate function is given by The candidate with the greatest value for is selected as the model term, but optionally its addition to the model may be subject to its value exceeding a threshold level [5, 6, 11]. The residual MSE after the addition of each term can be computed by The search algorithm may be stopped when an acceptably small residual MSE has been achieved (i.e., a ratio of the MSE over the meansquared value of the input [12] or an acceptably small percentage of the variance of the time series being modeled [5]). The search may also stop when a certain number of terms have been fitted. Another stopping criterion is when none of the remaining candidates can yield a sufficient MSE reduction value (this criterion would be representative of not having any candidates that would yield an MSE reduction value greater than the addition of a white Gaussian noise series).
5. Experimental Results
The data were collected by a lowcost MEMSbased inertial measurement unit (IMU CC300, Crossbow). These measurements were collected during a onehour experiment to obtain stochastic error models of both gyroscopes and accelerometers. To illustrate the performance, two sensors were selected as an example (accelerometerY, GyroY) while the other inertial sensors gave similar results. Figure 5 shows one hour of sampled accelerometerY and GyroY acquired at 200 Hz.
FOS is applied directly on the raw inertial sensor 200 Hz data without any preprocessing or denoising. Traditional methods like Yule Walker, Covariance, and Burg perform poorly on the raw data, so we first applied wavelet denoising of up to 4 levels of decomposition that resulted in band limiting the spectrum of the raw inertial sensor data to 12.5 Hz. Therefore, unlike FOS, the other 3 methods operate on the denoised version of the same data. After denoising, AR model parameters were then estimated as well as the corresponding prediction MSE for all sensors using YuleWalker, Covariance, and Burg methods.
For FOS, the raw INS data were divided into three datasets for model training, evaluation, and prediction stages. The first 3 minutes of the INS raw data were utilized for model training, which uses the FOS algorithm to identify several possibly nonlinear AR equations. Different models can be obtained by changing the maximum delay in the output and the degree of output crossproducts (CP). The next 3 minutes of the data were used for the evaluation stage. Here, models are compared and the best one, fitting the real output with minimum MSE, is chosen. As an example, the FOS model () of the accelerometerY is shown as follows: In the prediction stage, the output and MSE of the chosen model are computed over the remaining (novel) raw INS data. Figure 6 shows prediction MSE of AccelerometerY samples by YuleWalker, Covariance, Burg, and FOS methods. For FOS, an AR model of order 3 or 4 suffices, and MSE decreases when the degree of the crossproduct terms is raised to 2 (nonlinear model). An AR model of order 7 or 8 is required for Burg or Covariance method and order 9 or 10 for YuleWalker. Large AR model order complicates the estimation method (like KF) used for the INS/GPS integration.
Table 1 shows a summary of the performance of three conventional stochastic modeling methods (YuleWalker, Covariance, and Burg) and the proposed FOSbased method with crossproduct order set to 1 (i.e., linear model) and 2 (i.e., nonlinear model) for different model orders 1 and 10 over one hour of MEMS AccelerometerY measurements. The FOS model is capable of denoising the AccelerometerY measurements without appreciable degradation of the original signal. FOS achieves better performance in terms of lower MSE and less computational time than the traditional methods with no need for any denoising techniques. Increasing the crossproduct degree to 2 for FOS improves model accuracy and lessens position error by an order of magnitude (for ) but increases computation time.

A similar procedure was performed for the GyroY sensor measurements. Figure 7 shows the prediction MSE of GyroY samples by using YuleWalker, the Covariance, Burg, and FOS methods. It is clear that FOS method achieves minimum MSE error with less model order than the other AR methods. Table 2 shows a summary of the performance of three conventional stochastic modeling methods (YuleWalker, Covariance, and Burg) and the proposed FOSbased method with crossproduct order set to 1 (i.e., linear model) and 2 (i.e., nonlinear model) for different model orders 1 and 10 over one hour of GyroY measurements.

Similar to the accelerometer case, the stochastic model obtained for GyroY using FOS surpasses the models obtained by the other methods in the MSE, the corresponding position error that would result from the residual errors and the computation time.
6. Conclusions
Inertial sensor errors are the most significant contributors to INS errors. Thus, techniques to model these sensor errors are of interest to researchers. The current state of the art in modeling inertial sensor signals includes lowpass filtering and using wavelet denoising techniques, which have had limited success in removing longterm inertial sensor errors.
This paper suggested using FOS to model the MEMS sensor errors in time domain. The FOS MSE and computational time are compared with those from YuleWalker, the covariance, and Burg methods. FOS was applied directly to the onehour raw inertial sensor 200 Hz data without any preprocessing or denoising. The other traditional 3 methods operated on the denoised version of the same data after we applied the wavelet denoising of up to 4 levels of decomposition.
For either gyroscope or accelerometer case, the FOS model surpasses those obtained by traditional methods. The results demonstrate the advantages of the proposed FOSbased method including the absence of a need for preprocessing or denoising, lower computation time and MSE, and achieving better performance with smaller model order. Increasing the crossproduct degree for FOS improves model accuracy and lessens position error but increases computation time.
References
 A. Noureldin, T. Karamat, and J. Georgy, Fundamentals of Inertial Navigation, SatelliteBased Positioning and Their Integration, Springer, New York, NY, USA, 2012.
 M. Grewal, L. Weil, and A. Andrews, Global Positioning Systems, Inertial Navigation and Integration, John Wiley & Sons, New York, NY, USA, 2nd edition, 2007.
 S. Nassar, K. Schwarz, N. ElSheimy, and A. Noureldin, “Modeling inertial sensor errors using autoregressive (AR) models,” Navigation, Journal of the Institute of Navigation, vol. 51, no. 4, pp. 259–268, 2004. View at: Google Scholar
 J. Georgy and A. Noureldin, “Vehicle navigator using a mixture particle filter for inertial sensors/odometer/map data/GPS integration,” IEEE Transactions on Consumer Electronics, vol. 58, no. 2, pp. 544–522, 2012. View at: Publisher Site  Google Scholar
 M. J. Korenberg, “A robust orthogonal algorithm for system identification and timeseries analysis,” Biological Cybernetics, vol. 60, no. 4, pp. 267–276, 1989. View at: Publisher Site  Google Scholar
 M. J. Korenberg, “Fast orthogonal identification of nonlinear difference equation models,” in Proceedings of the 30th Midwest Symposium on Circuits and Systems, vol. 1, August 1987. View at: Google Scholar
 J. Armstrong, A. Noureldin, and D. McGaughey, “Application of fast orthogonal search techniques to accuracy enhancement of inertial sensors for land vehicle navigation,” in Proceedings of the Institute of Navigation, National Technical Meeting (NTM '06), pp. 604–614, Monterey, Calif, USA, January 2006. View at: Google Scholar
 A. Noureldin, A. Osman, and N. ElSheimy, “A neurowavelet method for multisensor system integration for vehicular navigation,” Measurement Science and Technology, vol. 15, no. 2, pp. 404–412, 2004. View at: Publisher Site  Google Scholar
 S. Nassar, A. Noureldin, and N. ElSheimy, “Improving positioning accuracy during kinematic DGPS outage periods using SINS/DGPS integration and SINS data denoising,” Survey Review, vol. 37, no. 292, pp. 426–438, 2004. View at: Google Scholar
 K. M. Adeney and M. J. Korenberg, “Fast orthogonal search for array processing and spectrum estimation,” IEE Proceedings: Vision, Image and Signal Processing, vol. 141, no. 1, pp. 13–18, 1994. View at: Publisher Site  Google Scholar
 M. J. Korenberg, “Fast orthogonal algorithms for nonlinear system identification and timeseries analysis,” in Advanced Methods of Physiological System Modeling, V. Z. Marmarelis, Ed., vol. 2, pp. 165–177, Plenum Press, New York, NY, USA, 1989. View at: Google Scholar
 D. R. McGaughey, M. J. Korenberg, K. M. Adeney, S. D. Collins, and G. J. M. Aitken, “Using the fast orthogonal search with first term reselection to find subharmonic terms in spectral analysis,” Annals of Biomedical Engineering, vol. 31, no. 6, pp. 741–751, 2003. View at: Publisher Site  Google Scholar
 Z. Shen, J. Georgy, M. J. Korenberg, and A. Noureldin, “FOSbased modelling of reduced inertial sensor system errors for 2D vehicular navigation,” Electronics Letters, vol. 46, no. 4, pp. 298–299, 2010. View at: Publisher Site  Google Scholar
 S. Haykin, Adaptive Filter Theory, Prentice Hall, Upper Saddle River, NJ, USA, 2002.
 L. Jackson, Digital Filters and Signal Processing, Kluwer Academic, Norwell, Mass, USA, 1996.
 R. Kless and P. Broersen, How to Handle Colored Noise in Large LeastSquares Problems, Delft University of Technology, Delft, The Netherlands, 2002.
 M. Hayes, Statistical Digital Signal Processing and Modeling, John Wiley & Sons, New York, NY, USA, 1996.
 M. J. L. de Hoon, T. H. J. J. van der Hagen, H. Schoonewelle, and H. van Dam, “Why YuleWalker should not be used for autoregressive modelling,” Annals of Nuclear Energy, vol. 23, no. 15, pp. 1219–1228, 1996. View at: Publisher Site  Google Scholar
 J. Burg, Maximum entropy spectral analysis [Ph.D. thesis], Department of Geophysics, Stanford University, Stanford Calif, USA, 1975.
 S. Orfanidis, Optimum Signal Processing: An Introduction, Macmillan, New York, NY, USA, 1988.
 J. M. Pimbley, “Recursive autoregressive spectral estimation by minimization of the free energy,” IEEE Transactions on Signal Processing, vol. 40, no. 6, pp. 1518–1527, 1992. View at: Publisher Site  Google Scholar
 S. Marple, Digital Spectral Analysis with Applications, PrenticeHall, Englewood Cliffs, NJ, USA, 1987.
 I. A. Rezek and S. J. Roberts, “Parametric model order estimation: a brief review,” in Proceedings of the IEE Colloquium on the Use of Model Based Digital Signal Processing Techniques in the Analysis of Biomedical Signals, London, UK, April 1997. View at: Google Scholar
 E. Ifeachor and B. Jervis, Digital Signal Processing, PrenticeHall, Englewood Cliffs, NJ, USA, 4th edition, 2001.
 K. H. Chon, “Accurate identification of periodic oscillations buried in white or colored noise using fast orthogonal search,” IEEE Transactions on Biomedical Engineering, vol. 48, no. 6, pp. 622–629, 2001. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2013 M. Tamazin et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.