Abstract

Robust estimators are often lacking a closed-form expression for the computation of their residual covariance matrix. In fact, it is also a prerequisite to obtain critical values for normalized residuals. We present an approach based on Monte Carlo simulation to compute the residual covariance matrix and critical values for robust estimators. Although initially designed for robust estimators, the new approach can be extended for other adjustment procedures. In this sense, the proposal was applied to both well-known minimum L1-norm and least squares into three different leveling network geometries. The results show that (1) the covariance matrix of residuals changes along with the estimator; (2) critical values for minimum L1-norm based on a false positive rate cannot be derived from well-known test distributions; (3) in contrast to critical values for extreme normalized residuals in least squares, critical values for minimum L1-norm do not necessarily tend to be higher as network redundancy increases.

1. Introduction

The least-squares (LS) estimator, also known as L2-norm minimization, is the standard method in surveying data processing. In case of absence of outliers, it is the best linear unbiased estimator for the unknown parameters [1]. It also provides the maximum likelihood solution, if observational errors are normally distributed. The LS minimizes the sum of the squares of the residuals v (weighted by the weight matrix of observations P), that is,

However, if there are outliers in the sample, the LS will provide biased parameters [2, 3]. Here, we follow the definition of Lehmann [4]: “an outlier is an observation that is so probably caused by a gross error that it is better not used or not used as it is.” In surveying engineering, statistical testing procedures are commonly applied to deal with data possibly contaminated by outliers. This idea goes back to the pioneering work of Baarda [5, 6]. He introduced the Data Snooping (DS) procedure in order to detect outliers in a geodetic network. The issue of outlier detection in surveying engineering has been widely explored in the literature (see, e.g., [713]). A conceptual analysis of measurement errors and outliers in geodetic networks can be found in [14].

The iterative approach of DS, also known as Iterative Data Snooping (IDS) [13], is the most well-established outlier identification method in geodetic networks. For a review of DS and its variations, we suggest [12]. In the IDS procedure, every observation i is tested against outliers by computing its normalized residual , i.e., the ratio between its LS residual and its LS residual standard deviation :

The ith observation with the extreme (highest) normalized LS residual (max |wi|) is compared to its critical value |Zα/2|, which is generally taken from the normal statistical table (α being the user-defined significance level of the test). If max |wi| > |Zα/2|, then the respective observation is flagged as an outlier and usually excluded from the observations set. The same procedure is repeated iteratively until no observation is flagged as an outlier. It is worth mentioning that (2) is a simplification of for scenarios of uncorrelated observations, an assumption that was adopted throughout this work.

However, IDS involves not only a single test but also multiple hypothesis testing. In that case, the “false positive rate” (type I error probability or significance level α) is the rate of experiments in which at least one observation is flagged as an outlier, when, in fact, there is none. In this context, Lehmann [9] showed that the critical values cannot be derived from well-known univariate test distributions (e.g., normal distribution), due to the correlations between residuals. Hence, in order to fully control false positive rates in geodetic networks, critical values for IDS started to be numerically computed by means of Monte Carlo simulation (MCS) (see, e.g., [15, 16]).

As mentioned, IDS, however, is based on LS residuals, which are very “sensitive” to outliers [2]. On the other hand, minimum L1-norm is one of the standard robust estimation methods in Geodesy [17]. The estimator that minimizes L1-norm is more “resistant” or “robust” to outliers, as they tend to be almost completely projected onto the corresponding residuals [18]. Numerical examples can be found in [19, 20]. The test against outliers by computing normalized residuals is also useful in minimum L1-norm [18, 21].

Minimum L1-norm seeks the minimization of the sum of weighted absolute residuals [19]:where p is the weight vector of uncorrelated observations and |v| is the vector of absolute residuals. In particular, Minimum L1-norm is likely to provide higher outlier identification success rate than IDS for low-redundancy networks [15, 22].

The minimum L1-norm solution may not be unique [23], and its vector of residuals in geodetic networks tends to be sparse, with many residuals being equal to zero (see, e.g., [20, 24]). This means that corresponding observations are accepted as “perfect” observations, without any measurement errors. Geodetic observations, however, always have (at least) random errors [25]. Hence, such assumption of “perfect” observations is disconnected to the physical reality of geodetic networks. Besides, the estimator that minimizes L1-norm is biased except for some particular cases [26]. Therefore, the final estimation of a network should always be performed by LS, even if minimum L1-norm was applied to identify outliers [8, 27].

Minimum L1-norm has no analytical direct solution and needs to be solved by numerical methods. This work focuses on the solution by the simplex method [28] of linear programming, the most widely used approach for solving minimum L1-norm [23]. In geodetic networks, it has already been applied by many authors (see, e.g., [19, 20, 2931]).

Another technique commonly employed to solve minimum L1-norm in geodetic networks is the iterative reweighted least squares (IRLS) (see, e.g., [24, 3234]). It has also been applied in deformation analysis of geodetic networks [3537]. However, it is worth mentioning that IRLS does not seem to be always reliable [38], as it is a “local” estimator and may produce unacceptable solutions if it gets stuck in a local optimum [39].

Other techniques that were performed to compute minimum L1-norm in geodetic networks include simulated annealing [24]; genetic algorithm [39, 40]; and linear programming by an interior point method [38]. Solutions of minimum L1-norm by these methods and by IRLS are outside the scope of this paper.

To actually identify outliers based on minimum L1-norm results, geodesists have already tried two different criteria: (1) the ratio between residuals and respective observation standard deviation σi as the test statistic (see, e.g., [15, 41]); and (2) the normalized residual (Equation 2, considering and from minimum L1-norm) as the test statistic (see, e.g., [21, 42]). For both criteria, the chosen critical values (from which the respective observation would be classified as an outlier) were, in general, common values taken from the univariate normal statistical table, such as |Zα/2| = 3.00 (α = 0.27%) and |Zα/2| = 3.29 (α = 0.1%), which, as mentioned, are not appropriate even for IDS. However, it remains unclear how to properly choose critical values in minimum L1-norm.

In this context, this paper has three main contributions: (1) it provides a new procedure to compute critical values for normalized residuals in robust estimation based on MCS control of false positive rate; (2) it serves as a method to compare different quality control procedures by preserving respective critical values with the same false positive rate; and (3) it provides a Monte Carlo approach to compute the covariance matrix of residuals Ʃv for any adjustment procedure, which is, indeed, a prerequisite to compute critical values for residuals.

The outline of this paper is as follows. First, we present the new approach to estimate Ʃv by means of MCS. We highlight that such technique can be widely applied to any adjustment procedure, including LS, the estimator that minimizes L1-norm and other robust estimators. Then, also by MCS, we present a procedure to compute critical values for normalized residuals in robust estimation, based on the control of false positive rate (unprecedented in geodesy). Experiments were conducted in three different leveling networks, focusing on the minimum L1-norm solution by the simplex method and on comparison with LS/IDS.

1.1. Covariance Matrix of Residuals by Means of MCS

Given a geodetic network with m observations and n unknowns parameters, its respective mathematical model may be defined by equations (4) and (5), with Amxn being the “design” matrix with the coefficients of the parameters vector xnx1 and lmx1, the vector of reduced observations [43]. In equation (5) (stochastic model), is the variance factor and is ƩLmxm the covariance matrix of observations. As mentioned, vmx1 is the residual vector and Pmxm is the (symmetric positive-definite) weight matrix of observations. The solution for x produced by the LS is given by equation (6).

Hence, applying the general law of propagation of variances [25], the covariance matrix of parameters Ʃx is expressed by

Assuming and being the vector of adjusted observations, its covariance matrix ƩLa is

Finally, Ʃv for the LS adjustment can be directly computed by equation (9). Note that Ʃx, ƩLa, and Ʃv in equations (7)–(9) refer to LS, as they were obtained by the LS solution.

In matrix Ʃv, elements in position (i, j) represent the covariance between residuals of observations i and j and elements (i, i) represent the variance of the ith observational residual. Hence, despite the adjustment procedure, once its Ʃv is obtained, of each observation (equation (2)) can be easily calculated as , with being the ith element of the main diagonal of Ʃv.

For other adjustment procedures, however, the computation of Ʃv is not so immediate. Actually, some robust estimators even lack a closed-form expression for the computation of its Ʃv in the literature. In order to fully address this issue, which is also a prerequisite to enable the calculation of critical values for normalized residuals in robust estimation, we propose the following approach (Figure 1) to compute Ʃv for any adjustment procedure, by means of MCS. The inputs are functional and stochastic models (matrices A and ƩL) of the geodetic network and the chosen adjustment procedure (e.g., LS or minimum L1-norm).(1)synthetically generate M = 200,000 vectors of (pseudo-) random normally distributed errors of observations , with expected mean μ = 0 and covariance matrix ƩL, i.e., ; this amount of MCS trials was suggested by Rofatto et al. [12](2)For each MCS trial, let , and we compute the respective residuals vector by performing the chosen adjustment procedure; then, each will have m elements , , with being the residual of the ith observation in the kth MCS trial(3)Considering the average of residuals of the ith observation in all MCS trials , we compute the variance (equation (10)) for each residual and compute the covariance for each pair of residuals (equation (11)):(4)fix the estimated Ʃv by placing variances of each ith residual in the ith element of the main diagonal and covariances in its corresponding position (i, j).

Regarding Ʃv in minimum L1-norm, some authors (see, e.g., [8, 21]) derived variances of the subset of nonzero residuals, and Junhuan [18] presented an expression valid for IRLS solution. On another hand, our MCS method computes all variances and covariances and is more general, not only for any minimum L1-norm solution but also for any adjustment procedure.

1.2. Critical Values Based on False Positive Rates

Recently, many works (see, e.g., [9, 15, 16]) have investigated critical values for normalized residuals in IDS based on MCS control of false positive rate. Motivated by mentioned works, we propose the following procedure for any robust estimator (Figure 2). In fact, the method could be applied to any adjustment procedure, but it is clearly more useful for robust estimators in the process of outlier identification. The inputs are matrices A and ƩL of the geodetic network, the desired α (false positive rate), and the robust estimator for which critical values of normalized residual will be estimated.(1)compute Ʃv of the geodetic network by the procedure from Figure 1, considering the robust estimator selected.(2)synthetically generate M = 200,000 vectors of (pseudo-) random normally (or with other distribution) distributed errors of observations , with expected mean μ = 0 and covariance matrix ƩL, i.e., ; in order to avoid any kind of bias, these new vectors should be different from the vectors of step (1). Random errors generally follow normal distributions in geodetic observations. However, an advantage of an MCS approach is that it may be applied to other error distributions, as can be seen in the work of Lehmann [9].(3)compute the max-|| (i.e., the maximum absolute ) of each MCS trial.(4)order the set of all max-|| values in the ascending order.(5)The critical value will be the one in position (1 − α) M of the ordered set.

2. Experiments and Results

Experiments were performed in three simulated leveling networks (Figure 3). Network A consists in one control station, m = 6 observations (height differences), and n = 3 unknowns (station heights). Network B consists in one control station, m = 10 observations, and n = 4 unknowns. Network C consists in one control station, m = 15 observations, and n = 5 unknowns. Therefore, the “mean redundancy number” [13] of network A is rA = (6–3)/6 = 0.50, network B is rB = (10–4)/10 = 0.60, and network C is rC = (15–5)/15 = 0.67.

For all networks, the standard deviation of the observations was given by , where (in km) is the length of the respective leveling line. In the ascending order of the observation index, the lengths (in km) of each leveling line were as follows: for network A, [42, 38, 27, 22, 23, 33]; for network B, [37, 28, 33, 26, 40, 32, 39, 29, 34, 41]; and for network C, [30, 34, 25, 37, 28, 38, 29, 35, 31, 26, 33, 36, 27, 32, 24]. Therefore, for example, σi of the 4th observation of network A (which is also the lowest σi of all networks) is .

Minimum L1-norm adjustments were performed by the simplex method of linear programming. Normally distributed pseudorandom numbers were generated by the Mersenne Twister algorithm [44] and transformed from uniform to normal distribution by the Ziggurat technique [45]. All experiments were performed in the software Octave.

At first, we computed Ʃv for the three networks in three different ways: (1) Ʃv(LS-A) for the LS adjustment by its (well-established) analytical formulation (equation (9)); (2) Ʃv(LS-MCS) for the LS adjustment, performing the new approach we presented by means of MCS; and (3) Ʃv(L1-MCS) for minimum L1-norm adjustment (simplex solution), performing the new approach we presented by means of MCS. Table 1 presents the elements (rounded to one decimal place) of these matrices computed for network A. Respective matrices for networks A, B, and C (rounded to three decimal places) can be found in the appendix.

Then, we applied the new procedure to compute critical values for normalized residuals by MCS and based on the false positive rate (significance level α) in minimum L1-norm (simplex solution). For comparison and further analysis, in Table 2, we also presented critical values for IDS with the same procedure and critical values from the normal distribution statistical table.

Although the use of MCS seems computationally expensive, today, this is no longer an obstacle even for personal computers [12]. As so, the whole process of computing minimum L1-norm critical values, which includes the estimation of respective Ʃv(L1-MCS) (Figure 2), took approximately 14, 21, and 29 minutes (for networks A, B, and C, respectively) using an Intel (R) Core (TM) i5 2.50 GHz processor with 4 Gb RAM.

3. Discussion

The first issue to be addressed is the covariance matrix of residuals for network A, B, and C. Table 3 presents the maximum, minimum, and average differences of the variances (elements of the main diagonal) and covariances (elements outside the main diagonal) between matrices Ʃv(LS-MCS) and Ʃv(LS-A) and between matrices Ʃv(LS-MCS) and Ʃv(L1-MCS).

Ʃv(LS-A) and Ʃv(LS-MCS) had very close values for the three networks. The difference between corresponding elements were always less than 0.300 mm2 (being less than 0.100 mm2 for more than 75% of all networks elements), and average differences computed were less than 0.060 mm2 for all networks. These differences are relatively very low, as the minimum variance of observations (of the 4th observation of network A) was . Hence, this result validates our strategy presented to compute Ʃv based on MCS.

On the other hand, we can clearly see that elements of Ʃv(L1-MCS) were very different from Ʃv(LS-MCS) (and Ʃv(LS-A)) ones. The maximum variance differences (in mm2) were 11.703, 24.525, and 6.606 for networks A, B, and C, respectively, and average variance differences were always higher than 4.900 mm2. Hence, as expected, using Ʃv(LS-A) is not appropriate in the context of minimum L1-norm.

As expected, critical values for normalized residuals in IDS and minimum L1-norm presented in Table 2 were always different from critical values obtained in the univariate normal distribution table. Moreover, minimum L1-norm critical values were always higher than IDS ones. This characteristic highlights the importance of controlling the false positive rate properly by MCS, as was proposed in this research.

Finally, we note also that critical values for both minimum L1-norm and IDS vary from different networks. Although IDS values tend to increase with network redundancy, as already shown by [16], the same cannot be claimed for minimum L1-norm. Hence, this issue may be a subject for further investigation.

4. Concluding Remarks

In this work, we successfully developed and presented an approach by means of MCS to compute the covariance matrix of residuals and critical values for normalized residuals in any adjustment procedure. Since the LS method has a well-established analytical expression for the covariance matrix of residuals, our MCS strategy to estimate it was first applied to LS. We found that differences in respective elements between our strategy and analytical formulation were negligible, which validates our approach.

Numerical results of the whole procedure of computing critical values, which includes the estimation of the respective residuals covariance matrix, were presented in three leveling networks for minimum L1-norm solved by the simplex method of linear programming and compared to LS (for the covariance matrix of residuals) and IDS based on LS results (for critical values). In this sense, we highlight that, as mentioned, the minimum L1-norm solution may not be unique. Hence, conclusions of this paper are associated with the simplex solution of minimum L1-norm.

We have shown that the covariance matrix of residuals may change along with the adjustment procedure (in our case, from LS to minimum L1-norm). Therefore, since robust estimators generally do not have a well-established solution to compute the covariance matrix of residuals, the approach presented for any adjustment procedure (including robust estimators) herein is a valuable strategy.

Surveyors cannot rely on critical values from univariate normal distribution either for IDS or minimum L1-norm. Moreover, critical values vary even among robust estimators. However, unlike IDS, the critical values in minimum L1-norm do not necessarily tend to increase with network redundancy. Hence, the main contribution of this work was the proposed Monte Carlo-based critical values to control the false positive rate for normalized residuals of robust estimators.

Future research should perform this proposal in order to provide a fair comparison among different quality control procedures with the same false positive rate. Furthermore, one can investigate effects of chosen false positive rates in probability levels of classes of errors in outlier identification, i.e., type II error, type III error, overidentification positive and negative, and statistical overlap (see [16] associated with robust estimators).

The proposed approach for the computing residuals covariance matrix can be extended to covariance matrices other than the residuals one in future works. One can, e.g., compute the network parameters in each MCS trial and then compute the parameter covariance matrix of the chosen adjustment procedure.

The relationship between network redundancy and critical values for normalized residuals in robust estimation also needs further investigation. Besides, the new approach for the computation of the covariance matrix of residuals and for the estimation of critical values for normalized residuals described here should be applied for other robust estimators and other types of geodetic networks, such as Global Navigation Satellite System (GNSS) networks.

Appendix

Tables 46 present matrices Ʃv(LS-A), Ʃv(LS-MCS), and Ʃv(L1-MCS) computed for network A, respectively.

Tables 79 present matrices Ʃv(LS-A) , Ʃv(LS-MCS), and Ʃv(L1-MCS) computed for network B, respectively.

Tables 1012 present matrices Ʃv(LS-A) , Ʃv(LS-MCS), and Ʃv(L1-MCS) computed for network C, respectively.

Data Availability

All codes that support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the Department of Science and Technology of the Brazilian Army. The authors would like to thank the research group “Controle de Qualidade e Inteligência Computacional em Geodesia” (dgp.cnpq.br/dgp/espelhogrupo/0178611310347329).