Abstract

Since the traditional multiple-error locating method in software testing is difficult to achieve, and its information analysis is inaccurate, a software multiple-error locating method based on path clustering and failure weighting is proposed in this paper. In an environment with complete test cases, the program execution context information is dynamically captured by running test cases, and a path matrix of execution trajectory information is constructed. The cluster analysis is used to divide clusters and expand the weight of failed execution and added to suspicious in the process of degree evaluation to troubleshoot multiple errors. Experiments are implemented on four benchmark programs. The results show that compared with the five methods based on the equivalent evaluation function, the error location cost of the proposed method was reduced by 19.15% on average and effectively improved the efficiency of error location.

1. Introduction

Software testing is an important stage in the process of software development and software quality assurance [1]. It is a very time-consuming and energy-consuming work, which needs to consume almost 50% of software system development resources [2]. The problem of software error location is widely concerned by the industry and academia [3]. Error location is exploited to detect the existing error software statements i and improve the efficiency of software testing. The techniques of error location are frequently explored in the field of software research in recent years, and any optimization will decrease the cost of software development [4].

Lots of research studies in the field of software error location are conducted. Peng et al. [5] have put forward the ABFL method using spectrum-based (SBFL) technology to accurately locate code error. Zheng et al. [6] used the genetic algorithm to achieve a highly flexible software multi-fault location FSMFL framework. Feyzi and Parsa [7] have put forward a statistical method of fault tendency based on elastic network regression namely FPA-FL. Zhu [8] used supervised and semisupervised learning techniques for software testing. Gao et al. [9] defined the DStar method to find the focus of solving the problem of wrong location by improving the experimental results with the increase of parameter values. Wong et al. [10] introduced a method to predict software error location based on test Hamming distance and the K-means algorithm. Huang et al. [11] proposed a software error location method FGAFL based on the function call path and the genetic algorithm. Li et al. [12] have proposed a top-down software error location algorithm based on the weakest precondition. Wang and Sun [13] devised a software error location technique based on program variation analysis to reduce the impact of accidental successful test cases by analyzing program variation. Jiang et al. [14] proved the equivalence relationship of 30 suspicion formulas and the pros and cons of the positioning effect based on genetic algorithm analysis; Digiuseppe and Jones [15] proposed an error location based on the combination of program dynamic slicing and the Bayesian method.

The existing software multi-error locating method [16] and the multi-objective optimization algorithm [17] has a relatively larger difficulty coefficient and inaccurate information analysis. When software testing is carried out in a real environment, the type, number, and distribution of errors cannot be known in advance. When there are multiple errors in the program, the single error locating strategy, which ignores the interaction between them, becomes less applicable. At this stage, there are relatively few research works focusing on the problem of multiple error localization in software. Each method has its special scope of application. How to improve the limitations of the method from different angles and reduce the high cost is worth thinking and exploring.

This paper proposes a method to troubleshoot errors from the perspective of multiple errors in software. First, cluster analyzing model processes the path matrix of the program execution context, increases the weight of the failure execution and is applied to the calculation process of suspicious sentence suspicion to search error. On one hand, it circumvents the problems of limited positioning results and high calculation difficulty coefficients of the existing multiple-error positioning methods. On the other hand, it enriches the theories and methods of data mining technology in the field of error positioning.

2. Path Cluster

2.1. Principles of Cluster Analysis

Clustering analysis is an unsupervised learning process in which a data object is divided into clusters with different attribute characteristics to realize the similarity coefficient of the same cluster elements and the similarity coefficient of different cluster elements. Single error location method is mostly a repeated test in one-bug-at-a-time way with low efficiency. And when the number and distribution of errors cannot be learned in advance, the effectiveness of the software single error location method to achieve multi-error location will be limited. Therefore, it is necessary to use cluster analysis to locate multiple errors.

In this paper, a clustering analysis algorithm is introduced by using the similar relationship between program execution trajectories to divide the huge execution trajectory information set into several smaller information sets and calculate the suspicion based on the coverage information according to the execution trajectories in the subsets. The efficiency of strategy execution is improved by reducing redundant data. At the same time, the size and type of the sample set are not limited, and targeted constraints can be added, which is beneficial to the selection of the execution trajectory.

2.2. Path Clustering Generation

When setting parameters, a path matrix with dimension from the program execution path and execution result information is constructed, represents the number of test cases, represents the number of lines of the sentence, and the element in the column of the matrix represents the actual output obtained by executing the test case number . If the execution result is consistent with expectations, , and if the execution result deviates from expectations, . The matrix element represents the statement coverage information of the line number when the th test case is executed. If the th sentence is covered, is obtained by applying the binary vector method, otherwise . is formally represented aswhere abstractly expresses the execution trajectory and covers information of all test cases in the form of a path matrix, and then a series of operations after clustering can be established on the basis of matrix operations in the field of numerical analysis, which expands the idea of solving problems within a limited range and vividly conveys the essence of the data object. The process of path clustering is illustrated in Figure 1.

In the process of path clustering, the value of and the definition of the initial centroid determine the performance of the clustering strategy. The connection of data objects in the cluster and the proximity of the cluster determine the classification effectiveness of clusters. The multi-dimensional Euclidean distance method is selected to measure the distance between the execution trajectories. When the distance is shorter, the similarity coefficient between the trajectories is higher, so that the probability that they belong to the same cluster will be greater, respectively.

3. Software Error Positioning Model

3.1. Path Cluster Analysis

Assuming is a Java program containing multiple errors and is the number of executable code lines, is a set of test cases, , and , where is a test case that fails to execute. is a test case that executes successfully.

To reduce the computational overhead of NP-hard intra-cluster variation optimization, the complexity will be (where is the number of data objects, is the number of clusters to be clustered, is the number of cumulative iterations,  <<  and  << ). The K-means clustering strategy uses multi-dimensional Euclidean distance measurement method for cluster analysis. Suppose execution profile vectors are and , then the N-dimensional Euclidean distance formula is

Among them, si is the ith coordinate of the first point and is the ith coordinate of the second point. If , then , otherwise .

When clustering, the determination of the value changes according to the scale of the data object and is empirically selected as 0.5%–2% of the scale.

3.2. Failure Weighting

When troubleshooting errors based on spectral characteristics, the number of test cases will continue to vary. Successful test cases are negatively correlated with suspicion, while failed test cases are positively correlated with corresponding suspicion. Therefore, this article adopts an approach inspired by Wong et al.’s claim that “when more and more test cases are run, the weight of successful test cases should be gradually reduced in stages,” and proposes a technique to increase failure’s weight. The suspicion calculation method increases the proportion of failed executions for the sentence of a certain line number, instead of directly counting the frequency of failed test cases, to improve the shortcomings of the Wong method, which is difficult to select the weight interval and the weight reduction factor cannot be adaptive.

When the process of the path matrix clustering is completed, the ratio of failure to successful test cases in each cluster is far less than 1. This results in the probability of statement execution failure being much less than the execution success. The influence of different sentence coverage and the same proportion on the calculation of suspicion are weaken and different test case sets are adapted only by changing the failed execution weight.

When the total number of test cases in a certain cluster is , the number of coverage for the statement execution failure of line number is

The number of successful coverage iswhere is the sequence identifier of the test case, the calculation formula of the statement coverage and the execution result is

At this time, the failure weight factor of the statement with line number is defined as

Among them, is the failure weight factor of the statement with line number i.

The weight factor changes with the ratio of successful test cases to failed test cases. The reason for increasing the failure rather than the success weight factor is that adding a failed test case has a greater impact on the suspiciousness of the statement.

Furthermore, the formula for the total weight of failed test cases covers the sentence with line number is

For any sentence in the cluster, the frequency of incorrect coverage is positively correlated with the suspicion, and the frequency of successful coverage is negatively correlated with the suspicion. Therefore, the method of increasing the failure weight factor is selected for a sentence, so the weight of the failed test case increases with the number of sentence coverage and improves the efficiency of error location.

3.3. Suspicion Calculation

Under the setting of the weighting factor, a formula for calculating the suspicion of failure weight is proposed as follows:

Among them, in order to prevent , the sentence suspicion degree is 0, and then the error location priority ranking cannot be performed, and introduces an adjustment factor of 0.001. In particular, the premises of establishing and are

When calculating the sentence suspicion degree, if the th sentence is not covered by the test case, the sentence will be deleted from the suspicious sentence set.

4. FCW Method

The implementation steps of FCW method are as follows:Input: A Java program with a total number of lines of executable code and multiple errorsOutput: Check statement priority orderStep 1: Run to obtain program execution trajectory and execution results and extract feature elements to construct a coverage information matrix with dimension . The parameter represents the number of test cases .Step 2: Use the -means algorithm to divide the test case set into subsets , where , and use formula (2) to calculate the similarity between program entities.Step 3: In each test case set, calculate the suspicious degree of program entities separately and increase the proportion of failed test cases at the same time. Add the weight factor in formula (7) to the calculation process to obtain an improved suspicious degree calculation with equations (9) and (10).Step 4: Arrange the suspicious sentences in descending order of suspicion and then generate a check sentence priority sequence . According to the principle that the highest is, firstly to be checked, the execution trajectory of multiple errors in is tracked. Finally, the method’s performance and the pros and cons are compared.

5. Experiment and Evaluation

5.1. Experimental Setup and Design

The FCW method uses EvoSuite to generate test cases. Cobertura obtains the execution path and coverage information and uses path clustering and failure weighting to troubleshoot errors.

According to the basic idea of the FCW method, taking further verification of its performance as a starting point, four Java benchmark programs with inconsistent versions of wrong versions were selected to carry out the experiment. The benchmark program descriptions are shown in Table 1.

5.2. Comparison and Analysis of Error Positioning Methods

When verifying whether the basic idea of the method is correct, several representative equivalence classes are selected to avoid repetitive operations and steps.

It can be practically proved that the risk assessment function applied in the direction of software error location can be divided into six equivalent categories (ER1∼ER6), and two different methods are shown in each category in Table 2.

represents the test cases that failed to execute, represents the test cases that executed successfully, and represents the number of covered program entities and test cases failed to execute. represents the number of program entities that are not covered and the number of test cases that failed to execute. represents the number of entities covered and successfully executed by test cases. represents the number of program entities that are not covered and successfully executed by test cases.

By performing path clustering operation on the benchmark program, the obtained experimental data are shown in Table 3. The columns 2∼5 in the table indicate the number of errors in the program, the time required to execute the test case, the number of clustering of the path matrix, and the time spent in clustering, respectively. The experimental results imply that for different programs, although the number of errors is different, the number of path clusters may be identical. A reasonable number of clusters should be set according to the program’s scale.

After cluster analysis, each class cluster was taken as a basic unit, and the failure weight factor was set as a parameter in the process of suspicion calculation for the statements in the cluster, so that the track generated by errors and multiple errors in the program were traced in descending check order. Various methods are applied to all benchmark programs, and the comparison result is illustrated in Table 4. Columns 1∼2 of Table 4 are the names of the program, the average number of sentences needs to be checked and the cost of error location. Columns 3∼7 are the cost results of commonly used methods for error location. Columns 8∼9 are presented in this section. The cost result of the FCW method required the error positioning shows the results by setting two different weighting factors for the QWo1 method and the QWo2 method. The data displayed in bold indicates the positioning cost of this method is higher than the FCW method, and the performance is not as good as the FCW method. As the scale of the program gradually increases, the positioning effect of the FCW method is better than Wong1 and Naish2, but the cost is higher than Tarantula, Jaccard, and Wong2.

By comparing the error location cost between different methods, the result is shown in Figure 2. The performance of the FCW method is improved. Under the setting of arithmetic average, it can be assured by checking about 25% of the code. The wrong location is reduced by 7.68%, 15.85%, 34.23%, 13.33%, and 25.83% compared with the wrong location cost of Tarantula, Jaccard, Wong1, Wong2, and Naish2 methods. In contrast, Tarantula, Jaccard, Wong1, Wong2, and Naish2 methods need to check 31%–59% of the code to locate the error and require more time and material resources, but the efficiency obtained is not as expected.

In order to evaluate the effectiveness of the error location method proposed in this paper, experiments use the error location accuracy (called Acc) and the relative improvement value (called RImp) as the evaluation criteria [18]. Acc is defined as the percentage of executable statements that should be checked before a true error statement is found. RImp is the total number of statements that need to be checked to find all errors using Context-FL divided by the total number of statements that need to be checked using the compare method. The lower the value of RImp, the better the positioning effect [19].

As shown in Figure 3, a comparison of the localization efficiency of the five commonly used methods and the FCW method in this paper. The horizontal coordinate represents the percentage of executable statements checked in all programs, and the vertical coordinate represents the percentage of errors found by the locating method, and each point in the graph represents the percentage of errors located when checking the percentage of executable statements.

6. Conclusion

Aiming to tackle multiple errors in software testing, this paper adopts an error location strategy combining path clustering and failure weighting. Through the performance test of the FCW method on four Java benchmark programs, and the comparison of the pros and cons of the strategy with five equivalent evaluation functions, the results show that the use of cluster analysis algorithm can cognize the difference between multiple errors, and weaken the interference between errors, and the failure weighted suspicion formula method can reduce the impact of the proportion of successful test cases after clustering. This method can improve the positioning accuracy within a certain range, reduce the complexity of the method implementation and affect the software testing cost.

Data Availability

The raw/processed data required to reproduce these findings cannot be shared as the data contains private data.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The work is supported by the National Natural Science Foundation of China (Grant no. 61876138), the Key R&D Project of Shaanxi Province (2020GY-010), the Key Industrial Chain Core Technology Research Project of Xi’an (2022JH-RGZN-0028), and the Special Fund for Key Discipline Con`struction of General Institutions of Higher Learning from Shaanxi Province.