Abstract

Reliability is one of the quantifiable software quality attributes. Software Reliability Growth Models (SRGMs) are used to assess the reliability achieved at different times of testing. Traditional time-based SRGMs may not be accurate enough in all situations where test effort varies with time. To overcome this lacuna, test effort was used instead of time in SRGMs. In the past, finite test effort functions were proposed, which may not be realistic as, at infinite testing time, test effort will be infinite. Hence in this paper, we propose an infinite test effort function in conjunction with a classical Nonhomogeneous Poisson Process (NHPP) model. We use Artificial Neural Network (ANN) for training the proposed model with software failure data. Here it is possible to get a large set of weights for the same model to describe the past failure data equally well. We use machine learning approach to select the appropriate set of weights for the model which will describe both the past and the future data well. We compare the performance of the proposed model with existing model using practical software failure data sets. The proposed log-power TEF based SRGM describes all types of failure data equally well and also improves the accuracy of parameter estimation more than existing TEF and can be used for software release time determination as well.

1. Introduction

Early Software Reliability Growth Models (SRGMs) represent the relationship between the time to failure and the cumulative number of faults detected till then. Many such SRGMs have been proposed as parametric [114] and nonparametric [1518] models since the year 1972 to estimate future failure occurrence times and assess the reliability growth of software systems during the testing phase. The traditional SRGMs are based on the premise that the mean value function of the model follows either exponential growth [1, 3] or S-shaped growth [2, 11] or both [48].

Some SRGMs have been proposed with testing effort function (TEF) [1113], since the fault detection and correction depend on efforts consumed such as test cases executed, man-days expended, computer utilization time, and other resources consumed, rather than only testing time or calendar time. The effort based SRGMs proposed in the past use exponential, Rayleigh, logistic, or Weibull distributions to specify testing effort function (TEF) to denote effort consumption during testing [1113]. Although these functions seem to give good result and can well fit in some cases, there is a fallacy in assuming finite total test effort at an infinite time. Xie and Zhao proposed a Nonhomogeneous Poisson Process (NHPP) reliability growth model based on log-power distribution which is a graphical model where fitting of the data or not can be visualized in a graph before parameter estimation [19].

In this paper, we propose using log-power [19] distribution to describe TEF in Goel and Okumoto [1] SRGM to provide an SRGM with infinite TEF. We use Artificial Neural Network (ANN) for parameter estimation and apply machine learning technique to determine the most suitable weights for the proposed model that will fit the past and future data equally well. We study and compare the goodness of fit (GoF) performance of the proposed model with a popular test effort function based SRGM. We use ANN for parameter estimation uniformly in all cases since ANN improves the parameter estimation accuracy and gives better goodness of fit rather than traditional statistical parametric models [1518].

This paper is organized in the following manner. Section 2 presents the proposed testing effort function. Section 3 presents the proposed Software Reliability Growth Model. Section 4 gives the approach to check the validity of the proposed model. Section 5 describes parameter estimation using ANN. Section 6 presents the machine learning technique used to select appropriate weights of the proposed model. Section 7 describes the performance analysis. Section 8 describes one application of the proposed model, namely, software release time determination. Summary and conclusions are given in Section 9.

2. Proposed Testing Effort Function

Since the resources consumed during software testing directly impact software reliability improvement, few SRGMs with testing effort functions were proposed in the past. To study the properties of testing effort functions, we compare the proposed log-power TEF with already proposed test effort functions such as Weibull and logistic. The comparison is given in Table 1.

The exponential and Rayleigh TEFs are special cases of Weibull TEF when the shape parameter is 1 and 2, respectively. Weibull TEF displays a peak curve when the shape parameter in the Weibull function increases. The exponential TEF is used, when the effort is uniformly consumed on the testing time whereas the Rayleigh TEF is used when the testing effort first increases to a peak and then decreases. In case of logistic TEF, at time “,” the effort is nonzero. It is unrealistic, because at the initial stages when time is zero no testing effort can be consumed.

There are innumerable chances for faults creeping in software systems. Therefore one has to adopt a strategy for the generation of effective test cases for minimizing the error content. It is believed that achieving zero defect in software is possible but impractical due to the requirement of infinite efforts. At time “,” the effort is not zero since test cases and test plan are drawn before testing starts. Thereafter it grows with testing. We chose log-power TEF because of its simplicity with just two parameters and it was found to be growing logarithmically with time and representing real testing projects better.

3. Proposed Software Reliability Growth Model with Log-Power Testing Effort Function

Instead of proposing a brand new SRGM for the sake of it, we propose building on the past good work done by researchers [1, 19]. We time transform the G-O model using log-power testing effort function. In the classical Goel-Okumoto SRGM, the independent variable, that is, time “,” is replaced with log-power testing effort function “” by applying the time transformation as applicable to NHPP models [20].

If is the log-power testing effort spent at time then the mean value function of the Goel-Okumoto model can be transformed as given below:where is total testing effort consumed in time interval , is the expected number of software errors to be detected, and , , and are constants.

Thus, the mean value function of the SRGM with log-power TEF is as follows:

4. Checking Validity of the Model

We evaluate the performance of the proposed model by using four practical software failure data sets which are available in the form of . The data set needs to be normalized in the range of before feeding to the ANNs. Table 2 provides the description of the software failure data sets.

We measure and compare the goodness of fit (GoF) performance of the proposed model by using Mean Square Error (MSE) [22]. MSE is used to measure the square of the difference between the actual and estimated values. The smaller MSE indicates the less fitting error and better performance.

5. Parameter Estimation Using Artificial Neural Network

We use feed-forward ANN with back-propagation algorithm for estimating parameters of the proposed model. Thus, the mean value function of the proposed SRGM with log-power TEF (2) is given as follows:where , , , and are the weights of software reliability model and their values are determined using ANN. Here, the activation functions of the ANN are developed according to the mean value function of the selected SRGM and testing effort function [16]. In order to estimate the weight values, software failure data which is available in the form of is used where is the cumulative testing time which is measured in terms of appropriate time such as months and hours, is the effort expended in terms of number of hours, and is the corresponding cumulative number of failures.

First, we estimate and values for log-power TEF using software failure data pair . Then, and values are estimated for mean value function using software failure data pair and here is the estimated values of . The activation functions for hidden layers are and . The linear activation function is used for output layers.

The ANN feed-forward back-propagation procedure for parameter estimation is given in Box 1.

6. Machine Learning Technique to Select Appropriate Weights of the Proposed Model

The goodness of fit statistic indicates the quality of fitting of past data. The objective is not only to get a better fit for the past data, but also to ensure that the model will describe the future data equally well. Traditionally the predictive validity, both short-term and long-term of the software reliability models, was measured in order to confirm that the model will describe the future data well. We apply hold-out cross-validation approach which is one of the conventional machines’ learning technique to get the better goodness of fit for the past data as well as predictive validity to describe the future data [23].

Multiple sets of weights may lead to equally good fit when we use ANN. Different good fits are possible depending on the start values assigned at random for the weights. Selection of weights based only on minimum training error could be misleading since the model may not describe future data accurately in the same manner. If the selected weights result in low training error but have high validation error, it is due to high variance or overfitting. Hence after arriving minimum training error (for 60% training data set) with the selected weights, we carry out validation (for 20% nonoverlapping validation data set) to ensure that the model will fit new data adequately. Box 2 describes the cross-validation procedure to select appropriate weights of the model.

Table 3 provides the Mean Squared Error values for both training and cross-validation for two trial weight sets of the proposed model.

It can be seen that although training error is more or less the same for both Trial-1 and Trial-2, the validation error is significantly higher for Trial-1 for both data sets. So it will not describe the future data better. Since the training and validation errors are both lower for the Trial-2 weights, the model will fit the future data also equally well.

7. Performance Analysis

Once the appropriate weights of the proposed model are determined as above, then the model is tested for performance using the remaining 20% test data to confirm the selected weights. The MSE calculated with test data is given in Table 4.

To study the relative performance of the testing effort function, we compare the proposed log-power TEF with already proposed Weibull test effort function [11], both used in G-O model [1]. The results confirm the suitability of log-power test effort function which appears to be the logical choice for TEF.

8. Determining When to Stop Testing: Use of Proposed SRGM

When to stop testing and release the software for operational use is one of the applications of Software Reliability Growth Models [22, 24]. Since the estimation of optimum release time based on conditional reliability does not converge [19], release time determination was carried by Subburaj and Gopal using minimum target failure intensity as the criterion instead of reliability [5], which converged after a few phases of testing. We adopt the same approach to determine when to stop testing using the proposed model. Box 3 describes the procedure for software release time determination using failure intensity to stop testing.

The equation of failure intensity function of proposed log-power TEF based SRGM is given as follows:

A target failure intensity of 1.663 failures per week is set for software failure data set DS-4. The target failure intensity has been achieved, and testing can be stopped at 25 weeks by which time 1166 failures were observed as given in Table 5. When we use effort based SRGM we can not only find the optimum testing time (), but also determine the effort needed to achieve target reliability as illustrated in Table 5.

9. Summary and Conclusions

In time-based Software Reliability Growth Models (SRGMs), we assume that the testing efforts are constant over time which may be unrealistic at times. Effort based SRGMs are more realistic and result in better goodness of fit. Hence, some SRGMs with testing effort functions were proposed in the past. We propose log-power TEF which is an infinite test effort function, since logically the test efforts will be infinite at the infinite testing time. The proposed log-power TEF based SRGM describes all types of failure data equally well. The goodness of fit indicates the quality of fitting of past data. It does not assure that the future data will be fitted equally well. Hence we determine the appropriate weights using machine learning technique to select the SRGM that will describe both the past and future failures equally well. The study confirms that SRGM with log-power TEF improves the accuracy of parameter estimation more than existing TEF and can be used for software release time determination as well. Instead of conventional parameter estimation methods, we use ANN for parameter estimation. Although already proposed SRGM uses Weibull distribution for effort function, our study reveals the log-power TEF to be simple and equally good and it is a natural choice for TEF. It is clear that the proposed log-power TEF based SRGM which is selected using machine learning technique improves the accuracy of the goodness of fit performance better than the Weibull TEF based SRGM which is already proposed.

Conflicts of Interest

The authors declare that there are no conflicts of Interest regarding the publication of this paper.