Optimizing Testing-Resource Allocation Using Architecture-Based Software Reliability Model
In the management of software testing, testing-recourse allocation is one of the most important problems due to the tradeoff between development cost and reliability of released software. This paper presents the model-based approach to design the testing-resource allocation. In particular, we employ the architecture-based software reliability model with operational profile to estimate the quantitative software reliability in operation phase and formulate the multiobjective optimization problems with respect to cost, testing effort, and software reliability. In numerical experiment, we investigate the difference of the presented optimization problem from the existing testing-resource allocation model.
Software testing is one of the most important phases to develop the highly reliable software products in software developments. In software testing, many developers, often called testers, try to find software bugs through the execution of test cases. As the number of test cases executed in software testing increases, the reliability of software product also increases by removing software bugs introduced in design and implement phases. However, it requires much efforts to increase the number of test cases executed in software testing. Thus from both cost and reliability points of view, it is important to make a plan for the allocation of testing resources such as the number of testers before software testing.
For this purpose, several papers have tried to solve the testing-resource problem with the probabilistic models. Ohtera and Yamada  first considered a simple software reliability model dependent on the testing effort and formulated a testing-resource allocation problem. The basic idea comes from the classical reliability allocation problems for component-based systems (e.g., see ). Zaheidi and Ashrafi  used AHP (Analytic Hierarchy Process) to solve a software reliability allocation model and determined reliability goals at the planning and design stages of the software project. Ashrafi and Berman , Berman and Ashrafi , Yamada et al. , and Nishiwaki et al.  extended the original works in [1, 8] and gave nonlinear programming algorithms for more complex resource allocation problems with constraints. Leung [9–11] discussed different optimization problems with various objective functions such as worst case failure probability, software development cost, and worst case utility. Hou et al.  considered a different testing-resource allocation problem based on the hypergeometric distribution software reliability model. Jha et al. , Wadekar and Gokhale , Lyu et al. , and Yang and Xie  also formulated various optimization problems for the software resource allocation and the software reliability allocation. Helander et al.  developed two problems: reliability-constrained cost minimization and budget-constrained reliability maximization under a software development scenario. Though their approach is quite similar to the classical nonlinear programming in the earlier works, it gives the detailed procedure with reality in applying the software resource allocation problem to the real problem. Ngo-The and Ruh  formulated a somewhat different problem for the software release planning by allocating the software development resources and gave an interesting case study. Recently, Pietrantuono et al.  used an architecture-based software reliability model and considered a reliability and testing time allocation problem. They also gave an empirical study for a program developed in the European space agency. In this way, considerable attentions have been received for the software resource allocation problems.
In this study, we focus on the testing-resource allocation with operational profile. The operational profile is a quantitative representation of how the system will be used in user environment . In fact, there are several representations for the operational profile. Ukimoto et al.  considered the software testing-resource allocation where the operational profile is time fraction of execution for modules, and they regarded the operational environment as the testing environment with different time scale. This idea is based on the accelerated life testing model where the testing environment is assumed to be accelerated from the operational environment with only the elapsed time . However, since software testing is the environment to detect software bugs, the testing environment might not be the time-accelerated environment of operational environment. Thus in the paper, we consider another representation of operational profile by using architecture-based software reliability model.
The architecture-based software reliability model is based on the architecture of the targeted software. In general, software system consists of a number of modules and executes modules according to a programmed logic, namely, the currently executed module changes with the passage of time in operational phase, which is called the execution path. If an execution path does not include any faulty module, the software never fails. That is, the software failure essentially depends on the software architecture and its execution paths. This is the basic concept of architecture-based software reliability model. Littlewood [22, 23] developed the earliest architecture-based software reliability models in operational phase. In his models, the execution path in operational phase is generated by a continuous-time Markov chain (CTMC) and a semi-Markov process. Laprie  also provided the similar model to the Littlewood  in a different way. Cheung  modeled the execution path by a discrete-time Markov chain (DTMC). Ledoux and Rubino  and Ledoux  extend the original Littlewood models to represent failover operation. Goseva-Popstojanova et al. [28, 29] established a theoretical relationship among different architecture-based software reliability models and compared them through an empirical case study. Singh et al.  provided an approach with UML to analyze the component systems which consist of software modules. In this paper, we use the architecture-based software reliability model to estimate the software reliability in operational phase. Compared to Ukimoto’s method , our approach provides more accurate estimation of the software reliability in operational phase.
The rest of this paper is organized as follows. In Section 2, we first describe the models for testing cost and efforts in testing environment based on software reliability growth models (SRGM). After that, the architecture-based software reliability model is also introduced to formulate the quantitative software reliability measure. In particular, we assume two different situations for the system usage. In Section 3, we formulate the software testing-resource allocation problems: reliability-constrained cost minimization and budget-constrained reliability maximization. Section 4 is devoted to the numerical illustration of our models. In Section 4, we compare the optimal solutions of resource allocation by Ukimoto et al.’s model and our model and discuss the effect of representation of operational profiles on the testing-resource allocation. Finally, in Section 5, we conclude this paper with some remarks.
2. Model Description
2.1. Cost Model in Testing Environment
First we describe the testing cost model in testing environment, which is essentially the same as Ukimoto et al. . The system consists of components. The software testing starts at time , and the system should be released at time . Let be the cumulative number of detected faults of component before testing time . Consider the following model assumptions:(i)There are a finite number of faults in each component before testing.(ii)The fault detection rate for a component is proportional to the amount of testing efforts for the component.
Let and be the expected number of faults before testing and the amount of testing efforts for component at testing time . Then the probability mass function (p.m.f.) of the cumulative number of faults is given bywhere is a fault detection rate per testing effort. The above equations are essentially same as the nonhomogeneous Poisson process (NHPP) based software reliability growth model (SRGM). By applying the testing effort function , we can represent a variety of fault detection processes. For instance, (cumulative) Rayleigh curve is typically used to a testing effort function. For the same of simplicity, this paper assumes the following linear testing effort function for all the modules:where is the testing effort per unit testing time and is a fixed effort for component .
Define the cost structure in testing environment:(i): fixing cost of a software fault detected in testing phase.(ii): testing cost per software testing effort.
Then the expected total cost for component in software testing is given byThus the total cost for software testing becomesAlso the total amount of testing efforts for testing is given by
2.2. Reliability Model in Operational Environment
Ukimoto et al.  assumed the expected cumulative number of faults at time , that is, the time after the release:This equation can be rewritten by This implies that the expected number of detected faults in operational phase is accelerated/decelerated by a parameter from the one in testing environment, because means the expected number of faults detected in . In , the parameter is given by time fraction of execution time of component in operational phase. Also, they assumed that the number of detected faults after the releases causes maintenance costs to fix the faults. However, in general, the operational environment is quite different from the testing environment. Moreover, from the user perspective, the reliability of software product is more significant than maintenance costs. Thus in this paper, we use the quantitative software reliability in operational phase derived from architecture-based software reliability model.
The architecture-based software reliability model represents a sequence of component executions in operational phase . In most of architecture-based software reliability models, the execution sequence is defined by a discrete or continuous-time Markov chain. In this paper, we focus on the continuous-time Markov chain (CTMC) based model.
The CTMC is a stochastic process with discrete state space on the continuous-time domain. In general, CTMC process is characterized by its infinitesimal generator. The infinitesimal generator is a square matrix whose dimension is same as the dimension of state space. The nondiagonal entries of the infinitesimal generator are transition rates between respective states, and diagonal entries represent the exit rates from corresponding states. Let be an infinitesimal generator of CTMC process . The probability row vector of is given byBy using the matrix exponential, the probability vector is also given by
In particular, we consider two cases: (i) execution of the system has an end; i.e., the system is an application such as command-line application; (ii) execution is continued; i.e., the system courteously provides a service such as server application. For convenience, the first and second cases are discrete and continuous cases, respectively.
(i) Discrete Case. Let be a transition probability to the execution of component after finishing the execution of component . Also is a probability that the execution is finished after component . Furthermore, we assume each execution time of component following an exponential distribution with rate . Then the sequence of component executions can be described by an absorbing CTMC with infinitesimal generatorwhere for , is a -by- matrix for transient states, and is an exit rate vector from transient states to the absorbing state.
To present the failure in operational phase, we define as the failure probability as the execution of component . In this paper, we suppose that the failure probability is given byIn the equation, means the expected number of residual faults in component at the release time which is given byAlso is the probability that a remaining fault does not cause a failure of component ; i.e., means the probability that at least one remaining fault causes a failure of component . Then the underlying infinitesimal generator can be rewritten bywhere is the matrix generated by replacing with and is a column vector whose -th entry is . Note that has two absorbing states corresponding to success and failure of execution, respectively.
The quantitative software reliability is defined by the probability that an execution is successfully finished. From the mathematical argument of CTMC, we have the software reliability in the discrete case:where is a probability vector to decide the initial component of execution.
(ii) Continuous Case. In the continuous case, the sequence of execution can be described by a CTMC with infinitesimal generator: Note that for . Similar to the discrete case, denotes the failure probability at component . Then we have a CTMC with one absorbing state corresponding to the failure state.In this case, the software reliability is defined by the probability that the system does not fail during the mission time . From the mathematical argument of CTMC, the quantitative software reliability can be formulated by where is a column vector whose entries are 1.
3. Software Testing-Resource Allocation Problems
Based on the models described in Section 2, we formulate the software testing-resource allocation problems. The problem is to decide test efforts for modules which minimizes testing cost or maximizes the software reliability in operational phase. Let , , and be the upper limits of cost and efforts and the lower limits of reliability, respectively. The problems reliability-constrained cost minimization (RCCM) and budget-constrained reliability maximization (BCRM) can be formulated as follows.
(i) RCCM in Discrete Case
(ii) RCCM in Continuous Case
(iii) BCRM in Discrete Case
(iv) BCRM in Continuous Case
They are nonlinear optimization problems and can be solved by numerical approaches such as Nelder-Mead method .
4. Numerical Illustration
In this section, we investigate the difference on the optimal testing-resource allocation between Ukimoto et al.’s model and our model. Suppose that the software consists of 10 modules and its architecture (module transition) is given in Figure 1, which is a reference model of architecture model introduced in . The number on each arrow means the transition probability . As seen in the figure, the system has an absorbing state as an output, and thus this is the discrete case. However, to compare our model with Ukimoto et al.’s model, we assume the execution restarts with INPUT just after the execution attains OUTPUT. In such situation, the system becomes the continuous case.
Table 1 shows the expected number of initial faults , the fault detection rate , the fixed effort , and the mean execution time used in this example. Also, release time, mission time, fixing cost, and testing cost are set as , , , and , respectively. Moreover, in our model, we set the failure probability per fault as for all .
Ukimoto et al.’s model considers maintenance cost which depends on the expected number of faults detected in operational phase (warranty period). Concretely, when is the fixing cost per fault in operational phase, the maintenance cost is formulated byNote that in the above equation is defined by (7); i.e., it requires the time fraction in execution . In this case, the time fraction is obtained from a steady-state probability of the CTMC. Define the row vector . Then the time fraction can be computed by finding the vector satisfying and . The last column of Table 1 shows the time fraction of execution. By using the maintenance cost, one of the testing-resource allocation problems described in  is given by Note that Ukimoto et al.’s model uses the expected number of residual faults instead of quantitative software reliability. In the experiment, the fixing cost per fault is set as .
Table 2 presents the optimal testing efforts obtained from RCCM problem in both models under , , and . Also the column ‘Residual’ indicates the expected number of residual faults at release time. From the table, we find that the optimal testing efforts in our model are much greater than those in Ukimoto et al.’s model. Since much efforts are spent in our model, the expected number of residual faults becomes smaller than those in Ukimoto et al.’s model. The amount of testing efforts depends on the number of initial faults and the detection rate of respective components. For instance, the numbers of initial faults in components M5 and M6 are 7.1 and 6.9 which are relatively higher than others. Thus much effort is spent in these components. Also M5 is the most frequently executed among them in terms of . Therefore, the testing effort for M5 is greatest in Ukimoto et al.’s model. However, in our model, the module with the greatest testing effort is M3. In Figure 1, M3 is the module that is executed before M5. That is, this result is affected by considering detailed transition probabilities of operational profile. On the other hand, Table 3 indicates the minimum costs (testing cost and maintenance cost), total amounts of testing effort (total effort), the total number of residual faults at release time (residual), and the quantitative software reliability in the operational phase (reliability). From the table, in Ukimoto et al.’s model, residual attains to the upper limit , and reliability attains to the lower limit in our case. Also, in the result on testing cost, there is a remarkable difference between Ukimoto et al.’s model and our model. The strategy obtained from Ukimoto et al.’s model is that much cost is spent to the maintenance without thought of quality (reliability) of software product. On the other hand, the strategy of our model is that much cost is spent in testing phase to guarantee the quality of software.
Next we show the example of BCRM. In BCRM, we set and . Note that the upper limit of cost is for the development cost, which does not include the maintenance cost. Tables 4 and 5 present the optimal testing efforts and their associated criteria. Dissimilar to RCCM, Ukimoto et al.’s model provides the high reliability. In this example, since the upper limit of cost is enough, both models provide the high reliability. However, the effort allocation is slightly different between them.
In this paper, we have presented testing-resource allocation problems by considering software reliability in operational phase. Concretely, by using architecture-based software reliability model, we have formulated the quantitative software reliability in operational phase and they are incorporated into the optimization problems to determine the optimal testing-resource allocation. In the numerical example, we have compared the optimal testing-resource allocation in Ukimoto et al.’s model and our model. As a result, the decision derived from our model is more severe to the quality of software product, compared to the decision from Ukimoto et al.’s model. In other words, from the reliability point of view, Ukimoto et al.’s model involves the risk that the released software fails, and the reliability of released software might be lower than the reliability we expect. The safety and mission critical systems require the high reliability. For such systems, the strict evaluation of operational reliability based on the software architecture is needed.
In future, we will investigate the tendency of BCRM problem in our model by compared with existing problems. Furthermore, by combining empirical software reliability engineering [32–34], we will discuss how to determine the model parameters in testing-resource allocation problems.
The model parameters in the experiment have been shown in the paper.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
H. Ohtera and S. Yamada, “Optimal allocation and control problems for software testing-resources,” IEEE Transactions on Reliability, vol. R-39, no. 2, pp. 171–176, 1990.View at: Google Scholar
D. W. Coit, “Economic allocation of test times for subsystem-level reliability growth testing,” Institute of Industrial Engineers (IIE). IIE Transactions, vol. 30, no. 12, pp. 1143–1151, 1998.View at: Google Scholar
N. Ashrafi and O. Berman, “Optimal design of large software systems considering reliability and cost,” IEEE Transactions on Reliability, vol. 41, no. 2, pp. 281–287, 1992.View at: Google Scholar
M. Nishiwaki, S. Yamada, and T. Ichimori, “Testing-resource allocation policies based on an optimal software release problem,” Mathematica Japonica, vol. 43, no. 1, pp. 91–97, 1996.View at: Google Scholar
Y.-W. Leung, “Optimal reliability allocation for modular software system designed for multiple customers,” IEICE Transaction on Information and Systems, vol. E79-D, no. 12, pp. 1655–1662, 1996.View at: Google Scholar
S. A. Wadekar and S. S. Gokhale, “Exploring cost and reliability tradeoffs in architectural alternatives using a genetic algorithm,” in Proceedings of the 10th IEEE International Symposium on Software Reliability Engineering (ISSRE-1999), pp. 104–113, IEEE, Boca Raton, FL, USA, 1999.View at: Publisher Site | Google Scholar
B. Yang and M. Xie, “Optimal testing-time allocation for mudular systems,” International Journal of Quality and Reliability Management, vol. 18, no. 8, pp. 854–863, 2001.View at: Google Scholar
K. Goševa-Popstojanova, A. P. Mathur, and K. S. Trivedi, “Comparison of architecture-based software reliability models,” in Proceedings of the 12th International Symposium on Software Reliability Engineering (ISSRE'01), pp. 22–31, IEEE, 2001.View at: Google Scholar
H. Singh, V. Cortellessa, B. Cukic, E. Gunel, and V. Bharadwaj, “A bayesian approach to reliability prediction and assessment of component based systems,” in Proceedings of the 12th International Symposium on Software Reliability Engineering (ISSRE'01), pp. 12–21, IEEE, China, November 2001.View at: Google Scholar
K. Shibata, K. Rinsaka, and T. Dohi, “Metrics-based software reliability models using non-homogeneous poisson processes,” in Proceedings of the 2006 17th International Symposium on Software Reliability Engineering, pp. 52–61, IEEE, Raleigh, NC, USA, November 2006.View at: Publisher Site | Google Scholar
H. Okamura, Y. Etani, and T. Dohi, “A multi-factor software reliability model based on logistic regression,” in Proceedings of the 2010 IEEE 21st International Symposium on Software Reliability Engineering (ISSRE '10), pp. 31–40, IEEE, San Jose, CA, USA, November 2010.View at: Publisher Site | Google Scholar
H. Okamura and T. Dohi, “A novel framework of software reliability evaluation with software reliability growth models and software metrics,” in Proceedings of the 15th IEEE International Symposium on High Assurance Systems Engineering (HASE '14), pp. 97–104, IEEE, Miami Beach, FL, USA, January 2014.View at: Publisher Site | Google Scholar