Mutation Testing Approach to Negative Testing
Negative testing deals with an important problem of assessing a system ability to handle unexpected situations. Such situations, if unhandled, may lead to system failures that in some cases can have catastrophic consequences. This paper presents a mutation testing-based approach for generation of test cases supporting negative testing. Application of this approach can provide, in a systematic and human-unbiased way, test cases effectively testing wide range of unexpected situations. Thus, it can contribute to improvement of a tested system. The paper formally defines mutation operators used to control the generation process, describes a generic framework for the generation and execution of the test cases, and explains how to interpret results.
Testing plays an important role in developing dependable software systems working to the satisfaction of their users [1, 2]. A thorough testing should include both positive and negative testing. The positive testing [2, 3] comprises various activities related to checking if a system fulfills all requirements of stakeholders. A number of testing approaches offer ways to accomplish this task. Although the approaches vary in details, they usually share the concept of using some form of specification capturing the requirements for selecting test cases and then executing the test cases against the system to see if its responses match the expected and specified ones. Results of thoroughly performed positive testing can usually be considered a clear and dependable indication of the degree to which a system is correct in terms of fulfilling the specified requirements.
However, a system working correctly under normal conditions may still be unable to handle some untypical, unexpected, and basically undesired situations in an adequate way. Lack of adequate handling of such situations can cause the system to crash or to fail in providing proper outcomes, what in turn may lead to serious consequences, including damage in property or human life. Hence, there is need to perform also a negative testing [2–4]. The negative testing focuses on assessing the behaviour of a system while subjecting it to conditions out of its normal scope of operations. As a typical requirements specification states what a system is expected to do in certain situations rather than describing unexpected situations the system may encounter and their adequate handling, it cannot be used directly to support negative testing. Thus, it is usually left to the expertise and creativity of a tester to determine all the untypical and unexpected situations that may be of interest and provide test cases checking how the system will react to them.
In this paper, a mutation testing-based approach addressing the problem of providing negative test cases for testing software systems is presented. Originally, mutation testing was used to assess a quality of suites of positive test cases by checking their ability to detect faulty versions of a tested program (so-called mutants) generated from a source code of the program by changing it in a small ways [5, 6]. However, as it provides a general guidance for producing some “faulty” artifacts, it can also be applied for generating negative test cases by modifying the positive ones. Test cases obtained in this way can reflect a wide variety of unexpected usage scenarios for a system and thus contribute to better testing. The concept of mutating test cases rather than a program was outlined in  as a support for model evaluation. This paper builds on this concept, but it formalizes it and focuses on object-oriented software systems. In particular, it describes formally rules (so-called mutation operators ) controlling the generation of negative test cases for such systems and outlines a generic framework for providing and using the test cases.
The paper is organized as follows. In Section 2, essential background information and related research concerning mutation testing and negative testing are presented. Section 3 gives the problems related to generation of negative test cases and briefly explains the reasons behind the proposed solution. Section 4 provides key information about positive and negative test cases and introduces an example system used in the subsequent section to illustrate the approach. In Section 5, mutation operators are formally defined and discussed, and Section 6 presents the generic framework for applying mutation testing to generate negative test cases and briefly discusses possible way of assessing a system within the framework. Section 7 states the main problem related to the approach and the last section concludes the work and indicates direction for future works.
2. Background and Related Research
The approach presented in this paper deals with applying mutation testing for generating negative tests; hence, these areas are of particular interest for the research.
2.1. Negative Testing
Negative testing aims at revealing weak points of a system, checking if it is able to handle unexpected situations adequately or at least recover gracefully in case of a failure [2–4]. There is no much work dedicated to the problem solely, but some general testing techniques provide advices on dealing with this problem. The most prominent example of such techniques is equivalence partitioning [2, 8, 9]. The idea behind this technique is to analyze the input domain for a system and partition it into several equivalence classes, including classes representing invalid values, and then select one value from each class to define test cases. Test cases defined based on invalid classes can be considered to be negative ones. Nevertheless, such approaches target only input values; thus, they deal only with one type of invalidating normal usage scenarios of a system. Another testing technique that can be related to negative testing directly is stress testing [10–12]. Stress testing focuses on checking how a system behaves under extreme conditions; thus, typical test cases provided to this aim try to overwhelm the system with resources or take the resources away. However, such approaches do not cover less extreme situation that may result from more subtle changes in using a system.
2.2. Mutation Testing
Mutation testing is a fault-based software testing technique introduced originally to assess quality of a suite of test cases with respect to their ability to detect faults in programs [5, 6]. Application of this technique involves generation of a number of faulty versions of the program, so that each of them differs from the original one by only one, small change. The faulty versions, called mutants, are then run with the assessed suite of test cases. Once all test cases are executed, the so-called mutation score for the suite is calculated as a ratio of mutants detected by the test cases over the total number of nonequivalent  mutants generated from the original program. The mutation score expresses, in a quantitative way, the quality of the suite of test cases being assessed.
Nowadays, the application area of mutation testing is not limited to assessing test cases at implementation level only. A number of works have shown examples of applying it to different formalisms, at various levels of software development, and for assessment, as well as for generation of test cases [14–24]. Works dedicated to mutation testing-based generation of test cases usually share the idea of mutating a system, its model, or specification and then selecting test cases being able to detect the mutants [15, 18, 19, 23–26]. Different methods are used to select the test cases based on the mutants. In the research presented in , constraint solving was used to derive positive test cases from mutated constraints imposed on functions. A conversion of a graph-based model to a formal grammar-based description was proposed in . The grammar was further processed to obtain test cases. An interesting approach was presented also in . Its authors applied mutations to a model of a system or to specified properties and used model-checking technique to generate counterexamples showing violation of certain, desired, or mutated properties. In both cases, the counterexamples were seen as test cases, either negative or positive. As it seems, such approach can provide both positive and negative test cases. However, as mutants can behave in ways incorrect with respect to a system specification or in entirely unpredictable way, some of negative test cases may actually detect specification related inconsistencies that can be detected by positive test cases.
The concept of providing negative test cases by using mutation testing on test cases directly is not widely explored. To the author’s best knowledge, so far, only a few other researchers have followed the idea. The approach described in  is the closest to the approach outlined in  and the one further presented in this paper. The approach in  deals with measuring a specification-implementation concordance, but it also aims at modifying test cases directly. However, the changes introduced by its authors applied only to data processed by a program and were random. This approach provides wider range of changes introduced in a controlled way.
Several studies have demonstrated that mutation testing is an effective and trustworthy technique supporting test generation and assessment at various levels of a software development [28–31]. The strength of mutation testing lies in its systematic and human-unbiased way of generating mutants; thus, it seems to be well suited to accomplish also the task of providing negative test cases being able to detect various undesired situations.
3. Problems and Solution
The main purpose of negative testing is to check whether a tested system is able to handle properly any unexpected and undesired situations. Thus, selection of negative test cases serving this purpose poses a challenge. In particular, two issues need to be resolved. The first one is to decide what situations are unexpected and how to trigger them, and the second issue is to determine what their proper handling should be.
Two assumptions help to set basis for dealing with the above issues:(1)Any situation that is not given by a requirement specification is accounted as an unexpected and undesired one.(2)Proper system response to any of these situations is to follow an error handling procedure.
In context of negative testing, both assumptions are justified, because a system should only work and be used, in ways described by the specification, and it should not do anything else. An attempt to use the system in any other way should be considered incorrect and be forbidden.
Taking into account the first assumption, it can be further assumed that situations of interest for negative testing can be defined upon the specified ones by contradicting expected ways of using a system. The contradiction can be achieved by invalidating either inputs triggering the expected behaviours of the system or the conditions that are expected to hold for it.
A typical test case defines conditions and inputs needed to force the tested system to behave in a given way and usually expected outcomes the system should produce . Thus, to define negative test cases, one needs to select representative combinations of invalid conditions and inputs. Accurate outcomes for negative test cases cannot be determined. However, the second assumption helps to decide if actual outcomes resulting from executing a negative test case indicate whether the system is able to prevent unwanted behaviour or not.
A majority of unexpected situations are caused by slightly incorrect use (by human users or external systems) of a system; therefore, a suite including negative test cases varying only slightly from the positive ones should be adequate for triggering wide range of unexpected situations. The approach presented in this paper proposes to use mutation testing to generate such suite. Application of this technique supports systematic and human-unbiased way of introducing small modification of various kinds.
4. Positive and Negative Test Cases for Object-Oriented Systems
The approach presented in this paper aims at object-oriented software systems. An object-oriented system consists of interacting objects, but from a user’s perspective, such system can be seen as a single entity (an object) having its own properties (attributes) and providing specific services (operations) to the users. A simple model of ATM cash machine, used in this paper to illustrate the approach, is shown in Box 1. The system has been presented in a form of a class defining the attributes and operations. For the attributes and operations, short descriptions regarding their meaning were included.
To use the system, the user is required to follow certain scenarios. For example, to successfully withdraw money, one needs to insert a card, to enter a valid PIN, to select the withdrawal option, to enter correct amount of money to withdraw, to confirm it, then to take the disposed money, assuming that the requested amount is currently available, and to take the ejected card. So, to test if the system is able to perform according to such a scenario, a positive test case reflecting the scenario is required. Hence, a positive test case should describe the events triggering certain system operations (i.e., inputs) and expected system responses (i.e., outputs) and initial settings ensuring that the scenario can be successfully accomplished (i.e., conditions). Formally, a test case for an object-oriented system is defined as follows (Definition 1).
Definition 1. A test case for object-oriented system is a triple (, , and ) where(i) is a vector () of conditions defining the state of where(a) is a call of a constructor instantiating and initializing an object representing the system where(1)obj is an instantiated object of ,(2) is the constructor defined for , and and are sequences of values and types, respectively, of the same length (e.g., ),(b) is the length of ,(ii) is vector of inputs, where(a) is a single input triggering some behaviour of , where(1)obj is an object of targeted by ,(2) is a call of method (a suite of operations defined for ), and and are sequences of values and types, respectively, of the same length (e.g., ),(b) is the length of ,(iii) is vector of outputs produced by where(a) is an output provided for ( may by empty, denoted as 0, when no output is expected for ),(b) is the length of , and .
An example of a positive test case representing the scenario for successful withdrawal of money, as generally described earlier in this section, is shown in Box 2. While the format of the test case in Box 2 does not imply the use of a specific formalism, it follows the object-oriented principles and it can be easily adapted for any object-oriented approach.
As given by Definition 1, the test case from Box 2 is denoted as , where(i), such that , where obj = atm, = ATM, , and ,(ii) such that(a), where obj = atm, = getCard, and and are empty,(b) = , where obj = atm, = getPIN, = (1234), and = (int),(c)The remaining inputs are defined in the same way as shown in (a) and (b) above.
such that is “check balance, withdraw cash, and quit,” = “card ejected,” , and the remaining outputs are empty.
However, positive test cases can check only scenarios that are defined by specification. A small user’s mistake or lack of resources needed to successfully run a certain scenario may cause the system to behave in an unknown and most likely unacceptable way, to crash or to hang. Some typical deviations from a given scenario, such as entering an invalid PIN or unavailability of requested amount of money while running the withdrawal scenario, are usually defined in a requirements specification; thus, they are well defined and not unexpected and therefore should be tested by positive test cases.
However, numerous other situations being the results of a small deviation from some known scenarios could be truly unexpected and hard to predict. Although such situation may occur rarely, a high quality system should be prepared to handle them by following some error handling procedure. To check a system “readiness” to handle the unexpected situation, negative test cases have to be used. As it was stated in the previous section, a negative test case can be obtained by changing a part of a positive test case. For example, to check what will happen when a user does not confirm the withdrawal after entering the amount, a negative test case should be obtained by simply removing the call for the confirmation from a positive test case defined for the withdrawal scenario. A suite of negative test cases, generated by means of mutation testing, can check a system against a wide range of unexpected situations that even an experienced test developer would not be able to design.
A negative test case has the same structure as a positive one, so it will not be defined separately. Examples of negative test cases are given in Section 5.
5. Mutation Operators
A mutation operator is a transformation rule that defines how to modify certain features of the artifact undergoing mutations . There is a subset of the so-called traditional mutation operators that are fairly universal and can be easily adapted for different programming or modeling languages, but in general mutation operators are formalism-specific [7, 14, 16, 19, 32–34]. Therefore, application of mutation testing in different context, as in this approach, should always include defining an adequate suite of mutation operators.
An adequate suite of mutation operators should at least(i)cover all features of the targeted formalism,(ii)generate syntactically correct (feasible) mutants.
The syntax of test cases is rather simple, when comparing to the syntax of programming languages, but it is also significantly different from it. Therefore, the mutation operators defined for programming languages are not applicable in context of mutating test cases. A suite of 9 mutation operators targeting test cases was introduced and informally described in . One, new mutation operator (Condition Part Swap) is here introduced, and the operator Operation Target Replacement mentioned in  was here discarded, because of the fact that this approach assumed only one object representing the system to be created, so the operator will be of no use. All the operators target conditions and inputs. Their formal definitions are given by Definitions 2–10.
Let be a mutation operator of type and let be a test case, such that as given by Definition 1. Formally, for a given test case , a mutation operator produces a mutated test case what is denoted by : , where is a mutated test case.
Definition 2. Operation Call Deletion operator () is a mutation operator that deletes one input from the sequence of inputs given by , what is formally defined: : , where , , , , and ; that is, is obtained by removing from .
An example of mutated test cases obtained by applying this operator to the test case given in Box 2 is shown in Box 3. The mutant was generated by removing the call for operation confirm(). The mutants in Box 3 should be able to check what would happen when a user does not confirm entering the amount of money to withdraw.
Definition 3. Operation Call Replacement operator () is a mutation operator that replaces one input in the sequence of inputs given by by another input, what is formally defined: : , where , , , , and and ; that is, is obtained by replacing with which is also a call of operation of .
An example of mutated test cases obtained by applying this operator to the test case given in Box 2 is shown in Box 4. The mutant was generated by replacing operation confirm() with operation quit(). Such mutated test case should be able to check how the system will further process if different activity was requested (especially if it will continue withdrawal scenario or provide outputs expected for scenario with aborted withdrawal).
Definition 4. Operation Call Insertion operator () is a mutation operator that inserts additional input into the sequence of inputs given by , what is formally defined: : , where , , , , and and ; that is, is obtained by inserting into it which is th operation of .
An example of mutated test cases obtained by applying this operator to the test case given in Box 2 is shown in Box 5. The mutant was generated by adding the call for operation quit() after the call for operation confirm(). Such mutated test case should be able to check what would happed when a user tries to abort the withdrawal of money after confirming entering of the amount.
Definition 5. Operation Call Swap operator () is a mutation operator that changes the order of two subsequent inputs of the sequence of inputs given by , what is formally defined: : , where , , , , and ; that is, is obtained by swapping inputs and .
An example of mutated test cases obtained by applying this operator to the test case given in Box 2 is shown in Box 6. The mutant was generated by swapping calls for operations enterAmount(500) and confirm(). Such mutated test case should be able to check what would happen when users do not enter the amount of money to withdraw before confirming it and then they will try to enter the amount later (especially if the system will continue scenario for withdrawal).
Definition 6. Operation Parameter Replacement operator () is a mutation operator that replaces a value of one parameter of an operation call with another value of the same type, what is formally defined: : , where , , , , and , where and and is a value of type .
An example of a mutated test case obtained by applying this operator to the test case given in Box 2 is shown in Box 7. The mutant was generated replacing the value 500 of the parameter of call for operation enterAmount() with value 0. Such mutated test case should be able to check what the system will do when a user does not enter a valid value.
Definition 7. Operation Parameter Swap operator () is a mutation operator that changes the order of two values of parameters of an operation call, what is formally defined: : , where , , , , and , where and for and .
The principles behind applying this operator are the same as those for Condition Part Swap operator given by Definition 10. The example test case, used here to illustrate application of mutation operator, will not undergo modifications defined by , but the example in Box 10 can be referred to to see the idea of swapping parameters.
Definition 8. Condition Part Deletion operator () is a mutation operator that deletes one element of a condition in the condition sequence, what is formally defined: : , where , , , , and , where and and .
An example of mutated test cases obtained by applying this operator to the test case given in Box 2 is shown in Box 8. The mutant was generated by removing one value in the call of constructor ATM(1, 100000.00). Deletion of a value reflects the concept of a missing suiting defining a state of the system that should be met to run some scenario. Such mutant should be able to check how the scenario for withdrawal of money will run when the amount of money available is not set.
Definition 9. Condition Part Replacement operator () is a mutation operator that replaces a value of one parameter of a constructor call with another value of the same type, what is formally defined: : , where , , , , and , where and and is a value of type .
An example of mutated test cases obtained by applying this operator to the test case given in Box 2 is shown in Box 9. The mutant was generated by replacing the value 1000000.00 in the constructor ATM(1, 100000.00) with 0. Such mutant should be able to check how the scenario for withdrawal of money will run when there is no money available in the ATM.
Definition 10. Condition Part Swap operator () is a mutation operator that changes the order of two parameters values of a constructor call, what is formally defined: : , where , , , , and , where and for and .
An example of mutated test cases obtained by applying this operator to the test case given in Box 2 is shown in Box 10. The mutant was generated by swapping the values in the constructor ATM(1, 100000.00). Such mutant should be able to check what will happen when certain conditions setting the correct state of ATM do not hold.
Negative test cases generated by applying operators or may have a counterpart in test cases obtained by using methods based on equivalence partitioning. Other kinds of mutants rather do not have their equivalents in test cases obtained by means of other test generation methods.
The examples in Boxes 3–10 show that some mutants generated by the mutation operators reflect unexpected, but still viable, scenarios, while others seem to represent quite unrealistic scenarios. It is a general characteristic of mutation testing; mutants rarely represent errors that can happen in reality. However, it was shown by several researchers (see references in ) that mutated programs and models imitated real live errors good enough to be considered reliable base of assessment of test cases. Thus, similar tendency can be expected in this context. These seemingly impossible to occur scenarios that are represented by mutated test cases may be able to detect serious problems that will never be in focus of a test designer, even an experienced one.
The suite of mutation operators covers all features of a typical test case that are of interest in context of negative testing. Thus, the suite should be sufficient to provide negative test cases being able to trigger wide range of unexpected situation.
6. A Generic Framework for Generating Negative Test Cases
An outline of the generic framework for generating and using negative test cases is presented in Figure 1. The main steps performed within this framework follow the general principles of applying mutation testing: generation of mutants (i.e., the negative test cases) and their execution.
Let us for the rest of this section suppose that denotes a suite of positive test cases provided for a tested system , denotes a suite of negative test cases, where is a negative test case generated for a positive test case by applying a mutation operator, and denotes a suite of mutation operators defined for test cases.
6.1. Generation of Mutants
The first step, generation of mutated (negative) test cases, requires the suite of positive test cases () to be provided. It is further required that(i)the suite is complete with respect to a requirements specification for system ,(ii)all test cases from were executed against and has passed them all.
The expected outcome of this step is a suite of negative test cases (). A generic procedure for generating mutants from a given suite of positive test cases is given in Box 11. It gives the key steps of generating negative test cases, but details concerning their actual implementation will depend on the actual implementation of the theoretical model.
To generate the suite of negative test cases, each positive test case () has to be parsed to recognize and return, one by one, all occurrences of conditions and inputs, the elements that are to be modified by applying adequate mutation operators to them. So, each call of the operation nextElement() returns one element (denoted in Box 11 by e.el) and its type (denoted in Box 11 by e.tp). Depending on the type e.tp returned for an element e.el, some of the “apply an operator” operations are called. Independent of the specific functionality of the “apply an operator” operation, each of them takes as arguments the element e.el returned by current call to nextElement() and currently analyzed positive test case and returns a subset of newly generated negative test cases. A generation of one negative test case always consists in copying and modifying the element e.el of the copy according to the change defined by the given “apply an operator” operation. The cost, in terms of computational complexity, of generating one negative test case is a sum of the costs of copying the positive test case ( (the notations used to express the complexity of all operations in Box 11 are the same as those in Definition 1)) and of introducing one modification . Considering the fact that is comparable to and should not be greater than , the cost of generating one negative test case can be approximated by .
There are nine “apply an operator” operations, one for each mutation operator. The cost of one call of a given operation depends on the number of negative test cases it generates and returns. The operations are listed as follows, briefly described, and for each of them the cost of its one call is given:(i)applyOCD() takes, as the e.el argument, a system operation call and returns one negative test case obtained by deleting the operation call (e.g., by omitting it while copying the test case given as its second argument into one, new, negative test case). The cost of one call of applyOCD() is equal to the cost of generating one negative test case – .(ii)applyOCR() takes, as the e.el argument, a system operation call and returns a subset of negative test cases. Each negative test case is obtained by replacing the received call to the system operation by a call to a system operation (). The number of negative test cases generated in this way equals ; hence, the cost of one call of applyOCR() is .(iii)applyOCI() takes, as the e.el argument, a system operation call and returns a subset of negative test cases. Each negative test case is obtained by inserting a call to a system operation () after the received call to the system operation . The number of negative test cases generated in this way equals again , for all but the first call of applyOCI(). In the first call of applyOCI(), that is for the first input of , the calls to the remaining system operations are inserted also before the call to . The cost of one call of applyOCI() is .(iv)applyOCS() takes, as the e.el argument, a system operation call and returns one negative test case obtained by reversing the order of two subsequent system operations calls given by : the received call to system operation and the subsequent call to a system operation . The cost of one call of applyOCS() is .(v)applyOPR() takes, as the e.el argument, a system operation call and returns a subset of negative test cases. Each negative test case is obtained by replacing the value passed in the received call to operation with a value from a set of predefined values of type . The number of negative test cases generated by replacing the value with all the values defined for it equals ; thus, the cost of replacing all values given by the received call to system operation equals , and the cost of one call of applyOPR() is .(vi)applyOPS() takes, as the e.el argument, a system operation call and returns a subset of negative test cases. Each test case is obtained by swapping two values and passed in the received call to a system operation . The number of negative test cases generated in this way equals the number of swaps that have to be done; that is, for values passed in the operation call. So, the cost of one call of applyOPR() is .(vii)applyCPD() takes, as the e.el argument, a condition and returns a subset of negative test cases. Each negative test case is obtained by removing a value from the list of values passed in the received call to the system constructor . The number of negative test cases generated in this way equals the number of values passed in the call; thus, the cost of one call of applyCPD() is .(viii)applyCPR() takes, as the e.el argument, a condition and returns a subset of negative test cases. The negative test cases are generated basically in the same way as that by the applyOPR() operation and the cost of one call of applyCPR() is also .(ix)applyCPS() takes, as the e.el argument, a condition and returns a subset of negative test cases. The negative test cases are generated basically in the same way as that by the applyOPS() operation and the cost of one call of applyCPS() is also .
The cost of mutating one positive test case is proportional to the number of negative test cases generated on its base and is and the total cost of applying the procedure in Box 11 is .
6.2. Execution of Mutants
The second step uses the negative test cases () generated in the previous step and additionally requires the system to be provided. The system should be correct with respect to its specification; that is, it should have passed all positive test cases from , as it was stated in the previous subsection. The purpose of this step is to gather information helping to assess the system ability to handle unexpected situations in a safe way. A generic procedure for executing mutated test cases follows a typical test execution procedure. Its outline is presented in Box 12.
Each iteration of this procedure consists in resetting the system to its initial state and then running the system (e.g., a program or an executable model) while feeding it with inputs from one negative test case and saving a corresponding outcome . An outcome should include actual outputs produced by the system for inputs given by test case and a verdict linked to the test case (rejected or accepted). The cost of executing the negative test cases depends on their number and size and on the cost of running the system (the cost of running a system cannot be given here as it depends on the system itself).
A more detailed procedure for executing negative test cases and an actual format of outcomes are here not defined, as they will depend on a testing environment used to run the system and execute the negative test cases.
6.3. Analysis of Negative Testing Results
After finishing both steps performed within the framework, the outcomes have to be analyzed to draw some conclusions regarding the ability of the assessed system to detect and adequately handle unexpected situations.
When mutation testing is applied in order to assess quality of a suite of test cases, the assessment is simple based on calculating the number of detected (i.e., providing outputs different from the original program for at least one test case from the suite) and undetected (i.e., providing the same outputs as the original program for all test cases from the suite) mutants. Only the undetected mutants need to be further manually analyzed to decide if they are equivalent  or were not detected due to the insufficiency of the suite.
In context of applying mutation testing to assess a system, the conclusions cannot be drawn in such a straightforward way. All results of running a system with the mutated test cases have to be analyzed and most of the work needs to be performed manually. The system may respond to a mutated test cases by crashing or hanging or by running the unexpected scenario without breaking and providing erroneous outcomes.
A crash or hanging is here an obvious case of “detecting” (rejecting) a mutated test case. However, in this context, it indicates the system insufficiency in providing any handling of the unexpected situation checked by the test case. Therefore, it should be recommended to study the test cases and execution traces of the system to identify the problem behind the crash and to propose adequate means of fixing the problem.
When a system does not crash, while executing particular mutated test case, the test case is considered “undetected” (accepted). Acceptance of a mutated test case means that the system has provided some error handling, it was unable to recognize unexpected situation, and it actually ran the scenario given by the mutant and provided some (erroneous) outputs or that the mutant was an equivalent one. Each case of accepting a mutant should be carefully analyzed to see what has caused the acceptance.
Let us consider the mutated test case presented in Box 2 as example. When the test case is executed, the ATM can wait for some period of time and then either abort the withdrawal and eject or withhold the card or first issue a message pointing to the lack of the expected input before aborting the withdrawal if the input is still not provided. Both responses seem to be acceptable ways of handling this unexpected situation; thus, they could be considered adequate ways of error handling. The system may also crash (and switch off) or hang (and be unable to abort the withdrawal and undertake any other actions) pointing to its inability to deal with the situation. Another course of action that may be taken by the system could consist in the continuation of the scenario and disposal of some money, what should be considered to be the case of not recognizing the situation as an unexpected one and providing invalid outcomes. In general, cases like the last one could be dangerous, because they may deceive the system users into thinking that the outcomes they obtained are correct and use them as such.
A mutated test case should be considered to be equivalent if it reflects a scenario represented by some of the positive test cases and forces the system to work in a way expected for the positive scenario. It is however not required that the outcomes will be identical with the outcomes provided by executing the positive test case. For example, for the positive test case presented in Box 1, a mutant that could be obtained by replacing the value 500 in the call of operation enterAmount(500) with another valid value (e.g., 100) will be an equivalent mutant. In general, there are no satisfactory solutions to the problem of identifying equivalent mutant . It is possible to apply some approaches helping to reduce the number or equivalent mutant (e.g., by rejecting replacement of values with values belonging to the same equivalence class), but in general they cannot be avoided and they cannot be identified without human assistance.
Results of analyzing the outcomes of executing negative test cases should provide valuable information pointing out weaknesses of the system and indicating possible ways of improving the tested system to make it more dependable and trustworthy.
7. Discussion of the Approach
The mutation testing-based approach to generation of negative test cases presented in this paper contributes to the area of software testing. Application of the approach can significantly improve a system ability to handle unexpected situations that, if unhandled adequately, could cause serious damage. However, there are certain problems that should be addressed in order to make the approach more attractive for practitioners.
The main problem concerning the approach, and mutation testing in general, is the high cost of generating and executing mutants . Several cost reduction techniques were proposed so far (e.g., [35–42]) and they seem to be quite efficient in the context of mutating systems. However, none of them was applied in context of mutating test cases. As it is clear that the number of mutated test cases can be quite large, even for small systems, adaptation of such techniques in this context or development of new, better suited, here, techniques seems to be one of the main issues that needs to be addressed.
Another problem that may affect the practical use of the approach is the selection of adequate replacement values used by operators OPR and CPR. Such a set can be generated by taking all values assigned to a given parameter in all positive test cases and adding values that are out of scope of valid values for the parameter. The invalid values can be determined on the basis of equivalence partitioning . Unfortunately, such set could be large and thus contribute significantly to the high cost of generating mutants. Similar problem, encountered in context of mutating systems, was tackled by rejecting operators replacing values of operands in expressions entirely. However, in this context, this approach seems to be not applicable, as the values passed in operations calls are vital elements of a test case. Future work on this approach should include a study on a possibility of minimizing the size of a such set or on working out a way allowing for using only a subset of these values each time the operators are applied.
Once more mutation testing related issue is the identification of equivalent mutants. A solution to this problem was not found in any context, yet. As it was stated in Section 6.3, the identification of equivalent mutants has to be done manually and it may require quite a large amount of time and effort to be invested. In practical implementation of the approach, it could be possible to add some procedures, for example, comparing the accepted mutants with positive test cases to calculate their similarity. Although it would probably not identify clearly the equivalent mutants, such comparison may point out some candidates and thus facilitate the human work.
The possibility to automate an approach plays an important role in its acceptance for practical use. In this case, the generation and execution of mutated test cases can be fully automated, but the analysis of results needs to be done mostly manually. Some preliminary assessment of the tested system can be made based only on the number of rejected and accepted test cases. However, to identify the causes of inadequate handling of unexpected situations, more detailed analysis of the outcomes of executing mutants is required. Though a detailed analysis needs to be manual and could be rather time consuming, it seems that the benefits of developing a highly dependable system being able to handle various undesired situations are worth the effort, especially in case of safety-critical systems.
8. Conclusions and Future Works
It is expected that a software system will work flawlessly in any situation. However, most testing methods focus only on checking if the system fulfills its specification, leaving the problem of assessing the system behaviour in unexpected situation unaddressed. The approach presented in this paper targets the problem by providing a mutation testing-based method for generating negative test cases that are able to support an assessment of a system ability to handle a wide range of unexpected situations. The main advantages offered by this approach include a procedural (thus suitable for automation), systematic, and human-unbiased way of defining the negative test cases and no need of any formal or informal description of unexpected situations. The approach seems promising, but as it was stated in Section 7, there are still problems that need to be addressed by further works.
Future work concerning application of mutation testing to test cases should also include development of tools supporting the generation and execution of mutants and experimental evaluation of the approach. Availability of such tools would significantly increase the possibility of adapting the approach in practice.
The author declares that there are no competing interests regarding the publication of this paper.
A. Roman, Testing and Software Quality, PWN, 2015 (Polish).
F. Belli and M. Linschulte, “On ‘Negative’ tests of web applications,” Annals of Mathematics, Computing & Teleinformatics, vol. 1, no. 5, pp. 44–56, 2008.View at: Google Scholar
F. Belli, “Finite state testing and analysis of graphical user interfaces,” in Proceedings of the 12th International Symposium on Software Reliability Engineering (ISSRE '01), pp. 34–43, IEEE CS Press, November 2001.View at: Google Scholar
A. P. Mathur, Mutation Testing, Encyclopedia of Software Engineering, Taylor & Francis, Abingdon, UK, 1994.
S. C. Reid, “Empirical analysis of equivalence partitioning, boundary value analysis and random testing,” in Proceedings of the 4th International Software Metrics Symposium, pp. 64–73, November 1997.View at: Google Scholar
G. J. Myers, C. Sandler, and T. Badgett, The Art of Software Testing, John Wiley & Sons, New York, NY, USA, 2004.
H. Agrawal, R. Demillo, R. Hathaway et al., “Design of mutant operators for the C programming language,” Tech. Rep. SERC-TR-41-P, Software Engineering Research Centre, Hyderabad, India, 1989.View at: Google Scholar
S. C. Pinto Ferraz Fabbri, J. C. Maldonado, T. Sugeta, and P. C. Masiero, “Mutation testing applied to validate specifications based on Statecharts,” in Proceedings of the 10th International Symposium on Software Reliability Engineering (ISSRE '99), pp. 210–219, November 1999.View at: Google Scholar
R. Schlick, W. Herzner, and E. Jbstl, “Fault-based generation of test cases from uml-models approach and some experiences,” in Computer Safety, Reliability, and Security: 30th International Conference,SAFECOMP 2011, Naples, Italy, September 19–22, 2011. Proceedings, vol. 6894 of Lecture Notes in Computer Science, pp. 270–283, Springer, Berlin, Germany, 2011.View at: Publisher Site | Google Scholar
G. Fraser and F. Wotawa, “Using model-checkers for mutation-based test-case generation, coverage analysis and specification analysis,” in Proceedings of the International Conference on Software Engineering Advances, pp. 16–21, 2006.View at: Google Scholar
K. Bolazar and J. W. Fawcett, “Measuring component specification-implementation concordance with semantic mutation testing,” in Proceedings of the 26th International Conference on Computers and Their Applications (CATA '11), pp. 102–107, New Orleans, La, USA, March 2011.View at: Google Scholar
J. H. Andrews, L. C. Briand, and Y. Labiche, “Is mutation an appropriate tool for testing experiments?” in Proceedings of the 27th International Conference on Software Engineering (ICSE '05), pp. 402–411, May 2005.View at: Google Scholar
R. Just, D. Jalali, L. Inozemtseva, M. D. Ernst, R. Holmes, and G. Fraser, “Are mutants a valid substitute for real faults in software testing?” in Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE '14), pp. 654–665, 2014.View at: Publisher Site | Google Scholar
L. Zhang, T. Xie, L. Zhang, N. Tillmann, J. de Halleux, and H. Mei, “Test generation via dynamic symbolic execution for mutation testing,” in Proceedings of the IEEE International Conference on Software Maintenance (ICSM '10), pp. 1–10, IEEE, Timisoara, Romania, September 2010.View at: Publisher Site | Google Scholar