Review Article

Predictive Analytics and Software Defect Severity: A Systematic Review and Future Directions

Table 1

Research questions guiding the SLR.

Research questionsObjectives

R-Q1: which is the most widespread data sampling state?To realize the sampling state of dataset mostly deployed so far
R-Q2: which public data are often deployedTo identify public datasets popularly or frequently used in literature
R-Q3: which machine learning approach is popular in literature?To identify the type of machine learning variate mostly used
R-Q4: does the choice of learner algorithm/ensemble impact the performance of defect severity prediction?To realize the consensus learner algorithm recommended in the literature
R-Q5: does training strategy impact prediction performance?To study various fold validation option-choice
R-Q6: is parameter tuning optimization popularly factored into predictive analytics?To know the extent to which results in the literature are enhanced by tuning options
R-Q7: which training tool is mostly adopted?Way of identifying the utilitarian value of tools for ML
R-Q8: what feature selection algorithm is mostly deployed?To identify the most deployed dimensionality reduction technique in literature
R-Q9: what is the course of action between “within” and “cross-project adoption”?A way of understanding the road map of SDP as implemented in previous studies
R-Q10: what are the prominent threats to the validity of proposed modelsTo identify from literature germane threats in literature to inspire future studies
R-Q11: understanding the future direction of software defect prediction studies with respect to threats to validity reportedTo do a one-to-one mapping of threats reported with future work directions