|
Research questions | Objectives |
|
R-Q1: which is the most widespread data sampling state? | To realize the sampling state of dataset mostly deployed so far |
R-Q2: which public data are often deployed | To identify public datasets popularly or frequently used in literature |
R-Q3: which machine learning approach is popular in literature? | To identify the type of machine learning variate mostly used |
R-Q4: does the choice of learner algorithm/ensemble impact the performance of defect severity prediction? | To realize the consensus learner algorithm recommended in the literature |
R-Q5: does training strategy impact prediction performance? | To study various fold validation option-choice |
R-Q6: is parameter tuning optimization popularly factored into predictive analytics? | To know the extent to which results in the literature are enhanced by tuning options |
R-Q7: which training tool is mostly adopted? | Way of identifying the utilitarian value of tools for ML |
R-Q8: what feature selection algorithm is mostly deployed? | To identify the most deployed dimensionality reduction technique in literature |
R-Q9: what is the course of action between “within” and “cross-project adoption”? | A way of understanding the road map of SDP as implemented in previous studies |
R-Q10: what are the prominent threats to the validity of proposed models | To identify from literature germane threats in literature to inspire future studies |
R-Q11: understanding the future direction of software defect prediction studies with respect to threats to validity reported | To do a one-to-one mapping of threats reported with future work directions |
|