Research Article

An Improved Method for Cross-Project Defect Prediction by Simplifying Training Data

Table 7

The best prediction results obtained by the CPDP approach based on TDSelector with Manhattan distance.

Manhattan distanceAntXalanCamelIvyJeditLucenePoiSynapseVelocityXercesEclipseEquinoxLucene2MylynPdeMean ± St.d

Linear
α0.80.90.91.00.90.91.01.00.81.000.80.91.01.00.187
AUC0.8040.7530.5990.8160.6890.6260.6950.7480.5000.7490.7730.6330.6920.6950.6680.696 ± 0.084
+ (%)1.3%7.0%0.3%-7.3%6.3%--7.8%-11.6%19.0%39.7%--5.6%
Logistic
α0.70.70.80.80.80.70.70.90.60.700.90.91.01.00.249
AUC0.7990.7600.6070.8300.6740.6210.7350.7940.5200.7560.7730.6800.5590.6950.6680.705 ± 0.084
+ (%)0.6%8.0%1.7%1.7%5.0%5.4%5.8%6.1%12.1%0.9%11.6%27.9%12.7%--6.9%
Square root
α0.90.90.91.00.80.80.90.80.91.001.001.01.00.164
AUC0.7950.7550.6040.8160.6930.6270.7040.7500.5100.7490.7730.5320.5230.6950.6680.680 ± 0.1
+ (%)0.1%7.2%1.2%-7.9%6.5%1.3%0.3%9.9%-11.6%-4.6%--3.1%
Logarithmic
α1.00.90.91.00.91.01.00.80.90.901.001.01.00.116
AUC0.7940.7550.6030.8160.6640.5890.6950.7630.5240.7560.7730.5320.5230.6950.6680.677 ± 0.102
+ (%)-7.2%1.0%-3.4%--2.0%12.9%0.9%11.6%-4.6%--2.7%
Inverse cotangent
α1.00.90.90.90.90.80.91.00.70.801.001.01.00.133
AUC0.7940.7490.6080.8210.6670.6090.7100.7480.5000.7580.7730.5320.5230.6950.6680.677 ± 0.103
+ (%)-6.4%1.8%0.6%3.9%3.4%2.2%-7.8%1.2%11.6%-4.6%--2.7%

NoD (α = 1)0.7940.7040.5970.8160.6420.5890.6950.7480.4640.7490.6930.5320.5000.6950.6680.659 ± 0.105