Research Article

Deep Neural Embedding for Software Vulnerability Discovery: Comparison and Optimization

Table 4

The statistics on code lengths for the total real-world dataset and test set involved in experiments. The functions are divided into five categories according to length.

Length of functions<128≥128 and <256≥256 and <384≥384 and <512≥512

No. of samples (% of total sets)58,556 (44.4%)32,140 (24.3%)15,052 (11.4%)8,065 (6.1%)18,205 (13.8%)
No. of samples (% of test set)11,735 (44.4%)6,446 (24.4%)2,944 (11.1%)1,630 (6.2%)3944 (13.9%)