Identifying Incident Causal Factors to Improve Aviation Transportation Safety: Proposing a Deep Learning Approach
Table 6
The summary of the incident reports and their label distribution in the training set before and after data oversampling, as well as validation and test sets.
ā
Original
Train (oversampled)
Validation
Test
Human factors (HF)
87356 (62.8%)
87356 (25.4%)
10941 (64.0%)
16145 (63.4%)
Aircraft (AC)
32690 (23.5%)
65380 (19.0%)
3823 (22.4%)
6620 (26.0%)
Company policy (CP)
5335 (3.8%)
53350 (15.5%)
635 (3.7%)
1047 (4.1%)
Procedure (PR)
5321 (3.8%)
53210 (15.4%)
645 (3.7%)
1004 (4.0%)
Weather (WE)
4979 (3.6%)
49790 (14.5%)
623 (3.7%)
952 (3.7%)
Airport (AP)
3424 (2.5%)
34240 (10.0%)
428 (2.4%)
643 (2.5%)
Total
139105 (100%)
343326 (100%)
17095 (100%)
25451 (100%)
Validation and test data are maintained as imbalanced as the original training set to truly represent the data sample distribution.