Research Article
Sound Classification Based on Multihead Attention and Support Vector Machine
Table 5
Classification accuracy on GTZAN compared across different numbers of heads and layers with Feature 1 and Feature 2 individually.
| Feature | Head (#) | L (#) | MhaNN accu. (%) | MhaNN-SVM accu. (%) | MhaNN-LR accu. (%) | MhaNN-KNN accu. (%) |
| Feature 1 | 2 | 1 | 81.8 | 82.9 | 81.6 | 82.2 | 2 | 82.9 | 84.0 | 81.3 | 83.4 | 3 | 81.2 | 81.7 | 79.2 | 82.0 | 4 | 1 | 82.3 | 83.1 | 82.0 | 83.3 | 2 | 85.4 | 88.4 | 84.7 | 86.7 | 3 | 84.2 | 86.1 | 85.5 | 84.8 | 8 | 1 | 82.7 | 84.8 | 82.7 | 83.1 | 2 | 83.6 | 85.1 | 83.3 | 84.1 | 3 | 81.2 | 83.2 | 78.7 | 80.1 |
| Feature 2 | 2 | 1 | 70.1 | 72.2 | 70.3 | 72.0 | 2 | 76.5 | 78.7 | 77.0 | 74.8 | 3 | 72.5 | 74.6 | 70.8 | 72.2 | 4 | 1 | 71.0 | 73.3 | 72.7 | 70.9 | 2 | 75.1 | 78.0 | 76.2 | 75.8 | 3 | 73.7 | 75.3 | 73.6 | 72.1 |
|
|