Research Article
Identification of Weakly Pitch-Shifted Voice Based on Convolutional Neural Network
Table 1
Architecture and parameters of the proposed network.
| No. | Layer | Kernel size/neuron numbers | Strides | Input channels | Parameters |
| 1 | Convolutional 1 | (5,5) | (1,1) | 1 | 1664 | 2 | Convolutional 2 | (5,5) | (1,1) | 64 | 102464 | 3 | Pooling 1 | (2,2) | (2,2) | 64 | — | 4 | Convolutional 3 | (5,5) | (1,1) | 64 | 102464 | 5 | Convolutional 4 | (5,5) | (1,1) | 64 | 102464 | 6 | Pooling 2 | (2,2) | (2,2) | 64 | — | 7 | Convolutional 5 | (5,5) | (1,1) | 64 | 102464 | 8 | Convolutional 6 | (5,5) | (1,1) | 64 | 102464 | 9 | Pooling 3 | (2,2) | (2,2) | 64 | — | 10 | Flatten | 2496 | — | — | — | 11 | Fully connected | 4096 | — | — | 1.02 ∗ 107 | 12 | Softmax | 1 | — | — | 4096 ∗ |
|
|
1 depends on specific the number of classes. |