| Type | Configuration |
| Input | | Conv | #kernels: 64, : , : 1, : 1 | Max pooling | Windows: , : 2 | Conv | #kernels: 128, : , : 1, : 1 | Max pooling | Windows: , : 2 | Conv | #kernels: 256, : , : 1, : 1 | Conv | #kernels: 256, : 33, : 1, : 1 | Max pooling | Windows: , : 2 | Conv | #kernels: 512, : , : 1, : 1 | Batch normalization | | Conv | #kernels: 512, : , : 1, : 1 | Batch normalization | | Max pooling | Windows: , : 2 | Conv | #kernels: 512, , : 1, : 0 | Map to sequence | | Bidirectional LSTM | #hidden unit: 256 | Bidirectional LSTM | #hidden unit: 256 | Bidirectional LSTM | #hidden unit: 256 | Transcription | |
|
|