Research Article
Indonesian Lip-Reading Detection and Recognition Based on Lip Shape Using Face Mesh and Long-Term Recurrent Convolutional Network
Table 1
Multilingual dataset compared to IndoLR.
| Dataset | Language | Year | Isolated | Form segment | Speakers | Classes | Total data | Resolution | Pose |
| MIRACL-VC1 [7] | English | 2014 | v | Words | 15 | 10 | 1500 | 640 × 480 | Frontal | MIRACL-VC1 [7] | English | 2014 | v | Sentences | 15 | 10 | 1500 | 640 × 480 | Frontal | OuluVS2 [12] | English | 2015 | v | Sentences | 20 | 10 | 1000 | 720 × 576 | Frontal | LRW [10] | English | 2017 | x | Words | >1000 | 500 | 400000 | 256 × 256 | −30∼30 | LRS2 [33] | English | 2017 | v | Sentences | >1000 | 17428 | 118116 | 160 × 160 | −30∼30 | LRS3-TED [11] | English | 2018 | v | Sentences | >1000 | 70000 | 165000 | 224 × 224 | −90∼90 | GLips [13] | German | 2022 | x | Words | 100 | 500 | 250000 | 256 × 256 | Frontal | Turkish [29] | Turkish | 2022 | v | Words | Unspecified | 111 | 39960 | 60 × 35 (30–60 FPS) | Frontal (10 rot) | Turkish [29] | Turkish | 2022 | v | Sentences | Unspecified | 113 | 27120 | 60 × 35 (30–60 FPS) | Frontal (10 rot) | CMLR [34] | Mandarin | 2020 | v | Sentences | 11 | 9 | 102076 | 64 × 128 | Frontal | CN-CVS/Speech [35] | Mandarin | 2023 | x | Sentences | 2529 | ∼75 | 193,329 | 640 × 480 | Natural | OLKAVS [14] | Korean | 2023 | v | Sentences | 1107 | >100 | 250000 | 1920 × 1080 | 0,45,90 | Indo [30] | Indonesia | 2020 | v | Sentences | 10 | 5 | 50 | Unspecified | Frontal | IndoLR | Indonesia | 2023 | v | Words | 8 | 10 | 2400 | 640 × 480 (30 FPS) | Frontal | IndoLR | Indonesia | 2023 | v | Sentences | 8 | 4 | 1600 | 640 × 480 (30 FPS) | Frontal |
|
|
Iso, isolated; v, isolated; x, continuous.
|