Applied Computational Intelligence and Soft Computing

Research Article

Indonesian Lip-Reading Detection and Recognition Based on Lip Shape Using Face Mesh and Long-Term Recurrent Convolutional Network

Table 1

Multilingual dataset compared to IndoLR.


Dataset	Language	Year	Isolated	Form segment	Speakers	Classes	Total data	Resolution	Pose

MIRACL-VC1 [7]	English	2014	v	Words	15	10	1500	640 × 480	Frontal
MIRACL-VC1 [7]	English	2014	v	Sentences	15	10	1500	640 × 480	Frontal
OuluVS2 [12]	English	2015	v	Sentences	20	10	1000	720 × 576	Frontal
LRW [10]	English	2017	x	Words	>1000	500	400000	256 × 256	−30∼30
LRS2 [33]	English	2017	v	Sentences	>1000	17428	118116	160 × 160	−30∼30
LRS3-TED [11]	English	2018	v	Sentences	>1000	70000	165000	224 × 224	−90∼90
GLips [13]	German	2022	x	Words	100	500	250000	256 × 256	Frontal
Turkish [29]	Turkish	2022	v	Words	Unspecified	111	39960	60 × 35 (30–60 FPS)	Frontal (10 rot)
Turkish [29]	Turkish	2022	v	Sentences	Unspecified	113	27120	60 × 35 (30–60 FPS)	Frontal (10 rot)
CMLR [34]	Mandarin	2020	v	Sentences	11	9	102076	64 × 128	Frontal
CN-CVS/Speech [35]	Mandarin	2023	x	Sentences	2529	∼75	193,329	640 × 480	Natural
OLKAVS [14]	Korean	2023	v	Sentences	1107	>100	250000	1920 × 1080	0,45,90
Indo [30]	Indonesia	2020	v	Sentences	10	5	50	Unspecified	Frontal
IndoLR	Indonesia	2023	v	Words	8	10	2400	640 × 480 (30 FPS)	Frontal
IndoLR	Indonesia	2023	v	Sentences	8	4	1600	640 × 480 (30 FPS)	Frontal

Iso, isolated; v, isolated; x, continuous.