Review Article

Exploring Sign Language Detection on Smartphones: A Systematic Review of Machine and Deep Learning Approaches

Table 7

(a) Datasets used in sign language recognition. (b) Links to publicly available dataset.

StudyYearDatasetRemarks

(a)

[141]2021PSL dataset37 alphabets

[165]2021ISLAN (Indian Sign Language)Collection of 700 sign images, and 24 sign videos

[139]2021SIBI dataset8 static word signs. 19200 total images are included

[140]2021Custom made numbers from 1 to 5

[142]2021RKS persiansign, first-person, ASVID, isoGD(i) RKS-PERSIANSIGN: this dataset comprises 10,000 RGB videos showcasing 100 Persian sign words. These videos are contributed by 10 individuals, including 5 women and 5 men, with 100 video samples available for each Persian sign word
(ii) First-person: this dataset consists of 100,000 RGB-D frames depicting 45 different hand action categories performed with 26 distinct objects, capturing various hand configurations. Only the RGB sequences from the ASVID dataset are used in this context
(iii) isoGD: this dataset contains a total of 47,933 RGB and depth video samples across 249 class labels. For your reference, only the RGB samples are utilized in this dataset. It is further divided into three subdatasets, with 35,878 samples designated for training, 5,784 samples for validation, and 6,271 samples for testing

[137]2020HamNoSys database3000 words

[138]2020Chinese Sign LanguageThe dataset generated consists of 51 common word signs from which 60 sentences were created. Instances of sentences are 20400 from 34 volunteers

[127]2020Korean Sign Language17 words used for training

[128]2020China Sign LanguageData augmentation is used to obtain a benchmark dataset based on Chinese Sign Language (CSL). One dataset is obtained from Kaggle and the other is built from 30-second video frames

[120]2020American Sign Language (ASL) and Bengali Sign Language (BdSL)A dataset is generated which contains 1000 data points for each of the letters of ASL and BdSL

[132]2020MS-ASL datasetThis dataset has 25000 clips over 222 signers and covers 1000 most frequently used ASL gestures

[133]2020Bangla Sign LanguageThis dataset has 30 consonants and 6 vowels of BSL characters. The dataset holds 36 × 50 = 1800 images in total as it has 50 samples for each sign

[129]2020German Sign LanguageThe dataset has 301 videos with an average duration of 9 minutes
[125]2020American Sign LanguageA dataset consisting of 80 video clips that focus on finger movement. These video clips were sourced from two different origins: 32 were extracted from publicly available videos, while the remaining 48 clips were recorded manually. Within this dataset, there are 20 instances for each of the four alphabets: D, I, J, and Z

[131]2020Croatian Sign LanguageThe dataset was generated which consists of 25 languages and their signs. 40 volunteers performed each gesture twice which resulted in 2000 sign videos

[122]2020Hong Kong Sign Language (HKSL)The dataset was created by the authors. It consists of 45 most common words. For each word, 30 videos from different signers were recorded. Total videos are 1500

[134]2020Indian Sign LanguageCustom created. The dataset includes 100 static signs, that is, 23 English alphabets, 0–10 digits, and 67 commonly used words. There are 300 images of each instance totaling 35000 images

[135]2020Custom madeThe dataset has four unique word signs. Each sign has 50 images with different positions and light levels. The total number of images is 1000

[123]2020German Sign LanguagePublic dataset. RWTH-PHOENIX-weather 2014

[136]2020RKS-PERSIANSIGN first-person dataset NYU hand pose dataset(i) RKS-PERSIANSIGN:
(1) Contains: 10,000 RGB videos
(2) Content: 100 Persian words
(3) Contributors: 10 individuals
(4) Purpose: likely used for Persian sign language recognition. This dataset provides video samples for training and evaluating models for recognizing
(ii) Persian sign language gestures
First-person dataset:
(1) Contains: 100,000 RGB-D frames
(2) Content: 45 hand action categories for 26 different objects
(3) Purpose: this dataset seems focused on recognizing hand actions related to interactions with various objects. The RGB-D frames can be used for training and evaluating models capable of understanding hand-object interactions
(iii) NYU hand pose dataset:
(1) Contains: 81,009 image sequences
(2) Content: 36 joints
(3) Purpose: likely used for hand pose estimation. This dataset provides a large number of image sequences capturing various hand poses, which can be used to train and test models for hand pose estimation tasks
[126]2020Flemish Sign LanguagePublic dataset
The total samples are 18730 from 67 native signers with 100 classes

[102]2019The dataset contains three gesturesThe three gestures are feeling uncomfortable, seeing a doctor, and taking medicine

[110]2019ASL alphabet dataset. Sign language and static gesture recognition dataset(i) The ASL alphabet dataset contains 87,000 images. The sign language and static gesture recognition dataset contains 1,687 images
(ii) The authors created their dataset from these two datasets which contain 73,488 images

[105]2019American Sign LanguageA total of 10 samples of each alphabet were taken for accuracy

[103]2019Arabic Sign Language10 alphabets Alif, Ba, Ta, Kha, Dal, Dhad, Thah, Ghayn, Lam, and La. 2000 images used for training

[104]2019British Sign Language26 letters A to Z
Training performed on 520 samples (26 classes with 20 samples per class)

[109]2019Indonesian language inflectional wordsCustom made
(i) Word count: the dataset consists of a total of 1,440 inflectional words
(1) Training data: 954 inflectional words
(2) Testing data: 486 inflectional words
(ii) Data sources: the data were recorded by three teachers from Santi Rama school for the hearing impaired in Jakarta

[91]2019ASL datasetTwo datasets: one is word-level (70 ASL words) and the other is sentence-level (100 sentences)

[101]2019Arabic Sign LanguageOnly 5 letters were taken for the experiment

[90]2019Custom-made5 volunteers to perform 26 alphabet signs with 30 repetitions. That is, 26 × 30 × 5 alphabet signs (3,900) in the dataset

[87]2019Swedish Sign Language signs datasetSwedish keyword signing targeted children with communicative disorders

[115]2019Custom-made40 signs five times each totaling 200 for testing
[116]2019Custom-made PSL(i) Dataset generation: the dataset was generated by capturing videos of sign language gestures. Afterward, frames were extracted from these videos using the Matlab image processing toolbox
(ii) Signs: the dataset includes various sign language gestures, with each sign represented by a substantial number of pictures
(iii) Number of signs: not specified, but there are multiple signs
(iv) Pictures per sign: each sign is represented by approximately 1,500 to 2,000 pictures
(v) Total pictures: the dataset contains a total of around 21,000 pictures

[99]2019German Sign Language weather forecast programRWTH-PHOENIX-weather-2014
(i) Training set:
(1) Number of videos: 5,672
(2) Use: typically used to train machine learning or deep learning models
(ii) Validation set:
(1) Number of videos: 540
(2) Use: used during the model development process to fine-tune hyperparameters and assess model performance
(iii) Test set:
(1) Number of videos: 629
(2) Use: reserved for evaluating the final model’s performance and assessing its generalization to unseen data

[97]2019Ghanaian Sign LanguageCustom-made
The dataset consists of 66000 images in RGB color with 33 classes of static gestures having 24 alphabets and 9 digits

[94]2019Korean Sign LanguageCustom-made
Ten words were selected. A different number of videos were selected from the Internet for each word. The total no. of videos is 421

[93]2019CSLThe authors selected 100 kinds of sign language words. The training set consists of 2964, the validation set has 1044, and the testing set has 1005 videos

[119]2019ASLCustom-made
26 alphabets
[86]2019ASL Russian Sign Language (RSL)ASL dataset: Massey University of researchers
This dataset consists of 2425 images from 5 individuals
RSL:
Custom-made
The data for RSL are collected from five YouTube videos. The total number of gestures in RSL is 33. Only the 26 static gestures are taken and the rest of the dynamic gestures are not included in this work

[95]2019Custom madeThe static sign language has 24 alphabets. J and Z are excluded because they are dynamic. Also, it included and captured from seven native and nonnative signers with alike lighting

[106]2019ASLThere are 6000 words in the ASL dictionary

[117]2019ASLPublic dataset
The dataset collected from Kaggle contains pictures of static hand motions of ASL with 24 classes. The database consists of 47475 pictures from which 33000 (70%) pictures were used in the training set and 1445 (30%) pictures for testing

[114]2019LSA64 datasetPublic dataset:
Argentinian Sign LanguageThe authors selected 30 gestures and 50 video streams for each gesture. After video processing, 90,000 images were created representing the sequence of dynamic gestures. The number of images for each category is 3000

[6, 118]2019ASLA comprehensive collection of American Sign Language (ASL) gestures representing 24 English letters (excluding “Y” and “Z”). These gestures are captured in the form of expressive hand movements, providing a rich resource for ASL recognition
These ASL gestures used Kinect technology with contributions from 5 different individuals

[100]2019ASLPublic dataset
ASLLVD, the American Sign Language lexicon video dataset, features nearly 10,000 ASL signs by 6 native signers. The dataset focuses on 50 hand-picked ASL signs, each signed by 6 different individuals, totaling 300 videos. These videos include various angles, but our analysis concentrated on front-view recordings

[96]2019ASLCustom-made
The authors collected video data for 25 ASL signs from 100 users where each sign was executed three times each. The total number of instances was 7500
[107]2019ISLCustom-made
The authors selected 26 common signs. Each sign sample comprises 50 consecutive readings, representing 50-time points of gesture motion. A single sample is structured as a 50 × 11 matrix, forming 2D data stored in a CSV file

[108]2019SIBICustom-made
The number of videos is 2275 which consists of 28 common sentences

[98]2019ASLCustom-made
26 letters of the ASL alphabet are included. The signers are 3 and each signer took 10 signs for each alphabet which totals 30 for one alphabet. Thus, the total number of instances is 780

[113]2019ISLThe dataset consists of 2500 images for alphabets and dynamic words. The authors augmented this dataset and produced 5157 images

[88]2019CSLThe authors have created a database of four tables to store symbols with important descriptions. They have used HamNoSys which consists of 200 symbols consisting of hand shapes, hand orientation, location, and movements
ASL

[112]2019ASLCustom-made
The study concentrates on static ASL gestures from A to Y, omitting J and Z due to their dynamic nature. The dataset comprises 24 gesture images captured with a smartphone camera. Each gesture is represented by 200 images taken by two users, accounting for a total of 4800 images

[92]2019Thai Sign Language (TSL)Custom-made
The authors used Microsoft Kinect to record the video stream dataset. It consists of 64 isolated vocabularies. Each word was performed by 8 nonnative TSL signers and each signer acted 5 times for each word. Thus, there are 64 × 8 × 5 = 2560 video samples in total

[89]2019Brazilian Sign LanguageCustom-made
Authors recorded videos for 26 letters of the alphabet in Libras with 13 users. The total number of videos was 338

[74]2018Indonesian Sign LanguageAlphabets A to Z and numbers 1 to 10 used in this experiment

[70]2018Indonesian Sign LanguageAlphabets A to Z taken

[84]2018The open dataset given at Kaggle called sign language MNISTA set of 28 × 28 images representing the standard American Sign Language (ASL) alphabet, excluding J and Z
[82]2018French Sign Language22 gestures were taken out of 26 from French Sign Language. 4 gestures, that is, J, P, Y, and Z, were left out because of their nonstatic nature. Each gesture was performed by 57 participants. The total dataset contains 1.25 million samples

[75]2018Indian Sign Language (ISL)Digits 0 to 9 and alphabets a to z were taken for the experiment

[79]2018Indian Sign Language (ISL)Digits 0 to 9 and alphabets a to z were taken for the experiment

[83]2018Custom built. Indian Sign Language18 signs with each sign by 10 different signers recorded

[71]2018Indian Sign Language
American Sign Language
British Sign Language
Turkish Sign Language
(i) ISL dataset: used SVM for this dataset
Contains 4 signs, that is, A, B, C, and the word “Hello”
(ii) ASL dataset: used KNN for this dataset
Contains 10 ASL fingerspelling alphabets from a to i and k. The letter j is not included. The total number of samples was 5254
(iii) ISL: used CNN for this dataset
The total dataset is 5000 samples for 200 signs done by five Indian Sign Language users
(iv) Authors used ANN for the following 3 datasets
(v) ASL: consists of letters from A to Z
(vi) British Sign Language: contains alphabets from A to Z
(vii) Turkish Sign Language:
Consists of alphabets from A to Z. The letters Q, W, and X are excluded

[72]2018Argentinian Sign LanguageLSA64 dataset: 10 subjects, 5 repetitions, 64 sign types, 3200 videos
RWTH-PHOENIX-weather database: 50 classes, 1297 training videos, 238 testing videos

[73]2018Public dataset
There are 900 pictures including 25 samples for each of 36 characters consisting of 26 letters and 10 digits

[77]2018ISLCustom-made
200 sign language words. Each sign is performed by 5 different signers

[80]2018ISLCustom-made
A dataset of 5000 images and 100 images each for 50 most commonly used words was created

[85]2018ISLCustom-made
The dataset consists of 200 words to form sentences
[81]2018ASLMassey University gesture dataset 2012:
Consists of 36 classes with 2524 images
ASL fingerspelling A dataset:
Consists of 24 classes with 131,000 images
NYU:
Consists of 36 classes with 81,009 images
ASL fingerspelling dataset of the Surrey University:
Consists of 24 classes with 130,000 images

[78]2018ASLASL alphabet dataset: public dataset
There are 24 static gestures from letters A–Y. J is excluded as it is dynamic. There are 100 images for each class

[69]2018Korean Sign LanguageCustom-made
The dataset consists of 10,480 videos collected from ten Korean professional signers

[76]2018SIBI (Sistem Isyarat Bahasa Indonesia)Custom-made
The dataset consists of SIBI words performed by 2 teachers fluent in this language. It consists of 21 root words and 155 inflectional words. Each word is recorded 5 times by each teacher, thus resulting in a total of 1760 signs

[60]2017Custom-madeStatic gestures for the English alphabets from A to Z and digits from 0 to 9

[62]2017Custom-madeStatic gestures for the English alphabets from A to Z and digits from 0 to 9

[2]2017Custom-madeStatic gestures for the English alphabets from A to Z and digits from 0 to 9

[67]2017Indonesian Sign LanguageDataset: 1000 samples, 50 Indonesian sign words, 20 samples per sign, 500 for training, 500 for testing

[61]2017ISLCustom-made
26 alphabets from A to Z and 12 basic words

[66]2017ISLCustom-made
18 different words were included in the dataset

[56]2017Ubicomp.eti.uni18 different words were included in the dataset

[57]2017ASL103 signs

[68]2017ASLThe dataset has a total of 720 images (30 for every ASL sign image). The dataset consists of alphabets from A to Y. The letters J and Z are excluded

[64]2017Sinhala Sign Language (SSL)Custom-made
The dataset consists of 61 SSL fingerspelling signs (words) and 40 SSL number signs
[58]2017Greek Sign LanguageCustom-made
5 participants (2 male, 3 female) learned and performed 15 signs, four times each, totaling 300 evaluation samples

[63]2017Korean Sign LanguageCustom-made
30 different gestures are included in this dataset. The training data are 72% and the testing data are 28%

[59]2017Thai Sign Language (TSL)The dataset consists of 10 words. Each word has 10 samples

[55]2017NGT (Nederlandse Gebarentaal) sign language of the NetherlandsPublic dataset
The dataset consists of 40 glosses (words) taken from the NGT dataset

[65]2017ASLCustom-made
The dataset consists of 25 images from 5 people for each alphabet and digits 0–9

[50]2016ASL16 alphabets taken for training and testing

[49]2016Indonesian Sign Language24 gestures from A to Y excluding J and Z

[45]2016ASLCustom-made
The dataset consists of 20 ASL signs

[51]2016Arabic Sign Language (ArSL)Public dataset:
This dataset consists of 588 signs which include 10 numbers from 0 to 9, 28 alphabets, and different categories like family, job, colors, and sports

[3]2016Pakistan Sign Language6 alphabets from A to F with 20 samples for each letter collected

[52]2016Continuous sign language18 signs with each sign by 10 different signers recorded

[48]2016Danish Sign Language(i) Danish Sign Language: this dataset consists of 2,149 signs
New Zealand Sign Language(ii) New Zealand Sign Language: this dataset consists of 4,155 signs
RWTH-PHOENIX-weather 2014(iii) RWTH-PHOENIX-weather 2014: this dataset consists of 65,227 signs

[47]2016German Sign LanguageRWTH-PHOENIX-weather 2012
RWTH-PHOENIX-weather multisigner 2014
This dataset consists of 65,227 signs
SIGNUM single signer:
This dataset consists of 450 basic signs. Isolated signs are 450 and continuous sentences are 780. The total number of images is 5,970,450

[44]2016American Sign Language image dataset (ASLID)Public datasets
American Sign Language lexicon video dataset (ASLLVD)Training set: 808 ASLID images from six signers. Test set: 479 ASLID images from two signers
[46]2016Greek Sign LanguageCustom-made
24 Greek Sign Language letters, 10 samples each, 6 subjects, totaling 1440 samples

[54]2016Korean Sign LanguageCustom-made
Experiment: 5 subjects, 1–9 numbers repeated 5 times. 3 males and 2 females were the participants

[41]2015South African Sign Language (SASL)Taken only three alphabets A, B, and C and three digits 1, 2, and 3

[42]2015Malaysian Sign LanguageTaken only three alphabets A, B, and C and three digits 1, 2, and 3

[39]2015Taiwan Sign Language51 fundamental postures in Taiwan Sign Language

[35]2015ASLCustom built (real-time hand gesture recognition system)

[37]2015Indonesian Sign LanguageAlphabets A to Z

[40]2015ASLOnly the letters A to Z are included for testing

[43]2015Greek Sign LanguageGreek Sign Language alphabets

[36]2015German Sign Language (DGS)Public dataset
RWTH-PHOENIX-weather corpus:
Dataset: 2137 sentence segments, 14717 gloss annotations, 189,363 frames

[28]2014Custom builtHand gesture image database
The test dataset was prepared by four persons each of whom showed 19 signs with three rotation variations

[33]2014PSL300 samples taken from 30 individuals with 10 signs each

[34]2014PSLCustom-made
This dataset consists of 500 images of 37 alphabets. 426 images were utilized for training and 74 for testing

[31]2014Dataset DS1The number of one-handed videos and frames is 42 and 902
The number of two-handed videos and frames is 48 and 1337
Dataset DS2:
The number of one-handed videos and frames is 42 and 1276
The number of two-handed videos and frames is 48 and 1945
Dataset DS3:
The number of one-handed videos and frames is 42 and 1197
The number of two-handed videos and frames is 48 and 1735
[29]2014PSLCustom-made
The dataset consists of 37 alphabets. 6 samples are recorded for each alphabet

[30]2014ASLCustom-made dataset
Dataset: 24 static letters signed by 5 individuals, 60,000 images

[32]2014ArSLThe sign-to-letter translation by using a hand glove, microcontroller, and display unit

[27]2013Thai Sign Language (TSL)Custom-made
The dataset consists of 42 TSL alphabets. Several videos are taken for each alphabet

[26]2012Custom-built
A word is an input to the smartphone which is converted to video animation

[25]2012Brazilian Sign Language (Libras)Custom-made
The dataset consists of two sets. One is the vowel set which is A, E, I, O, and U. The other set has the set which has B, C, F, L, and V

NameLink (access date 25-August-2023)

(b)

PSLhttps://data.mendeley.com/datasets/y9svrbh27n/1

First-personhttps://guiggh.github.io/publications/first-person-hands/

Purdue RVL-SLLLhttps://engineering.purdue.edu/RVL/Database/ASL/asl-database-front.htm

Corpus NGThttps://www.ru.nl/en/cls/research

isoGDhttp://www.cbsr.ia.ac.cn/users/jwan/database/isogd.html

SIGNUMhttps://www.phonetik.uni-muenchen.de/forschung/Bas/SIGNUM/

WLASLhttps://dxli94.github.io/WLASL/

ASLIDhttp://vlm1.uta.edu/~srujana/ASLID/ASL_Image_Dataset.html

German Sign Languagehttps://www-i6.informatik.rwth-aachen.de/~koller/RWTH-PHOENIX/

Danish Sign Languagehttps://www.tegnsprog.dk/

ArSLhttps://menasy.com/
How2Signhttps://how2sign.github.io/

GSL datasethttps://vcl.iti.gr/dataset/gsl/

AUTSLhttps://chalearnlap.cvc.uab.cat/dataset/40/description/

LSA64https://facundoq.github.io/datasets/lsa64/

Ubicomphttps://ubicomp.eti.uni-siegen.de/home/datasets/

ASL finger spellinghttps://www.kaggle.com/datasets/mrgeislinger/asl-rgb-depth-fingerspelling-spelling-it-out

Sign language MNISThttps://www.kaggle.com/datasets/datamunge/sign-language-mnist

Indian Sign Languagehttps://data.mendeley.com/datasets/rc349j45m5/1 doi: 10.17632/rc349j45m5.1