Detection of Touchscreen-Based Urdu Braille Characters Using Machine Learning Techniques

Shokat, Sana; Riaz, Rabia; Rizvi, Sanam Shahla; Khan, Inayat; Paul, Anand

doi:https://doi.org/10.1155/2021/7211419

Mobile Information Systems

On this page

Abstract Introduction Materials and Methods Results Discussion Conclusions Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Novel Approaches to Mobile Application Testing

View this Special Issue

Research Article | Open Access

Volume 2021 | Article ID 7211419 | https://doi.org/10.1155/2021/7211419

Detection of Touchscreen-Based Urdu Braille Characters Using Machine Learning Techniques

Sana Shokat,¹Rabia Riaz,¹Sanam Shahla Rizvi,²Inayat Khan,³and Anand Paul⁴

Academic Editor: Zhongguo Yang

Received09 Sept 2021

Revised23 Oct 2021

Accepted20 Nov 2021

Published27 Dec 2021

Abstract

Revolution in technology is changing the way visually impaired people read and write Braille easily. Learning Braille in its native language can be more convenient for its users. This study proposes an improved backend processing algorithm for an earlier developed touchscreen-based Braille text entry application. This application is used to collect Urdu Braille data, which is then converted to Urdu text. Braille to text conversion has been done on Hindi, Arabic, Bangla, Chinese, English, and other languages. For this study, Urdu Braille Grade 1 data were collected with multiclass (39 characters of Urdu represented by class 1, Alif (ﺍ), to class 39, Bri Yay (ے). Total (N = 144) cases for each class were collected. The dataset was collected from visually impaired students from The National Special Education School. Visually impaired users entered the Urdu Braille alphabets using touchscreen devices. The final dataset contained (N = 5638) cases. Reconstruction Independent Component Analysis (RICA)-based feature extraction model is created for Braille to Urdu text classification. The multiclass was categorized into three groups (13 each), i.e., category-1 (1–13), Alif-Zaal (ﺫ - ﺍ), category-2 (14–26), Ray-Fay (ﻒ - ﺮ), and category-3 (27–39), Kaaf-Bri Yay (ے - ﻕ), to give better vision and understanding. The performance was evaluated in terms of true positive rate, true negative rate, positive predictive value, negative predictive value, false positive rate, total accuracy, and area under the receiver operating curve. Among all the classifiers, support vector machine has achieved the highest performance with a 99.73% accuracy. For comparisons, robust machine learning techniques, such as support vector machine, decision tree, and K-nearest neighbors were used. Currently, this work has been done on only Grade 1 Urdu Braille. In the future, we plan to enhance this work using Grade 2 Urdu Braille with text and speech feedback on touchscreen-based android phones.

1. Introduction

Smart devices are the most powerful tool for improving people’s living standards with visual disabilities [1]. Recent trends predict a drastic increase in smartphone users, with an expected increase of up to 9 billion by 2021 [2, 3]. There are various applications meant to assist visually impaired users, such as screen readers, sound and speech output devices, location finders, wearable devices for mobility, stereo vision-based systems, and virtual assistants [4–7]. Rapid growth in smartphone usage has changed the learning attitude of people [8]. People are increasingly turning to technology to explore new ideas. People learn by watching videos, tutorials, and online courses on their smart devices [9]. Braille is a commonly used language for visually impaired people. Louis Braille designed it in 1821. Braille is comprised of six dots in the form of two columns and three rows [10]. Visually impaired people write on sheets with the help of a stylus and read by gliding their fingers over the raised dots. It is difficult for visually impaired people to write Braille using these devices. Some interfaces convert textbooks into Braille books, but this facility is limited to specific languages; there is no procedure for converting Urdu text into Braille [11]. Previously, visually impaired people could only use their phones to make phone calls and send and receive text messages [12]. But now, people with visual impairments can read Braille with the help of a different screen reader software like Apple’s VoiceOver [13]. Different applications such as NavTap, Braille Play [14], Braille Tap [15], Braille Key, TypeIn Braille [16], Perkinput [17], Braille Easy [18], Eyedroid [19], and DRISHYAM [20] were developed to facilitate Braille text entry using smart devices. Although audio feedback was provided for user assistance in these applications, they also used numerous difficult gestures to memorize and took more time to perform a specific task. Due to usability issues, visually impaired users are unable to access such applications. Research is in progress for making applications that are less time-consuming and more usable. For Braille to Natural language conversion, image processing techniques were applied on scanned Braille sheets. Braille has been converted into Arabic [21], English [22], Bengali [23], Hindi [24], Tamil, maths [25–27], and Odia [28] using these techniques, respectively. Braille is converted into Urdu and Hindi using deterministic Turing machines [29] and image segmentation algorithms [30].

Previously, Braille was translated into other languages using scanned sheets as input. Their conversions are hectic due to extensive writing on those sheets by the users. Several touchscreen-based applications provide text-to-speech conversion methods that assist visually impaired people with reading and writing. Most of these applications burden the user [31, 32], such as memorizing so many gestures, finding the position of dots on the screen, and no editing options. A position-free Braille text entry method was proposed to address these problems. That application was designed to put the least burden on the users while entering the English Braille alphabet. Visually impaired users can enter Braille characters by clicking anywhere on the screen, subsequently saved in an image format. For character recognition, deep learning techniques with the GoogLeNet inception model achieved more than 95% accuracy [33]. As per our knowledge, there are very few studies for Urdu Braille data, and none of them takes user input directly from the touchscreen. So, there is a strong need for an application that takes run-time Braille data and converts it into natural languages. Currently, there is no mechanism available for Braille to Urdu conversion using touchscreen-based devices. So, in this study, the front-end interface proposed by Sana et al. was used to collect the Urdu Braille dataset. Braille input was saved in an image format in the previous version of this application. Here, with some backend processing algorithm improvements, Urdu Braille input is saved in numerical data. Machine learning techniques such as DT, SVM, and KNN with RICA-based feature extraction methods are used for Braille to Urdu conversion on the new Urdu Braille dataset, see Figure 1.

The main contributions of this research are as follows:(a)Collection of Urdu Braille Grade 1 dataset using the application developed by Sana et al. from visually impaired students of the Special Education School, Manak Payyan, Pakistan [33]. There was no existing Urdu Braille dataset that was taken directly from touchscreen devices.(b)Enhancement in the backend processing mechanism of Sana et al. has been proposed.(c)Predication of Urdu character from Braille input is made with robust machine learning techniques, such as decision tree (DT), support vector machine (SVM), and K-nearest neighbor (KNN), using RICA-based feature extraction method.(d)Evaluation of proposed mechanism on the collected Urdu dataset is made based on true positive rate (TPR), true negative rate (TNR), positive predicted value (PPV), negative predicted value (NPV), false positive rate (FPR), total accuracy (TA), and area under the curve (AUC).(e)A comparative analysis with previous studies using scanned input-based data of different national and regional languages has been performed.

This study comprises the following sections: Section 2 provides materials and methods, information regarding the dataset, and its collection procedure. In Section 3, the results are presented in detail. Discussion is provided in Section 4, and finally, conclusions and future work are presented in Section 5.

2. Materials and Methods

2.1. Dataset

The front-end android-based application proposed was used to collect the Urdu dataset for this study [33]. On the android-based touchscreen, visually impaired users enter Urdu Braille characters. The dataset is collected from the National Special Education School Center (NSEC) “Manak Payyan.” The age of the participants was between 12 and 19 years, and these students were either completely or partially blind. At this level, Urdu Grade 1 Braille data were collected, which included 39 distinct characters.

The final dataset comprises 5637 Urdu Braille characters collected using an android-based touchscreen device. Machine learning techniques are utilized to convert Braille to Urdu text using this dataset. All the ambiguous data were eliminated after checking the values against each alphabet manually.

2.2. Backend Processing Mechanism

An improved backend processing mechanism is proposed in this study. In work done earlier, the dataset consisting of images corresponding to each character was used.

In the current study using the position-free interface, values of “x” and “y” coordinating against each dot are stored in a database. Braille is composed of six-dot patterns, and each Braille character is represented by the activation and deactivation of these dots. For example, if a Braille character has two active dots, the proposed system will save the value of “x” and “y” coordinates of active dots, and the remaining four inactive dots will be assigned a “0” value. Initial data were stored in the form of a .txt file separated by a comma. To avoid ambiguity in the dataset, the researcher manually checked the dataset, commas were removed, and each extracted dot was stored in a .csv file. Figures 2(a) and 2(b) show a visually impaired user entering Dal “ﺪ” and Toyn “ﻄ” using a touchscreen-based Braille interface.

(a)

(b)

The algorithm for Braille input coordinate extraction is shown in Figure 3.

Since the only coordinate location of active dots is saved in the dataset for each character instead of the whole image, thus, this approach reduces the storage requirements. In previous schemes, the single image saved in the database was 4 to 8 KB in size. For multiple instances, these requirements get multiplied by the number of cases, whereas with the new approach, a text file containing 144 different samples of a single character took only 9 to 10 KB space.

2.2.1. Feature Extraction

Different feature extraction techniques were applied earlier to predict Braille to text conversion for other languages. Jha and Parvathi extracted HOG features using Braille to Hindi text conversion [24]. Similarly, Li et al. have used the traditional feature extraction method using KNN, Naïve Bayes, etc., for recognizing Braille characters [34]. Moreover, Li et al. used a histogram of oriented gradients (HOG) with SVM to convert Braille characters into English, Sinhala, and Odia [35, 36]. This study extracts RICA-based features using DT, SVM, and KNN classifiers for converting Braille to Urdu text. This feature extraction method extracts more features than the actual dimensions of the input dataset. This algorithm also has the ability for faster execution of preprocessing steps.

2.2.2. RICA Feature Extraction Method

Reconstruction independent component analysis (RICA) is not a supervised learning technique, so it does not utilize the class label information. RICA algorithm was introduced to address the limitations of the ICA algorithm. This technique delivered more promising results than ICA. A lot of algorithms have been presented in recent years to learn sparse features.

A sparse filter can differentiate many artificial and natural signals, and this feature plays a vital role in different machine learning techniques.

The unlabeled data are given as input

For calculating independent components, the problem of optimization of standard ICA [37] can be defined mathematically aswhere represents a nonlinear penalty function, is a matrix, L represents the vectors count, and I defines the identity matrix. Additionally, is employed to prevent the vectors in X from being degenerated. For this purpose, a smooth penalty function can be used, i.e., [27].

However, some constraints related to orthonormality block the standard independent component analysis from learning on an overcomplete basis. Consequently, the defect, as mentioned above, prevents ICA from scaling into high-dimensional data. Hence, for the replacement of orthonormality constraints in ICA, soft reconstruction cost is used in RICA. After this substitution, the following unconstrained problem can be used to represent RICA filtering:

In the above-stated equation, λ > 0 exhibits the tradeoff between sparsity and reconstruction error rates. After performing swapping orthonormality constraints with reconstruction cost, in this way, even on unwhitened data, RICA can learn sparse representations when X is overcomplete. However, penalty h can yield sparse representations and is not invariant [38]. Therefore, RICA [39] swapped it by an additional pooling penalty represented by L2, simultaneously promoting pooling features to cluster correlated features. Moreover, for feature learning, L2 pooling also encourages sparsity. L2 pooling [40, 41] represents a two-layered network; the 1st layer is with square nonlinearity, and in the 2nd layer, square root nonlinearity,

Pooling matrix H ∈ P(L × L) where H_k denotes a row of that pooling matrix set to constant weights, i.e., 1 for every element in matrix H, element-wise multiplication is defined by , and ε > 0 is a small constant. RICA is a linear method that investigates the sparse representation only in the actual data space. RICA method is unable to use the association between class label information and training samples.

2.3. Classification

A process for categorizing classes according to the extracted features is known as classification. There are different machine learning techniques, such as supervised, unsupervised, reinforcement, ensemble, neural networks, and deep learning [42]. Machine learning approaches such as DT, KNN, and SVM based on RICA-based feature extraction methods are applied for character prediction. 70%–30% data are used for training and validation purpose [43].

2.3.1. Decision Tree

A DT is a machine learning technique that is used for prediction. DT is popular because it does not require too many computations [44]. DT classifiers have a tree-like structure that divides the dataset into several subsets. This classifier trains the model by applying simple decision rules on training data [45]. The model is then used to forecast the desired values, read the dataset, and categorize them into classes [8].

The following equations can be used to design DT algorithms mathematically.

In this study, train and test data are divided, with a 70%–30% ratio. Training data are used to build a model, and test data are used to check the model’s validity. A multiclass approach is used to predict Braille to Urdu text using DT. The DT is tuned using the default parameters.

2.3.2. KNN

KNN is the most common and simple nonparametric technique used for regression and classification models in machine learning. The Euclidian distance formula [46] is used to calculate the distance between the samples.where a and b represent the number of samples.where a_i−b_i are the i^th dimension feature dimensions of samples, and n represents the total number of feature dimensions.

The number of nearest neighbors determines the output value while using KNN. If the value of K = 1, the object can be classified and assigned to the nearest neighbor of that single class [45].

Here, KNN is used to classify Braille to Urdu text. K = 3 is selected, distance metrics as Euclidean distance, and distance weight as equal weight.

2.3.3. SVM

For pattern and character recognition, SVM is the most well-known machine learning classification technique. SVM is a supervised machine learning technique used in biomedical image processing, computer vision, speech recognition, etc. [47]. SVM builds a hyperplane in high-dimensional spaces to obtain a better classification. If the achieved hyperplane has the highest functional margin, the classifier will give good performance [48]. The greater the margin, the lower the risk of generalized error. SVM finds the hyperplane that provides the most significant minimum distance for the training data. SVM can produce more generalized outcomes. SVM is a twofold classifier that converts data into a hyperplane that depends upon high-dimensional data.

Let us consider a hyperplane x. + b = 0, where is normal.

Linearly separable data are represented as follows:where y_i is the twofold class label.

When we achieve maximum margin by maximizing, the objective function value of

E = ║║² gives

By removing the discrepancies from the above equations, now we have

If data cannot be linearly separated, then a slack variable Ξi is used to identify misclassifications.

Thus, in this scenario, an objective function is defined assubject to

Here, C and L represent hyperparameters and cost functions, respectively. Cost functions are used to detect outliers. The dual formulation with L(Ξi) = Ξi issubject to

Here,

Kernel trick is used to handle nonlinearly separable data [49]. The nonlinear mapping function from the input space is transformed into a higher dimensional feature space. Polynomial, Gaussian, and radial-based functions are the most popular kernels.

SVM polynomial kernels:

SVM Gaussian polynomial kernels:

SVM fine Gaussian kernels:

Dual formation of a nonlinear case is shown assubject to

Grid search is the famous evaluation metric used for SVM evaluation. Optimal parameters are carefully selected by setting the grid range and step size. Only one parameter, “c,” a soft margin constant, is used in a linear kernel, whereas the SVM Gaussian kernel and SVM fine Gaussian kernel contain two training parameters, cost “c” and sigma, which can be used to control the nonlinearity of the degree. RICA feature extraction method was employed in the study with 70% data for training and 30% for testing. In this study, a polynomial kernel with default parameters is used.

2.4. Performance Evaluation Metrics

To predict Urdu Braille characters, true positive rate (TPR), true negative rate (TNR), positive predicted value (PPV), negative predicted value (NPV), false positive rate (FPR), and total accuracy (TA) are used to evaluate Urdu Braille character prediction.

2.4.1. True Positive Rate

TPR is also called sensitivity or recall. TPR indicates how many correct alphabets are classified as true. Mathematically, it is written as

2.4.2. True Negative Rate

TNR is also called specificity. This metric defines the number of negative predicted values that are correctly identified. It can be expressed mathematically as

2.4.3. Positive Predictive Value

A test predicted positive results when it is true positive. Mathematically, it can be presented as

2.4.4. Negative Predictive Value

NPV shows a test result in negative prediction, and the subject also has a negative value. Mathematical representation is given as follows:

2.4.5. Total Accuracy

Total accuracy is defined by adding all true positives and all true negatives and dividing it by all true negatives, false positives, true positives, and false negatives.

2.4.6. ROC

It measures the proportions of all true positives and all true negatives. They are calculated by plotting the ROC curve against the true positive and false positive values. TPR is plotted along the x-axis, whereas FPR is plotted along the y-axis. The area under the curve (AUC) value lays between 0 and 1. A value >0.5 shows separation. A value greater than 0.5 indicates separation. In this study, Braille Urdu characters, which are predicted true when they belong to the true class, have values 1 or approaching 1.

3. Results

Urdu Braille characters are predicted from a newly collected dataset from touchscreen devices. The performance is computed using RICA feature extraction methods and machine learning algorithms such as DT, KNN, and SVM. TPR, TNR, PPV, NPV, TA, FPR, and AUC were the performance metrics employed in the evaluation.

Figures 4(a)–4(c) show AUC values using DT for category-1 (class 1–class 13), Alif-Zaal (ﺫ - ﺍ), category-2 (class 14–class 26), Ray-Fay (ﻒ - ﺮ), and category-3 (class 27–class 39), Qaaf-Bri Yay (ے - ﻕ), respectively. By extracting RICA features, the highest performance attained from category-1 (class 1–class 13), Alif-Zaal (ﺫ - ﺍ), is Braille class 6 with 99.9% TA and with 0.9979 AUC value, as shown in Figure 4(a). They were followed by other classes such as Alif, Bay, Hay, and Khay (ﺍ, ﺏ, ﺡ, ﺥ) yielding accuracies of (99.90%), with AUC (0.9995, 0.9914, 0.9895), respectively. Other classes such as Chay and Zaal (ذ ،چ) also achieved better accuracies of 99.85% and 97.76% with AUC (0.9884 and 0.9887, respectively). Other performance measures, such as TPR and TNR, yield performance (TPR >94%) and (TNR >98%). From category-2 (14–26), Ray-Fay (ﻒ - ﺮ) shows the highest performance for class 18, 21, 14, and 15, i.e., Seen, Zuuad, Ray, Rray, Fay, Toyn, and Ghaen (ﺲ,, ﺾ ﺮ, ڑ, ﻒ ﻃ, ﻍ) with accuracies of (99.95%, 99.95%, 99.85%, 99.85%, 99.80%, 99.64%, and 99.64%), with AUC (0.9899, 0.9997, 0.9992, 0.9495, 0.9615, 0.9495, and 0.9683, respectively) with TPR >90% and TNR >99%, as shown in Figure 4(b). For category-3 (27–39), Qaaf-Bri Yay (ے ﻕ), maximum TA of 100% is achieved for class 35 (Gol Hy), 37 (Hamza), and 28 (Kaaf) (ﻩ, ء, ﻙ) with AUC value of (0.9553, 1, 0.9995). These results are followed by Gaaf, Meem, and Hy (گ, ﻡ, ﻫ) with TA of (99.75%, 99.85%, and 99.75%) (with AUC value of (0.0732, 0.8125, and 0.9553) TPR >94%, and TNR >99%, except for class 31, i.e., TPR = 62.50%, as shown in Figure4(c). Amongst all, the highest separation (AUC = 1) was seen for the Urdu Braille characters Gol Hy and Hamza (ﻩ, ء). AUC value of other Urdu Braille characters such as Alif, Bay, Pay, and Tay (ﺍ, ﺐ, پ, and ﺖ) are above 99% indicating good classification. Detailed results are presented in Table 1.

(a)

(b)

(c)

The maximum accuracy obtained by using KNN for category (1–13), Alif-Zaal (ﺫ - ﺍ), is for class 2, 4, 5, and 13, i.e., Bay, Tay, Ttay, and Zaal (ﺏ, ﺖ, ٹ, ﺬ) with TA (99.95%, 99.85%, 99.80%, and 99.75%) and AUC (0.9997, 0.9992, 0.9990, and 0.9896, respectively), as shown in Figure 5(a). For category-2 (14–26), (Ray-Fay) (ﻒ - ﺮ), highest TA of 100% is achieved for class 15, 18, and 23, followed by 26, 21, 14, and 19, i.e., Rray, Seen, and Zoyn (ڑ, ﺲ, ﻅ). Fay and Zuad (ﻒ, ﺾ) achieved a TA of 099.95%, while Ray and Sheen (ﺮ, ﺵ) had TAs of 99.90% and 99.75%, respectively, with AUC (1, 1, 1, 0.9884, 0.9922, 0.9995, and 0.9800) with TPR >96% and TNR >99%, as shown in Figure 5(b). Similarly, highest TA achieved for category-3 (27–39), (Qaaf-Bri Yeh) (ے - ﻕ), is for class 35, 37, 31, 30, 27, and 28, i.e., Gol Hy, Hamza, Meem, Laam, Qaaf, and Kaaf (ﻩ, ء, ﻡ, ﻞ,, ﻕ ﻙ) (99.75%, 99.80%, 100%, 100%, 99.69%, and 99.69%), with AUC (1, 1, 0.7143, 0.9457, 0.9873, and 0.9578, respectively) along with TPR >42% for Meem (ﻡ), TPR = 100% for Gol Hy (ﻩ), Hamza (ء), and TNR = 100% for Gol Hy, Hamza, and Meem (ﻩ, ء, ﻡ), as shown in Figure 5(c). Overall findings indicate that Braille characters such as ڑ, ﺲ, ﻅ, ﻩ, and ء got the highest AUC value of 1 that shows 100% separation among all classes.

(a)

(b)

(c)

We also achieved above 0.99 AUC for several URDU Braille characters such as ﺍ, ﺏ, پ, and ﺖ, etc., by using KNN on the extracted feature set. KNN also exhibits promising results when using RICA features to extract Braille characters from a touchscreen device. Table 2 shows detailed DT results for all Grade-1 Urdu Braille characters.

Furthermore, SVM was used to measure the performance. Category (1–13), Alif-Zaal (ﺫ - ﺍ), shows the highest accuracies for class 10, 6, 9, 8, 3, 4, and 7, i.e., Jeem, Tay, Pay, Chy, Hay, Ssay, and Khay (ﺥ,ﺙ,ﺡ,چ,پ,ﺖ,ﺝ) are with TA (100%, 99.95%, 99.95%, 99.90%, 99.60%, 99.60%, and 99.80%), with AUC (1, 0.9997, 0.9997, 0.9995, 0.9995, 0.9995, and 0.9990, respectively) with TPR = 100% and TNR >99%, as shown in Figure 6(a). For category-2 (14–26), (Ray-Fay) (ﻒ - ﺮ), highest performance was achieved for class 26, 15, 23, 25, 14, 19, and 21, i.e., Ghaen, Ray, Fay, Rray, Zoyn, Sheen, and Zuad (ﻍ, ﺮ, ﻒ, ڑ, ﻅ, ﺵ, ﺾ) with TA achieved (100%, 99.95%, 99.90%, 99.90%, 99.85%, 99.85%, and 99.85%) and AUC (1, 0.9444, 0.9828, 0.9905, 0.9992, 0.9900, and 0.9776, respectively), as shown in Figure 6(b). Similarly, for category-3, (Qaaf-Bri Yeh) (ے - ﻕ), highest TA is achieved for class 34, 35, 36.37, 38, 28, 30, and 33, i.e., Waao, Hy, Gol Ha, Hamza, Choti Yeh (ﻮ, ﻫ, ﻩ, ء, ﻯ) all achieved highest TA of (100%), with AUC (1) with TPR = 100% and TNR = 100%.

(a)

(b)

(c)

However, Kaaf, Laam, and Noon Gunnah (ﻙ, ﻞ, ں) obtained 99.95% TA; AUC for Kaaf and Laam is 0.9997 and 0.9889 for Noon Gunnah with TPR 97% and TNR >99%, as shown in Figure 6(c). SVM outperforms other classifiers, as mentioned above, in terms of AUC value. Braille characters such as ﺥ, ﻒ, ﻮ, ﻩ, ﻫ, ء, and ﻯ got the highest AUC value of 1. SVM shows 99% separation for ﺍ, ﺏ, پ, ﺖ, ٹ, ﺙ, چ, ﺫ, ﺮ, ﺰ, ﻕ, ﻙ, and ﻞ. Figures 7(a)–7(c) illustrate overall performance on all Urdu alphabets from Alif-Bri Yay by measuring PR, TNR, PPV, NPV, TNR, TA, and AUC.

(a)

(b)

(c)

Results of Shokat et al. [33] reported that Naive Bayes, DT, KNN, SVM, sequential model, and GoogLeNet inception model have TA of (96.38%), (97.20%), (97.04%), (83.00%), (92.21%), and (95.8%), respectively. For Naive Bayes, DT, and KNN, TPR and PPV showed no significant value. Better results were seen with SVM, sequential model, and GoogLeNet inception model. The maximum performance with the lowest error rate was achieved by SVM with RICA-based feature extraction method with TPR (93.96%), TNR (99.85%), PPV (94.51%), NPV (99.87%), TA (99.73%), and FPR (0.14%). After SVM, the results using DT showed the second best performance. The obtained results show TPR (90.98%), TNR (99.78%), PPV (92.21%), NPV (99.78%), TA (99.57%), and FPR (0.21%). At last, performance achieved using KNN is TPR (90.2%), TNR (99.72%), PPV (89.75%), NPV (99.77%), TA (99.5%), and FPR (0.28%). Table 3 compares the findings of Sana et al. with the current study [33]. More promising results have been achieved with an improved backend processing system for dataset collection; we have achieved more promising results in total accuracy and space used.

4. Discussion

To improve the living quality for visually challenged persons, Braille must be converted to natural language. Braille to natural language conversion has been done in many studies. Most of the studies translated scanned Braille documents into standard English or the other way around. Jha and Parvathi conducted a study that used an SVM classifier trained by extracting features using the histogram of oriented gradient (HOG) feature extraction method to translate handwritten Odia and Hindi text into Braille. Converting Odia [36] and Hindi [25] using this classification technique, 99% and 94.5% accuracies were achieved, respectively, into Braille. Using the same technique, 99% and 80% accuracies were achieved for converting handwritten English and Sinhala documents into Braille text [35]. An SVM classifier trained on HAAR feature extraction methods is used to convert handwritten English sheets, with a classification error of less than 10% [50]. English to Braille conversion was taken place by taking input from a gesture-based touchscreen using KNN for classification. The distance between two dots was computed using Bayesian touch distance, which yielded a 97.4 percent accuracy [51]. Another study was carried out to recognize scanned Braille characters using KNN, Naïve Bayes, random forest, and SVM depicting that 63%, 53%, 65%, and 69.6% accuracies were achieved [34], as shown in Table 4.

For Braille to Urdu character recognition, RICA-based features are extracted, and robust machine learning algorithms such as DT, SVM, and KNN are used. A new Urdu Braille dataset was collected from visually impaired students using a newly built touchscreen-based free Braille input application. Main findings achieved for DT algorithm with category-1 (1–13), Alif-Zaal (ﺫ - ﺍ), the highest detection performance was obtained with Braille class 6 (ﺙ) with TA (99.95%)and AUC (0.9979).These results were followed by other classes such as (ﺍ, ﺏ, ﺡ, ﺥ) yielding accuracies of (99.90%), with AUC (0.9995, 0.9995, 0.9914, and 0.9895 ), respectively. Other performance indicators, such as TPR and TNR, show that performance (TPR >94%) and (TNR >98%) are achieved. Similarly, category-2 (14–26), Ray to Fay (ﻒ - ﺮ), for DT shows the best results for class 18, 21, 14, and 15, i.e., Seen, Zuuad, Ray, and Rray (ﺲ, ﺾ, ﺮ, ڑ) with accuracies of 99.95%, 99.95%, 99.85%. and 99.85%, with AUC (0.9899, 0.9997, 0.9992, and 0.9495 respectively) with TPR = 100% and TNR >99%. For category-3 (27–39), Qaaf-Bri Yeh (ے - ﻕ), highest TA is achieved for class 35, 37, and 28, i.e., Gol Hy and Hamza (ﻩ, ء) with (100%) TA. TA of Kaaf is 99.90%, with AUC (1, 1, and 0.9995), respectively, with TPR >94% and TNR >99%. By employing KNN, the highest TA achieved for category (1–13), Alif-Zaal (ﺫ - ﺍ), is for class 13, 2, and 4, i.e., Bay and Tay (ﺐ, ﺖ) (99.85% and 99.85%), with AUC (0.9997 and 0.9992, respectively). For category-2 (14–26), (Ry-Fy) (ﻒ - ﺮ), highest TA is achieved for class 15, 18, 23, 26, and 21, i.e., Rray, Seen, and Zoyn (ڑ, ﺲ, ﻅ) TA = 100% followed by Fay and Zuad (ﻒ, ﺾ) with TPR (99.95%) with AUC (1, 1, 1, 0.9884, and 0.9922) with TPR = 100% and TNR >99%. Similarly, for category-3 (27–39) (Qaaf-Bri Yeh) (ے - ﻕ), highest TA was achieved for class 35, 37, and 31, i.e., Gol Ha, Hamza, and Meem (ﻩ, ء, ﻡ) TA (100%, 100%, and 99.80%), with AUC (1, 1, and 0.7143) with TPR >42% for Meem (ﻡ) and TPR = 100% for Gol Hy (ﻩ) and Hamza (ء) and TNR = 100% for Hay, Hamza, and Meem (,ء,ﻩ ﻡ). Moreover, performance evaluation for SVM category (1–13), Alif-Zaal (ﺫ - ﺍ), shows highest accuracies for class 10, 6, 9, and 8, i.e., Khay, Ssay, Hay, and Chy (ﺥ, ﺙ, ﺡ, چ) with TA (100%, 99.95%, 99.95%, and 99.90%), with AUC (1, 0.9997, 0.9997, and 0.9995, respectively) with TPR = 100% and TNR >99%. For category-2 (14–26) (Ray-Fay) (ﻒ - ﺮ), highest performance was achieved for class 26, 15, 23, and 25, i.e., Fay, Rray, Zoyn, and Ghaen (ﻒ ڑ, ﻍ, ﻅ) with TA achieved (100%, 99.95%, 99.90%, and 99.90%) with AUC (1, 0.9444, 0.9828, and 0.9905). Similarly, for category-3 (Qaaf-Bri Yeh) (ے - ﻕ), the highest TA is achieved for class 34, 35, 36.37, and 38, i.e., Waao, Hy, Gol Hay, Hamza, and Choti Yeh (ﻮ, ﻫ, ﻩ, ء, ﻯ) all achieved the highest TA of (100%), with AUC (1) with TPR = 100% and TNR = 100%.

5. Conclusions and Future Work

Braille is a growing means of communication for people with visual impairments. More than 150 million people continue to use Braille around the world for several reasons. Literacy is one of the powerful cases that show Braille’s importance for learning how to read and write. With the advent of technology, Braille is more accessible to visually impaired users. Various studies have been carried out to convert Braille into a natural language. However, most studies have used handwritten scanned sheets to translate Braille into natural languages such as Arabic, Hindi, Tamil, Odia, Chinese, Korean, Bangla, English, and Gujarati, and vice versa [21, 28, 36, 52–57]. As per our knowledge for Urdu Braille recognition, no such study has been found so far. In this research, the Urdu Braille dataset is collected using a touchscreen-based android application [33]. This application is easy to use and less burdensome for visually impaired people. Robust machine learning techniques include SVM with polynomial kernels, DT with default parameters, and KNN with K = 3 for Urdu Braille character recognition. The dataset was divided into 70%–30% for training and validation. Performance metrics used to evaluate these classifiers’ performance are PPV, NPV, FPR, FNR, FPR, TA, and AUC. The RICA-based feature extraction method is used for Urdu Braille character recognition. The highest performance using DT classifier shows that class 35, Gol Hy (ﻩ), and class 37, Hamza (ء), achieved a maximum separation AUC (1) with TA, TPR, TNR = 100%, and FPR = 0%. Similarly, the highest performance using KNN shows that class 15, 18, 23, 35, and 37, i.e., Rray, Seen, Zoyn, Gol Hy, and Hamza (ڑ, ﺲ, ﻅ, ﻩ, ء) attained the highest separation AUC (1) with TPR, TNR, TA = 100%, and FPR = 0%. Furthermore, using SVM, the highest performance was achieved by class 10, 26, 34, 35, 36, 37, and 38, i.e., Khay, Fay, Waao, Hy, Gol Hy, Hamza, and Choti Yeh (ﺥ, ﻒ, ﻮ, ﺡ, ﻩ, ء, ﻯ) with highest separation AUC (1) with TPR, TNR, TA = 100%, and FPR = 0%. SVM has the highest TA of all classifiers with TA (99.73%), TPR (93.96%), TNR (99.85%), PPV (94.51%), NPV (99.87%), and FPR (0.14%).

RICA features are extracted, and robust machine learning techniques are used to recognize Urdu Braille characters. We intend to expand this dataset in the future to include other languages such as English and maths with advanced Braille levels. The performance will then be evaluated using convolutional neural network (CNN) and transfer learning techniques such as GoogleNet inception model and LSTM. By implementing these models, performance of the system will further improve. Along with this, we will be more focused on providing error detection and voice feedback services for visually impaired users.

Data Availability

This dataset is not publicly available. But it could be provided on request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research work was supported by the National Research Foundation of Korea (NRF) grants funded by the Korean government under reference number (2020R1A2C1012196).

References

Y. Lee and J. Lee, “A checklist for assessing blind users’ usability of educational smartphone applications,” Universal Access in the Information Society, vol. 18, no. 2, pp. 343–360, 2019.
View at: Publisher Site | Google Scholar
Tech. rep, “Ericsson Mobility Report,” 2019. https://www.ericsson.com/en/mobility-report/reports/november-2019.
S. O’Dea, “Smartphone users 2020 | Statista,” Smartphone users worldwide 2016-2021. 2020. https://www.statista.com/statistics/330695/number-of-smartphone-users-worldwide- 2020.
W. Grusseneyer and E. Folmer, “Accessible touchscreen technology for people with visual impairments: a survey,” ACM Trans. Access. Comput., vol. 9, no. 2, pp. 1–31, 2017.
View at: Google Scholar
S. Akram, A. Mahmood, I. Ullah et al., “Construction and analysis of a novel wearable assistive device for a visually impaired person,” Applied Bionics and Biomechanics, vol. 2020, Article ID 6153128, 11 pages, 2020.
View at: Publisher Site | Google Scholar
A. A. Díaz-toro, S. E. Campaña-bastidas, and E. F. Caicedo-bravo, “Vision-based system for assisting blind people to wander unknown environments in a safe way,” Journal of Sensors, vol. 2021, Article ID 6685686, 18 pages, 2021.
View at: Publisher Site | Google Scholar
V. Iyer, K. Shah, S. Sheth, and K. Devadkar, “Virtual assistant for the visually impaired,” in Proceedings of the 2020 5th International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, December 2020.
View at: Publisher Site | Google Scholar
K. Taylor and L. Silver, Smartphone Ownership Is Growing Rapidly Around the World, but Not Always Equally, Pew Research Center, Washington, DC, USA, 2019, https://Www.Pewresearch.Org/Global/2019/02/05/Smartphone-Ownership-Is-Growing-Rapidly-Around-the-World-But-Not-Always-Equally/.
S. Shokat, R. Riaz, S. S. Rizvi, K. Khan, F. Riaz, and S. J. Kwon, “Analysis and evaluation of braille to text conversion methods,” Mobile Information Systems, vol. 2020, Article ID 3461651, 13 pages, 2020.
View at: Publisher Site | Google Scholar
J. Rantala, R. Raisamo, J. Lylykangas et al., “Methods for presenting braille characters on a mobile device with a touchscreen and tactile feedback,” IEEE Transactions on Haptics, vol. 2, no. 1, pp. 28–39, 2009.
View at: Publisher Site | Google Scholar
R. K. Virinchi, “Design and analysis of a system to read braille with remote cloud architecture,” in Proceedings of the IEEE 17th India Council International Conference (INDICON), New Delhi, India, September 2020.
View at: Publisher Site | Google Scholar
B. Leporini, M. C. Buzzi, and M. Buzzi, “Interacting with mobile devices via VoiceOver,” in Proceedings of the 24th Australian Computer-Human Interaction Conference on-OzCHI ’12, Melbourne, Australia, May 2012.
View at: Publisher Site | Google Scholar
M. Li, M. Fan, and K. N. Truong, “BrailleSketch,” in Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility, Baltimore, ML, USA, June 2017.
View at: Publisher Site | Google Scholar
T. Guerreiro, P. Lagoá, P. Santana, D. Gonçalves, and J. Jorge, “NavTap and BrailleTap: non-visual texting interfaces,” in Proceedings of the Rehabilitation Engineering and Assistive Technology Society of North America Conference (Resna), Baltimore, ML, USA, January 2008.
View at: Google Scholar
J. Oliveira, T. Guerreiro, H. Nicolau, J. Jorge, and D. Gonçalves, “BrailleType: unleashing braille over touch screen mobile phones,” in Proceedings of the IFIP Conference on Human-Computer Interaction, Lisbon, Portugal, December 2011.
View at: Publisher Site | Google Scholar
N. S. Subash, S. Nambiar, and V. Kumar, “BrailleKey: an alternative Braille text input system: comparative study of an innovative simplified text input system for the visually impaired,” in Proceedings of the 4th International Conference on Intelligent Human Computer Interaction: Advancing Technology for Humanity, IHCI 2012, Kharagpur, India, June 2012.
View at: Publisher Site | Google Scholar
S. Azenkot, J. O. Wobbrock, S. Prasain, and R. E. Ladner, “Input finger detection for nonvisual touch screen text entry in Perkinput,” in Proceedings of the Graphics Interface, Toronto Ontario Canada, May 2012.
View at: Google Scholar
B. Šepić, A. Ghanem, and S. Vogel, “BrailleEasy: one-handed braille keyboard for smartphones,” Studies in Health Technology and Informatics, vol. 217, pp. 1030–1035, 2015.
View at: Google Scholar
M. Shabnam and S. Govindarajan, “Gesture recognition algorithm: braille-coded gesture patterns for touch screens: eyedroid,” Indian J. Sci. Technol., vol. 9, no. 33, pp. 1–17, 2016.
View at: Publisher Site | Google Scholar
H. Gautam and P. Gaur, “DRISHYAM_ real-time text to braille conversion and realisation,” in Proceedings of the IEEE 17th India Council International Conference (INDICON), New Delhi, India, May 2020.
View at: Google Scholar
S. D. Al-Shamma and S. Fathi, “Arabic Braille Recognition and transcription into text and voice,” in Proceedings of the 2010 5th Cairo International Biomedical Engineering Conference, Cairo, Egypt, June 2010.
View at: Publisher Site | Google Scholar
P. R. Kumar, “Braille language converter for visually impaired people,” Int. J. Intellect. Adv. Res. Eng. Comput., vol. 6, no. 2, pp. 2229–2232, 2018.
View at: Google Scholar
L. Nahar, A. Jaafar, E. Ahamed, and A. B. M. A. Kaish, “Design of a braille learning application for visually impaired students in Bangladesh,” Assistive Technology, vol. 27, no. 3, pp. 172–182, 2015.
View at: Publisher Site | Google Scholar
V. Jha and K. Parvathi, “Braille transliteration of Hindi handwritten texts using machine learning for character recognition,” Int. J. Sci. Technol. Research, vol. 8, no. 10, pp. 1188–1193, 2019.
View at: Google Scholar
A. Bier and Z. Sroczyński, “Rule based intelligent system verbalizing mathematical notation,” Multimedia Tools and Applications, vol. 78, no. 19, pp. 28089–28110, 2019.
View at: Publisher Site | Google Scholar
M. Maćkowski, P. Brzoza, M. Żabka, and D. Spinczyk, “Multimedia platform for mathematics’ interactive learning accessible to blind people,” Multimedia Tools and Applications, vol. 77, no. 5, pp. 6191–6208, 2018.
View at: Google Scholar
J. Yook, K. Kim, B. C. Son, and S. Park, “A translating program usability analysis of alternative multimedia mathematics materials for the blind,” Multimed. Tools Appl., Springer, Berlin, Germany, 2020.
View at: Publisher Site | Google Scholar
S. Padmavathi, K. S. S. Manojna, and S. S. Reddy, “Conversion of braille to text in English, Hindi and Tamil languages,” International Journal of Computer Science, Engineering and Applications, vol. 3, no. 3, pp. 19–32, 2013.
View at: Publisher Site | Google Scholar
M. A. Fahiem, “A deterministic turing machine for context sensitive translation of braille codes to Urdu text,” in Combinatorial Image Analysis, Springer, Berlin Heidelberg, 2008.
View at: Google Scholar
U. Beg, K. Parvathi, and V. Jha, “Text translation of scanned Hindi document to braille via image processing,” Indian Journal of Science and Technology, vol. 10, no. 33, pp. 1–8, 2017.
View at: Publisher Site | Google Scholar
G. Kouroupetroglou, A. Pino, and P. Riga, “A methodological approach for designing and developing web-based inventories of mobile Assistive Technology applications,” Multimedia Tools and Applications, vol. 76, no. 4, pp. 5347–5366, 2017.
View at: Publisher Site | Google Scholar
S. Priyadarsini, P. Ajit, K. Nayak, and S. Champati, “A survey on speech synthesis techniques in Indian languages,” Multimedia Systems, vol. 26, pp. 453–468, 2020.
View at: Google Scholar
S. Shokat, R. Riaz, S. S. Rizvi, A. M. Abbasi, and S. J. Kwon, “Deep learning scheme for charcater prediction with position-free touch screen-based Braille input method,” Human Centric Computing and Information Sciences, vol. 10, no. 1, pp. 1–24, 2020.
View at: Google Scholar
T. Li, X. Zeng, and S. Xu, “A Deep learning method for braille recognition,” in Proceedings of the 2014 International Conference on Computational Intelligence and Communication Networks, Bhopal, India, November 2014.
View at: Publisher Site | Google Scholar
W. K. I. L. I. Perera and T. D. S. H. Wanniarachchi, “Optical braille recognition based on Histogram of oriented gradient features and support-vector machine,” International Journal of Engineering Science, vol. 8, no. 10, pp. 19192–19195, 2018.
View at: Google Scholar
V. Jha and K. Parvathi, “Machine learning based braille transliteration of Odia language,” International Journal of Innovative Technology and Exploring Engineering, vol. 9, no. 5, pp. 1866–1871, 2020.
View at: Google Scholar
Y. Yanhui Xiao, Z. Zhenfeng Zhu, Y. Yao Zhao, Y. Yunchao Wei, and S. Shikui Wei, “Kernel reconstruction ICA for sparse representation,” IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 6, pp. 1222–1232, Jun. 2015.
View at: Publisher Site | Google Scholar
A. Hyvärinen, J. Hurri, and P. O. Hoyer, Natural Image Statistics, Springer London, London, UK, 2009.
Q. Le, A. Karpenko, J. Ngiam, and A. Y. Ng, “ICA with reconstruction cost for efficient overcomplete feature learning,” in Advances in Neural Information Processing Systems, Springer, Berlin, Germany, 2011.
View at: Google Scholar
Y.-L. Boureau, J. Ponce, and Y. LeCun, “A theoretical analysis of feature pooling in visual recognition,” in Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel, September 2010.
View at: Google Scholar
Y. LeCun, ““Learning invariant feature hierarchies,” in learning invariant feature hierarchies,” in Proceedings of the European Conference on Computer Vision, Berlin, Heidelberg, June 2012.
View at: Google Scholar
M. Hu, W. Li, K. Yan, Z. Ji, and H. Hu, “Modern machine learning techniques for univariate tunnel settlement forecasting: a comparative study,” Mathematical Problems in Engineering, vol. 2019, Article ID 7057612, 12 pages, 2019.
View at: Publisher Site | Google Scholar
I. Rasheed, V. Gupta, H. Banka, and C. Kumar, “Urdu Text Classification: a comparative study using machine learning techniques,” in Proceedings of the 2018 Thirteenth International Conference on Digital Information Management (ICDIM), Coimbatore, India, May 2018.
View at: Publisher Site | Google Scholar
T. Shaikhina, D. Lowe, S. Daga, D. Briggs, R. Higgins, and N. Khovanova, “Decision tree and random forest models for outcome prediction in antibody incompatible kidney transplantation,” Biomedical Signal Processing and Control, vol. 52, pp. 456–462, 2019.
View at: Publisher Site | Google Scholar
L. Hussain, I. A. Awan, W. Aziz et al., “Detecting congestive heart failure by extracting multimodal features and employing machine learning techniques,” BioMed Research International, vol. 2020, Article ID 4281243, 19 pages, 2020.
View at: Publisher Site | Google Scholar
Q. Ning, Z. Ma, and X. Zhao, “dForml(KNN)-PseAAC: detecting formylation sites from protein sequences using K-nearest neighbor algorithm via Chou's 5-step rule and pseudo components,” Journal of Theoretical Biology, vol. 470, pp. 43–49, 2019.
View at: Publisher Site | Google Scholar
A. Gammerman, V. Vovk, H. Boström, and L. Carlsson, “Conformal and probabilistic prediction with applications: editorial,” Machine Learning, vol. 108, no. 3, pp. 379-380, 2019.
View at: Publisher Site | Google Scholar
M. Achirul Nanda, K. Boro Seminar, D. Nandika, and A. Maddu, “A comparison study of kernel functions in the support vector machine and its application for termite detection,” Information, vol. 9, no. 1, p. 5, 2018.
View at: Publisher Site | Google Scholar
A. Tharwat, “Parameter investigation of support vector machine classifier with kernel functions,” Knowledge and Information Systems, vol. 61, no. 3, pp. 1269–1302, 2019.
View at: Publisher Site | Google Scholar
J. Li, X. Yan, and D. Zhang, “Optical braille recognition with haar wavelet features and support-vector machine,” in Proceedings of the International Conference on Computer, Mechatronics, Control and Electronic Engineering, Changchun, China, December 2010.
View at: Publisher Site | Google Scholar
H. Udapola and S. R. Liyanage, “Braille messenger: adaptive learning based non-visual touch screen input for the blind community using braille,” in Proceedings of the International Conference on Innovations in Info-Business and Technology, 2017, Ozo, Colombo, Sri Lanka, October 2018.
View at: Google Scholar
X. Wang, Y. Yang, H. Liu, and Y. Qian, “Chinese-braille translation based on braille corpus,” International Journal of Advanced Pervasive and Ubiquitous Computing, vol. 8, no. 2, pp. 56–63, 2016.
View at: Publisher Site | Google Scholar
X. Wang, J. Zhong, J. Cai, H. Liu, and Y. Qian, “CBConv: service for automatic conversion of Chinese characters into braille with high accuracy,” in Proceedings of the 21st International ACM SIGACCESS Conference on Computers and Accessibility, New York, NY, USA, April 2019.
View at: Publisher Site | Google Scholar
S. Lee, S. Jung, and H. Song, “CNN-based drug recognition and braille embosser system for the blind,” Journal of Computing Science and Engineering, vol. 12, no. 4, pp. 149–156, 2018.
View at: Publisher Site | Google Scholar
T. R. Abir, T. S. Bin Ahmed, M. T. Rahman, and S. Jafreen, Handwritten Bangla Character Recognition to Braille Pattern Conversion Using Image Processing and Machine Learning, BRAC University, Dhaka, Bangladesh, 2018.
M. Gadag and V. Udayashankara, “Efficient approach for English braille to text conversion,” International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, vol. 5, no. 4, pp. 3343–3348, 2016.
View at: Google Scholar
H. Parekh, S. Shah, F. Patel, and H. Vyas, “Gujarati braille text recognition: a review,” International Journal of Computer Science & Communication, vol. 7, no. 1, pp. 19–24, 2019.
View at: Google Scholar

Copyright

Copyright © 2021 Sana Shokat et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

974

Downloads

803

Citations