Revolution in technology is changing the way visually impaired people read and write Braille easily. Learning Braille in its native language can be more convenient for its users. This study proposes an improved backend processing algorithm for an earlier developed touchscreen-based Braille text entry application. This application is used to collect Urdu Braille data, which is then converted to Urdu text. Braille to text conversion has been done on Hindi, Arabic, Bangla, Chinese, English, and other languages. For this study, Urdu Braille Grade 1 data were collected with multiclass (39 characters of Urdu represented by class 1, Alif (ﺍ), to class 39, Bri Yay (ے). Total (N = 144) cases for each class were collected. The dataset was collected from visually impaired students from The National Special Education School. Visually impaired users entered the Urdu Braille alphabets using touchscreen devices. The final dataset contained (N = 5638) cases. Reconstruction Independent Component Analysis (RICA)-based feature extraction model is created for Braille to Urdu text classification. The multiclass was categorized into three groups (13 each), i.e., category-1 (1–13), Alif-Zaal (ﺫ - ﺍ), category-2 (14–26), Ray-Fay (ﻒ - ﺮ), and category-3 (27–39), Kaaf-Bri Yay (ے - ﻕ), to give better vision and understanding. The performance was evaluated in terms of true positive rate, true negative rate, positive predictive value, negative predictive value, false positive rate, total accuracy, and area under the receiver operating curve. Among all the classifiers, support vector machine has achieved the highest performance with a 99.73% accuracy. For comparisons, robust machine learning techniques, such as support vector machine, decision tree, and K-nearest neighbors were used. Currently, this work has been done on only Grade 1 Urdu Braille. In the future, we plan to enhance this work using Grade 2 Urdu Braille with text and speech feedback on touchscreen-based android phones.

1. Introduction

Smart devices are the most powerful tool for improving people’s living standards with visual disabilities [1]. Recent trends predict a drastic increase in smartphone users, with an expected increase of up to 9 billion by 2021 [2, 3]. There are various applications meant to assist visually impaired users, such as screen readers, sound and speech output devices, location finders, wearable devices for mobility, stereo vision-based systems, and virtual assistants [47]. Rapid growth in smartphone usage has changed the learning attitude of people [8]. People are increasingly turning to technology to explore new ideas. People learn by watching videos, tutorials, and online courses on their smart devices [9]. Braille is a commonly used language for visually impaired people. Louis Braille designed it in 1821. Braille is comprised of six dots in the form of two columns and three rows [10]. Visually impaired people write on sheets with the help of a stylus and read by gliding their fingers over the raised dots. It is difficult for visually impaired people to write Braille using these devices. Some interfaces convert textbooks into Braille books, but this facility is limited to specific languages; there is no procedure for converting Urdu text into Braille [11]. Previously, visually impaired people could only use their phones to make phone calls and send and receive text messages [12]. But now, people with visual impairments can read Braille with the help of a different screen reader software like Apple’s VoiceOver [13]. Different applications such as NavTap, Braille Play [14], Braille Tap [15], Braille Key, TypeIn Braille [16], Perkinput [17], Braille Easy [18], Eyedroid [19], and DRISHYAM [20] were developed to facilitate Braille text entry using smart devices. Although audio feedback was provided for user assistance in these applications, they also used numerous difficult gestures to memorize and took more time to perform a specific task. Due to usability issues, visually impaired users are unable to access such applications. Research is in progress for making applications that are less time-consuming and more usable. For Braille to Natural language conversion, image processing techniques were applied on scanned Braille sheets. Braille has been converted into Arabic [21], English [22], Bengali [23], Hindi [24], Tamil, maths [2527], and Odia [28] using these techniques, respectively. Braille is converted into Urdu and Hindi using deterministic Turing machines [29] and image segmentation algorithms [30].

Previously, Braille was translated into other languages using scanned sheets as input. Their conversions are hectic due to extensive writing on those sheets by the users. Several touchscreen-based applications provide text-to-speech conversion methods that assist visually impaired people with reading and writing. Most of these applications burden the user [31, 32], such as memorizing so many gestures, finding the position of dots on the screen, and no editing options. A position-free Braille text entry method was proposed to address these problems. That application was designed to put the least burden on the users while entering the English Braille alphabet. Visually impaired users can enter Braille characters by clicking anywhere on the screen, subsequently saved in an image format. For character recognition, deep learning techniques with the GoogLeNet inception model achieved more than 95% accuracy [33]. As per our knowledge, there are very few studies for Urdu Braille data, and none of them takes user input directly from the touchscreen. So, there is a strong need for an application that takes run-time Braille data and converts it into natural languages. Currently, there is no mechanism available for Braille to Urdu conversion using touchscreen-based devices. So, in this study, the front-end interface proposed by Sana et al. was used to collect the Urdu Braille dataset. Braille input was saved in an image format in the previous version of this application. Here, with some backend processing algorithm improvements, Urdu Braille input is saved in numerical data. Machine learning techniques such as DT, SVM, and KNN with RICA-based feature extraction methods are used for Braille to Urdu conversion on the new Urdu Braille dataset, see Figure 1.

The main contributions of this research are as follows:(a)Collection of Urdu Braille Grade 1 dataset using the application developed by Sana et al. from visually impaired students of the Special Education School, Manak Payyan, Pakistan [33]. There was no existing Urdu Braille dataset that was taken directly from touchscreen devices.(b)Enhancement in the backend processing mechanism of Sana et al. has been proposed.(c)Predication of Urdu character from Braille input is made with robust machine learning techniques, such as decision tree (DT), support vector machine (SVM), and K-nearest neighbor (KNN), using RICA-based feature extraction method.(d)Evaluation of proposed mechanism on the collected Urdu dataset is made based on true positive rate (TPR), true negative rate (TNR), positive predicted value (PPV), negative predicted value (NPV), false positive rate (FPR), total accuracy (TA), and area under the curve (AUC).(e)A comparative analysis with previous studies using scanned input-based data of different national and regional languages has been performed.

This study comprises the following sections: Section 2 provides materials and methods, information regarding the dataset, and its collection procedure. In Section 3, the results are presented in detail. Discussion is provided in Section 4, and finally, conclusions and future work are presented in Section 5.

2. Materials and Methods

2.1. Dataset

The front-end android-based application proposed was used to collect the Urdu dataset for this study [33]. On the android-based touchscreen, visually impaired users enter Urdu Braille characters. The dataset is collected from the National Special Education School Center (NSEC) “Manak Payyan.” The age of the participants was between 12 and 19 years, and these students were either completely or partially blind. At this level, Urdu Grade 1 Braille data were collected, which included 39 distinct characters.

The final dataset comprises 5637 Urdu Braille characters collected using an android-based touchscreen device. Machine learning techniques are utilized to convert Braille to Urdu text using this dataset. All the ambiguous data were eliminated after checking the values against each alphabet manually.

2.2. Backend Processing Mechanism

An improved backend processing mechanism is proposed in this study. In work done earlier, the dataset consisting of images corresponding to each character was used.

In the current study using the position-free interface, values of “x” and “y” coordinating against each dot are stored in a database. Braille is composed of six-dot patterns, and each Braille character is represented by the activation and deactivation of these dots. For example, if a Braille character has two active dots, the proposed system will save the value of “x” and “y” coordinates of active dots, and the remaining four inactive dots will be assigned a “0” value. Initial data were stored in the form of a .txt file separated by a comma. To avoid ambiguity in the dataset, the researcher manually checked the dataset, commas were removed, and each extracted dot was stored in a .csv file. Figures 2(a) and 2(b) show a visually impaired user entering Dal “ﺪ” and Toyn “ﻄ” using a touchscreen-based Braille interface.

The algorithm for Braille input coordinate extraction is shown in Figure 3.

Since the only coordinate location of active dots is saved in the dataset for each character instead of the whole image, thus, this approach reduces the storage requirements. In previous schemes, the single image saved in the database was 4 to 8 KB in size. For multiple instances, these requirements get multiplied by the number of cases, whereas with the new approach, a text file containing 144 different samples of a single character took only 9 to 10 KB space.

2.2.1. Feature Extraction

Different feature extraction techniques were applied earlier to predict Braille to text conversion for other languages. Jha and Parvathi extracted HOG features using Braille to Hindi text conversion [24]. Similarly, Li et al. have used the traditional feature extraction method using KNN, Naïve Bayes, etc., for recognizing Braille characters [34]. Moreover, Li et al. used a histogram of oriented gradients (HOG) with SVM to convert Braille characters into English, Sinhala, and Odia [35, 36]. This study extracts RICA-based features using DT, SVM, and KNN classifiers for converting Braille to Urdu text. This feature extraction method extracts more features than the actual dimensions of the input dataset. This algorithm also has the ability for faster execution of preprocessing steps.

2.2.2. RICA Feature Extraction Method

Reconstruction independent component analysis (RICA) is not a supervised learning technique, so it does not utilize the class label information. RICA algorithm was introduced to address the limitations of the ICA algorithm. This technique delivered more promising results than ICA. A lot of algorithms have been presented in recent years to learn sparse features.

A sparse filter can differentiate many artificial and natural signals, and this feature plays a vital role in different machine learning techniques.

The unlabeled data are given as input

For calculating independent components, the problem of optimization of standard ICA [37] can be defined mathematically aswhere represents a nonlinear penalty function, is a matrix, L represents the vectors count, and I defines the identity matrix. Additionally, is employed to prevent the vectors in X from being degenerated. For this purpose, a smooth penalty function can be used, i.e., [27].

However, some constraints related to orthonormality block the standard independent component analysis from learning on an overcomplete basis. Consequently, the defect, as mentioned above, prevents ICA from scaling into high-dimensional data. Hence, for the replacement of orthonormality constraints in ICA, soft reconstruction cost is used in RICA. After this substitution, the following unconstrained problem can be used to represent RICA filtering:

In the above-stated equation, λ > 0 exhibits the tradeoff between sparsity and reconstruction error rates. After performing swapping orthonormality constraints with reconstruction cost, in this way, even on unwhitened data, RICA can learn sparse representations when X is overcomplete. However, penalty h can yield sparse representations and is not invariant [38]. Therefore, RICA [39] swapped it by an additional pooling penalty represented by L2, simultaneously promoting pooling features to cluster correlated features. Moreover, for feature learning, L2 pooling also encourages sparsity. L2 pooling [40, 41] represents a two-layered network; the 1st layer is with square nonlinearity, and in the 2nd layer, square root nonlinearity,

Pooling matrix H ∈ P(L × L) where Hk denotes a row of that pooling matrix set to constant weights, i.e., 1 for every element in matrix H, element-wise multiplication is defined by , and ε > 0 is a small constant. RICA is a linear method that investigates the sparse representation only in the actual data space. RICA method is unable to use the association between class label information and training samples.

2.3. Classification

A process for categorizing classes according to the extracted features is known as classification. There are different machine learning techniques, such as supervised, unsupervised, reinforcement, ensemble, neural networks, and deep learning [42]. Machine learning approaches such as DT, KNN, and SVM based on RICA-based feature extraction methods are applied for character prediction. 70%–30% data are used for training and validation purpose [43].

2.3.1. Decision Tree

A DT is a machine learning technique that is used for prediction. DT is popular because it does not require too many computations [44]. DT classifiers have a tree-like structure that divides the dataset into several subsets. This classifier trains the model by applying simple decision rules on training data [45]. The model is then used to forecast the desired values, read the dataset, and categorize them into classes [8].

The following equations can be used to design DT algorithms mathematically.

In this study, train and test data are divided, with a 70%–30% ratio. Training data are used to build a model, and test data are used to check the model’s validity. A multiclass approach is used to predict Braille to Urdu text using DT. The DT is tuned using the default parameters.

2.3.2. KNN

KNN is the most common and simple nonparametric technique used for regression and classification models in machine learning. The Euclidian distance formula [46] is used to calculate the distance between the samples.where a and b represent the number of samples.where aibi are the ith dimension feature dimensions of samples, and n represents the total number of feature dimensions.

The number of nearest neighbors determines the output value while using KNN. If the value of K = 1, the object can be classified and assigned to the nearest neighbor of that single class [45].

Here, KNN is used to classify Braille to Urdu text. K = 3 is selected, distance metrics as Euclidean distance, and distance weight as equal weight.

2.3.3. SVM

For pattern and character recognition, SVM is the most well-known machine learning classification technique. SVM is a supervised machine learning technique used in biomedical image processing, computer vision, speech recognition, etc. [47]. SVM builds a hyperplane in high-dimensional spaces to obtain a better classification. If the achieved hyperplane has the highest functional margin, the classifier will give good performance [48]. The greater the margin, the lower the risk of generalized error. SVM finds the hyperplane that provides the most significant minimum distance for the training data. SVM can produce more generalized outcomes. SVM is a twofold classifier that converts data into a hyperplane that depends upon high-dimensional data.

Let us consider a hyperplane x.  + b = 0, where is normal.

Linearly separable data are represented as follows:where yi is the twofold class label.

When we achieve maximum margin by maximizing, the objective function value of

E = ║2 gives

By removing the discrepancies from the above equations, now we have

If data cannot be linearly separated, then a slack variable Ξi is used to identify misclassifications.

Thus, in this scenario, an objective function is defined assubject to

Here, C and L represent hyperparameters and cost functions, respectively. Cost functions are used to detect outliers. The dual formulation with Li) = Ξi issubject to


Kernel trick is used to handle nonlinearly separable data [49]. The nonlinear mapping function from the input space is transformed into a higher dimensional feature space. Polynomial, Gaussian, and radial-based functions are the most popular kernels.

SVM polynomial kernels:

SVM Gaussian polynomial kernels:

SVM fine Gaussian kernels:

Dual formation of a nonlinear case is shown assubject to

Grid search is the famous evaluation metric used for SVM evaluation. Optimal parameters are carefully selected by setting the grid range and step size. Only one parameter, “c,” a soft margin constant, is used in a linear kernel, whereas the SVM Gaussian kernel and SVM fine Gaussian kernel contain two training parameters, cost “c” and sigma, which can be used to control the nonlinearity of the degree. RICA feature extraction method was employed in the study with 70% data for training and 30% for testing. In this study, a polynomial kernel with default parameters is used.

2.4. Performance Evaluation Metrics

To predict Urdu Braille characters, true positive rate (TPR), true negative rate (TNR), positive predicted value (PPV), negative predicted value (NPV), false positive rate (FPR), and total accuracy (TA) are used to evaluate Urdu Braille character prediction.

2.4.1. True Positive Rate

TPR is also called sensitivity or recall. TPR indicates how many correct alphabets are classified as true. Mathematically, it is written as

2.4.2. True Negative Rate

TNR is also called specificity. This metric defines the number of negative predicted values that are correctly identified. It can be expressed mathematically as

2.4.3. Positive Predictive Value

A test predicted positive results when it is true positive. Mathematically, it can be presented as

2.4.4. Negative Predictive Value

NPV shows a test result in negative prediction, and the subject also has a negative value. Mathematical representation is given as follows:

2.4.5. Total Accuracy

Total accuracy is defined by adding all true positives and all true negatives and dividing it by all true negatives, false positives, true positives, and false negatives.

2.4.6. ROC

It measures the proportions of all true positives and all true negatives. They are calculated by plotting the ROC curve against the true positive and false positive values. TPR is plotted along the x-axis, whereas FPR is plotted along the y-axis. The area under the curve (AUC) value lays between 0 and 1. A value >0.5 shows separation. A value greater than 0.5 indicates separation. In this study, Braille Urdu characters, which are predicted true when they belong to the true class, have values 1 or approaching 1.

3. Results

Urdu Braille characters are predicted from a newly collected dataset from touchscreen devices. The performance is computed using RICA feature extraction methods and machine learning algorithms such as DT, KNN, and SVM. TPR, TNR, PPV, NPV, TA, FPR, and AUC were the performance metrics employed in the evaluation.

Figures 4(a)4(c) show AUC values using DT for category-1 (class 1–class 13), Alif-Zaal (ﺫ - ﺍ), category-2 (class 14–class 26), Ray-Fay (ﻒ - ﺮ), and category-3 (class 27–class 39), Qaaf-Bri Yay (ے - ﻕ), respectively. By extracting RICA features, the highest performance attained from category-1 (class 1–class 13), Alif-Zaal (ﺫ - ﺍ), is Braille class 6 with 99.9% TA and with 0.9979 AUC value, as shown in Figure 4(a). They were followed by other classes such as Alif, Bay, Hay, and Khay (ﺍ, ﺏ, ﺡ, ﺥ) yielding accuracies of (99.90%), with AUC (0.9995, 0.9914, 0.9895), respectively. Other classes such as Chay and Zaal (ذ ،چ) also achieved better accuracies of 99.85% and 97.76% with AUC (0.9884 and 0.9887, respectively). Other performance measures, such as TPR and TNR, yield performance (TPR >94%) and (TNR >98%). From category-2 (14–26), Ray-Fay (ﻒ - ﺮ) shows the highest performance for class 18, 21, 14, and 15, i.e., Seen, Zuuad, Ray, Rray, Fay, Toyn, and Ghaen (ﺲ,, ﺾ ﺮ, ڑ, ﻒ ﻃ, ﻍ) with accuracies of (99.95%, 99.95%, 99.85%, 99.85%, 99.80%, 99.64%, and 99.64%), with AUC (0.9899, 0.9997, 0.9992, 0.9495, 0.9615, 0.9495, and 0.9683, respectively) with TPR >90% and TNR >99%, as shown in Figure 4(b). For category-3 (27–39), Qaaf-Bri Yay (ے ﻕ), maximum TA of 100% is achieved for class 35 (Gol Hy), 37 (Hamza), and 28 (Kaaf) (ﻩ, ء, ﻙ) with AUC value of (0.9553, 1, 0.9995). These results are followed by Gaaf, Meem, and Hy (گ, ﻡ, ﻫ) with TA of (99.75%, 99.85%, and 99.75%) (with AUC value of (0.0732, 0.8125, and 0.9553) TPR >94%, and TNR >99%, except for class 31, i.e., TPR = 62.50%, as shown in Figure4(c). Amongst all, the highest separation (AUC = 1) was seen for the Urdu Braille characters Gol Hy and Hamza (ﻩ, ء). AUC value of other Urdu Braille characters such as Alif, Bay, Pay, and Tay (ﺍ, ﺐ, پ, and ﺖ) are above 99% indicating good classification. Detailed results are presented in Table 1.

The maximum accuracy obtained by using KNN for category (1–13), Alif-Zaal (ﺫ - ﺍ), is for class 2, 4, 5, and 13, i.e., Bay, Tay, Ttay, and Zaal (ﺏ, ﺖ, ٹ, ﺬ) with TA (99.95%, 99.85%, 99.80%, and 99.75%) and AUC (0.9997, 0.9992, 0.9990, and 0.9896, respectively), as shown in Figure 5(a). For category-2 (14–26), (Ray-Fay) (ﻒ - ﺮ), highest TA of 100% is achieved for class 15, 18, and 23, followed by 26, 21, 14, and 19, i.e., Rray, Seen, and Zoyn (ڑ, ﺲ, ﻅ). Fay and Zuad (ﻒ, ﺾ) achieved a TA of 099.95%, while Ray and Sheen (ﺮ, ﺵ) had TAs of 99.90% and 99.75%, respectively, with AUC (1, 1, 1, 0.9884, 0.9922, 0.9995, and 0.9800) with TPR >96% and TNR >99%, as shown in Figure 5(b). Similarly, highest TA achieved for category-3 (27–39), (Qaaf-Bri Yeh) (ے - ﻕ), is for class 35, 37, 31, 30, 27, and 28, i.e., Gol Hy, Hamza, Meem, Laam, Qaaf, and Kaaf (ﻩ, ء, ﻡ, ﻞ,, ﻕ ﻙ) (99.75%, 99.80%, 100%, 100%, 99.69%, and 99.69%), with AUC (1, 1, 0.7143, 0.9457, 0.9873, and 0.9578, respectively) along with TPR >42% for Meem (ﻡ), TPR = 100% for Gol Hy (ﻩ), Hamza (ء), and TNR = 100% for Gol Hy, Hamza, and Meem (ﻩ, ء, ﻡ), as shown in Figure 5(c). Overall findings indicate that Braille characters such as ڑ, ﺲ, ﻅ, ﻩ, and ء got the highest AUC value of 1 that shows 100% separation among all classes.

We also achieved above 0.99 AUC for several URDU Braille characters such as ﺍ, ﺏ, پ, and ﺖ, etc., by using KNN on the extracted feature set. KNN also exhibits promising results when using RICA features to extract Braille characters from a touchscreen device. Table 2 shows detailed DT results for all Grade-1 Urdu Braille characters.

Furthermore, SVM was used to measure the performance. Category (1–13), Alif-Zaal (ﺫ - ﺍ), shows the highest accuracies for class 10, 6, 9, 8, 3, 4, and 7, i.e., Jeem, Tay, Pay, Chy, Hay, Ssay, and Khay (ﺥ,ﺙ,ﺡ,چ,پ,ﺖ,ﺝ) are with TA (100%, 99.95%, 99.95%, 99.90%, 99.60%, 99.60%, and 99.80%), with AUC (1, 0.9997, 0.9997, 0.9995, 0.9995, 0.9995, and 0.9990, respectively) with TPR = 100% and TNR >99%, as shown in Figure 6(a). For category-2 (14–26), (Ray-Fay) (ﻒ - ﺮ), highest performance was achieved for class 26, 15, 23, 25, 14, 19, and 21, i.e., Ghaen, Ray, Fay, Rray, Zoyn, Sheen, and Zuad (ﻍ, ﺮ, ﻒ, ڑ, ﻅ, ﺵ, ﺾ) with TA achieved (100%, 99.95%, 99.90%, 99.90%, 99.85%, 99.85%, and 99.85%) and AUC (1, 0.9444, 0.9828, 0.9905, 0.9992, 0.9900, and 0.9776, respectively), as shown in Figure 6(b). Similarly, for category-3, (Qaaf-Bri Yeh) (ے - ﻕ), highest TA is achieved for class 34, 35, 36.37, 38, 28, 30, and 33, i.e., Waao, Hy, Gol Ha, Hamza, Choti Yeh (ﻮ, ﻫ, ﻩ, ء, ﻯ) all achieved highest TA of (100%), with AUC (1) with TPR = 100% and TNR = 100%.

However, Kaaf, Laam, and Noon Gunnah (ﻙ, ﻞ, ں) obtained 99.95% TA; AUC for Kaaf and Laam is 0.9997 and 0.9889 for Noon Gunnah with TPR 97% and TNR >99%, as shown in Figure 6(c). SVM outperforms other classifiers, as mentioned above, in terms of AUC value. Braille characters such as ﺥ, ﻒ, ﻮ, ﻩ, ﻫ, ء, and ﻯ got the highest AUC value of 1. SVM shows 99% separation for ﺍ, ﺏ, پ, ﺖ, ٹ, ﺙ, چ, ﺫ, ﺮ, ﺰ, ﻕ, ﻙ, and ﻞ. Figures 7(a)7(c) illustrate overall performance on all Urdu alphabets from Alif-Bri Yay by measuring PR, TNR, PPV, NPV, TNR, TA, and AUC.

Results of Shokat et al. [33] reported that Naive Bayes, DT, KNN, SVM, sequential model, and GoogLeNet inception model have TA of (96.38%), (97.20%), (97.04%), (83.00%), (92.21%), and (95.8%), respectively. For Naive Bayes, DT, and KNN, TPR and PPV showed no significant value. Better results were seen with SVM, sequential model, and GoogLeNet inception model. The maximum performance with the lowest error rate was achieved by SVM with RICA-based feature extraction method with TPR (93.96%), TNR (99.85%), PPV (94.51%), NPV (99.87%), TA (99.73%), and FPR (0.14%). After SVM, the results using DT showed the second best performance. The obtained results show TPR (90.98%), TNR (99.78%), PPV (92.21%), NPV (99.78%), TA (99.57%), and FPR (0.21%). At last, performance achieved using KNN is TPR (90.2%), TNR (99.72%), PPV (89.75%), NPV (99.77%), TA (99.5%), and FPR (0.28%). Table 3 compares the findings of Sana et al. with the current study [33]. More promising results have been achieved with an improved backend processing system for dataset collection; we have achieved more promising results in total accuracy and space used.

4. Discussion

To improve the living quality for visually challenged persons, Braille must be converted to natural language. Braille to natural language conversion has been done in many studies. Most of the studies translated scanned Braille documents into standard English or the other way around. Jha and Parvathi conducted a study that used an SVM classifier trained by extracting features using the histogram of oriented gradient (HOG) feature extraction method to translate handwritten Odia and Hindi text into Braille. Converting Odia [36] and Hindi [25] using this classification technique, 99% and 94.5% accuracies were achieved, respectively, into Braille. Using the same technique, 99% and 80% accuracies were achieved for converting handwritten English and Sinhala documents into Braille text [35]. An SVM classifier trained on HAAR feature extraction methods is used to convert handwritten English sheets, with a classification error of less than 10% [50]. English to Braille conversion was taken place by taking input from a gesture-based touchscreen using KNN for classification. The distance between two dots was computed using Bayesian touch distance, which yielded a 97.4 percent accuracy [51]. Another study was carried out to recognize scanned Braille characters using KNN, Naïve Bayes, random forest, and SVM depicting that 63%, 53%, 65%, and 69.6% accuracies were achieved [34], as shown in Table 4.

For Braille to Urdu character recognition, RICA-based features are extracted, and robust machine learning algorithms such as DT, SVM, and KNN are used. A new Urdu Braille dataset was collected from visually impaired students using a newly built touchscreen-based free Braille input application. Main findings achieved for DT algorithm with category-1 (1–13), Alif-Zaal (ﺫ - ﺍ), the highest detection performance was obtained with Braille class 6 (ﺙ) with TA (99.95%)and AUC (0.9979).These results were followed by other classes such as (ﺍ, ﺏ, ﺡ, ﺥ) yielding accuracies of (99.90%), with AUC (0.9995, 0.9995, 0.9914, and 0.9895 ), respectively. Other performance indicators, such as TPR and TNR, show that performance (TPR >94%) and (TNR >98%) are achieved. Similarly, category-2 (14–26), Ray to Fay (ﻒ - ﺮ), for DT shows the best results for class 18, 21, 14, and 15, i.e., Seen, Zuuad, Ray, and Rray (ﺲ, ﺾ, ﺮ, ڑ) with accuracies of 99.95%, 99.95%, 99.85%. and 99.85%, with AUC (0.9899, 0.9997, 0.9992, and 0.9495 respectively) with TPR = 100% and TNR >99%. For category-3 (27–39), Qaaf-Bri Yeh (ے - ﻕ), highest TA is achieved for class 35, 37, and 28, i.e., Gol Hy and Hamza (ﻩ, ء) with (100%) TA. TA of Kaaf is 99.90%, with AUC (1, 1, and 0.9995), respectively, with TPR >94% and TNR >99%. By employing KNN, the highest TA achieved for category (1–13), Alif-Zaal (ﺫ - ﺍ), is for class 13, 2, and 4, i.e., Bay and Tay (ﺐ, ﺖ) (99.85% and 99.85%), with AUC (0.9997 and 0.9992, respectively). For category-2 (14–26), (Ry-Fy) (ﻒ - ﺮ), highest TA is achieved for class 15, 18, 23, 26, and 21, i.e., Rray, Seen, and Zoyn (ڑ, ﺲ, ﻅ) TA = 100% followed by Fay and Zuad (ﻒ, ﺾ) with TPR (99.95%) with AUC (1, 1, 1, 0.9884, and 0.9922) with TPR = 100% and TNR >99%. Similarly, for category-3 (27–39) (Qaaf-Bri Yeh) (ے - ﻕ), highest TA was achieved for class 35, 37, and 31, i.e., Gol Ha, Hamza, and Meem (ﻩ, ء, ﻡ) TA (100%, 100%, and 99.80%), with AUC (1, 1, and 0.7143) with TPR >42% for Meem (ﻡ) and TPR = 100% for Gol Hy (ﻩ) and Hamza (ء) and TNR = 100% for Hay, Hamza, and Meem (,ء,ﻩ ﻡ). Moreover, performance evaluation for SVM category (1–13), Alif-Zaal (ﺫ - ﺍ), shows highest accuracies for class 10, 6, 9, and 8, i.e., Khay, Ssay, Hay, and Chy (ﺥ, ﺙ, ﺡ, چ) with TA (100%, 99.95%, 99.95%, and 99.90%), with AUC (1, 0.9997, 0.9997, and 0.9995, respectively) with TPR = 100% and TNR >99%. For category-2 (14–26) (Ray-Fay) (ﻒ - ﺮ), highest performance was achieved for class 26, 15, 23, and 25, i.e., Fay, Rray, Zoyn, and Ghaen (ﻒ ڑ, ﻍ, ﻅ) with TA achieved (100%, 99.95%, 99.90%, and 99.90%) with AUC (1, 0.9444, 0.9828, and 0.9905). Similarly, for category-3 (Qaaf-Bri Yeh) (ے - ﻕ), the highest TA is achieved for class 34, 35, 36.37, and 38, i.e., Waao, Hy, Gol Hay, Hamza, and Choti Yeh (ﻮ, ﻫ, ﻩ, ء, ﻯ) all achieved the highest TA of (100%), with AUC (1) with TPR = 100% and TNR = 100%.

5. Conclusions and Future Work

Braille is a growing means of communication for people with visual impairments. More than 150 million people continue to use Braille around the world for several reasons. Literacy is one of the powerful cases that show Braille’s importance for learning how to read and write. With the advent of technology, Braille is more accessible to visually impaired users. Various studies have been carried out to convert Braille into a natural language. However, most studies have used handwritten scanned sheets to translate Braille into natural languages such as Arabic, Hindi, Tamil, Odia, Chinese, Korean, Bangla, English, and Gujarati, and vice versa [21, 28, 36, 5257]. As per our knowledge for Urdu Braille recognition, no such study has been found so far. In this research, the Urdu Braille dataset is collected using a touchscreen-based android application [33]. This application is easy to use and less burdensome for visually impaired people. Robust machine learning techniques include SVM with polynomial kernels, DT with default parameters, and KNN with K = 3 for Urdu Braille character recognition. The dataset was divided into 70%–30% for training and validation. Performance metrics used to evaluate these classifiers’ performance are PPV, NPV, FPR, FNR, FPR, TA, and AUC. The RICA-based feature extraction method is used for Urdu Braille character recognition. The highest performance using DT classifier shows that class 35, Gol Hy (ﻩ), and class 37, Hamza (ء), achieved a maximum separation AUC (1) with TA, TPR, TNR = 100%, and FPR = 0%. Similarly, the highest performance using KNN shows that class 15, 18, 23, 35, and 37, i.e., Rray, Seen, Zoyn, Gol Hy, and Hamza (ڑ, ﺲ, ﻅ, ﻩ, ء) attained the highest separation AUC (1) with TPR, TNR, TA = 100%, and FPR = 0%. Furthermore, using SVM, the highest performance was achieved by class 10, 26, 34, 35, 36, 37, and 38, i.e., Khay, Fay, Waao, Hy, Gol Hy, Hamza, and Choti Yeh (ﺥ, ﻒ, ﻮ, ﺡ, ﻩ, ء, ﻯ) with highest separation AUC (1) with TPR, TNR, TA = 100%, and FPR = 0%. SVM has the highest TA of all classifiers with TA (99.73%), TPR (93.96%), TNR (99.85%), PPV (94.51%), NPV (99.87%), and FPR (0.14%).

RICA features are extracted, and robust machine learning techniques are used to recognize Urdu Braille characters. We intend to expand this dataset in the future to include other languages such as English and maths with advanced Braille levels. The performance will then be evaluated using convolutional neural network (CNN) and transfer learning techniques such as GoogleNet inception model and LSTM. By implementing these models, performance of the system will further improve. Along with this, we will be more focused on providing error detection and voice feedback services for visually impaired users.

Data Availability

This dataset is not publicly available. But it could be provided on request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.


This research work was supported by the National Research Foundation of Korea (NRF) grants funded by the Korean government under reference number (2020R1A2C1012196).