Abstract

Along with the progress of the times, the development of graphology has changed towards computerization. The fundamental problem in automated graphology is how to determine personality traits through digital handwriting using the principles of graphology. Although various models and approaches have been developed in research related to automated graphology, there are still obstacles to overcome such as the selection of preprocessing techniques and image processing algorithms to extract handwriting features and proper classification techniques to get maximum accuracy. Therefore, this study aims to design a reliable framework using image processing and machine learning approaches such as filtering, thresholding, and normalization to determine the personality traits through handwriting features. Then, handwriting features are classified according to the Big Five model. Experiments using the decision tree, SVM (kernel RBF), and KNN produced an accuracy above 99%. These results indicated that the proposed framework can be well applied to predict the personality of the Big Five model through handwriting analysis features.

1. Introduction

We already know that handwriting is a way of communication between humans and that handwriting interprets the ideas that exist in the human brain. Generally, handwriting has a unique pattern, just like the pattern of human fingerprints. This fundamental thing is the reason why handwriting can be analysed to determine human behaviour and personality. Handwriting analysis can be used as a means of self-introspection to find out the strengths and weaknesses of a person. Science that studies human personality through handwriting is called handwriting analysis or better known as graphology. Graphology can identify and predict human personality by finding patterns in the handwriting that provide essential information about the writer’s mental, physical, and emotional state and behaviour.

The development of graphology has changed towards computerization and has become a separate field of research today. The fundamental problem in computerized graphology is how to determine human personality through digital handwriting using the principles of graphology. The first research that discusses computerized graphology is called computer-aided graphology using the principles of pattern recognition which consists of three main stages, namely, preprocessing, feature extraction, and classification [1]. From these stages, it becomes a model or approach that cannot be separated in building computerized graphology. After that, it developed rapidly and became a separate research area for determining a human personality through handwriting.

The Five Factor Model (FFM) of personality is a set of five broad personality trait dimensions, often referred to as the “Big Five Model,” which consist of openness to experience, conscientiousness, extraversion, agreeableness, and neuroticism [2]. The application of the Big Five model has been consistently associated with career guidance and job performance [3], analysing financial behaviour [4], employee recruitment [5], and marital relations [6]. A study discussed by the authors of [7] obtained the results that the Big Five model is better than other psychometrics such as the MBTI.

Manual classification of personality traits based on handwriting analysis by the graphologist needs more time and high cost. Machine learning involves the use and development of computer systems that are able to learn and adapt without following explicit instructions by using algorithms and statistical models to analyse and draw inferences from patterns in data [8]. Several studies apply personality psychology measurement techniques based on mapping and combination of several handwriting features. Handwriting analysis features such as baseline, slope, pen pressure, connecting stroke, letter “t,” letter “f,” and spacing between lines are combined to determine human personality and behaviour based on the five-factor model [9]. In another study, the FFM was used to determine personality traits using several features such as baseline, letter “t,” line spacing, word spacing, and pen pressure and classified using the PersonaNet algorithm based on the CNN model [10]. Other measurement techniques such as Myer–Briggs Type Indicators (MBTIs) are also used to determine the personality traits of a person with a combination of classification techniques such as ANN, SVM, template matching, and KNN [11, 12]. In addition, the Enneagram model, which is one of the psychological measurements, combined with the C-mean technique, produces personality groupings which are divided into nine personality types, namely, the reformer, helper, achiever, individualist, investigator, loyalist, enthusiast, challenger, and peacemaker [13].

In this study, we present a classification model of personality traits through handwriting with the Big Five model architecture using image processing and machine learning approaches. The model architecture is presented starting from the preprocessing stages which include noise removal, thresholding, segmentation, and normalization. Furthermore, at the feature extraction stage, features such as baseline, top margin, line spacing, word spacing, letter size, slant, and pen pressure are extracted using an image processing approach using the OpenCV library [14]. Then, the classification stage presents the psychological grouping of the human personality based on the results of handwriting extraction. In this classification stage, it consists of three steps: the first step is to determine the decision rules for each class based on the features of handwriting analysis, the second step is to map the features for psychological identification by applying the Big Five personality psychology method, and the third step is to classify the Big Five personality from the handwriting images with a machine learning approach based on the psychological identification mapping.

The main contributions of this paper are as follows:(1)We proposed a framework to determine Big Five personality traits through handwriting images using machine learning classification.(2)From the experiments, it can be seen that the proposed framework is very effective and performs the state-of-the-art classification methods for determining the Big Five personality traits through handwriting images.

The organization of this paper is as follows: Section 2 provides the materials and methods, Section 3 provides the related works, Section 4 describes the methodology, Section 5 gives the results, Section 6 describes the discussion, and Section 7 gives the conclusion.

2. Materials and Methods

In this study, we use a public handwriting database from the IAM handwriting database [15]. It contains English handwriting text forms that can be used to train and test handwriting text recognition and perform author identification and verification experiments. The database contains unlined handwriting text forms, which were scanned at a resolution of 300 dpi and saved as a 256 grey-level PNG image format. The IAM handwriting database consists of 657 participants who have contributed to creating the database, 1539 handwriting text pages, 5685 labelled sentences, 13353 labelled text lines, and 115320 labelled words.

Many researchers have published papers on handwriting analysis classification. Table 1 presents a brief overview of the author’s contribution to the automated handwriting analysis.

From what has been described in Table 1, the current study is still lacking on how to build a framework for handwriting analysis which is indicated by the fact that the accuracy obtained is still below 90% [4, 9, 10, 12, 19, 21]. Joshi et al. [18] developed a classification framework based on the support vector machine (SVM) that achieved 97% classification accuracy. The template-matching technique can be useful to extract the individual letter. It needs more template databases to get a better result. Naturally, a larger template database can consume more time for training [9, 18]. The deep learning architecture shows impressive results [20, 22]. Pathak et al. [22] developed a deep neural network architecture model that obtained 97.7% accuracy. Disadvantages of this technique are that it requires more computational resources and is prone to overfitting problems [24].

Related to those studies described above, this study aims to build a framework for predicting personality traits based on the Big Five personality model in terms of graphology using machine learning approaches. This research is expected to be an alternative in terms of assessing a human personality through handwriting.

In the next section, we discuss the theoretical models of each part of the proposed framework.

4. Methodology

As mentioned in the previous section, our research aims to build a framework for predicting personality traits based on the Big Five personality model in terms of graphology using machine learning approaches. Figure 1 shows the framework of our proposed research. An explanation of each process is described in the following subsections.

4.1. Preparing the Dataset

The system design begins by cropping the handwriting image from the IAM database [15]. The image cropping process is intended to remove unnecessary parts from the image in the feature extraction process. Each cropped image is stored in the PNG format with the entire image width measuring 850 pixels and the image height adjusting to the existing handwriting text content. Figure 2 describes a handwriting image from the IAM database before and after the cropping process.

4.2. Preprocessing

Some noise is still present in the handwriting image generated during the scanning process. This noise must be removed from the image to produce optimal feature extraction. The filtering technique using bilateral filtering in the OpenCV library is used in this study [25, 26]. After the filtration technique is performed, the next step is to binarize the handwriting image; in this case, the thresholding technique is used in the OpenCV library [27]. The selection of the thresholding technique is based on the dominance of 2 colour intensities in the handwriting image. The third stage of preprocessing is the stage of normalizing the handwriting image using dilation, contour, and affine transformation techniques, still using the library in OpenCV [28]. This stage aims to separate each line of text and words which will later be used to determine the distance between spaces, both lines and words.

4.3. Extraction of Handwriting Analysis Features

After the preprocessing stage was performed, certain handwriting analysis features were required to be extracted from the database of handwriting samples. Based on [29], the features that will be used include baseline, top margin, line spacing, word spacing, letter size, slant, and pen pressure. All process of extracting the features used the OpenCV library.

4.3.1. Baseline

The baseline feature of handwriting is an invisible line on which the bottom of the middle zone letters aligns [29]. To determine the classification of the baseline angle value, if the baseline angle is positive, then it is categorized as descending (baseline > 0°), and if the baseline angle is negative, then it is categorized as ascending (baseline < 0°). Table 2 gives the details of the baseline feature and its characteristics.

4.3.2. Letter Size

Letter size is determined by calculating all the text lines in the middle zone. The average letter size of all lines will be the letter size value. The size of the middle zone estimates the letter size without considering upper and lower zones. To determine the letter-size classification of the handwriting sample, the middle zone portion of the line of the text is calculated. The letter size in the normal category is about 1/8 inch (3.175 mm) [29]. The letter size that is more than 1/8 inch is categorized as larger than normal and less than 1/8 inch is categorized as smaller than normal size. Table 3 gives the details of the letter size feature and its characteristics.

4.3.3. Line Spacing

The amount of space in each line of the text is said to be line spacing [29]. To determine the classification of line spacing, the normal spacing is around 2-3x the size of the letter (in the middle zone, excluding the upper and lower zones). Line spacing less than 2x the letter size is categorized as narrow line spacing, while line spacing more than 3x the letter size is categorized as wide line spacing. Table 4 gives the details of the line spacing feature and its characteristics.

4.3.4. Word Spacing

The amount of space in each word of the text is said to be word spacing [29]. To determine the classification of word spacing, the normal spacing is 1x the size of the letter (in the middle zone, excluding the upper and lower zones). Word spacing less than 1x the letter size is categorized as narrow word spacing, while word spacing more than 1x the letter size is categorized as wide line spacing. Table 5 gives the details of the word spacing feature and its characteristics.

4.3.5. Top Margin

To determine the classification of the top margin is the same with line spacing, the normal top margin is 2x the size of the letter (in the middle zone, excluding the upper and lower zones) [29]. The top margin less than 2x the letter size is categorized as a narrow top margin, while the top margin more than 2x the letter size is categorized as a wide top margin. Table 6 gives the details of the top margin feature and its characteristics.

4.3.6. Pen Pressure

Extraction of pen pressure is taken from the average value of all nonzero pixels (handwriting text pixel intensity) divided by the number of pixels counted after the binarization process. The pixel intensity value above 180 is categorized as heavy, the pixel intensity below 140 is categorized as light, and the rest is normal. Table 7 gives the details of the pen pressure feature and its characteristics.

4.3.7. Slant

The slant of writing refers to the direction of the letter slope and is determined by the angle formed between the downstroke of the baseline [29]. To find the angle of the slant, the deslanted technique was used, which was proposed by Luettin and Luettin [30]. The deslanting technique is based on the hypothesis that each “word” is deslanted when the number of columns containing a continuous stroke is maximum [30]. From this technique, for each angle in a suitable range, a shear transformation is used. Table 8 gives the details of the slant feature and its characteristics.

4.4. Mapping the Big Five Model

Before mapping the handwriting features that have been extracted into the Big Five model, first, we present the literature that discusses the Big Five model. The Big Five model is one of the models used to describe individual personality traits [2, 31, 32]. The Big Five model is based on 5 groups of personality traits which are as follows:(1)Neuroticism: It refers to people who have lack of emotional stability control, tend to experience negative emotions easily, such as anger and anxiety, and vulnerability to depression. On this scale, people are judged on the dichotomy: nervous vs. confident. The characteristics that represent neuroticism include awkwardness, pessimism, moodiness, jealousy, patience, fright, nervous, anxiety, fear, vigilance, and self-criticism, lack of confidence, insecurity, instability, and oversensitivity.(2)Openness to experience: It refers to people who can easily express their emotions and have a desire for adventure, appreciation of art, and bright ideas. Typically, on this scale, people are judged based on the dichotomy: consistent vs. curiosity. The characteristics that represent openness to experience include imagination, insightful, varied interests, originality, bravery, preference for variety, cleverness, creativity, curiosity, perceptive, intellect, and complexity/depth.(3)Extroversion: It refers to people who easily express positive emotions, such as making friends with others, being assertive, and being talkative. On this scale, people are judged based on the dichotomy: extroversion vs. solitary. The traits that represent extroversion include sociable, firmness, excitement, friendly nature, energized, talkative, articulation skills, cheerful, affectionate tendencies, friendliness, and social beliefs.(4)Agreeableness: It refers to people who have a tendency to be affectionate rather than suspicious, also helpful, and short-tempered. On this scale, people are judged based on the dichotomy: compassion vs. separated. The characteristics that represent agreeableness include altruism (put the interests of others first), modesty, patience, moderate, wisdom, courtesy, kind, loyalty, selflessness, helpful, sensitive, friendly, excitement, and consideration.(5)Conscientiousness: It refers to a person who is reliable, has a penchant for carefully planned behaviour, and is oriented towards results and achievements. On this scale, people are judged based on the dichotomy: organized vs. careless. The traits that represent conscientiousness include persistence, ambition, accuracy, self-discipline, consistency, predictability, control, reliability, sense, hard work, energy, perseverance, and planning.

From the explanation above, the next step is to map the features of graphology with the types of Big Five personality. The correlation between these features is presented in Table 9.

4.5. Personality Trait Classification

After mapping, the next step is to classify the personality using several machine learning approaches. The five factors of the Big Five model are predicted with the mapping that has been performed. Therefore, there are 5 separate labels for each personality psychology trait and 5 classifications for each Big Five (FFM) model. The classification process uses 3 different machine learning algorithms including the SVM, KNN, and decision tree.

SVM is a supervised learning method with the concept of building a hyperplane or a collection of hyperplanes in high- or infinite-dimensional spaces, which can be used for classification, regression, or other tasks [33, 34]. A hyperplane is said to be optimum or has the best level of generalization of data if it has the largest margin; in other words, the resulting error depends on the size of the margin. In SVM, there are 4 kernels that can be used, namely, the linear kernel, polynomial kernel, radial basis function (RBF) kernel, and sigmoid kernel.

KNN is a classification with the type of instance-based learning that works by finding a number of k patterns (among all the patterns being trained in all classes) closest to the input pattern and then making decisions based on the highest number of patterns among the k value pattern [35].

A decision tree (DT) is a nonparametric-supervised learning method that was used for classification and regression with a tree structure [36, 37]. The goal is to create a model that predicts the value of the target variable by studying simple decision rules deduced from data features. A DT takes a set of input data to classify, and it outputs a tree that resembles an orientation diagram where each leaf is a decision (a class) and each nonfinal node (internal) represents a test. During classification, only features are being considered in the test pattern, so feature selection is implicit in it. The most commonly used decision tree classifications are binary and use a single feature at each node, resulting in boundary decisions that are parallel to the feature axis. As a result, such decision trees are intrinsically less than optimal for most applications. However, the main advantage of tree classifiers, apart from their speed, is the possibility to interpret decision rules in terms of individual features. This makes decision trees interesting for researchers to use interactively.

To implement some of the machine learning approaches above, the Scikit-Learn Library module in Python is used [38]; then, performance testing is carried out on each personality in the Big Five model.

5. Experiment Results

This research experiment used all the handwriting images from the IAM handwriting database, with a total of 1539 images. Performance measurement was carried out using the Python programming language [39], the OpenCV library, and the Scikit-Learn library. This test was also run on a PC with the following specifications: GPU processor 9th generation i7, NVIDIA GeForce GTX 1660 Ti, and DDR4 16 GB. The result of handwriting feature extraction for the entire image is stored in one file and becomes a labelled data file for each handwriting image document. There are two labels for each model, identified and not identified. Performance measurement was performed with machine learning algorithms. There are 5 classification scenarios carried out, including the SVM (three variations of the kernel: linear, RBF, and polynomial), KNN, and decision tree, with a split ratio of 20 : 80 for testing and training data. Performance testing was performed on each dimension in the Big Five model, namely, neuroticism, openness to experience, extraversion, agreeableness, and conscientiousness.

5.1. Performance Measures

The classification performance measures used for the comparison are accuracy, precision, recall, F1 score, true positive (TP), true negative (TN), false positive (FP), and false negative (FN). The performance measures are calculated using the following equations, as shown in Table 10.

Table 11 presents the data from the classification process for the neuroticism model. The parameters used in the classification report are accuracy, precision, recall, and F1 score. From these data, it can be seen that the SVM classifier using the RBF, KNN, and decision tree kernels is able to produce maximum performance for the model.

Table 12 shows the data from the classification process for the openness to experience model. We still use the same parameters in this classification report, with maximum accuracy results using the SVM (RBF kernel), KNN, and decision trees. The difference is that SVM with a linear kernel is able to produce an accuracy of the model above 90%.

Table 13 shows the data from the classification process for the extroversion model. From these data, it can be seen that SVM with a linear kernel does not show maximum results with accuracy below 90%.

Table 14 shows the data from the classification process for the agreeableness model. From these data, like the previous model, it can be seen that SVM with a linear kernel does not show maximum results with accuracy below 90%.

Table 15 shows the data from the classification process for the conscientiousness model. From these data, SVM with an RBF kernel and a decision tree achieved the highest accuracy with 100%, KNN and SVM with a polynomial kernel obtained 99%, and SVM with a linear kernel achieved the lowest accuracy with 88%.

Figure 3 describes the confusion matrix for each model of the Big Five. It can be seen that the amount of data used for testing is 308 or 20 percent of the 1539 handwriting data.

5.2. K-Fold Cross-Validation

Evaluating machine learning models can be very difficult. Typically, we divide the data set into training and test sets and then use a training set to train the model and a test set to test the model. This method is very unreliable because the accuracy obtained for one test set can be very different from the accuracy obtained for different test sets. K-fold cross-validation (CV) provides a solution to this problem by dividing the data into folds and ensuring that each fold is used as a test set at multiple CV points. K-fold CV is a given data set divided into a number of K parts/folds where each fold is used as a test set at some point [40]. The algorithm used to test the validity of the accuracy results is k = 10 cross-validation (Figure 4). The performance of the classifier model is assessed with two performance metrics: the mean absolute error (MAE) and the root mean square error (RMSE).

Table 16 shows the classifier output for each model of the Big Five model using 10-fold cross-validation. In the neuroticism model, the decision tree has the lowest MAE score with a value of 0, the SVM RBF kernel with a value of 0.00064, the KNN with a value of 0.01039, the SVM polynomial kernel with a value of 0.10328, and the SVM linear kernel with a value of 0.15850, respectively. For the accuracy with the cross-validation-tuning method shown in Figure 5, the decision tree has the average CV score with 100% accuracy, SVM RBF has the average CV score with 99.935% accuracy, the KNN has the average CV score with 98.96%, the SVM polynomial has the average CV score with 89.67%, and SVM linear has the average CV score with 84.149%, respectively. From the data obtained, all classifiers have decreased in accuracy by using the 10-fold CV score, except for the decision tree that is relatively stable. The most significant decrease in accuracy is in SVM with a polynomial kernel, from an accuracy of 94% to an accuracy of 89%.

In the openness to experience model, the decision tree has the lowest MAE score with a value of 0.00064, the SVM RBF kernel with a value of 0.01756, the KNN with a value of 0.02338, the SVM polynomial kernel with a value of 0.05459, and the SVM linear kernel with a value of 0.08705, respectively. For the accuracy with the cross-validation-tuning method shown in Figure 6, the decision tree has the average CV score with 99.93% accuracy, SVM RBF has the average CV score with 98.24%, the KNN has the average CV score with 96.48%, the SVM polynomial has the average CV score with 94.54%, and SVM linear has the average CV score with 91.29%, respectively. From the data obtained, the decision tree and SVM RBF classifiers have decreased in accuracy by using the 10-fold CV score, but the decrease in the value is not significant. It can be seen with the value of the MAE with a relatively small decrease. The most significant decrease in accuracy is in the KNN, from an accuracy of 100% to 97.76%.

In the extroversion model, the decision tree has the lowest MAE score with a value of 0, the SVM RBF kernel with a value of 0.01756, the KNN with a value of 0.03511, the SVM polynomial kernel with a value of 0.06502, and the SVM linear kernel with a value of 0.13455, respectively. For the accuracy with the cross-validation-tuning method shown in Figure 7, the decision tree has the average CV score with 100% accuracy, SVM RBF has the average CV score with 98.50%, the KNN has the average CV score with 96.48%, the SVM polynomial has the average CV score with 93.49%, and SVM linear has the average CV score with 86.54%, respectively. From the data obtained, SVM RBF, SVM linear, and SVM polynomial have decreased in accuracy by using the 10-fold CV score, but the decrease in the value is not significant. It can be seen with the value of the MAE with a relatively small decrease. The decision tree has a stable value for the 10-fold CV score. The most significant decrease in accuracy is in the KNN, from an accuracy of 100% to 96.48%.

In the agreeableness model, the decision tree has the lowest MAE score with a value of 0, the SVM RBF kernel with a value of 0.00454, the KNN with a value of 0.04677, the SVM polynomial kernel with a value of 0.01818, and the SVM linear kernel with a value of 0.15268, respectively. For the accuracy with the cross-validation-tuning method shown in Figure 8, the decision tree has the average CV score with 100% accuracy, SVM RBF has the average CV score with 99.54%, the KNN has the average CV score with 95.32%, the SVM polynomial has the average CV score with 98.18%, and SVM linear has the average CV score with 84.73%, respectively. From the data obtained, the most significant decrease in accuracy is in the KNN, from an accuracy of 100% to 95.32%.

In the conscientiousness model, the decision tree has the lowest MAE score with a value of 0.00129, the SVM RBF kernel with a value of 0.00194, the KNN with a value of 0.02857, the SVM polynomial kernel with a value of 0.02143, and the SVM linear kernel with a value of 0.09163, respectively. For the accuracy with the cross-validation-tuning method shown in Figure 9, the decision tree has the average CV score with 99.87% accuracy, SVM RBF has the average CV score with 99.80%, the KNN has the average CV score with 97.14%, the SVM polynomial has the average CV score with 97.85%, and SVM linear has the average CV score with 90.83%, respectively. From the data obtained, the most significant decrease in accuracy is in the KNN, from an accuracy of 99% to 97.14%.

6. Discussion

From all the data presented, it can be said that SVM with an RBF kernel and decision tree classifiers show very promising results. This is indicated by the accuracy of the five models which can be a maximum of above 99%. The selection of an appropriate image processing algorithm that adapts to the characteristics of the handwriting dataset is very important. In addition, it is equally essential that the selection of the right parameters in the classification process can produce good accuracy.

Several previous studies also obtained maximum results by using SVM as a classifier, such as a study by Joshi et al. [18], who were able to produce an accuracy of 97%. This is one of the advantages of SVM which is very good at classifying two different classes. Besides, the selection of the right kernel will affect the results of the classification process. KNNs and decision trees also show promising results. Other studies such as by Gavrilescu [12] used the KNN as its classifier with an accuracy of 88.6%, and then, Topaloglu and Ekmekci [17], using decision trees, produced an accuracy of 93.75%. With the deep neural network architecture, Pathak et al. [22] achieved 97.7% accuracy and Bernardo et al. [23] achieved 91.26%, respectively. The results are described in Table 17.

Although our model has performed well on the IAM data set, it is important to examine the results of our model on another handwriting image dataset, such as the CVL database [41]. We believe that our model has some applicability to identification of different handwriting images, and for sure, this will be one of our future research directions.

7. Conclusions

We presented a framework for determining the Big Five personality traits through handwriting analysis features and classified them using machine learning algorithms. The automated handwriting analysis helps the graphologist determine human personality traits easier. This framework has three main stages which include preprocessing, handwriting feature extraction, and personality classification based on mapping from the Big Five models. The classification can be performed using different machine learning algorithms, and it is used for the handwriting image database. This research is further evaluated through 10-fold cross-validation with key metrics to see the impact on accuracy, and the other performance-measured metrics such as the mean absolute error and root mean square error are discussed. All the metrics show good results, which means that the decision tree and SVM with an RBF kernel are the suitable classifier techniques. Overall, the classification accuracy of the framework is higher than that of previous work.

The authors do acknowledge the current limitations of this research. For example, our model is not currently developed for real systems. Also due to the limitation of the handwriting database, our model does not take into account the amount of classification under different colours of background handwriting samples.

In future research studies, a novel framework will be designed with different psychology measurements such as the MBTI and Enneagram model. Besides, the author will also challenge more complex handwriting databases and apply the model to the real system.

Data Availability

The dataset is available in a public repository, Computer Vision and Artificial Intelligence, and can be accessed on the URL: https://fki.tic.heia-fr.ch/databases/iam-handwriting-database.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors thank the rectors for funding the competitive research and paper publication based on this research. This research was funded by the DIPA of the Public Service Agency of Universitas Sriwijaya, on April 28, 2021.