Abstract

The alcohol use disorder (AUD) is an important brain disease, which could cause the damage and alteration of brain structure. The current diagnosis of AUD is mainly done manually by radiologists. This study proposes a novel computer-vision-based method for automatic detection of AUD based on wavelet Renyi entropy and three-segment encoded Jaya algorithm from MRI scans. The wavelet Renyi entropy is proposed to provide multiresolution and multiscale analysis of features, describe the complexity of the brain structure, and extract the distinctive features. Grid search method was used to select the optimal wavelet decomposition level and Renyi order. The classifier was constructed based on feedforward neural network and a three-segment encoded (TSE) Jaya algorithm providing parameter-free training of the weights, biases, and number of hidden neurons. We have conducted the experimental evaluation on 235 subjects (114 are AUDs and 121 healthy). -fold cross validation has been used to avoid overfitting and report out-of-sample errors. The results showed that the proposed method outperforms four state-of-the-art approaches in terms of accuracy. The proposed TSE-Jaya provides a better performance, compared to the conventional approaches including plain Jaya, multiobjective genetic algorithm, particle swarm optimization, bee colony optimization, modified ant colony system, and real-coded biogeography-based optimization.

1. Introduction

Alcohol use disorder (AUD) affected 208 million people worldwide in 2010. It can cause severe adverse effects to the brain, liver, heart, and pancreas. The long-term misuse can lead to increased tolerance to alcohol, making it difficult to control the consumption. The short-term misuse can lead to “blood alcohol concentration (BAC).” A BAC from 0.35% to 0.80% can cause fatal respiratory depression and life-threating alcohol poisoning.

This paper studies the effect of long-term alcohol misuse on the brain. The alcohol misuse can have a damaging effect on the brain neurons; hence, patients with long-term AUD have smaller volumes of white matter and gray matter than age-matched controls. Besides, alcohol causes adverse effect on the prefrontal cortex and cerebellum. The current diagnosis of AUD mainly relies on manual observation based on brain images. However, due to mild symptoms, the radiologists may miss the slight shrinkage of AUD brains and be unable to identify it at an early stage. It is necessary to create an efficient approach that can monitor the patient brain via magnetic resonance imaging (MRI) and provide automatic, early diagnosis.

Over the last decades, computer-vision-based techniques have been proposed for automatically detecting changes on brain structure for brain related disease diagnosis based on MRI scans. Nayak et al. (2016) [1] presented a brain image classification algorithm based on random forest. Alweshah and Abdullah (2015) [2] hybridized firefly algorithm (FA) and probabilistic neural network (PNN). They used the proposed method for detecting changes in the brain. Lv and Hou [3] proposed an improved particle swarm optimization (IPSO) to detect alcoholism in MRI scanning. Monnig (2012) [4] suggested detecting white matter atrophy in neuroimaging of AUD. Yang (2017) [5] combined Hu moment invariant (HMI) and support vector machine (SVM) for pathological brain detection. Jiang and Zhu (2017) [6] explored the method using pseudo Zernike moment (PZM). Lv and Sui (2017) [7] used data augmentation technique for alcoholism detection.

Although several of the above methods were developed for pathological brain detection, they can be easily transferred and applied to alcoholism detection. Nevertheless, these methods suffer from several common problems: first, approaches that do not take into account expressions of complexity of the brain structure do not exhibit good performance in AUD. Second, the training algorithms of existing classifiers may fall into local optimal and it is difficult to optimize the hyperparameters (e.g., the number of hidden neurons in a feedforward neural network) of the classifiers [8].

To address the above problems, we propose in this study a novel identification method of alcoholic use disorder. Our contributions include the following: a novel feature extraction method—wavelet Renyi entropy, which can describe the complexity of brain structure at multiple scales—and an improved Jaya algorithm to train a feedforward neural network, which can optimize the weights, biases, and the number of hidden neurons simultaneously. Our training algorithm does not need to set algorithm-specific parameters.

The rest of this paper is organized as follows: Section 2 describes the subjects, scan protocol, and slice selection method. Section 3 presents the proposed feature extraction method—wavelet Renyi entropy. Section 4 describes the classifier construction and the proposed training algorithm: three-segment encoded Jaya algorithm. Section 5 provides the implementation procedure and the evaluation method. Besides, we show how to use grid search to optimize the parameters of wavelet Renyi entropy. The results and discussions are presented in Section 6. Section 7 concludes the work.

2. Materials

2.1. Subjects

The subjects went through a medical history interview to guarantee they met the inclusion criteria. Those qualified applicants received the computerized diagnostic interview schedule version IV, which ascertains the presence or absence of major psychiatric disorders. Applicants were excluded if mandarin was not their first language, if they were left-handed, or if they had HIV, epilepsy, and stroke; Wernicke–Korsakoff syndrome; bipolar disorder; cirrhosis or liver failure or seizures unrelated to alcoholism, head injury with loss of consciousness more than 15 minutes unrelated to alcoholism, depression, schizophrenia, and other psychotic disorders.

Finally, we enrolled 114 abstinent long-term chronic alcoholic participants (58 men and 56 women) and 121 nonalcoholic control participants (59 men and 62 women). Participants were enrolled through flyers posted in Jiangsu Province Hospital, Nanjing Children’s Hospital, and Nanjing Brain Hospital, as well as the Internet-based advertisements. The data collection lasted for a total of three years. The research was approved by the Institutional Review Board of the participating hospitals. Informed consent was obtained from each participant.

The 235 participants were tested by the “Alcohol Use Disorder Identification Test (AUDIT)” [9]. The unit “ounce” was transformed to “gram,” since the former is not widely identified in China. Their demographic characteristics are shown in Table 1. In this study, we only focus on the structural imaging data.

2.2. Scan Protocol

All 235 subjects lied down as still as possible, with their eyes closed and remaining conscious. Scanning was implemented by a Siemens Verio Tim 3.0T MR scanner (Siemens Medical Solutions, Erlangen, Germany). In total, 216 sagittal slices covering the whole brain were acquired, using an MP-RAGE sequence. The imaging parameters were listed as follows: slice thickness = 0.8 mm, TE = 2.50 ms, TR = 2000 ms, TI = 900 ms, FA = 9°, matrix = 256 × 256, and FOV = 256 mm × 256 mm. The acquired image was 16-bit gray level depth, and we reduced it to 8-bit gray level depth, since the alcoholism alters the structure of healthy brain and it does not change the gray level of brain images. Besides, 8-bit gray level provides enough information, so it is unnecessary to use 16-bit gray level images.

2.3. Slice Selection

We used FMRIB software library (FSL) v5.0 software [16, 17] to extract brain and remove skulls for each scanned 3D image. All the volumetric images were normalized to a standard MNI template. Afterwards, we resampled each image to 2 mm isotropic voxel. The slice at = 80 (8 mm) at MNI_152 coordinate, which is an average of 152 T1-weighted MRI scans linearly transformed to Talairach space, was chosen for each patient. The reason for selecting the 80th slice is that it contains the two distinguishing features of alcoholic patients: (i) the enlarged ventricle and (ii) the shrunk gray matter, for example, the precentral gyrus [18], inferior frontal gyrus [19], and middle temporal gyrus [20]. Figure 1 shows the clear difference between the alcoholic and healthy samples.

Afterwards, the background was cropped, leaving a rectangle matrix with size of 176 × 176 for the subsequent classifier training. The datasets used in this study are available upon request.

3. The Proposed Feature Extraction Method

To extract distinctive features, this study proposed a new wavelet Renyi entropy (WRE), which combines discrete wavelet transform and Renyi entropy in order to describe the complexity of the brain structure. The wavelet decomposition provides multiresolution and multiscale analysis, while the Renyi entropy provides the complexity description of the wavelet subbands of brain structure.

3.1. Wavelet Decomposition

For a specific signal/image, the discrete wavelet transform (DWT) transforms the signal/image to the wavelet domain. It performs the transformation at multiple levels, by delivering the previous approximation subband to the quadrature mirror filters (abbreviated as QMF) [21]. Compared to traditional Fourier transform, DWT has the key advantage of temporal/spatial resolution.

Let be a given one-dimensional signal, and the continuous wavelet transform of is depicted aswhere represents the coefficients and the mother wavelet. is defined asHere, represents the scale factor and the translation factor (both and > 0). Formula (1) can be discretized by replacing and to discrete variables and .where the parameters and represent the values of scale and translation factors, respectively. By this means, we can produce the DWT asHere, means the discrete version of variable . means the downsampling. The functions and represent the low-pass filter and high-pass filter, respectively. and represent the approximation subband and the detail subband, respectively.

For a two-dimensional DWT (abbreviated as 2D-DWT) [22], suppose the image is symbolized as , and there are four subbands in all after each decomposition (, , , and ), shown in Figure 2. The subband is the approximation component of original image. Subbands , , and represent horizontal, vertical, and diagonal position, respectively. will be decomposed into four new subbands at a higher level, to produce corresponding higher-level subbands.

3.2. Renyi Entropy

Each subband of wavelet decomposition can be regarded as a discrete variable . Suppose has possible outcomes asSuppose the corresponding probability is defined as

The -order Renyi entropy is defined as [23]The Renyi entropy is Schur-concave and it is a nonincreasing function in . In some special cases, the Renyi entropy will turn to other types of entropies. For instance, is called Hartley entropy, is Shannon entropy, and is the min-entropy [24].

Suppose we have a binary random variable with , where . The Renyi entropies with different -values against are plotted in Figure 3. The concaveness and the nonincreasing against properties are obvious from this picture. Zero values are included, since this does not affect the calculation of Renyi entropy.

3.3. Wavelet Renyi Entropy

In the past, scholars have proposed the so-called “wavelet Renyi entropy.” Nevertheless, our proposed WREs are different from traditional WREs. First, traditional WREs are mainly for one-dimensional signal, while ours are for two-dimensional image. Second, traditional WREs calculate entropies over the approximation subbands, while our method calculates entropies over both approximation and detail subbands of wavelet coefficients.

The pseudocode of WRE is depicted in Pseudocode 1. The bin number of wavelet coefficient histogram is set to 256 in this study.

Step  1 Input a given brain image .
Step  2 Perform a -level wavelet decomposition, and obtained () subbands.
Step  3 For :
     Get the 256-bin histogram of the th subband wavelet coefficients;
     Obtain Renyi entropy with order of over the histogram.
    End
Step  4 Output the catenation of all Renyi entropy values of all subbands.

For a given image, our proposed WRE produced a -element feature vector. Here, we choose the optimal values of and by grid searching approach. The detailed implementation is explained in Section 5.3.

4. The Classifier Construction Based on a Feedforward Neural Network (FNN) and Three-Segment Jaya

To train the classifier, we have proposed using a feedforward neural network (FNN) and three-segment Jaya algorithm. Scholars have used various classifiers in medical brain image analysis, such as decision tree, support vector machine [25], and naive Bayesian classifier. Nevertheless, the feedforward neural network (FNN) won remarkable success, because of the universal approximation theorem [26], which says the following.

Suppose is a bounded and nonconstant continuous function. Given any function and any small number , there exist an integer , real vectors , and real constants and , such that we havewhere can be used as an approximation realization of function , which satisfies

However, the traditional FNN training algorithm is a backpropagation (BP) gradient descent algorithm. The BP and its variants often converge to local optimal points. Three-segment encoded Jaya is introduced to address the problem. In the following sections, we will detail each of the methods.

4.1. Structure of FNN

Structurally, the FNN include three layers: (i) an input layer accepted the features; (ii) a hidden layer contains hidden neurons; (iii) an output layer outputs the scores of each class. Finally, the “argmax” function predicts the class associated with the largest score. Figure 4 presents the diagram of FNN. The number of input neurons is the same as the number of features extracted from brain images, the number of output neurons is the same as the number of classes, and the number of hidden neurons is commonly obtained by hyperparameter optimization.

4.2. Jaya Algorithm

As mentioned earlier, the traditional FNN training has an issue with global optimization; to address this problem, a massive number of global optimization algorithms were proposed and employed to train FNN, particularly in the field of brain image classification. For example, Hajimani et al. (2017) [11] designed a multiobjective genetic algorithm (MOGA) to detect cerebral vascular accidents. Chen et al. (2017) [12] used particle swarm optimization (PSO) to classify MRI brain tissues. Subramaniam and Radhakrishnan (2016) [13] used bee colony optimization (BCO) to classify brain cancer image. Raghtate and Salankar (2015) [14] proposed a modified ant colony system (MACS) to realize automatic brain MRI classification. Chen and Du (2017) [15] proposed a real-coded biogeography-based optimization (RCBBO) method for pathological brain detection.

Those algorithms make the classifier more robust than BP; nevertheless, their own parameters need to be fine-tuned, which causes the hyperparameter optimization problem. To overcome the limitation of existing optimization approaches, Jaya as a powerful global optimization approach has been introduced by Rao (2016) [27] as a benchmark function for constrained and unconstrained problems. It is an algorithm-specific parameter-free approach which has been proven to be superior to state-of-the-art optimization algorithms and has been successfully applied in thermal performance optimization [28], photovoltaic model identification [29], cooling tower design [30], sensing period adaptation [31], heat change optimization [32], and so forth.

Figure 5 shows the diagram of Jaya algorithm. Assume , , and are the index of iteration, variable, and candidate. Assume means the th variable of th solution candidate at th step. Assume and are two positive numbers in the range of and generated at random. The modified candidate is defined aswhere and denote the index of worst and best candidate within the population:Hence, and denote the worst and best value of th variable at th iteration.

The 2nd term “” in (10) represents that the candidate needs to move closer to the best one. The 3rd term “” in (10) represents that the candidate needs to move away from the worst candidate, noting the “−” symbol before [33]. The updated candidate at iteration () is written aswhere represents the fitness function.

Equation (12) indicates that is assigned with if the modified candidate is better in terms of fitness than ; otherwise it is assigned with . The algorithm iterates until the termination criterion is satisfied. We set the termination criterion as follows: either the algorithm reaches maximum iteration epoch or the error does not reduce for five epochs.

4.3. Three-Segment Encoded Jaya

The existing Jaya algorithm is mainly used to train weights and biases of FNN as described in Phillips (2017) [10]. However, we believe the number of hidden neurons is also an important hyperparameter that influences the classification performance of FNN. Hence, we proposed a three-segment encoded Jaya algorithm (TSE-Jaya), which aims to optimize the weights, biases, and number of hidden neurons simultaneously. The candidate now contains three parts aswhere , , and represent extracting the first part, second part, and third part of the solution candidate representation. The first part encodes the weights, the second part encodes the biases, and the third part encodes the number of hidden neurons (NHN). The modified candidate is defined consequently as

The modification rule does not obey (10), and the new modification rule is three-fold as follows:where and are two random positive numbers, similar to variable . Other procedures are the same as those in Jaya algorithm.

5. Implementation and Evaluation

5.1. Cross Validation Based Implementation

Figure 6 presents the flowchart of our method. Here, the -fold cross validation method [34] was used in order to avoid overfitting and report out-of-sample errors. We divide the whole dataset 10-fold. In th trial, th fold is used as validation, th fold is used as test, and other folds are used as training. The training iterates until the accuracy over validation () set increases for five continuous epochs. For a clear understanding, we plotted a toy example in Figure 7. Here, at epoch 6, the validation error reaches the minimum. Then, from the 6th to 11th epoch, we can observe the validation increases although the training error decreases, which indicates an overfitting occurs. Hence, we should select the weights corresponding to the 6th epoch. The goal of -fold cross validation in this study is to avoid overfitting.

After the training terminates, the measure over the test set is recorded. Finally, all the measures over all test sets of all trials are averaged, and the final classification performance was calculated. The ideal confusion matrix of one time of 10-fold cross validation is where means the ideal confusion matrix, the number of folds, and the number of repetitions. In this study, we run the 10-fold cross validation 10 times, and the ideal confusion matrix is

5.2. Evaluation

The evaluation was performed on the realistic confusion matrix of 10 × 10-fold cross validation. Suppose the positive class is alcoholism, and the negative class is the control. We can define true positive (TP) as alcoholism correctly identified, true negative (TN) as control correctly identified, false positive (FP) as control mispredicted as alcoholism, and false negative (FN) as alcoholism mispredicted as control. Finally, we define three measures: sensitivity (Sen), specificity (Spc), and accuracy (Acc).

5.3. Grid Searching

In the grid searching, the criterion uses the “accuracy (Acc)” measure defined above. The implementation is explained in Table 2. For wavelet decomposition level , a simple grid searching from 1 to 5 with an increase of 1 was used, since should be an integer.

For the order of Renyi entropy, a coarse-to-fine searching strategy was used. First, a coarse grid was set from 0 to 6 with an increase of 1, and we obtained the coarse candidate .Then, a fine grid was set from to with an increase of 0.1, and the optimal order is obtained as

6. Results and Discussions

Our programs were developed in-house. The experiment ran on the platform of Dell laptop with 2.20 GHz Intel Core i7-4702HQ CPU and 16 GB RAM. The operating system was Windows 10. MATLAB 2017a is the programing development environment.

6.1. Statistical Analysis

The cross validation divides the dataset into 10 sets. In each run, the number of sets resulting from the division is different. We set the wavelet decomposition level as 4 and Renyi -value as 1.2. The maximum iterative epoch is set as 1000, and the population in Jaya algorithm is set to 20. The sensitivity, specificity, and accuracy of our method are listed in Tables 3, 4, and 5, respectively. We can observe our proposed method achieved a sensitivity of %, a specificity of %, and an accuracy of %.

6.2. Comparison to State-of-the-Art Approaches

We compared this proposed method “WMI + FNN + TSE-Jaya” with four state-of-the-art approaches: FA + PNN [2], IPSO method [3], HMI + SVM [5], and PZM [6]. All the algorithms were run over a 10 × 10-fold cross validation over our dataset. The results of 10 × 10-fold cross validation of four state-of-the-art methods are shown in Table 6. Finally, the comparison result is presented in Table 7.

6.3. Optimal Wavelet Decomposition Level

In this experiment, we fixed the Renyi order to 1.2 and let the wavelet decomposition level change from 1 to 5 with increase of 1. The corresponding accuracy varied as shown in Figure 8. Here, the accuracy is 85.45%, 91.45%, 93.11%, 93.66%, and 87.96% when decomposition level is 1, 2, 3, 4, and 5, respectively. Obviously, the 4th-level decomposition yields the greatest accuracy; hence, we chose the optimal decomposition level as 4. The Renyi entropy was then calculated over all the subbands of this 4-level wavelet decomposition.

Note that 3D-DWT is more straightforward than 2D-DWT over a one particular slice. Nevertheless, our aim in this study is to select a distinguishing slice, which is related to brain regions affected by alcoholism, in order to reduce the computation burden. In the future, we shall test the results of 3D-DWT.

6.4. Optimal Renyi Order

In this experiment, we shall illustrate why we set the Renyi order as 1.2 by a coarse-to-fine grid search. First, we search the coarse grid from 0 to 6 with an increase of 1. The result was shown in Figure 9(a), and the value of 1 was selected as the initial point for fine grid search. Second, the fine grid from 0.5 to 1.5 with an increase of 0.1 was established, and the result was shown in Figure 9(b). We can observe that can yield the greatest accuracy.

6.5. Effectiveness of Three-Segment Encoding

Our proposed TSE-Jaya can train the weights, biases, and number of hidden neurons (NHN) simultaneously. In this experiment, we compare TSE-Jaya to plain Jaya algorithm [10], which can only train the weights and biases of feedforward neural network [10]; hence, we have to fix the number of hidden neurons by experience. Here, we set NHN as 10 for plain Jaya algorithm. The settings of other parameters were the same as previous experiments. The results of 10 × 10-fold cross validation of plain Jaya [10] are shown in Table 8, and the comparison between Jaya [10] and our proposed TSE-Jaya is shown in Table 9.

The superiority of proposed TSE-Jaya to plain Jaya [10] is clear. This demonstrates the importance of choosing the optimal number of hidden neurons, that is, that the variable number of hidden neurons gives a better performance than fixed number of hidden neurons, which is also validated by Carleo and Troyer (2017) [35].

6.6. Training Algorithm Comparison

To demonstrate the efficiency of the proposed algorithm, we have compared the TSE-Jaya with a several global optimization algorithms including MOGA [11], PSO [12], BCO [13], MACS [14], and RCBBO [15]. All the settings of common controlling parameters are the same: the maximum iterative epoch is set as 1000, and the population in all algorithms is set to 20. The algorithm-specific parameters of those four comparison algorithms are assigned by experiences. The results of those four training algorithms over 10 × 10-fold cross validation are shown in Table 10, and the final comparison with our proposed TSE-Jaya was shown in Table 11.

Table 11 shows that the proposed TSE-Jaya performed the best among all six algorithms. The PSO [12] and RCBBO [15] ranked the second and the third, with their accuracies over than 90%. The BCO [13] ranked the fourth, the MACS [14] ranked the fifth, and MOGA [11] performed the worst. The reasons behind efficiency of the proposed approach can be explained from two aspects: (i) The Jaya does not need to set the algorithm-specific parameters, making it more reliable than other algorithms. (ii) The TSE guarantees the variable number of hidden neurons at each run.

6.7. Validation of the Selected Slice

In this experiment, we validated in terms of the classification performance the selection of the 80th slice. We set a range of increasing from 30 to 150 with an increment of 10 as shown in Figure 10.

Other settings were the same as the previous experiments. Again, 10 repetitions of 10-fold cross validation were utilized. The curve of accuracy is drawn in Figure 11. It is observed that the 80th slice gives the highest accuracy among all candidate slices. The reason is that this slice contains the enlarged ventricle and the shrunk gray matter caused by alcoholism. On the contrary, hippocampus [36] and striatum [37] are also related to alcoholism. Nevertheless, their altered volumes are relatively small and hence do not provide an excellent performance in this task.

In this case, the optimal slice could be in a position that is vertical to or -axes, or it can be even an oblique plane to all three axes. Here, we choose a slice vertical to -axis, which is for the convenience of radiologists, since they usually read the axial slices. In the future, we shall develop techniques to handle multislices, and we may develop surface analysis techniques [38].

7. Conclusions

In this study, we proposed a novel alcoholism identification method from healthy controls based on a computer-vision approach. Our method was based on three components: the proposed wavelet Renyi entropy, feedforward neural network, and the proposed three-segment encoded Jaya algorithm. The experiments showed that our method achieved a sensitivity of %, a specificity of , and an accuracy of over a 10 × 10-fold cross validation. The performance is superior to four state-of-the-art alcoholism algorithms. We validated the optimal wavelet decomposition to be 4, and the optimal Renyi order was 1.2. Besides, comparing to the existing global optimization approaches, the proposed three-segment encoded Jaya is proven to provide a better performance than other methods such as plain Jaya and another five training algorithms. Finally, we validated the reason why we chose the 80th slice.

The shortcomings of our method lie in two aspects. First, our method needs to scan the whole brain and select the 80th slice at -axis. Second, the wavelet Renyi entropy was extracted, but in the future we shall try to find more efficient features.

In the future work, we will develop classifiers based on multisource data, such as facial image, EEG, and spectrum data. We shall explore the changes of functional connectivity of alcoholism brain. Second, other image features, such as histogram of oriented gradient, will be tested. Third, deep learning methods will be tested.

Conflicts of Interest

The authors have no conflicts of interest to disclose with regard to the subject matter of this paper.

Acknowledgments

This study was supported by National Natural Science Foundation of China (61602250), Natural Science Foundation of Jiangsu Province (BK20150983), Project of Science and Technology of Henan Province (172102210272), Program of Natural Science Research of Jiangsu Higher Education Institutions (16KJB520025), and Open Fund of Key Laboratory of Guangxi High Schools Complex System and Computational Intelligence (2016CSCI01).