Abstract

Due to different treatment strategies, it is extremely important to differentiate between glioblastoma multiforme (GBM) and brain metastases (MET). It often proves difficult to distinguish between GBM and MET using MRI due to their similar appearance on the imaging modalities. Surgical methods are still necessary for definitive diagnosis, despite the importance of magnetic resonance imaging in detecting, characterizing, and monitoring brain tumors. We introduced an accurate, convenient, and user-friendly method to differentiate between GBM and MET through routine MRI sequence and radiomics analyses. We collected 91 patients from one institution, including 50 with GBM and 41 with MET, which were proven pathologically. The tumors separately were segmented on all MRI images (T1-weighted imaging (T1WI), contrast-enhanced T1-weighted imaging (T1C), T2-weighted imaging (T2WI), and fluid-attenuated inversion recovery (FLAIR)) to form the volume of interest (VOI). Eight ML models and feature reduction strategies were evaluated using routine MRI sequences (T1W, T2W, T1-CE, and FLAIR) in two methods with (second model) and without wavelet transform (first model) radiomics. The optimal model was selected based on each model’s accuracy, AUC-roc, and F1-score values. In this study, we have achieved the result of 0.98, 0.99, and 0.98 percent for accuracy, AUC-roc, and F1-score, respectively, which have yielded a better result than the first model. In most investigated models, there were significant improvements in the multidimensional wavelets model compared to the non-multidimensional wavelets model. Multidimensional discrete wavelet transform can analyze hidden features of the MRI from a different perspective and generate accurate features which are highly correlated with the model accuracy.

1. Introduction

Glioblastoma multiforme (GBM) and brain metastases (MET) are the most common malignant brain tumors in adults [1, 2]. The distinction between these two types of tumors is crucial to subsequent diagnostic and therapeutic planning [3, 4]. An accurate diagnosis of the tumor’s source and extent is crucial [5, 6]. In addition, treatment strategies for these tumors differ; total en bloc resection is preferred for MET, while stereotactic radiosurgery can be used for MET less than 3 to 4 cm, whereas GBM should be treated with maximal resection followed by molecular classification and simultaneous chemoradiotherapy [79]. Diagnosis of GBM and MET is based on histopathology biopsy [9, 10]. It is particularly dangerous for older adults and tumors near eloquent areas [9, 10]. In routine magnetic resonance imaging (MRI), GBM and MET show peripheral edema and ring enhancement [11, 12]. Even though the two lesions had different treatment strategies, similar radiological appearances made it difficult to differentiate them. Many years of radiological research have focused on accurately distinguishing these two lesions [11, 13, 14].

A radiomics study may provide pathophysiological insights otherwise hidden in quantitative imaging data [12]. In radiomics research, some features are used, such as size, shape, image intensity, and voxel relationships [48]. A wide range of problems can be solved using multidimensional wavelets [15, 16]. An effective way to reveal hidden characteristics of signals is to use discrete wavelets (DWT) [15, 16].The wavelet transform is a robust tool in signal processing, and it could provide us with deep and precise insight into the structure of the signals [15, 16].

With new machine learning algorithms (ML), radiomics analysis could be more precise, accurate, and convenient for clinical reports [14, 6]. Predictive model can be created based on the unique patterns found in the data by using these algorithms [610]. Following training, the machine can accurately identify the tumor type in a new sample and support clinical decisions significantly. In MRI images, brain tumors can be categorized in many ways [1014]. One of the most prominent ones is fuzzy clustering means (FCM), support vector machine (SVM), artificial neural network (ANN), knowledge-based techniques, and the expectation-maximization (EM) algorithm methodology [14, 1719].

It has been reported that advanced techniques of MRI, such as perfusion-weighted imaging (PWI), diffusion tensor imaging (DTI), MR spectroscopy (MRS), and amide proton transfer-weighted imaging, play vital roles in the diagnosis of GBMs compared with MET; however, these advanced techniques may not be included in all standard MRI protocols [3, 12, 20]. Any single finding cannot guide clinical practice in some cases due to diagnostic uncertainty. Many previous studies showed that combined conventional MRI (cMRI), diffusion-weighted imaging (DWI), and 18F-FDG-PET images to establish different radiomics models to differentiate MET from GBM and found that the integrated model based on cMRI, DWI, and 18F-FDG-PET had the best discriminatory power. In contrast, advanced sequences like DWI are not widely available in the clinic as cMRI [3, 12, 20]. Consequently, the radiomics literature shows that different classifiers have different outputs [16]. Choosing the best model is complex [36]. It is necessary to build various models to achieve a more excellent result [3, 12, 20].

In this study, we have used standard MRI sequences (T1-weighted imaging (T1WI), contrast-enhanced T1-weighted imaging (T1C), T2-weighted imaging (T2WI), and fluid-attenuated inversion recovery (FLAIR)) to extract the features from the dataset. By combining these simple features with DWT, we developed suitable feature vectors that can be used in machine learning algorithms to differentiate between GBM and MET patients. The objective of our study was to develop a convenient, accurate, and stable predictive model to aid clinical investigations in differentiating MET from GBM without surgery.

2. Materials and Methods

2.1. Dataset and Patient Population

This retrospective study was approved by the Tarbiat Modares University Institutional Review Board, and informed consent from the patients was waived (IR.MODARES.REC.1400.076).

In all cases, the pathology diagnosis was based on WHO standards and was obtained from Hazrate Rasool Akram Hospital, affiliated with Tarbiat Modares University [21]. We have collected MRI data from 91 patients (GBM: 51, MET: 40) based on their pathological confirmation.

Patients with biopsy confirmation of GBM or MET were excluded if they had any of the following conditions: (1)Strokes and infections of the intracranial space(2)The use of antitumor treatments prior to MR scanning, such as brain surgery, chemotherapy, or radiation(3)Inadequate electronic medical records

2.2. Image Acquisition

The MR scans were performed using the 1.5T Siemens Trio Scanners in the MR Research Center. In this work, we have concentrated on conventional MR sequences, including T1-weighted imaging (T1WI), contrast-enhanced T1-weighted imaging (T1C), T2-weighted imaging (T2WI), and fluid-attenuated inversion recovery (FLAIR). Patients with intracranial tumors undergo these examinations regularly.

2.3. Segmentation and Feature Extraction

All T1WI, T2WI, FLAIR, and T1C images (matrix size:, slice mm, and slice mm) have been transferred from the picture archiving and communication system (PACS) to 3D Slicer [16]. Using these images, two radiologists (Reader 1 and 2; 10 years of experience) were blind to grouping manually selected regions of interest (ROIs) along the edge of the tumor. The tumors separately were segmented on all MRI images (T1-weighted imaging (T1WI), contrast-enhanced T1-weighted imaging (T1C), T2-weighted imaging (T2WI), and fluid-attenuated inversion recovery (FLAIR)) to form the volume of interest (VOI). [712]. In each slice, ROIs were drawn along the tumor margin to encompass the entire tumor area [14, 17, 2224]. The preprocessing and feature extraction of the images were performed using Pyradiomics (http://pyradiomics.readthedocs.io/en/latest/index.html) [20, 2428]. The voxel size resampling (111) and bin width (64) were applied to the images. Pyradiomics was used to extract radiomic features from each ROI based on its three-dimensional region of interest (3D ROI) [2934]. From each sequence, 107 features were extracted (Table 1), and these features were grouped into three categories: first-order statistics (), shape-based features (), and textural features. The textural feature category includes GLCM (), GLRLM (), GLSZM (), GLDM (), and NGTDM () [3437]. Both the training (70% data) and validation groups (30% data) were normalized using -scores. Intraobserver and interobserver intraclass correlation coefficients (ICCs) were applied to measure the reproducibility of each feature [18, 19]. Reader 1 and Reader 2 performed image segmentation independently twice weekly to assess intraobserver reliability. Using the following steps, we selected significant radiomic features [18, 19]. ICCs over 0.75 were kept for intraobserver and interobserver features. Following that, LASSO logistic regression was performed with 10-fold cross-validation. In order to generate machine learning inputs, all selected features from all series of images were registered in one row after dimensionality reduction.

2.4. First Radiomics Model Establishment

Eight ML algorithms were imported from the scikit-learn library in Python software to establish models [15, 16, 25, 26, 38]. These algorithms included Support Vector Machine (SVM), Naïve Bayes (NB), Multilayer perceptron (MLP), Decision Tree (DT), Ada Boost (ADA), K-nearest neighbor (K.N.N.), Logistic Regression (LR), and Random Forest (RF) [15, 16, 25, 26, 38].

Selected features from LASSO were imported to this ML, and the predictive ability of each algorithm was primarily assessed using the AUC of receiver operating characteristic (ROC) curve analysis [15, 16, 25, 26, 38].

2.5. Second Radiomics Model Establishment (Wavelet-Based Features)

Multidimensional wavelet transforms were created by importing selected features from LASSO. This means that the low pass filter generates the approximate coefficient, and the high pass filter would result in the detail coefficients. The approximate coefficient is the most similar signal to the original signal. The detail coefficient consisted of three matrices: vertical, horizontal, and diagonal [15, 16]. We have considered 31 different wavelet filter banks from four distinct families to provide a wide range of feature vectors from different wavelet filter banks. The 31 different wavelet filter banks “ bior1.3, bior1.5, bior2.2, bior2.4, bior2.6, bior3.1, bior3.3, bior3.5, bior3.7, bior4.4, bior5.5, db2, db3, db4, db5, db6, db7, db8, db9, sym2, sym3, sym4, sym5, sym6, sym7, sym8, coif1, coif2, coif3, coif4, coif5” were among the most conventional filter banks that are available in Python compiler. Features extracted from the 3D Slicer were initially considered as 4-Dimensional signals. These signals were then considered as input signals to multidimensional discrete wavelets. The approximate and detail coefficients were substantially generated and saved in a python array. Approximate coefficients were the most similar signal to the primary signal, and the detail coefficients consisted of horizontal, vertical, and diagonal details (cH1, cV1, cD1). These approximate coefficients and detail coefficients were generated by low and high pass filters, respectively. We calculated eleven different criteria for approximation and detail coefficient matrixes. Seven of these criteria were: maximum, minimum, average, median, standard deviation, Shannon entropy, and signal energy applied to entire approximate coefficient and detail coefficients matrices and led to 28 different feature vectors (74). The other two were the standard error and slope between approximate coefficient and detail coefficient matrixes generating six other features (23). Eventually, two signal energy and wavelength criteria were used for the whole signal (original input signal), which led to 2 other features and a total of 36 features for every single 3D Slicer output sample. These features were saved in the text file and used as the input file for the machine learning model. These procedures have been done for 31 different wavelet filter banks mentioned earlier, and for each filter bank, we have made a separated feature vectors file. To be more precise, each filter bank had a unique profile; hence, we had 31 different profiles for each sample. Eight ML used these profiles separately as an input file. Finally, we have reported our best result as our proposed model.

2.6. Evaluation Method

Regarding the dataset size, we used the 5-fold cross-validation. To avoid bias, we repeated the 5-fold cross-validation test 100 times. Eventually, the iteration average was considered the model’s outputs. The classification influence can be evaluated using three indicators: area under the receiver operating characteristic curve (AUC), accuracy (ACC), and F1-score. Finally, a pairwise test was applied to compare the obtained ROC curves, and the ROC curves of the wide variety of classifiers were then investigated. According to the analysis, illustrates that the two ROC curves were statistically significant differences [20, 24, 25, 3840]. Workflows are shown in Figures 1 and 2.

3. Result

3.1. First Radiomics Model Result

We have reported the result of our proposed model by a set of conventional parameters, including F1-score, accuracy, and AUC. These parameters could best express the model’s performance in machine learning research. As mentioned in the first model, we have used the selected features obtained from 3D Slicer and used SVM, NB, MLP, DT, ADA, KNN, LR, and RF to establish our predictive model. The highest accuracy for these models is achieved by random forest algorithms, shown in Table 2.

3.2. Second Radiomics Model Result

We used the selected feature vector from the 3D Slicer as the multidimensional DWT input for the second model and extracted the approximate and detailed coefficients. After concatenating the approximate coefficient and detail coefficient matrices, particular criteria have been calculated. The same eight ML were applied to establish the predictive model. We calculated accuracy, F1-score, and AUC of ROC in 31 different filter banks, as presented in Tables 35.

Our result proved that the second model could perform better than the first model (sig ≤0.05). This model’s DB5 filter bank and Logistic Regression achieved the highest result. We have gained 0.98, 0.99, and 0.98 percent for accuracy, AUC-roc, and F1-score, respectively. We introduce the DB5 wavelet and Logistic regression as our proposed model for identifying the GBM and MET in MRI sequences. These differences are shown in Tables 68.

4. Discussion

We explored the diagnostic performance of radiomics using traditional machine learning classifiers for differentiating GBM from single MET. The wavelet radiomics performed better than the best-performing traditional radiomics and demonstrated good generalizability in the testing data. Four imaging modes were used, 107 features were extracted from each sequence, and all 428 parameters were used as LASSO inputs to select the best features. The selected features are used in two different ways. First, used in ML input, then transferred to 31 wavelet filter banks, and lastly imputed to ML. For the best combination of classification procedures, extensive comparative studies were performed for 31 types of wavelet features and eight classification algorithms (Adaboost, K.N.N., Gaussian NB, DT, LR, SVM, RF, and MLP).

Moreover, the multidimensional discrete wavelet could reveal the hidden feature of the data. Multidimensional discrete wavelets provide us with approximate and detailed coefficient matrices. The approximate coefficient is the most similar wave to the original signal, while the detailed coefficient includes horizontal, vertical, and diagonal details. We found that wavelet feature extraction, a critical classification component, enables us to distinguish different tumor characteristics using different feature types.

According to our findings, wavelet radiomics-based ML can successfully discriminate GBM and MET. LR was judged to be the most effective model. Another important finding from our research was the diagnostic performance of the top models with wavelet features outperformed models without wavelet features (Tables 68).

Our study’s most important outcome was identifying appropriate discriminative models for lesions in the brain. LR (AUC 0.99) and RF (AUC 0.97) are both high-performing models for GBM and MET classification.

A number of previous studies have used multiparametric MRI data to discriminate between GBM tumors and MET, including advanced imaging methods such as diffusion, perfusion, and MR spectroscopy [16]. It should be noted that advanced imaging is not incorporated into all MRI protocols across all sites and is highly dependent on the acquisition and analysis method [14, 34, 41]. Therefore, it is important to be able to classify brain tumors based on common sequences [14, 34, 41]. Some previous studies attempted to distinguish between different types of brain tumors based on a single contrast; however, these studies were carried out on relatively small populations and were limited to data obtained from a specific MRI system [14, 34].

Cho1 et al. [42] studied radiomics and ML in glioma grading classification, which showed that RF and LR were high-performing models. In our investigation, the LR model with db5 wavelet feature performed better (AUC 0.99) than the LR model in a previous study by Cho et al., which had an AUC of 0.95. These findings demonstrate that the proposed model has an impact on model performance. Our results were better than the previous study that looked at various ML. Priya et al. [22] used the LASOO and Elastic Net model to classify brain tumors; we have higher results in wavelet base feature (0.99 vs. 095) and worse in part without wavelet base features (0.95 vs. 0.93).

Ning et al. [43] examined seven ML classifiers and five feature reduction techniques using radiomics features produced from T1-CE and T2W pictures and obtained an AUC of 0.890 with an accuracy of 83 percent. The second part of their work used deep neural networks, which had an AUC of 0.95 and an accuracy of 89%. Even though we did not analyze deep neural networks (DNN) due to their computational cost, our findings were comparable to Ning et al.

Su et al. [17] extracted features from the T1-CE sequence, then evaluating 30 model combinations and feature reduction for their radiomics obtained an AUC of 0.80. Using 248 combinations of wavelet feature series and classifier, we achieved a higher AUC of 0.99 and an accuracy of 98 percent. The fact that we extracted wavelet features from MRI sequences improved our results.

Wavelet feature is an essential element of this model performance. The improved effectiveness of 31 types of wavelet features compared to a priori feature (without wavelet) was a key conclusion of our research. LR and RF models provide a more significant generalizability benefit than other ML models. We have found the essential type of wavelets for Ada, Knn, Nb, Dt, LR, SVM, RF, and MLP were db8, coif2, bior2.6, bior1.5, db5, db2, bior1.5, and coif3 consequently.

Adding wavelets to the analysis confirms that it can better distinguish MET and GBM than the radiomics model based solely on c MRI features. Therefore, our study provides a good solution to the problem of poor model performance in radiomics research. We propose a unique, helpful prediction model based on MRI sequences (T1-W, T2-W, T1C-W, and Flair-W) that can be used in clinical practice.

4.1. Limitation

Our research has a few limitations that need to be addressed. First, due to the retrospective nature of the investigation, several patients’ clinical information and MRI sequences were missing, resulting in a smaller sample size. Second, although normalization was implemented during data preprocessing, variation across MRI scan protocols was unavoidable. Third, the picture segmentation approach utilized in this work relied on manual delineation, which might be replaced by automated delineation of ROIs using deep learning methods in the future to enhance the model’s reliability. Fourth, since our conclusion that radiomics characteristics may represent changes in the tumor microenvironment was based on circumstantial evidence, further validation and experimental confirmation are required. Finally, our study did not include advanced sequences, such as DWI or perfusion MR imaging.

5. Conclusion

We found that radiomics-based ML can accurately classify GBM and MET. Also, in this study, the best results were in the LR algorithm and wavelet db5, which can be considered acceptable in data with few samples. The performance of a model might vary based on the mix of classifier and feature types used. Thus a complete model selection method should be used. Also, the result of the models applied to the extracted features’ composition is very suitable compared to their separate modes.

Data Availability

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Ethical Approval

The local ethics committee approved this study (IR.MODARES.REC.1400.076).

Informed consent was received from all participants.

Conflicts of Interest

The authors declare that they have no competing interests.

Authors’ Contributions

The authors confirm their contribution to the paper as follows: Salar Bijari, Amin Jahanbakhshi, Parham Hajishafiezahramini, and Parviz Abdolmaleki were responsible for study conception and design; Salar Bijari and Amin Jahanbakhshi were responsible for data collection; Parviz Abdolmaleki and Salar Bijari were responsible for analysis and interpretation of results; Salar Bijari and Amin Jahanbakhshi were responsible for draft manuscript preparation. All authors reviewed the results and approved the final version of the manuscript.

Acknowledgments

I would like to say a special thank you to S. Sayfollahi for her support, guidance, and overall insights in this field and have made this an inspiring experience for me. I would also like to thank all of the patients who participated in the study. Finally, I would like to thank my family for supporting me during the compilation of this dissertation.