Abstract

Accurate tumor, node, and metastasis (TNM) staging, especially N staging in gastric cancer or the metastasis on lymph node diagnosis, is a popular issue in clinical medical image analysis in which gemstone spectral imaging (GSI) can provide more information to doctors than conventional computed tomography (CT) does. In this paper, we apply machine learning methods on the GSI analysis of lymph node metastasis in gastric cancer. First, we use some feature selection or metric learning methods to reduce data dimension and feature space. We then employ the K-nearest neighbor classifier to distinguish lymph node metastasis from nonlymph node metastasis. The experiment involved 38 lymph node samples in gastric cancer, showing an overall accuracy of 96.33%. Compared with that of traditional diagnostic methods, such as helical CT (sensitivity 75.2% and specificity 41.8%) and multidetector computed tomography (82.09%), the diagnostic accuracy of lymph node metastasis is high. GSI-CT can then be the optimal choice for the preoperative diagnosis of patients with gastric cancer in the N staging.

1. Introduction

According to the global cancer statistics in 2011, an estimated 989,600 new stomach cancer cases and 738,000 deaths occurred in 2008, which account for 8% of the total cases and 10% of the total deaths. Over 70% of the new cases and deaths were recorded in developing countries [1, 2]. The most commonly used staging system is the American Joint Committee on Cancer Tumor, Node, and Metastasis (TNM) [35]. The two most important factors that influence survival among patients with resectable gastric cancer are the depth of cancer invasion from the gastric wall and the number of lymph nodes present. In areas not screened for gastric cancer, late diagnosis reveals a high frequency of nodal involvement. Even in early gastric cancer, the incidence of lymph node metastasis exceeds 10%. The overall incidence was reported to be 14.1% and 4.8% to 23.6% depending on cancer depth [6]. The lymph node status must be pre-operatively evaluated for proper treatment. However, the various modalities could not obtain sufficient results. The lymph node status is one of the most important prognostic indicators of poor survival [7, 8].

Preoperative examinations, endoscopy, and barium meal examinations are routinely used to evaluate cancerous lesions in the stomach. Abdominal ultrasound, computed tomography (CT) examination, and magnetic resonance imaging (MRI) are commonly used to examine the presence of invasion to other organs and metastatic lesions. However, their diagnostic accuracy is limited. Endoscopic ultrasound has been the most reliable nonsurgical method in the evaluation of the primary tumor with 65% to 77% accuracy of N staging due to the limited penetration ability of the ultrasound for lymph node distant metastasis. In spite of the higher image quality and dynamic contrast-enhanced imaging, MRI only has an N staging accuracy of 65% to 70%. The multidetector row computed tomography (MDCT) [9] scanner enables for thinner collimation and faster scanning, which markedly improves imaging resolution and enable rapid handling of image reconstruction. Moreover, intravenous bolus administration of contrast material permits precise evaluation of carcinoma enhancement, and the water-filling method enables negative contrast to enhance the gastric wall. Thus, MDCT has a higher N staging accuracy of up to 82% and has become a main examination method for preoperative staging of gastric cancer [10]. Fukuya et al. [11] showed in their study for lymph nodes of at least 5 mm that sensitivity for detecting metastasis positive nodes was 75.2% and specificity for detecting metastasis negative nodes was 41.8%. A large-scale Chinese study [10] conducted by Ruijin Hospital showed that the overall diagnostic sensitivity, specificity, and accuracy of MDCT for determining lymph node metastasis was 86.26%, 76.17%, and 82.09%, respectively. However, with clinically valuable scanning protocols of the spectral CT imaging technology, we can obtain more information with gemstone spectral imaging (GSI) than with any conventional CT (e.g., MDCT).

In conventional CT imaging, we measure the attenuation of the X-ray beam through an object. We commonly define the X-ray beam quality in terms of its kilo voltage peak (kVp) that denotes the maximum photon energy, as the X-ray beam comprises a mixture of X-ray photon energies. GSI [12] with spectral CT, and conventional attenuation data may be transformed into effective material densities, that enhance the tissue characterization capabilities of CT. Furthermore, through the monochromatic representation of the spectral CT, the beam-hardening artifacts can be substantially reduced, which is a step toward quantitative imaging with more consistent image measurements for examinations, patients, and scanners.

In this paper, we intend to use the machine learning method to handle the large amount information provided by GSI and to improve the accuracy for the determination of lymph node metastasis in gastric cancer.

The paper is arranged as follows, Section 2 describes the details of the methods used in this paper, Section 3 presents the experimental framework and the results, and Section 4 concludes the present study and discusses potential future research.

2. Methodology

Figure 1 shows a flow chart illustrating the whole framework of the classification on lymph node metastasis in gastric cancer.

2.1. Pre-Processing

GSI-CT examination was performed among patients using the GE Discovery CT750 HD (GE-Healthcare) scanner [13]. Each patient received an intramuscular administration of 20 mg of anisodamine to decrease peristaltic bowel movement and drank 1,000 to 1,200 mL tap water for gastric filling 5 to 10 min before the scan. Patients were in a supine position. After obtaining the localizer CT radiographs (e.g., anterior-posterior and/or lateral), we captured the unenhanced scan of the upper abdomen and then employed the enhanced GSI scan in two phases. An 80 mL to 100 mL bolus of nonionic iodine contrast agent was administered to the ante-cubital vein at a flow rate of 2 mL/sec to 3 mL/sec through a 20-gauge needle using an automatic injector. CT acquisitions were performed in the arterial phase (start delay of 40 s) and in the portal venous phase (start delay of 70 sec). The arterial phase scans the whole stomach and the portal venous phase examines from the top of the stomach diaphragm to the abdominal aortic bifurcation plane. The GSI-CT scanning parameters are as follows: scan mode of spectral imaging with fast tube-voltage switching between 80 kVp and 140 kVp, the currents of 220 mA to 640 mA, slice thickness of 5 mm, rotation speed of 1.6 s to 0.8 s, and pitch ratio of 0.984 : 1.

2.2. Feature Extraction

Lymph node regions of interest (ROIs) were delineated by experienced doctors. Not all the lymph nodes could be captured in the images because of the node size or location. Figure 2 shows lymph node and aortic in the arterial phase and venous phase under 70 keV monochromatic energy. The lymph node on Figure 2(b) is difficult to find for its small size. The monochromatic values (Hu) and the mean of material basis pairs (μg/cm3) were calculated. The features used in this paper are monochromatic CT values (40 keV to 140 keV) and material basis pairs (Calcium-Iodine, Calcium-Water, Iodine-Calcium, Iodine-Water, Water-Calcium, Water-Iodine, Effective-Z).

During the image acquisition process, variations on the injection speed, dose of the contrast agents and their circulation inside the body of patients can cause differences in the CT numerical values. To eliminate discrepancies, the arterial CT value of the same slice was recorded at mean time, and then normalization work was conducted by using the following formula:

2.3. Feature Selection
2.3.1. mRMR Algorithm

Minimal redundancy maximal relevance (mRMR) is a feature-selection scheme proposed by [14] mRMR that uses the information theory as a standard with better generalization and efficiency and accuracy for feature selection. Each feature can be ranked based on its relevance to the target variable, and the ranking process considers the redundancy of these features. An effective feature is defined as one that has the best trade-off between minimum redundancy within the features and maximum relevance to the target variable [15]. Mutual information (MI), which measures the mutual dependence of two variables, is used to quantify both relevance and redundancy in this method [16]. The two most used mRMR criteria are mutual information difference (MID) and mutual information quotient (MIQ), where is the MI between feature and classification , is MI between features and , is the current feature set, and is the length of the feature set.

2.3.2. SFS Algorithm

Sequential forward selection (SFS) is a traditional heuristic feature selection algorithm [17, 18]. SFS starts with an empty feature subset . In each iteration only one feature is added to the feature subset. To determine which feature to add, the algorithm tentatively adds an unselected feature to the candidate feature subset and tests the accuracy of the classifier built on the tentative feature subset. The feature that exhibits the highest accuracy is finally added to the feature subset. The process stops after an iteration in which no features can be added, resulting in an improvement in accuracy.

2.4. Metric Learning Algorithm

Learning good distance metrics in feature space is crucial to many machine learning works (e.g., classification). A lot of existing works has shown that properly designed distance metrics can greatly improve the KNN classification accuracy compared to the standard Euclidean distance. Depending on the feasibility of the training samples, distance metric learning algorithms can be divided into two categories: supervised distance metric learning and unsupervised distance metric learning. Table 1 shows the several distance metric learning algorithms. Among them, principal component analysis (PCA) is the most commonly used algorithm for the problem of dimensionality reduction of large datasets like in the application of face recognition [19], image retrieval [20].

2.5. Classification

The K-nearest neighbor (KNN) [21, 26] algorithm is among the simplest of all machine algorithms. In this algorithm, an object is classified by a majority vote of its neighbors. The object is consequently assigned to the class that is most common among its KNN, where is a positive integer that is typically small. If , then the object is simply assigned to the class of its nearest neighbor.

The KNN algorithm is first implemented by introducing some notations , is considered the training set, where is the d-dimensional feature vector, and is associated with the observed class labels. For simplicity, we consider a binary classification. We generally suppose that all training data are iid samples of random variables with unknown distribution.

With previously labeled samples as the training set , the KNN algorithm constructs a local subregion of the input space, which is situated at the estimation point . The predicting region contains the closest training points to , which is written as follows: where is the th order statistic of , and is the distance metric. denotes the number of samples in region , which is labeled . The KNN algorithm is statistically designed for the estimation of posterior probability of the observation point : For a given observation , the decision is formulated by evaluating the values of and selecting the class that has the highest value Thus, the decision that maximizes the associated posterior probability is employed in the KNN algorithm. For a binary classification problem in which , the KNN algorithm produces the following decision rule:

3. Experimental Results and Discussion

3.1. Experiments

The image data used in our work were acquired from GE Healthcare equipment in Ruijin Hospital on April 2010. We collected got 38 gastric lymph node datasets. Among the datasets were 27 lymph node metastasis (positive) and 11 nonlymph node metastasis (negative). All the lymph node data were pathology results obtained after lymph node dissection (lymphadenectomy) in patients.

3.1.1. Univariate Analysis

In this study, we conduct univariate analysis by exploring variables (features) one by one. We analyze each feature by calculating its relevance to lymph node metastasis. Here, we use the following measurements:(i)Two-Tailed t-test: The two-tailed test is a statistical test used in inference, in which a given statistical hypothesis, H0 (the null hypothesis), is rejected when the value of the test statistic is either sufficiently small or sufficiently large.(ii)Point Biserial Correlation Coefficient (): In regard to the , notation formula is the mean for nondichotomous values in connection with the variable coded 1, and is the mean for the non-dichotomous values for the same variable-coded 0. is the standard deviation for all non-dichotomous entries, and and are the proportions of the dichotomous variable-coded 1 and 0, respectively.(iii) Information Gain (IG): IG is calculated by the entropy of the feature , minus the conditional entropy of given , (iv)Area Under Curve (AUC).(v)Symmetrical Uncertainty (SU): SU is the normalization of IG within , where the higher value of SU shows a higher relevance for feature X and class Y (as a measure of correlation between the features and the concept target)

The experimental results of the univariate analysis are shown in Tables 2 and 3. Based on the table, the Iodine-Water, Iodine-Calcium, Calcium-Iodine, and Effective-Z features show high relevance to lymph node metastasis. Among these features, high relevance to lymph node metastasis was clinically confirmed for Iodine-Water and Effective-Z features. Both Iodine-Water and Iodine-Calcium features reflect the concentration of the iodinated contrast media uptake by the surrounding tissue, and thus they are related to lymph node metastasis. The Calcium-Iodine feature indicates tissue calcification, which rarely exists in lymph nodes. However, experimental results show that the Calcium-Iodine feature is highly related to lymph node metastasis, which must be further verified by clinical results.

Based on the statistical results of , AUC, SU, and IG, compared with high monochromatic energy, low-energy features have higher relevance to lymph node metastasis according to clinical results. As shown in Figure 3, low-energy images display a large difference between lymph node metastasis (positive) and non-lymph node metastasis (negative), as monochromatic energy is associated with higher energies that yield less contrast between materials and more contrast with low energies. However, low-energy images bring more noise with higher contrast. Therefore, doctors usually select 70 keV as a tradeoff for clinical diagnosis.

3.1.2. SFS-KNN Results

Figure 4 and Table 4 present the classification accuracy (ACC) of the KNN algorithm with different neighborhood sizes and the SFS algorithm with increasing lengths of the feature set. ACC first increases with the increasing length of the feature set, and then decreases. After application of the SFS algorithm, the feature set becomes shorter, whereas accuracy becomes higher compared with the original feature set that explains the effectiveness of SFS. From Table 4, we can examine ACC with different neighborhood sizes and selected features. When , the performance remains stable before and after data normalization, and ACC reaches 96.58% after normalization and finally selects 12 (effective-Z in the arterial phase), 30 (effective-Z in the venous phase), 31 (Calcium-Iodine in the venous phase), 33 (Iodine-Calcium in the venous phase), and 14 (Calcium-Water in the arterial phase) feature sets. These selected features are highly related to the classification results (lymph node metastasis). Among which the 12 (effective-Z in the arterial phase), 30 (effective-Z in the venous phase), 33 (Iodine-Calcium in the venous phase) feature sets are consistent with the pathology theory and clinical experience of doctors. As for the other feature sets, their effectiveness need to be further verified by studies. However, the SFS-KNN algorithm is not a global optimized solution, and it may lead to overfitting problems, which explain the decrease in ACC. In our experiments, the amount of the samples is not sufficient, so the large neighborhood size fails to reflect the local characteristics of the KNN classifier. Therefore, is not selected as the optimal size.

3.1.3. mRMR-KNN Results

Figure 5 shows two feature selection procedures with different mRMR criteria. Tables 5 and 6 reveal the classification performance of mRMR-KNN (MIQ and MID) with different neighborhood sizes [27]. We can see form the two tables that the two criteria of MIQ and MID acquire almost the same performances. After normalization, the accuracy with all different are highly increased, thus demonstrate the positive effect of data normalization. Among the feature sets, we can conclude from the table that 15 (Iodine-Calcium in the arterial phase), 21 (60 keV in the venous phase), 30 (Effective-Z in the venous phase), and 3 (60 keV in arterial phase) are closely related to the lymph node metastasis, which highly agree with the pathology theory and clinical experience of doctors. With , the classification performance remains stable before and after normalization, which further verifies the optimal (neighborhood size) value.

3.1.4. Metric Learning Results

Figure 6 shows 2D visualized results of 6 different distance metric learning methods in one validation. In the two-dimensional projection space, the classes are better separated by the LDA transformation than by other distance metrics. However, the result of KNN with single distance metric is not very satisfying, that’s why we consider combination.

Table 7 shows the classification accuracy of KNN algorithm with different-distance metric learning methods. Apparently, these results show that the data normalization helps a lot on classification. Moreover, PCA is a popular algorithm for data dimensionality reduction and operates in an unsupervised setting without using the class labels of the training data to derive informative linear projections. However, PCA can still have useful properties as linear preprocessing for KNN classification. By combining PCA with other supervised distance metric learning methods (e.g., LDA, RCA), we can obtain greatly improved performance. The accuracy of KNN classification depends significantly on the metric used to compute distances between different samples.

3.2. Discussion

Based on the experimental results, the use of machine learning methods can improve the accuracy of clinical lymph node metastasis in gastric cancer. In our study, we mainly used the KNN algorithm for classification, which shows high efficiency. To improve effectiveness and classification accuracy, we first employed several feature-selection algorithms, such as mRMR and SFS methods, which both show an increase in accuracy. We obtained the highly related features of lymph node metastasis in accordance with the validated results of clinical pathology. Another way to improve accuracy is the use of distance metric learning for the input space of the data from a given collection of similar/dissimilar points that preserve the distance relation among the training data, and the application of the KNN algorithm in the new data patterns. Some schemes used in our experiments attained the overall accuracy of 96.33%.

4. Conclusions

The main contribution of our study is to prove the feasibility and the effectiveness of machine learning methods for computer-aided diagnosis (CAD) of lymph node metastasis in gastric cancer using clinical GSI data. In this paper, we employed a simple and classic algorithm called KNN that combines several feature selection algorithms and metric learning methods. The experimental results show that our scheme outperforms traditional diagnostic means (e.g., EUS and MDCT).

One limitation of our research is the insufficient number of clinical cases. Thus, in our future work, we will conduct more experiments on clinical data to improve further the efficiency of the proposed scheme and to explore more useful and powerful machine learning methods for CAD in clinical.

Acknowledgments

This work was supported by the National Basic Research Program of China (973 Program, no. 2010CB732506) and NSFC (no. 81272746).