Computational and Mathematical Methods in Medicine

Computational and Mathematical Methods in Medicine / 2020 / Article
Special Issue

Machine Learning and Network Methods for Biology and Medicine 2020

View this Special Issue

Research Article | Open Access

Volume 2020 |Article ID 1016284 | https://doi.org/10.1155/2020/1016284

Seyed Abbas Mahmoodi, Kamal Mirzaie, Maryam Sadat Mahmoodi, Seyed Mostafa Mahmoudi, "A Medical Decision Support System to Assess Risk Factors for Gastric Cancer Based on Fuzzy Cognitive Map", Computational and Mathematical Methods in Medicine, vol. 2020, Article ID 1016284, 13 pages, 2020. https://doi.org/10.1155/2020/1016284

A Medical Decision Support System to Assess Risk Factors for Gastric Cancer Based on Fuzzy Cognitive Map

Guest Editor: Tao Huang
Received23 Mar 2020
Revised19 Jun 2020
Accepted14 Jul 2020
Published05 Oct 2020

Abstract

Gastric cancer (GC), one of the most common cancers around the world, is a multifactorial disease and there are many risk factors for this disease. Assessing the risk of GC is essential for choosing an appropriate healthcare strategy. There have been very few studies conducted on the development of risk assessment systems for GC. This study is aimed at providing a medical decision support system based on soft computing using fuzzy cognitive maps (FCMs) which will help healthcare professionals to decide on an appropriate individual healthcare strategy based on the risk level of the disease. FCMs are considered as one of the strongest artificial intelligence techniques for complex system modeling. In this system, an FCM based on Nonlinear Hebbian Learning (NHL) algorithm is used. The data used in this study are collected from the medical records of 560 patients referring to Imam Reza Hospital in Tabriz City. 27 effective features in gastric cancer were selected using the opinions of three experts. The prediction accuracy of the proposed method is 95.83%. The results show that the proposed method is more accurate than other decision-making algorithms, such as decision trees, Naïve Bayes, and ANN. From the perspective of healthcare professionals, the proposed medical decision support system is simple, comprehensive, and more effective than previous models for assessing the risk of GC and can help them to predict the risk factors for GC in the clinical setting.

1. Introduction

Gastric cancer (GC) which is one of the major cancers around the world with about one million new patients each year is known to be the third cause of cancer deaths [1, 2]. This represents an important public health issue in the world, especially in Central Asian countries, where the incidence of this disease is very high [2]. GC is a multifactorial disease, and its formation is related to various risk factors [3]. Various scientific methods, such as photofluorography and esophagogastroduodenoscopy, are used to diagnose GC in the early stages and can help reduce the mortality rate of GC with a practical approach [3]. Given that these methods are invasive and expensive, it is necessary to provide a simple inexpensive and effective tool for the diagnosis of people at risk for GC, which can then be followed by more accurate examinations. Moreover, appropriate prevention efforts can be made to reduce the incidence of this disease.

The initial definitions of the decision support system (DSS) consider it as a system to support decision-makers of the management in the semistructured and unstructured positions and decisions [4]. Accordingly, DSS means helping decision-makers and increasing their ability, not replacing their judgments [4]. Today, the use of DSSs has expanded in a variety of areas, such as management, industry, agriculture, information systems, medicine, and hundreds of other topics. The medical decision support system (MDSS) is a computer system designed to help physicians or other healthcare professionals in making clinical decisions. Some applications of the medical decision support system are outlined below [5]: (i)Preventive care services, for example, screenings for blood pressure and cancer(ii)Patient symptom checker(iii)Care plan(iv)Guide to reducing long hospital stays(v)Intelligent health monitoring systems

MDSS contains numerous advantages, of which the most important is to minimalize medical failure and make a relatively stable structure for diagnosing and treating the disease, thereby resolving various and conflicting ideas of specialists [5]. Therefore, it is vital to design and implement these models.

FCMs are regarded as soft computing methods that try attempting to act like humans for decision-making and reasoning [6]. In fact, an FCM is an instrument for modeling multifaceted systems, which is attained by integrating neural networks and fuzzy logic [7, 8], and to describe the complex system’s performance utilizing concepts. This technique creates a conceptual model where each concept provides a characteristic or a state of a system dynamically interacting with these notions [9]. FCM is a graphical representation of a system structure [10]. According to the artificial intelligence, FCMs are dynamic learning networks; thus, more data to model the problem can help the system with adapting itself and reaching a solution. This conceptual model is not restricted to the exact measurements and quantities. Hence, it is very appropriate for concepts without accurate structures.

FCMs were presented by Kosko as a fuzzy directed graph with sign and feedback loops to illustrate the computational complexity and dependence of a model symbolically and explicitly [11]. In other words, a set of nodes is created by the FCM affecting each other via causal relations. The details and mathematical formulation of this technique are described in Supplementary Materials (available here). Using the benefits of fuzzy systems (if-then rules) and neural networks (teaching and learning), FCM was able to quickly prove its effectiveness in various areas so that we can see its successful presence in politics, economics, engineering, medicine, etc. [12].

In recent years, MDSS using FCM has been developed as one of the main applications of this tool. FCM has emerged as a tool for representing and studying the behavior of systems, and it can deal with complex systems using an argumentative process. This study is aimed at providing an MDSS for assessing the risk of GC using FCM.

In the following, some successful instances of FCM applications regarding decision support systems are provided. Papageorgiou et al. [13] utilized FCM for predicting infectious diseases and infection severity. A novel FCM-based technique was presented by Amirkhani et al. [14] to screen and isolate UDH from other internal brain lesions. Hence, they examined 86 patients in Shahid Beheshti Hospital in Isfahan City. The pathologist extracted the ten key properties needed to screen these lesions to use them as the key concepts of FCM. The accurateness of the suggested technique was 95.35%. Based on the results, it was indicated that not only the suggested FCM contained a high accuracy level it is also able to preset an acceptable false-negative rate (FNR). A decision support system was proposed by Baena de Moraes Lopes et al. [7] to diagnose the changes in urinary elimination, based on the nursing terminology of North American Nursing Diagnosis Association International (NANDA-I). For 195 cases of urinary incontinence, an FCM model was utilized after the NANDA-I classifications. The high specificity and sensitivity of 0.92 and 0.95, were, respectively, found by the FCM model; however, a low specificity value was provided in the determination of the diagnosis of urge urinary incontinence (0.43) along with a low sensitivity value to overall urinary incontinence (0.42).

Recently, the use of FCM with Hebbian-based learning capabilities has increased. According to [15], a decision-making framework was proposed that can accurately assess the progression of depression symptoms in the elderly people and warn healthcare providers by providing useful information for regulating the patient’s treatment. According to [16], a risk management system for familial breast cancer was presented using the NHL-based FCM technique. Data needed for this study were extracted from 40 patients and 18 key features were selected. The results showed that the accuracy is 95%. According to [17], the first specialized diagnostic system for obesity was proposed based on psychological and social characteristics. In this study, a mathematical model based on FCM was presented. According to the proposed model, the effects of different weight-loss treatment methods can be studied.

No certain reason exists for GC. The cause-effect associations are not systematically investigated and understood so far between the integrated impacts of the multiple risk factors on the probability of developing GC. Even the ideas of radiologists and oncologists are greatly subjective in this regard. In such instances, it is considered to use an FCM as a human-friendly and transparent clinical support instrument to determine the cause-effect associations between the factors and the subjectivity can be remarkably eliminated by the degrees of its effects on the risk level. The present work is mainly focused on developing a clinical decision-making instrument in terms of an FCM to evaluate GC risk.

2. Methods

2.1. FCM Model for GC Risk Factors

Addressing GC is a complex process that needs to understand the various parameters, risk factors, and symptoms to make the right decision and assessment. This study assesses the risk of GC by providing a medical decision-making system. The design of this decision-making system is based on a proposed model of FCM, which is presented below. Designing and developing a suitable FCM require human knowledge to describe a decision support system. In this study, GC specialists are used for the development of the FCM model. The development of the FCM model is divided into three main steps, which is briefly summarized: (1)Identify concepts(2)Determine the relationships between concepts and initial weights(3)Weighting

First, the experts individually identify the factors that contribute to GC. In the following, common concepts among specialists are selected as model nodes. The second step is to identify the relationships between concepts. To this end, experts define the interactions between concepts with respect to fuzzy variables. To do so, determine the relationship and the direction of the relationship (if any). The amounts of these effects are expressed as very low, low, medium, high, and very high. Finally, the linguistic variables expressed by the experts are integrated. Using the SUM technique, these values are aggregated and the total linguistic weight is generated by the “centric” defuzzification method and converted to a numerical value. The corresponding weight matrix is then constructed. Choosing a learning algorithm to teach initial weights is the third step of this method. The purpose of a learning algorithm, setting the initial weight, is the same way as neural networks to improve the modeling FCM.

To better understand, these steps were used step by step to develop an FCM model for GC. For this purpose, the opinions of three specialists were used. In the first phase of the research presented in this article, information on GC risk factors was collected from medical sources, pathologists, and informal sources [1848]. The collected knowledge was transformed into a well-structured questionnaire and presented to three experts. The questionnaire includes risk factors associated with GC. According to three experts, 27 common features were identified as the major risk factors for end-stage GC. To better understand, we used the mentioned process step by step to develop an FCM model for GC.

Risk factors for gastric cancer may be categorized into four groups (personal features, systemic conditions, stomach condition, and diet food), each of which includes several risk factors. The final features are presented in Figure 1, and their explanations are given in Table 1.


Risk factorsDescription

C1: sexStudies show that men around the world are diagnosed with GC almost twice as much as women [18].
C2: blood groupScientific research shows that there is a significant relationship between blood type and GC. The blood groups A and O have the highest and lowest incidence of GC, respectively [19].
C3: BMIHigh BMI increases GC [20]. In 2016, the IACR formed a team of specialists. They reported that GC is one of the diseases caused by excessive fat gain and high BMI [21].
C4: ageThe risk of GC increases with age [18, 22, 23].
C5: motilityPeople with any regular physical activity have a lower risk of GC than nonactive people. According to the US Physical Activity Guidelines Advisory Committee (2018), moderate evidence showed that physical activity reduces the risk of various cancers, including GC [21].
C6: alcohol consumptionRegular alcohol consumption increases the risk of GC [24, 25].
C7: exposed to chemicalsSome jobs exposed to chemicals, such as cement and chromium, increase the risk of GC [26].
C8: smokingSmoking increases the risk of GC [27, 28].
C9: salt consumptionHigh salt intake increases the risk of GC [23, 29, 30].
C10: consumption of vegetableThe daily consumption of 200-200 grams of vegetables per day may reduce the risk of GC [31].
C11: consumption of smoked foodThe smoked food is a great source of polycyclic aromatic hydrocarbons (PAHs). Scientific research has shown that this biopollutant is one of the factors involved in many cancers, including GC [32, 33].
C12: milk consumptionIncreasing dairy consumption, such as milk, is associated with a lower risk of GC [34].
C13: fast food consumptionFast food consumption is one of the factors affecting the incidence of GC [35].
C14: consumption of fried foodsThe results of scientific studies show that people who use a lot of fried foods in their diet are at increased risk of GC [27, 28].
C15: fruit consumptionA daily consumption of 120-150 grams of fruit per day may reduce the risk of GC [31].
C16: food storage containerToday’s food containers are often made of chemicals, such as plastics that contain bisphenol A. Thus, it can be the source of various types of cancer and hormonal disorders [36].
C17: baking dishThe use of metal containers, such as aluminum for cooking, can be a factor in the development of diseases because these types of metals, when exposed to heat, emit a small amount of lead [37].
C18: history of allergyRecent studies indicate that the history of allergic diseases is associated with a lower risk of GC [38].
C19: family history of cancerA family history of cancer in certain specific sites may be associated with a risk of GC [39].
C20: family of GCThis risk factor is strongly associated with different types of GC [40, 41].
C21: history of cardiovascular diseasePeople with cardiovascular disease are at a lower risk of GC because of using some drugs [42].
C22: general status of cancerPeople with a good general health status are less likely to be at risk of GC [43].
C23: history of gastric refluxGastric reflux causes a 3-10% percent increase in being at risk of GC [44].
C24: history of stomach surgeryGastric surgeries, such as gastric ulcers, may increase the risk of cancer [45].
C25: history of stomach infectionHelicobacter pylorus is the most important risk factor for GC [4648].
C26: mucosa statusGastric ulcers are considered as a risk factor for GC [35].
C27: history of gastric inflammationThe history of gastric inflammation is one of the most important factors in the incidence of GC [35].

In the second phase, first, the sign for the relationship between the two concepts is determined, and finally, the numerical values of the two concepts are calculated. Five membership functions were used for this purpose. Consider the following example.

1st specialist: C4 has a great impact on C27.

2nd specialist: C4 has a moderate impact on C27.

3rd specialist: C4 has a great impact on C27.

Using the SUM method, the above three linguistic weights (high, very high, and very high) are aggregated. The above three linguistic weights (high, very high, and very high) are aggregated using the SUM method. Figure 2 represents the centroid defuzzification method that is implemented to calculate the numerical value of the weight in the range .

Using this method, the weight of all relationships between the concepts related to FCM for GC was calculated. The developed FCM is shown in Figure 3. In the third step, we used a learning algorithm to train the model, which includes updating the relationship weight, and finally, a fuzzy cognition map for GC risk factors was extracted. For this purpose, data collected from 560 patients referred to Imam Reza Hospital in Tabriz (after the preprocessing steps) were used through a questionnaire. Table 2 shows the features, values, and frequency of patients.


FeaturesRangeNumberPercent

SexMale25645.7%
Female30454.3%
Age<40203.47%
41–6021037.5%
≥6133059.03%
Blood groupA12321.96%
B7813.92%
AB8014.28%
O27949.82%
BMI6912.32%
7613.57%
12021.42%
29352.32%
MotilityLight15627.85%
Medium23642.14%
High16830%
Alcohol consumptionYes8515.17%
No47584.82%
Exposed to chemicalsYes549.64%
No50690.35%
SmokingYes19835.35%
No36264.64%
Salt consumptionNone101.78%
Low17531.25%
High37566.96%
Consumption of vegetableDaily264.64%
1-3 times a week21438.21%
1-3 times a month32057.14%
Consumption of smoked foodNone50.89%
Daily00%
1-3 times a week14926.60%
1-3 times a month40672.5%
Milk consumptionYes21438.21%
No34661.78%
Fast food consumptionNone40.71%
1-3 times a week31556.25%
1-3 times a month24143.03%
Consumption of fried foodsNone00%
1-3 times a week19134.10%
1-3 times a month36965.89%
Fruit consumptionNone61.07%
1-3 times a week18533.03%
1-3 times a month36965.89%
Food storage containerAluminum21638.57%
Plastic30153.75%
Copper325.71%
Style91.60%
Chinese20.35%
Baking dishAluminum101.78%
Teflon39069.64%
Copper213.75%
History of allergyYes8915.89%
No47184.10%
Family history of cancerYes21137.67%
No34962.32%
Family of GCYes12321.965
No43778.03%
History of cardiovascular diseaseYes18533.03%
No37566.96%
General statusGood7914.10%
So-so19033.92%
Poor29151.965
History of gastric refluxYes23441.78%
No32658.21%
History of stomach surgeryYes488.57%
No51291.42%
History of stomach infectionYes17631.42%
No38468.57%
Mucosa statusNormal9416.78%
Swollen12622.5%
Red15728.03%
Sore18332.67%
History of gastric inflammationYes16329.10%
No39770.89%
Risk scoreHigh30053.57%
Moderate18633.21%
Low748.39%

Figure 4 shows the proposed FCM model for risk factors of GC. This FCM has 28 concepts and 38 edges with their weights. Considering the 28 concept nodes, 27 are the ultimate physician-selected features that interfere with the disease and are shown by the values C1 to C27. The central node is the concept of GC, which receives and collects interactions from all other nodes. The positive weight of an edge indicates that it has a positive effect on the incidence of GC, and the negative weight indicates the role of deterrence in the incidence of the disease. The yellow, purple, blue, and green colors were used to specify the category of any feature or concept. The C1 to C8 features specified with yellow were classified as personal features. The violet color was used for the C9 to C17 features of the diet food category. Blue and green were also used for the C18 to C22 features of the systemic condition, respectively, and C23 to C27 features were used for the stomach condition category.

2.2. Learning FCM Using NHL Algorithm

GC specialists were well positioned to create FCM in our method. Nonlinear Hebbian Learning (NHL) is utilized to learn the weights due to no access to a relatively large data set, causal weight optimization, and more accurate results [49]. The Hebbian-based algorithms were used for FCM training to determine the best matrix in terms of expert knowledge [50]. Algorithms set the FCM weights through existing data and a learning formula in terms of repetition and Hebbian rule methods [50]. The NHL algorithm is based on the assumption that all of the concepts of the FCM model are stimulated at each time step and their values change. The value corresponding to the concepts of and is updated, and the weight is corrected in iteration . The value of is determined in the th iteration. The impact of concepts with values and corrected weighted values in iteration is determined by

Each of the concepts in the FCM model may be input or output concepts. A number of concepts are defined as output concepts (OCs). These concepts are the state of the system in which we want to estimate the value that represents the final state of the system. The classification of concepts as input and output concepts is by the experts of the group and according to the subject under consideration. The mathematical relations used in the NHL algorithm for learning FCM are shown in equations (1) and (2).

where is a scaling parameter called the learning rate. is a very small positive scaler factor called learning parameter. Its value is obtained through test error.

Equation (3) is the main equation of the NHL algorithm. is the weight decay parameter. The values of concepts and weights are calculated by equations (1) and (3), respectively. In fact, the NHL algorithm updates the basic matrix nonzero elements suggested by the experts in each iteration. The following criteria determine when the NHL algorithm ends [50].

(a) The terminating function is given as

where is the mean value of .

This kind of metric function is suitable for the NHL algorithm used in the FCMs. In each step, calculates the Euclidean distance for and . Assuming that , is calculated by

Given that the FCM model has -OCs, for calculating , the sum of the square between -OCs and can be calculated by

After is minimized, the situation ends. b) The second condition for completing the algorithm is the difference between two consecutive OCs. This value should be less than . Therefore, the value of the th iteration should be less than based on

In this algorithm, the values of the parameters and are determined through test error. After several tests, the values of and show the best performing algorithm. Finally, when the algorithmic termination conditions are met, the final weight matrix ( is obtained.

For the convenience of end-users, a graphical interface is designed using the GUI in MATLAB for the proposed system. The user interface for the GC risk prediction software is shown in Figure 3.

For example, the user enters the requested information into the system. The system displays the risk assessment result after receiving information from the user and using the proposed NHL-FCM model.

For the comparison of classification accuracy, the same data set is used for classification with other machine learning models. Backpropagation neural network, support vector machine, decision tree, and Bayesian classifier were used in the Weka toolkit V3.7 to test other learning algorithms. For this purpose, the Excel file containing the collected data collection was converted to .arff format so that it can be read for Weka. Then, the required steps for data preprocessing were performed. In this software, one of the most common methods of evaluating the performance of categories that divide the tagged data set into several subsets is cross-validation. 10-fold cross-validation was used for all the studied algorithms. 10-fold cross-validation divides the data set into 10 parts and performs the test 10 times. In each step, one part is considered as a test and the other 9 parts are considered for training. In this way, each data is used once for testing and 9 times for training. As a result, the entire data set is covered for training and testing.

The backpropagation neural network with 27 input neurons, 10 neurons, and 3 output nodes was used as the multilayer perceptron. Also, for classification of the assess risk into three classes, high, medium, and low, the support vector machine, decision tree C4.5, and Naïve Bayesian classifier were used.. Given that the data studied are not linearly separable, we need to use the core technology to implement the SVM algorithm. The core technology is one of the most common techniques for solving problems that are not linearly separable. In this method, a suitable core function is selected and executed. In fact, the purpose of kernel functions is to linearize nonlinear problems. There are several kernel functions in Weka. The RBF (Radial Basis Function) was used to run the SVM algorithm. By selecting and running the C4.5 algorithm, you can see the results of the classification. Also, the tree created by this algorithm can be seen graphically, which is a large tree. The three categories of high risk, medium risk, and low risk were selected as target variables and other characteristics as predictive variables. The leaves of the tree are the target variables and can be seen as a number of rules according to the model made by the tree. Naïve Bayesian was another classification algorithm that was implemented using Weka on the studied data, and its results were examined. This algorithm uses a possible framework to solve classification problems.

3. Results

To analyze the performance of the proposed method, we divided the data into two categories. The proposed model was trained using 70% of the patient records (392 records) based on the NHL algorithm and tested using 30% of the records (168 records). Considering 168 patient records selected for testing randomly, there were 56 records in the high category, 64 records in the medium category, and 48 records in the low category.

Root square error (RMSE) and performance measure accuracy, recall, precision, and mean absolute error (MAE) are the key behavior measures in the medical field [17] widely utilized in the literature. To determine accuracy, recall, and precision, the turbulence matrix was utilized. A confusion matrix is a table making possible to visualize the behavior of an algorithm. Table 3 represents the general scheme of a confusion matrix (with two groups C1 and C2).


Predicted class

Actual classC1C2
C1True positive
(TP)
False positive
(FP)
C2False negative (FN)True negative
(TN)

The matrix contains two columns and two rows specifying the values including the number of true negatives (TN), false negatives (FN), false positives (FP), and true positives (TP). TP shows the number of specimens for class C1 classified appropriately. FP represents the number of specimens for group C2 classified inaccurately as C1. TN shows the number of samples for class C2 classified correctly. FN represents the number of specimens for class C1 classified incorrectly as class C2. (i)Accuracy: accuracy represents the ratio of accurately classified specimens to the total number of tested samples. It is determined by(ii)Recall: recall is the number of instances of the class C1 that has actually predicted correctly. It is calculated by(iii)Precision: it represents the classifier’s ability not to label a C2 sample as C1. It is calculated by

The MAE performance index is calculated by

In equation (11), represents the number of training data (), shows the number of output concepts (), and denotes the difference between the th decision output concept (OC) and its equivalent real value (target) by appearing the th set of input concepts to the input of the tool.

The RMSE evaluation index is defined based on where is the number of training sets and is the system outputs.

Table 4 shows the accuracy results obtained from the proposed method and other standard categorizers. The proposed method works better than other categories because of the efficiency of the NHL’s efficiency for working with very small data to correct FCM weight. As a result, optimal decisions are made for output concepts.


Classifiers+HighMediumLowClass recallClass precisionOverall accuracyRMSEMAE

Decision treesHigh3010153.5773.1776.780.51200.721
Medium1652081.2576.47
Low1024797.9179.66
Naïve BayesHigh408571.4275.4780.350.3340.645
Medium856487.577.77
Low803981.2582.97
SVMHigh462482.1488.4686.90.1930.342
Medium060493.7593.75
Low1024083.376.92
MLP-ANNHigh492787.584.4890.470.2480.097
Medium458490.6287.87
Low344593.7586.53
Proposed modelHigh551198.2196.4995.830.1730.0471
Medium160193.7596.77
Low034695.8393.87

The results show that the highest total accuracy is related to the proposed method (95.83%) which is about 5% higher than the accuracy of the MLP-ANN algorithm. The highest precision and recall are related to the proposed algorithm, which are, respectively, 96.77% (medium) and 98.21% (high). It also shows that the training error of the proposed method based on NHL is less than the other algorithms used in this study.

As stated, and are two learning parameters in the NHL algorithm. In this algorithm, the upper and lower limits of these parameters are determined by trial and error in order to optimize the final solution. After several simulations with parameters and , it was observed that the use of large amounts of causes significant changes in weights and weight marks. Also, simulation with small also creates significant weight changes, thus preventing the weight of concepts from entering the desired range. For this reason, values and are limited to and . In each study, a constant value is considered for these parameters.

After several investigations, it was found that the best performance of the category is related to and . The classification results obtained for the different values of learning parameters are presented in Table 5.


Confusion matrixClassification accuracy (%)
HighMediumLow

0.010.97504788.69
4591
2140
0.030.95456189.28
5580
6047
0.0450.98551195.83
1601
0346
0.050.96546094.04
1560
1248
0.0550.96532591.6
2580
1443

4. Discussion

In this study, we designed a risk prediction model and a GC risk assessment tool using data from a study on a population of patients referring to the gastroenterology unit of Imam Reza Hospital in Tabriz. The proposed model presented in this study is attempting to rationalize beyond the analyses of clinical experts and increase the ability of experts to make logical decisions in a clinical setting for patients with different levels of risk factors for GC and help clinical specialists to make a logical decision about optimal preventive methods for patients.

The 95.8% overall classification accuracy obtained through the Hebbian-based FCM using 560 patients indicates a high level of coordination between the proposed system and medical decisions, and the proposed decision support tool can be trusted for clinical professionals and also helps them in the process of risk assessment of gastric GC.

Specifically, our risk assessment tool is simple and inexpensive to use in the clinical environment, because many other methods to predict the risk of GC are invasive. Therefore, this is an effective instrument for estimating the population at risk of cancer in the future. The results show that this new model can predict the probability of developing GC concerning the characteristics specified in this study with a better accuracy than previous studies.

In recent years, several researches have been carried out on the development and validation of risk assessment tools for various cancers [51, 52]. Recent studies have shown that the combination of H. pylori antibody and serum pepsinogen can be a good predictor of GC [53, 54].

We believe that only two other evaluation instruments exist for GC rather than ours. Based on the Japan Public Health Center-based Prospective Study, a device was designed to estimate the cumulative probability of GC incidence including sex, age, smoking status, the mixture of H. pylori antibody and serum pepsinogen, consumption of salty food, and family history of GC as the risk factors [55]. A good performance was found by the model based on calibration and discrimination. Based on [2], a risk evaluation instrument for GC was proposed in the general population of Japan. In this work, gender, age, the combination of Helicobacter pylori antibody and pepsinogen status, smoking status, and hemoglobin A1C level were risk factors for GC.

The risk factors chosen in these two studies were very limited to a few specific characteristics and had little similarity to the factors in our study. Risks such as consumption of fruits and vegetables, alcohol consumption, history of cardiovascular disease, blood type, milk consumption, history of allergy, gastric reflux, storage containers, food intake, and family history of cancer did not exist in both studies in spite of their importance in previous studies. Factors such as salt intake and a history of GC are known as causes of GC that did not exist in [2]. Another remarkable point in our study is that, given the nature of the proposed model, this method addresses the effects of factors that are sometimes related to each other or even the mutual effects that might put each other at risk, but it is not included in the two previous studies.

Another advantage of the proposed method than other algorithms is that other methods cannot provide any explicit causal relationship and the system works as a black box. This problem also makes these algorithms less suited to medical decision support systems. Finally, the new system has the following benefits: (i)It examines the factors that have not been taken into account in previous models to assess the risk of GC(ii)Because of the use of new factors, this model can be more effective in predicting the risk of GC(iii)The proposed model is presented by a software that has a simple, convenient, and user-friendly interface(iv)The use of this software by physicians and other researchers can tackle individual healthcare decisions(v)It helps healthcare professionals decide on individual risk management mechanisms

The system presented in this study has the following limitations: (1) a small sample of patients used to learn and anticipate GC, (2) the heavy dependence of this model on knowledge of domain specialists, (3) dependence on initial conditions and communication, and (4) the absence of external validation of the forecast system. Although this system has nice results due to the use of an appropriate database and the important and relevant GC factors, the generalizability of our results cannot be proved without the experiment of the system in another data set. As a result, it is necessary to use a larger statistical population to test the proposed model.

5. Conclusions

Assessing the level of risk for GC is very important and helps make decisions about screening. Given the limited number of GC risk assessment tools that have been proposed so far, there is no tool that comprehensively covers the risk factors in scientific studies on GC. The proposed model based on soft computing covers all the factors influencing the incidence of GC. The classification accuracy of the proposed method is higher than other methods of the machine learning classification, such as the decision tree and SVM. This is due to the useful features of FCM for checking domain knowledge and determining the initial structure of FCM and the initial weights and then using the NHL algorithm to teach the FCM model and adjust these weights. The FCM-based model is comprehensive, transparent, and more effective than previous models for assessing the risk of GC. As a result, this risk assessment tool can help diagnose people with a high risk of GC and help both healthcare providers and patients with the decision-making process. Our future work is to use more features and variations and other learning algorithms to determine the weight of the edges in the FCM.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors claim no conflicts of interest.

Supplementary Materials

A review of fuzzy cognitive maps [56]. (Supplementary Materials)

References

  1. J. Ferlay, I. Soerjomataram, R. Dikshit et al., “Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012,” International Journal of Cancer, vol. 136, no. 5, pp. E359–E386, 2015. View at: Publisher Site | Google Scholar
  2. H. Charvat, S. Sasazuki, M. Inoue et al., “Prediction of the 10-year probability of gastric cancer occurrence in the Japanese population: the JPHC study cohort II,” International Journal of Cancer, vol. 138, no. 2, pp. 320–331, 2016. View at: Publisher Site | Google Scholar
  3. M. Iida, F. Ikeda, J. Hata et al., “Development and validation of a risk assessment tool for gastric cancer in a general Japanese population,” Gastric Cancer, vol. 21, no. 3, pp. 383–390, 2018. View at: Publisher Site | Google Scholar
  4. R. Sharda, D. Delen, and E. Turban, Analytics, Data Science, & Artificial Intelligence: Systems for Decision Support, Pearson, 2019.
  5. A. Amirkhani, E. I. Papageorgiou, A. Mohseni, and M. R. Mosavi, “A review of fuzzy cognitive maps in medicine: taxonomy, methods, and applications,” Computer Methods and Programs in Biomedicine., vol. 142, pp. 129–145, 2017. View at: Publisher Site | Google Scholar
  6. E. I. Papageorgiou, P. P. Spyridonos, C. D. Stylios, P. Ravazoula, P. P. Groumpos, and G. N. Nikiforidis, “Advanced soft computing diagnosis method for tumour grading,” Artificial Intelligence in Medicine, vol. 36, no. 1, pp. 59–70, 2006. View at: Publisher Site | Google Scholar
  7. M. H. B. de Moraes Lopesa, N. R. S. Ortegab, P. S. P. Silveirab, E. Massadb, R. Higac, and H. de Fátima Marind, “Fuzzy cognitive map in differential diagnosis of alterations in urinary elimination: a nursing approach,” International Journal of Medical Informatics, vol. 82, no. 3, pp. 201–208, 2013. View at: Publisher Site | Google Scholar
  8. E. I. Papageorgiou, C. D. Stylios, and P. P. Groumpos, “Active Hebbian learning algorithm to train fuzzy cognitive maps,” International Journal of Approximate Reasoning, vol. 37, no. 3, pp. 219–249, 2004. View at: Publisher Site | Google Scholar
  9. V. K. Mago, R. Mehta, R. Woolrych, and E. I. Papageorgiou, “Supporting meningitis diagnosis amongst infants and children through the use of fuzzy cognitive mapping,” BMC Medical Informatics and Decision Making, vol. 12, no. 1, 2012. View at: Publisher Site | Google Scholar
  10. P. Beena and R. Ganguli, “Structural damage detection using fuzzy cognitive maps and Hebbian learning,” Applied Soft Computing, vol. 11, no. 1, pp. 1014–1020, 2011. View at: Publisher Site | Google Scholar
  11. B. Kosko, “Fuzzy cognitive maps,” International Journal of Man-Machine Studies, vol. 24, no. 1, pp. 65–75, 1986. View at: Publisher Site | Google Scholar
  12. E. I. Papageorgiou and J. L. Salmeron, “A review of fuzzy cognitive maps research during the last decade,” IEEE Transactions on Fuzzy Systems, vol. 21, no. 1, pp. 66–79, 2013. View at: Publisher Site | Google Scholar
  13. E. I. Papageorgiou, N. I. Papandrianos, G. Karagianni, and D. Sfyras, “Fuzzy cognitive map based approach for assessing pulmonary infections,” in Lecture Notes in Computer Science, pp. 109–118, Springer, Berlin, Heidelberg, 2009. View at: Publisher Site | Google Scholar
  14. A. Amirkhani, M. R. Mosavi, F. Mohammadizadeh, and S. B. Shokouhi, “Classification of intraductal breast lesions based on the fuzzy cognitive map,” Arabian Journal for Science and Engineering, vol. 39, no. 5, pp. 3723–3732, 2014. View at: Publisher Site | Google Scholar
  15. A. Mpillis, E. I. Papageorgiou, C. A. Frantzidis, M. S. Tsatali, A. C. Tsolaki, and P. D. Bamidis, “A decision-support framework for promoting independent living and ageing well,” IEEE journal of biomedical and health informatics, vol. 19, no. 1, pp. 199–209, 2015. View at: Google Scholar
  16. E. I. Papageorgioua, J. Subramanianb, A. Karmegamc, and N. Papandrianosd, “A risk management model for familial breast cancer: a new application using fuzzy cognitive map method,” Computer Methods and Programs in Biomedicine, vol. 122, no. 2, pp. 123–135, 2015. View at: Publisher Site | Google Scholar
  17. P. J. Giabbanelli, T. Torsney-Weir, and V. K. Mago, “A fuzzy cognitive map of the psychosocial determinants of obesity,” Applied Soft Computing, vol. 12, no. 12, pp. 3711–3724, 2012. View at: Publisher Site | Google Scholar
  18. G. Murphy, R. Pfeiffer, M. C. Camargo, and C. S. Rabkin, “Meta-analysis shows that prevalence of Epstein–Barr virus-positive gastric cancer differs based on sex and anatomic location,” Gastroenterology, vol. 137, no. 3, pp. 824–833, 2009. View at: Publisher Site | Google Scholar
  19. B. L. Zhang, N. He, Y. B. Huang, F. J. Song, and K. X. Chen, “ABO blood groups and risk of cancer: a systematic review and meta-analysis,” Asian Pacific Journal of Cancer Prevention, vol. 15, no. 11, pp. 4643–4650, 2014. View at: Publisher Site | Google Scholar
  20. C. Q. Sun, Y. B. Chang, L. L. Cui et al., “A population-based case-control study on risk factors for gastric cardia cancer in rural areas of Linzhou,” Asian Pacific Journal of Cancer Prevention, vol. 14, no. 5, pp. 2897–2901, 2013. View at: Publisher Site | Google Scholar
  21. S. M. Gapstur, J. M. Drope, E. J. Jacobs et al., “A blueprint for the primary prevention of cancer: targeting established, modifiable risk factors,” CA: a Cancer Journal for Clinicians, vol. 68, no. 6, pp. 446–470, 2018. View at: Publisher Site | Google Scholar
  22. P. Karimi, F. Islami, S. Anandasabapathy, N. D. Freedman, and F. Kamangar, “Gastric cancer: descriptive epidemiology, risk factors, screening, and prevention,” Cancer Epidemiology, Biomarkers & Prevention, vol. 23, no. 5, pp. 700–713, 2014. View at: Publisher Site | Google Scholar
  23. D. Y. Graham, “Epidemiology of gastric cancer,” in Gastric Cancer, M. Rugge and M. Fassan, Eds.V. E. Strong, Ed., pp. 23–33, Springer International Publishing, Switzerland, 2015. View at: Google Scholar
  24. J. Dong and A. P. Thrift, “Alcohol, smoking and risk of oesophago-gastric cancer,” Best Practice & Research. Clinical Gastroenterology, vol. 31, no. 5, pp. 509–517, 2017. View at: Publisher Site | Google Scholar
  25. K. A. Moy, Y. Fan, R. Wang, Y. T. Gao, M. C. Yu, and J. M. Yuan, “Alcohol and tobacco use in relation to gastric cancer: a prospective study of men in Shanghai, China,” Cancer Epidemiology Biomarkers & Prevention, vol. 19, no. 9, pp. 2287–2297, 2010. View at: Publisher Site | Google Scholar
  26. A. R. Yusefi, K. Bagheri Lankarani, P. Bastani, M. Radinmanesh, and Z. Kavosi, “Risk factors for gastric cancer: a systematic review,” Asian Pacific Journal of Cancer Prevention, vol. 19, no. 3, pp. 591–603, 2018. View at: Publisher Site | Google Scholar
  27. M. Pakseresht, D. Forman, R. Malekzadeh et al., “Dietary habits and gastric cancer risk in north-west Iran,” Cancer Causes & Control, vol. 22, no. 5, pp. 725–736, 2011. View at: Publisher Site | Google Scholar
  28. K. Jain, V. Sreenivas, T. Velpandian, U. Kapil, and P. K. Garg, “Risk factors for gallbladder cancer: a case–control study,” International Journal of Cancer, vol. 132, pp. 1660–1666, 2012. View at: Google Scholar
  29. L. D’Elia, G. Rossi, R. Ippolito, F. P. Cappuccio, and P. Strazzullo, “Habitual salt intake and risk of gastric cancer: a meta-analysis of prospective studies,” Clinical Nutrition, vol. 31, no. 4, pp. 489–498, 2012. View at: Publisher Site | Google Scholar
  30. M. Verdalet-Olmedo, C. Sampieri, J. M. Romero, H. M. Guevara, Á. M. Machorro-Castaño, and K. L. Córdoba, “Omission of breakfast and risk of gastric cancer in Mexico,” World Journal of Gastrointestinal Oncology, vol. 4, no. 11, pp. 223–229, 2012. View at: Publisher Site | Google Scholar
  31. M. Ganjavi, B. Faraji, F. Kamangar, and C. Tucker, “Delayed effect of fruits and vegetables on gastric cancer,” Journal of The Academy of Nutrition and Dietetics, vol. 117, no. 9, p. A21, 2017. View at: Publisher Site | Google Scholar
  32. M. B. Braga-Neto, J. G. Carneiro, A. M. de Castro Barbosa et al., “Clinical characteristics of distal gastric cancer in young adults from Northeastern Brazil,” BMC Cancer, vol. 18, no. 1, p. 131, 2018. View at: Publisher Site | Google Scholar
  33. X. J. Cheng, J. C. Lin, and S. P. Tu, “Etiology and prevention of gastric cancer,” Gastrointest Tumors, vol. 3, no. 1, pp. 25–36, 2016. View at: Publisher Site | Google Scholar
  34. Y. Guoa, Z. Shanb, H. Renc, and W. Chena, “Dairy consumption and gastric cancer risk: a meta-analysis of epidemiological studies,” Nutrition and Cancer, vol. 76, no. 4, pp. 555–568, 2015. View at: Google Scholar
  35. F. Habibzadeh, “Gastric cancer in the Middle East,” The International Journal of Occupational and Environmental Medicine, vol. 382, 2013. View at: Google Scholar
  36. I. Husaina, M. Alalyanib, and A. H. Hanga, “Disposable plastic food container and its impacts on health,” The Journal of Energy and Environmental Science, vol. 130, pp. 618–623, 2015. View at: Google Scholar
  37. J. D. Weidenhamer, M. P. Fitzpatrick, A. M. Biro et al., “Metal exposures from aluminum cookware: an unrecognized public health risk in developing countries,” Science of the Total Environment, vol. 579, pp. 805–813, 2017. View at: Publisher Site | Google Scholar
  38. S. Jo, T. J. Kim, H. Lee et al., “Associations between atopic dermatitis and risk of gastric cancer: a nationwide population-based study,” The Korean Journal of Gastroenterology, vol. 71, no. 1, pp. 38–44, 2018. View at: Publisher Site | Google Scholar
  39. X. Jiang, C. C. Tseng, L. Bernstein, and A. H. Wu, “Family history of cancer and gastroesophageal disorders and risk of esophageal and gastric adenocarcinomas: a case–control study,” BMC Cancer, vol. 14, no. 1, 2014. View at: Publisher Site | Google Scholar
  40. M. Song, M. C. Camargo, S. J. Weinstein et al., “Family history of cancer in first-degree relatives and risk of gastric cancer and its precursors in a Western population,” Gastric Cancer, vol. 21, no. 5, pp. 729–737, 2018. View at: Publisher Site | Google Scholar
  41. C. Y. Yun, N. Kim, J. Lee et al., “Usefulness of OLGA and OLGIM system not only for intestinal type but also for diffuse type of gastric cancer, and no interaction among the gastric cancer risk factors,” Helicobacter, vol. 23, no. 6, 2018. View at: Publisher Site | Google Scholar
  42. S. A. Mahmoodi, K. Mirzaie, and S. M. Mahmoudi, “A new algorithm to extract hidden rules of gastric cancer data based on ontology,” Springerplus, vol. 5, no. 1, 2016. View at: Publisher Site | Google Scholar
  43. M. S. Kwak, K. S. Choi, S. Park, and E. C. Park, “Perceived risk for gastric cancer among the general Korean population: a population-based survey,” Psychooncology, vol. 18, no. 7, pp. 708–715, 2009. View at: Publisher Site | Google Scholar
  44. M. Rugge, R. M. Genta, F. di Mario et al., “Gastric cancer as preventable disease,” Clinical Gastroenterology and Hepatology, vol. 15, no. 12, pp. 1833–1843, 2017. View at: Publisher Site | Google Scholar
  45. “Causes, risk factors, and prevention, stomach cancer risk factors,” 2017, https://www.cancer.org/cancer/stomach-cancer/causes-risks-prevention/risk-factors/. View at: Google Scholar
  46. K. Sugano, “Effect of Helicobacter pylori eradication on the incidence of gastric cancer: a systematic review and meta-analysis,” Gastric Cancer, vol. 22, no. 3, pp. 435–445, 2019. View at: Publisher Site | Google Scholar
  47. V. E. Cokkinides, P. Bandi, R. L. Siegel, and A. Jemal, “Cancer-related risk factors and preventive measures in US Hispanics/Latinos,” CA: a Cancer Journal for Clinicians, vol. 62, no. 6, pp. 353–363, 2012. View at: Publisher Site | Google Scholar
  48. J. Jiang, Y. Chen, J. Shi, C. Song, J. Zhang, and K. Wang, “Population attributable burden of Helicobacter pylori-related gastric cancer, coronary heart disease, and ischemic stroke in China,” European Journal of Clinical Microbiology & Infectious Diseases, vol. 36, no. 2, pp. 199–212, 2017. View at: Publisher Site | Google Scholar
  49. A. E. I. Papageorgiou and W. Froelich, “Application of evolutionary fuzzy cognitive maps for prediction of pulmonary infections,” IEEE Transactions on Information Technology in Biomedicine, vol. 16, no. 1, pp. 143–149, 2012. View at: Publisher Site | Google Scholar
  50. A. Amirkhani, M. R. Mosavi, K. Mohammadi, and E. I. Papageorgiou, “A novel hybrid method based on fuzzy cognitive maps and fuzzy, clustering algorithms for grading celiac disease,” Neural Computing and Applications, vol. 30, no. 5, pp. 1573–1588, 2018. View at: Publisher Site | Google Scholar
  51. K. G. Yeoh, K. Y. Ho, H. M. Chiu et al., “The Asia-Pacific Colorectal Screening score: a validated tool that stratifies risk for colorectal advanced neoplasia in asymptomatic Asian subjects,” Gut, vol. 60, no. 9, pp. 1236–1241, 2011. View at: Publisher Site | Google Scholar
  52. B. Rosner, G. A. Colditz, J. D. Iglehart, and S. E. Hankinson, “Risk prediction models with incomplete data with application to prediction of estrogen receptor-positive breast cancer: prospective data from the Nurse’s Health Study,” Breast Cancer Research, vol. 10, no. 4, p. R55, 2008. View at: Publisher Site | Google Scholar
  53. T. Terasawa, H. Nishida, K. Kato et al., “Prediction of gastric cancer development by serum pepsinogen test and Helicobacter pylori seropositivity in Eastern Asians: a systematic review and meta-analysis,” PLoS One, vol. 9, no. 10, p. e109783, 2014. View at: Publisher Site | Google Scholar
  54. H. Watabe, T. Mitsushima, Y. Yamaji et al., “Predicting the development of gastric cancer from combining Helicobacter pylori antibodies and serum pepsinogen status: a prospective endoscopic cohort study,” Gut, vol. 54, no. 6, pp. 764–768, 2005. View at: Google Scholar
  55. C. De Martel, J. Ferlay, S. Franceschi et al., “Global burden of cancers attributable to infections in 2008: a review and synthetic analysis,” The Lancet Oncology, vol. 13, no. 6, 2012. View at: Google Scholar
  56. E. I. Papageorgiou, “Learning algorithms for fuzzy cognitive maps–a review study,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 42, no. 2, pp. 150–163, 2012. View at: Publisher Site | Google Scholar

Copyright © 2020 Seyed Abbas Mahmoodi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


More related articles

 PDF Download Citation Citation
 Download other formatsMore
 Order printed copiesOrder
Views344
Downloads284
Citations

Related articles

Article of the Year Award: Outstanding research contributions of 2020, as selected by our Chief Editors. Read the winning articles.