Abstract

The developing countries are still starving for the betterment of health sector. The disease commonly found among the women is breast cancer, and past researches have proven results that if the cancer is detected at a very early stage, the chances to overcome the disease are higher than the disease treated or detected at a later stage. This article proposed cloud-based intelligent BCP-T1F-SVM with 2 variations/models like BCP-T1F and BCP-SVM. The proposed BCP-T1F-SVM system has employed two main soft computing algorithms. The proposed BCP-T1F-SVM expert system specifically defines the stage and the type of cancer a person is suffering from. Expert system will elaborate the grievous stages of the cancer, to which extent a patient has suffered. The proposed BCP-SVM gives the higher precision of the proposed breast cancer detection model. In the limelight of breast cancer, the proposed BCP-T1F-SVM expert system gives out the higher precision rate. The proposed BCP-T1F expert system is being employed in the diagnosis of breast cancer at an initial stage. Taking different stages of cancer into account, breast cancer is being dealt by BCP-T1F expert system. The calculations and the evaluation done in this research have revealed that BCP-SVM is better than BCP-T1F. The BCP-T1F concludes out the 96.56 percentage accuracy, whereas the BCP-SVM gives accuracy of 97.06 percentage. The above unleashed research is wrapped up with the conclusion that BCP-SVM is better than the BCP-T1F. The opinions have been recommended by the medical expertise of Sheikh Zayed Hospital Lahore, Pakistan, and Cavan General Hospital, Lisdaran, Cavan, Ireland.

1. Introduction

Apparently, the diagnosis and the scrutiny of the breast cancer disease have always been a decisive and critical one in the regard of medical department. The cancerous lumps form in a particular area of the body when the human cells begin to produce rapidly beyond the expected limit. The cancerous lumps which are also termed as tumors are comprised of two kinds: one is the benign and the other is malignant. Breast cancer is considered as a lump that is formed in the breast cells; when these cells begin to grow irregularly in a human body, it results in flaking and redness of the breast. The cancer is still considered as the undiagnosed and untreated disease in various parts of the world. The questions arise, why the cancer still has the strong roots in patients? Why still breast cancer remains undiagnosed amongst the women? Why the ratio of mortality among women due to breast cancer is still constant? The breast cancer should be diagnosed at an early stage so that the condition does not persist and many lives could be saved. If it gets diagnosed initially, then the chances to overcome the disease are certainly higher. Women are being impinged by this disease commonly. So if cancer remains undiagnosed, then it may lead to death [1, 2]. The risk factors evolve because of which breast cancer is induced may be the genetic reasons, alcohol intake, dense tissues in breast, radiation exposure, age, and so on. Since 1989, the survival rate amongst masses has been immensely improved due to modern technologies introduced in screening and treatment. The recently conducted researches have shown that, in 2017, 252,710 women were diagnosed with this disease; approximately 40,610 according to the statistics were more likely to die from breast cancer. The radical steps in reducing the risk factors of this disease are the awareness of the induction causes of the breast cancer. The early symptoms of cancer and screening can lessen the factors of cancer infringing masses [3].

The physical diagnosis of this mortal disease involves the breast exam, imaging tests, biopsy, blood tests, and so on. The initial blood marker tests which are CA 15.3, TRU-QUANT, CA 125, CEA, and so on are done before the treatment of this cancer. The blood marker tests work as an initial indicator for this disease; in determining the breast cancer, currently there are four main methods which are employed in differentiating between the malignant lump and benign lump: biopsy, fine needle aspiration, mammography, and MRI [4, 5]. The proposed methodology has employed the two algorithms to detect an ailment efficiently. The Mamdani inference system is used among the recent researches to detect a particular kind of disease; this proposed research has compiled the previously done evaluations in comparison to this proposed methodology [6]. Mammography determines the incomplete diagnosis of the infection; if the infection results conclude out to be negative, the infection turns out as benign. Mammography determines whether there is any lump, no lump, or cancerous lump. Mammography concludes the severity level and the type of cancer is measured with the help of biopsy gold test. The biopsy gold test (type) will also determine the benign, ductal carcinoma, invasive lobular carcinoma, inflammatory breast disease, and lobular carcinoma [7, 8].

The recent researches show that, in 2017, 252,710 women were diagnosed with this disease; approximately 40,610 which were more likely to die from breast cancer [1, 9, 10]. The physical diagnosis involves the breast exam, imaging tests, biopsy, blood tests, and so on. In regard of determining the breast cancer, currently there are four main methods which are used in differentiating between the malignant lump and benign lump: biopsy, fine needle aspiration, mammography, and MRI [2, 3, 11].

In this research, fuzzy logic and SVM are used to find out the breast cancer by using different statistical measures such as accuracy, miss rate, specificity, sensitivity, false-positive value, false-negative value, likelihood ratio positive, likelihood ratio negative, positive prediction value, and negative prediction values. With the help of these matrices, breast cancer can be found more accurately as compared with the previous literature.

2. Literature Review

To avoid the faulty reasoning processes, errors, continuous failures, lack of knowledge, and failures in rules of logics in detection of tumors, researchers have started to advance new methodologies, models, and tools. Bearing a clear aim, various systems and proposed models have been put forward in successful identification of breast cancer. Among women in America at the rate of one in three cancers approximately, breast cancer is the most frequently diagnosed cancer among the women [6, 12, 13]. Surgical biopsies confirm malignancy with high level of sensitivity but are considered costly and can affect patient’s psychology as well. This research demonstrates novel approach by using morphological operators and clustering algorithm fuzzy c-m to identify malignant lump in mammography automatically [6, 1417]. In the article, initial identification of tumor, fuzzy system’s various applications, and algorithms have been proposed [18, 19]. In some previous studies on FDTs, proposed approaches focus on modification of decision tree pruning algorithm and require fuzzy parameters to be set by domain experts. We opted to fuzzify already generated decision tree nodes to relax the sharp decision boundaries. A similar kind of approach is employed in [20, 21].

Fuzzy logic has been rarely used in cancer prognosis. Being noncrisp, it can act as a natural ally of a physician in prognostic decision-making process [22]. In the recent researches, we have surveyed various types of research scenarios [2123]; in the prognosis of cancer, the applications of various cases of machine learning techniques are contributing towards the advanced researches. Some of the basic trends which are encountered for the motivation of experiments include the following: the fuzzy logic has been used in the diagnosis of cancer rarely. Aiming for clear interpretation of a particular type of disease physicians, using “Black Box” models, approximately 70% of all researches reported making use of neural networks. The majority of the manuscripts used machine learning techniques independently without considering potential in the discussed manuscripts to cope up with each other in a hybrid model. Lack of attention is paid to data size. Victor gives one solution to computerized tool used for diagnosis of breast cancer. The fact is that fuzzy logic can substantially assist in diagnosis of breast cancer is being put forward in this paper [2426].

Diagnosis of breast cancer is through fuzzy clustering with partial supervision [27]. ARTMAP approach gives 97.2% accuracy, which represents the one-way approach in the neural networks for the diagnosis of breast cancer [28]. Classification exactness of over 95% was professed to be accomplished by utilizing little MIAS database; the proposed framework is being acknowledged for the findings of bosom malignancy dependent on outrageous learning machine [2932].

Resisting a technique for classification of mammogram that is comprised of 4 phases, preprocessing stage utilized middle filter to upgrade nature of picture and to expel clamor from the picture. To check the variation from the norm of the mammograms, ANN classifier was utilized to group the picture into fitting class. Affectability, specificity, and precision asserted in the work were 72.72%, 93.6%, and 88.66% [18, 33].

A framework for the findings of bosom malignancy dependent through feedforward networks was proposed. Prepreparing was done in two-phase foundation and second was evacuating pectoral muscle. Hough change strategy was utilized for ROI. An aggregate of 32 dark dimensions and surface highlights is separated from mammograms. Precision asserted by utilizing smaller than normal MIAS database was 94.06% [34].

A few systems have been sent to anticipate and perceive significant example for breast malignancy analysis. Data mining further categorizes the different methods of the decision tree, ANN, RIPPERS classifier, and Support Vector Machine (SVM) to make a quick explanation and survey of the dataset regarding the cardiovascular disease. The explanation used the consideration and comparison of the performance of the techniques which encompasses accuracy, sensitivity, specificity, error rate, true positive rate, and false positive rate [35, 36].

Computational intelligence approaches like fuzzy system [3739], neural network [40], and swarm intelligence [41] and evolutionary computing [42] like genetic algorithm [43, 44], DE, Island GA [45], Island DE [46, 47], classifier [48], and SVM [49] are strong candidate solutions in the field of smart city [50], wireless communication, and so on.

3. Proposed System Model Methodology

The following methodology has been elaborated in Figure 1. First layer is the data acquisition layer which follows up with the data collection of breast cancer. The raw data attained through the collection of breast cancer is then fed into the preprocessing layer. The preprocessing layer is a criterion to handle the missing values amongst the raw data; furthermore, moving average and normalizations are being done in the preprocessing layer. The omissions and errors are being lessened through the standard portable. Then after the completion of the previous layers, the preprocessing layer then jumps on to the application layer. The application layer is comprised of the prediction layer and the performance evaluation layer. The prediction layer specifically focuses on the two algorithms which are employed to determine the indispensable types of breast cancer through type-1 fuzzy logic and SVM just points out that something is fishy or not; that is, a person is suffering from the disease or not. The two algorithms which aided the application layer are shown in Figure 1. Type-1 fuzzy logic is an enabled system used to get accurate results from big data. The performance evaluation layer calculates the accuracy and miss rate. Type-1 fuzzy logic constitutes of logical rules and these rules can be defined easily by the help of a medical expert. Type-1 fuzzy logic rules are applied on inputs of fuzzy sets and then converted it into a fuzzy output. In this research, input variables are used to propose a system to diagnose the particular disease which is cardiac by using a fuzzy logic model. For the detection of cardiac disease, Support Vector Machine (SVM) is a model that provides computational results which depend upon the structure and biological functions of neural networks. In the prediction layer, Support Vector Machine is used to find out the breast cancer, and in the performance layer, it is used to evaluate the results produced by the prediction layer performance. The whole system process is shown in Figure 1, in which the data acquisition layer comprises the parameters of input. In this model, they will go for the neural system, where a trained algorithm is used to estimate breast cancer. At the industrial level, SVM is utilized and it gives accurate results. SVM includes several neurons that are specifically organized. Neurons and influences among them are essential parts of an SVM. Neurons have handling features that cooperate to overcome an issue. This layer is used to examine the breast cancer on the basis of thirty input parameters, which is termed as the scientific study of the models that are statistical in nature and constitute algorithms which computer systems employ to perform a certain type of task with greater precision and as certainty. In the performance evaluation layer, precision and miss rate are determined. In the decisive area, the conclusion is made whether the breast cancer is identified or not.

3.1. Fuzzy System Methodology

Our proposed model breast cancer prediction (BCP), multilayered Mamdani fuzzy type-1 inference system- (MFIS-) based expert system (BCP-T1F) is explained in this section. The BCP-T1F expert system consists of four layers as shown in Figure 2. In layer 1 named symptoms, the initial symptoms of breast cancer will be checked which are swelling, breast pain, redness, nipple retraction, Family Inheritance Breast Cancer (FBIC), and skin irritation. This will find whether it is lump or cancer.

If the system finds the symptoms in the patient, then layer 2 diagnoses the breast cancer (no/yes) using two input variables that are ultrasound and mammography. If the layer 2 diagnoses breast cancer, then layer 3 will be activated. Layer 3 predicts the type and severity of BCP based on two input variables (biopsy gold severity) and (biopsy gold type). Then layer 4 will check the stage of cancer by three input variables that are MRI, CT, and PET which are shown in Figure 2.

Mathematically, the layers of the proposed BCP-T1F can be written as follows.

This layer 1 can be written mathematically as

The layer 2 can be written as

Then layer 3 can be written as

Then layer 4 can be written as

3.1.1. Membership Functions

The membership function of proposed BCP-T1F expert system yields the curve values ranging between 0 and 1 and also dispenses a mathematical form of the fuzzy logic that accords statistical values of both the input and output variables. The mathematical representation of proposed BCP-T1F expert system yields member functions of layers 1–4 shown in Tables 14. These membership functions are gathered after the consultation with the medical experts from Cavan General Hospital Lisdaran, Cavan, Ireland.

3.1.2. Rules Table

The proposed system BCP-T1F rules table usually relies on the expert system which constitutes of sixty-four inputs and output rules for layer 1, fifteen output and input rules for layer 2, eight input and output rules for layer 3, and thirty I/O rules for layer 4. This rule (Table 5) has been obtained with the assistance of the medical experts from Cavan General Hospital Lisdaran, Cavan, Ireland.

3.1.3. Rule Based

Rules are essential for input and output variables. Achievement of an adroit system is built based on rules. Some of the rules are shown in Table 4.

3.1.4. Inference Engine

Inference engine is the most emphasized constituent of any decision-based expert system. In this manuscript, BCP-T1F expert system has been employed in layer 1, layer 2, layer 3, and layer 4.

3.1.5. Defuzzification

Defuzzification is the process of making a measureable result in crusty logic, given fuzzy sets, and corresponding membership degrees. It is the process that plots a fuzzy set to a crisp set. It is characteristically needed in fuzzy control systems. In Figures 3(a)3(d), the graphical illustrations of defuzzifier of BCP-T1F expert system are obtainable.

3.1.6. Lookup Diagram

MATLAB R2019a tool is used for demonstrating, imitation, algorithm expansion, prototyping, and various other fields. This tool is well organized for software designing, data examination, conception, and calculations. For the simulation of results, three inputs and one output of BCP are used on layer 4 which are shown in Figure 4.

Figure 4 shows that is considered as node 2, is taken into account as low size, is spread in whole body, and turns out to be concluded as Stage 3.

Similarly, Figure 4 also demonstrates the rule-based knowledge; few of them are shown as follows:

is considered as node 3, is taken into account as very high size, is spread in whole body, and turns out to be concluded as Stage 4.

is considered as node 2, is taken into account as low size, is spread in whole body, and turns out to be concluded as Stage 2.

is considered as node 1, is taken into account as no tumor, is benign, and turns out to be concluded as Stage 0.

is considered as node 2, is taken into account as low size, is benign, and turns out to be concluded as Stage 1.

3.2. SVM-Based System Model
3.2.1. Sensor Data

Heterogeneous sensors are collecting continuously environmental data. It is transforming a physical quantity into a measurement. Multiple sensors are connected in the form of topology with the sensor board. Each sensor node acquires a subset of the collected samples for locally compressing and summarizing from the random signal.

3.2.2. Preprocessing

Data preprocessing is a data mining technique collecting the data from the patients which involves transforming raw data into an understandable format. Real-world data is often incomplete, is inconsistent, and is likely to contain many errors. In this step, we handle the missing values using mean, mode, and so on. We also mitigate the noisy data using the moving average method in which we used five-filter size. Data preprocessing prepares raw data for further processing.

In this article, Figure 5 has proposed a new system model for breast cancer control using support vector machine system in ML [48] and for breast cancer prediction BCP-SVM. This model depicts the whole process through picturing of the proposed BCP-SVM system model. With the help of this model, we can witness that the data gained from the Internet of medical things is utilized in sensory layer. This fed data can be updated with the help sensors. The layer named sensory layer has all the parameters which will be employed to predict cancer. The outcome generated is in the form of raw data. The raw data will be fed into the preprocessing layer. Data preprocessing prepares raw data for further processing. The raw data goes through the managing, moving, and normalization in the preprocessing layer. The portable standard was employed to eliminate inconsistencies from the data which is done in the previous layers termed as a preprocessing layer. After the data from preprocessing layer jumps on to the application layer, this layer of various parameters which are used in the application layer finds out the breast cancer malignancy. The layer is divided into two halves known as performance layer and prediction layer. In the prediction layer, Support Vector Machine is used to find out the breast cancer, and in the performance layer, it is used to evaluate the results produced by the prediction layer performance which are shown in Figure 5. The application layer evaluates the data being fed into this layer which then gives out whether the accuracy is achieved or not.

The proposed model is categorized into five different layers. If the trained accuracy is achieved, then it is passed onto the cloud for the further proceeding for the validation process. Cloud stores the data whether it is for training process or for testing process.

From cloud, data is being received for the validation process. The trained data or input is fed into the cloud to determine an evaluation system for the testing purposes. It is fed into the cloud and then forwarded to the preprocessing layer [51] where data is improved by handling missing values and errors; then finally it is transferred to the further diagnosis.

As we know, the equation of the line iswhere “x” is slope of a line and “y” is the intersect; therefore,

Let and ; then the above equation can be written as

This equation is obtained from 2-dimensional vectors. It also works for different number of dimensions; (6) depicts the general equation of hyperlane.

The direction of a vector is written as and is defined aswhere

As we know,

Equation (3) can also be written as

The dot product can be computed as the above equation for n-dimensional vectors.

Let

If sign (z) > 0, then it is correctly classified, and if sign (z) < 0, then it is incorrectly classified.

Given a dataset D, we compute f on a training dataset:

Then Z which is called functional margin of the dataset is as follows:

Taking hyperplanes, the hyperplane with the largest Z will be commendatory selected. The geometric margin of the dataset is denoted by Z. The main goal is to take into account an optimal hyperplane, which means finding the values of and y of the optimal hyperplane.

The Lagrangian function shows the following equation:

From (16) and (17), we get

After substituting the Lagrangian function , we getand

The expansion of the Lagrangian multipliers method to the Karush–Kuhn–Tucker conditions can be done; the constraints will bear disproportion. The Karush–Kuhn–Tucker commendatory conditions will be expressed aswhere is the optimal point, is the positive value, and for the other points are .

So,

These are called support vectors, which are the closest points to the hyperplane. According to (22),

To compute the value of y, we get

Multiplying both sides by m in (24), then we getwhere ;

Then

The number of support vectors is S; we will have the hyperplane. To make predictions, hyperplane is used. And the hypothesis function is as follows:

The above point which arises on the hyperplane will be considered as class +1 (breast cancer found) and the point which lies down the hyperplane will be categorized as −1 (breast cancer not found).

So, basically, the objective of the SM algorithm is to find a hyperplane which could separate the data accurately and we need to find the best one, which is often referred to as the optimal hyperplane.

4. Simulation and Results

MATLAB 2019a is used for simulation purpose. Section 4.1 contains the results of proposed fuzzy-based model and Section 4.2 contains the result of proposed SVM-based model.

4.1. Fuzzy Results

For the constructive results, MATLAB R2019a is used as a tool so as to gather the stimulation of results taking algorithm development along with it; it also takes prototyping into account. The interpretation of the results is being developed by taking the 12 total inputs and 4 outputs variables for fuzzy logic. When layer 1 shows the symptoms to be found in a particular person, then it rushes to the second layer in which mammography and ultrasound are done to do the initial treatment so as to assure that something is fishy going on; in this research, the proposed BCP-T1F system not only diagnoses the disease but also shows the different levels. When jumping towards layer 3, it depicts that the biopsy gold determines the type and severity of the lump, whether it is cancerous or not. Then the third layer comes; it involves PET, MRI, and CT which in turns gives the size of the tumor and to which extent it is cancerous. Figure 6 has clearly stated the precision of a proposed system. The proposed system BCP-T1F shows the precision rate of 96.56 percent and the miss rate of the BCP-T1F comes out to be 3.44 percent. This proposed system is providing the accurate results for the corresponding type and severity level.

4.2. SVM Results

The simulation of MATLAB R2019a tool is employed to assume and predict the breast cancer. Tables 6 and 7 conclude the training and validation with respect to precision rate and miss rate. SVM algorithm has been implemented to the dataset [15] of 569 sets of records; moreover, this data is divided into training constitutes of 70% (399 samples) and 30% (170 samples) for the mentioned purposes training and validation. Various statistical measures used for comparing as well as performance are calculated with different metrics named sensitivity, specificity, and accuracy, whereas the true positivity is expressed in sensitivity and accurate negative as specificity. The following parameters are derived by the formulas given as follows:

The proposed BCP-SVM system model calculates the predicted output as negative (−1) and positive (1). The resultant output of value negative (−1) shows that there is benign and positive (1) value which shows the existence of malignancy.

Table 6 shows the proposed BCP-SVM system model prediction of breast cancer during the training phase. Total 399 number of samples are used during training which is further divided into 250,149 samples of positive and negative, respectively. It is observed that 248 samples have positive class which are correctly predicted and no breast cancer (benign) is found but 02 records are incorrectly predicted as a negative which means breast cancer (malignancy) is found. Similarly, total 149 samples are taken, wherein the case of negative means congestion is found, in which 144 samples are correctly predicted as a negative which means breast cancer is found and 05 samples are invalidly predicted as a positive which means no breast cancer is found, while actually breast cancer exists there.

Table 7 shows the proposed BCP-SVM system model prediction of breast cancer during validation phase. Total 170 numbers of samples are used during training which further are divided into 107,63 samples of positive and negative, respectively. It is observed that 106 samples of positive class have no breast cancer found and also are correctly predicted but 01 records are incorrectly predicted as a negative which means breast cancer is found, while actually breast cancer does not exist. Similarly, total 63 samples are taken in the case of negative which means breast cancer is found, in which 59 samples are correctly predicted as a negative which means breast cancer is found and 04 samples are invalidly predicted as a positive which means no breast cancer is found, while actually breast cancer existed there.

Table 8 shows the proposed BCP-T1F-SVM system model performance in terms of sensitivity, specificity, precision, and miss rate during training and testing phase. It clearly shows that the proposed BCP-T1F-SVM system during training gives 98.63%, 98.02%, 98.25%, and 1.75% sensitivity, specificity, accuracy, and miss rate, respectively. And during testing, the proposed BCP-T1F-SVM system gives 98.33%, 96.36%, 97.06%, and 2.94% sensitivity, specificity, accuracy, and miss rate, respectively. In addition, some more statistical measures are added to predict the values such as false positive, false negative, likelihood ratio negative, and positive and positive and negative prediction values give the result during training 1.98%, 1.37%, 0.0139, 49.81, 96.64%, and 99.2%. And during testing, the proposed TCC-SVM system gives 3.64%, 1.67%, 0.0173, 27.01, 93.65%, and 99.06%, respectively.

Table 9 and Figure 7 show the performance of the proposed BCP-T1F-SVM system model using fuzzy logic and SVM with previous approaches given in the literature BCP-ANN [8], ANN-ELM [8], and ANN [11, 17].

5. Conclusion

For the constructive results, MATLAB R2019a is used as a tool so as to gather the stimulation of results taking algorithm development along with it; it also takes prototyping into account. The interpretation of the results is being developed by taking the 12 total inputs and 4 outputs variables for fuzzy logic and 30 inputs and 1 output variables for SVM. The main goal is to analyze critically the different dimensions of breast cancer or any type of cancerous disease. The proposed system BCP-T1F-SVM is to devise a type of expert system to diagnose breast cancer and its stages. The reports on the basis of which the expert system analysis has been carried out were yielded by Cavan General Hospital Lisdaran, Cavan, Ireland. This expert system can be employed by the medical specialists and nonspecialist also. In this article, the proposed principal expert system BCP-T1F achieved a precision of 96.56 percent, which in turn also accords the 3.44 percent of miss rate. The proposed BCP-SVM is proven to provide an accuracy of 97.06 percent which in turn also accords the 2.94 percent of miss rate. Both the proposed BCP-T1F and BCP-SVM systems give more accuracy as compared to a previous published approach.

Data Availability

The simulation data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.