Research Article | Open Access
Yuntao Wei, Xiaojuan Wang, Meishan Li, "Intelligent Medical Auxiliary Diagnosis Algorithm Based on Improved Decision Tree", Journal of Electrical and Computer Engineering, vol. 2020, Article ID 1473736, 9 pages, 2020. https://doi.org/10.1155/2020/1473736
Intelligent Medical Auxiliary Diagnosis Algorithm Based on Improved Decision Tree
In order to address the problem of low ability of intelligent medical auxiliary diagnosis (IMAD), an IMAD based on improved decision tree is proposed. Firstly, the constraint parameter model of IMAD is constructed. Secondly, according to the physiological indexes of IMAD, the independent variables and dependent variables of auxiliary diagnosis are constructed, the quantitative recurrent analysis of IMAD is carried out by using regression analysis method, the data analysis model of IMAD is constructed, and the adaptive classification and recognition of IMAD are carried out. Finally, the attribute feature quantity of IMAD with pathological characteristics is extracted, and the improved decision tree model is used to realize intelligent medical auxiliary, assist in the optimal decision of diagnosis, and realize the effective classification and recognition of pathological characteristics. The results show that this method has better decision-making ability and better classification performance for IMAD, which improves the intelligence and accuracy of intelligent medical auxiliary diagnosis.
At present, medical diagnosis is mostly based on the information obtained from the diagnosis of equipment and instruments, combined with the medical knowledge of physicians and years of accumulated experience for diagnosis. With the gradual introduction of medical auxiliary equipment, the use of intelligent medical aids for pathological diagnosis has become an important means for future pathological diagnosis. Traditional intelligent medical auxiliary diagnosis usually extracts the knowledge and experience of medical experts and establishes rules based on this to provide assistance to doctors in diagnosis. However, due to the complexity of the diagnosis, medical data redundancy and uncertainty are prone to occur in the process. Therefore, it is necessary to use computer and information technology to develop a new IMAD system. The patient pathological monitoring data, patient demographic data, physiological characteristics data, environmental data, disease data, medical record data, mortality rate, and so on are taken as medical auxiliary diagnosis data to realize the optimization of IMAD and improve the timeliness, accuracy, and completeness of medical information. Research on related intelligent medical auxiliary diagnostic methods has received great attention .
The algorithm design of IMAD is based on the analysis of attribute characteristics of intelligent medical auxiliary diagnosis, and an irregular spatial clustering model of IMAD distribution is constructed . Also, the feature space recombination method is used for IMAD data feature analysis and structure reorganization. In traditional methods, fuzzy medical decision-making methods, PID methods, statistical feature analysis methods, and C-means clustering methods are mainly used for IMAD algorithms to construct a statistical analysis model of intelligent medical auxiliary diagnosis . Big data mining and adaptive feature extraction are performed based on the attribute distribution of intelligent medical auxiliary diagnostic data to achieve intelligent medical auxiliary diagnostics. Literature  proposed a computer-aided diagnosis study based on cancer medical images, and it was found that deep learning showed better results in tumor segmentation and classification than traditional shallow learning methods. Literature  proposed the diagnosis of lung cancer in the elderly based on high-definition optical intelligent electronic staining endoscope technology. A total of 91 elderly people who may have lung cancer were selected as experimental subjects, and relevant intelligent electronic staining optical imaging was collected to analyze the sensitivity and specificity of lung cancer diagnosis. It was found that this method can be more effective in detecting elderly lung cancer. Literature  proposed a liver cancer diagnosis and prediction model based on gene expression planning (GEP). Clinical data show that six serum markers including γ-glutamyltransferase, alpha-fetoprotein, carbohydrate antigen 153, and carbohydrate antigen 199 are related to the characteristics of HCC. This study predicts HCC by establishing the best joint model with GEP. Compared to the results from support vector machines, artificial neural networks, and multilayer perceptrons, GEP shows better results. The results show that the GEP model is a promising and good diagnostic tool for hepatocellular carcinoma, which can be widely used in the auxiliary diagnosis of liver cancer. Literature  proposed the FLIM method, which can be used to measure intracellular viscosity, analyze cell differentiation and apoptosis, monitor the dynamic changes of macromolecules in the nucleus, and help diagnose pathological sections. Literature  proposed an improved method for storing medical records atlas, in order to improve the efficiency of querying medical records and assist medical diagnosis of cases. According to the multivariate relationship of the original case data, the attributes of the case are ID-processed, and an entity type table is designed to improve the query accuracy of the medical records. Experiments show that the designed method of storing medical records atlas has higher query efficiency.
Aiming at the abovementioned problems, this paper proposes an intelligent medical-aided diagnosis algorithm based on an improved decision tree. First, a constraint parameter model for IMAD is constructed, and independent variables and dependent variables of auxiliary medical diagnosis are constructed according to various physiological indicators of IMAD, an intelligent medical auxiliary diagnostic data analysis model is constructed, adaptive classification and recognition are performed on intelligent medical auxiliary diagnostic data, and attribute characteristic quantities of intelligent medical auxiliary diagnostic data that reflect pathological characteristics are extracted. The improved decision tree model is used to realize the optimal decision making of IMAD and to effectively classify and identify pathological characteristics. Finally, simulation experiments are performed to demonstrate the superior performance of this method in improving the ability of intelligent medical auxiliary diagnosis.
2. Design of the Basic Decision Tree Model for Intelligent Medical Auxiliary Diagnosis
2.1. Node Distribution of Intelligent Medical Auxiliary Diagnosis
In order to realize the IMAD based on the improved decision tree, firstly, the sensors are arranged, and the sensor nodes are used to collect the patient’s medical pathological data and other auxiliary data, and the collected data are summarized and preprocessed to ensure that the obtained data are valid data and uploaded to the data service center to complete the monitoring of medical auxiliary diagnosis. After obtaining the auxiliary data, the constraint parameter distribution weights of the IMAD are analyzed, and the IMAD is adaptively classified . Based on the classification results, the model structure of the basic decision tree is designed.
The adaptive weighted weights of the constrained parameter distribution for intelligent medical-aided diagnosis arewhere the relevant sequence of the physiological indicators of the diagnostic information iswhere is the gray spatial distribution weight of IMAD . According to the relevant sequence of physiological indicators used in medical diagnosis, the optimal detection of intelligent medical auxiliary diagnostic data is realized, and the detection statistics arewhere is sequences related to physiological indicators used in medical diagnosis, is the composition data of the physiological index-related sequences, and j is the constraint parameters for medical assisted diagnosis. A Limited Data Set Model for Intelligent Medical-Assisted Diagnostic Data Distribution is constructed:
The associated features of IMAD arewhere , , and are the distribution of medical assistance diagnostic data, is the auxiliary diagnostic parameter values, and is the associated feature parameter. Real-time scheduling of intelligent medical auxiliary diagnostic data in a decentralized subspace and constructing a statistical distribution sequence feature matrix of intelligent medical auxiliary diagnostic data satisfy
According to the abovementioned analysis, a decision tree model is used to construct a spatial clustering model of intelligent medical auxiliary diagnostic data [11, 12]. The spatial grid clustering analysis method and the decision tree model are used to optimize the classification of intelligent medical auxiliary diagnostic data. The node distribution set of the decision tree fusion is obtained as follows:where , , , , and are constraint parameter matching sets for IMAD. Therefore, a constraint parameter model for IMAD is constructed, and regression analysis is performed according to the constraint parameters to improve the medical auxiliary diagnosis ability.
2.2. Design of the Basic Decision Tree Model
A real-time statistical analysis model of intelligent medical auxiliary diagnostic data is constructed. It is assumed that the intelligent medical auxiliary diagnostic data contains samples, sample , . Quantitative regression analysis is performed on the feature quantities of association rules of intelligent medical diagnosis.where indicates the correlation feature detection result of IMAD, and the calculation expression is
The feature quantity of the association rule spectrum of IMAD is extracted, and the information is fuseed to obtain the distribution of big data association rules of IMAD:where is the mutual information of intelligent medical auxiliary diagnostic data; is the cross-detection statistics of IMAD numbers; is fuzzy tomographic feature distribution set; and are the ambiguity function for feature weight detection of intelligent medical auxiliary diagnostic data; and is the fusion coefficient of data.
The IMAD method is used to analyze the pathological characteristics, the decision tree scheduling model is constructed, and the optimal classification in the pathological diagnosis process is performed [13, 14]. The associated main feature detection of the characteristic weights of the intelligent medical auxiliary diagnostic data is output aswhere is the gray spatial distribution weight of IMAD, is the associated principal feature detection for feature weights of diagnostic data, and is the number of samples in intelligent medical auxiliary diagnostic data. In the fuzzy hierarchical distribution structure model, the energy spectral density of the intelligent medical auxiliary diagnostic large data set at node is obtained. A principal component analysis method was used to construct a regression analysis model for the feature weights of intelligent medical auxiliary diagnostic data. The linear structure reorganization is performed according to the feature distribution of the intelligent medical auxiliary diagnostic data [15, 16]. The intelligent medical auxiliary diagnostic data to be classified is classified according to 5-tuples. The statistical distribution probability density characteristics of the intelligent medical auxiliary diagnostic data feature weight classification are obtained as follows:where there is an update period between and . A fuzzy hierarchical classification method is used to construct a basic decision tree model for intelligent medical-aided diagnosis as follows:where is the statistical distribution probability density feature and t is the implementation cycle difference. An association rule method is used to construct an intelligent medical auxiliary diagnostic data analysis model, and a basic decision tree model is provided to provide a basis for improving the design of the decision tree model.
3. Intelligent Medical-Assisted Diagnosis Optimization
3.1. Improved Decision Tree Model
On the premise of constructing the basic decision tree model of IMAD, the IMAD is optimized. This paper proposes an IMAD algorithm based on an improved decision tree. Attribute characteristic quantities of IMAD that reflect pathological characteristics are extracted, and an improved decision tree model is used to realize the optimal decision of IMAD [17, 18]. Based on formula (13), an improved decision tree model is designed. It is shown in Figure 1.
According to the abovementioned model, the feature extraction method  of a finite set is used to classify and recognize the IMAD, and an improved decision tree model is constructed to obtain the reliability evaluation function of the decision tree which is expressed as
A multiqueue decision tree scheduling method is used to establish a distributed training set of decision features for intelligent medical-aided diagnosis , and the decision tree scheduling feature weights for medical-aided diagnosis are
The kernel function is established in the fuzzy hierarchical distribution node of the decision tree. Adjust the weighting coefficient to get the geometric neighborhood of . Perform intelligent medical-assisted diagnosis and decision making .
3.2. Intelligent Medical Auxiliary Diagnosis Decision Output
After the auxiliary data are obtained, the attribute feature quantity of the IMAD reflecting the pathological features was extracted. The original waveform diagram of the data sequence is given, which is shown as Figure 2.
The autocorrelation feature decomposition was used to regress and analyze the data sequence of IMAD. The data series of IMAD was analyzed by regression analysis, and the intelligent medical diagnosis feature weight decision tree model was constructed, and the phase space reconstruction method is used for fuzzy feature reconstruction. A quadruple is used to represent the statistical distribution of the feature weights of the intelligent medical auxiliary diagnostic data, among which and are the entity set of feature weights for intelligent medical auxiliary diagnostic data (i.e., node and ), is the interactive statistics of feature weights of intelligent medical assistance diagnostic data, and is the time delay of feature weight classification of intelligent medical auxiliary diagnostic data. The rough set feature reconstruction method is used to analyze the pathological characteristics of i IMAD. The quantitative feature set of the decision tree distribution of IMAD is obtained as follows:
A narrow time-domain window TLX and TLY is created, and the fuzzy feature extraction model for IMAD is
Let the distribution of feature weights of intelligent medical auxiliary diagnostic data be m, and use the decision tree construction method to perform quantitative regression analysis of intelligent medical auxiliary diagnostics. According to the decision tree model of IMAD, the association distribution sets usd and usq of intelligent medical auxiliary diagnosis are constructed, and the statistical characteristic values are obtained:where refers to the fuzzy correlation fusion set of intelligent medical auxiliary diagnostic data and is the statistical frequency of intelligent medical auxiliary diagnostic data.
3.3. The Proposed Algorithm
Based on the abovementioned analysis, the feature quantities of the intelligent medical auxiliary diagnostic data that reflect the pathological characteristics are extracted, and the improved decision tree model is used to achieve the statistical results of the intelligent medical pathological characteristics, and the medical diagnosis is assisted based on the results. Specific steps are as follows:(1)Attribute characteristic data of IMAD reflecting pathological characteristics are extracted(2)Input: a finite set combination method for intelligent medical auxiliary diagnostic data attribute feature quantity and improved decision tree model based on fuzzy clustering; output: decision tree reliability evaluation function(3)Establish a decision tree distribution feature training set for intelligent medical-aided diagnosis(4)Input: distribute a feature training set; output in the fuzzy hierarchical distribution node of the decision tree: kernel function for IMAD and decision making(5)Input: analyze the pathological characteristics of IMAD; and output: quantified set of decision tree distribution characteristics of IMAD(6)The association distribution set of IMAD is constructed, and the following statistical eigenvalue formula is obtained:(7)Intelligent medical pathological characteristics statistics is realized(8)End
4. Experiment and Result Analysis
In order to verify the application performance of the model in the realization of IMAD, we improve the ability of IMAD and conduct simulation experiment analysis.
4.1. Experimental Environment and Data Set
The main software environment is the model structure of the basic decision tree. The sample length of the IMAD is 1200, and the number of training data in the data is 100. The initial frequency of collecting intelligent medical auxiliary diagnostic data is , the termination frequency is , and the fuzzy coefficient is 1.35. The main feature distribution coefficient is 0.24, and the weight coefficient of the decision tree distribution is 0.25. The data sets selected in this experiment are as follows:(1)WHO: it has more than 1000 indicators, including mortality, child nutrition, vaccines, and infectious and non-communicable diseases.(2)CDC WONDER: public health data on the United States, including environmental data, death data, and demographic data.(3)MIMIC Critical Care Database: it contains demographic data and physiological characteristics data of patients and collects more than 60,000 data.
The specific experimental data set is described in Table 1.
4.2. Experimental Steps
In order to verify the proposed algorithm, the experimental steps are designed and analyzed. Two million data sets from the WHO, CDC WONDER, and MIMIC Critical Care Database are selected as test data, and the remaining data are used as the sample data training model. The model input, algorithm application, and result output are described in Figure 3.
4.3. Experimental Indicator
4.3.1. Feature Extraction Accuracy
The improved decision tree model is used to realize the statistics of intelligent medical pathological characteristics, and the medical pathological characteristics are extracted according to the following formula:where T represents the fusion set of fuzzy correlation of data, t is the statistical frequency, and is the word repetition decision number.
4.3.2. Diagnostic Feature Classification Performance
In the adjuvant treatment, the diagnosis characteristics should be classified according to the age of onset of the patient, family history, comorbidities, onset time, triggers, specificity, history of allergies, routine tests at the point of onset, and routine blood tests. In the face of complex pathological characteristics, classification performance is very important. The elderly are the main group of patients, and the decline in physical function of the elderly can be clearly observed, that is, the clinical phenotype. Therefore, the clinical manifestations of Parkinson’s disease, a common middle-aged and elderly neurodegenerative disease, are taken as the representative of the test.
4.3.3. Fusion of the Pathological Characteristic
In order to verify the fusion of the classified sample features with the pathological characteristics in the IMAD, the fusion of pathological characteristics needs to be tested. The calculation formula of the fusion rate is as follows:where refers to pathological fusion features, t refers to the sample fusion feature, and n refers to the fusion grade. The stronger the integration, the better the decision making of this proposed method.
4.4. Result and Discussion
4.4.1. Comparison of Feature Extraction Accuracy
Taking the collected data as the research object, the feature quantities of the intelligent medical auxiliary diagnosis reflecting pathological characteristics are extracted, and the improved decision tree model is used to realize the optimal decision of the IMAD, and the feature extraction results are obtained.
The method of this paper is compared with the feature extraction accuracy of intelligent medical auxiliary diagnostics in literature , , and , and the comparative test results are obtained. They are shown in Table 2.
Analyzing Table 2, the data of the method under the comparison of the accuracy of the number of iterations are higher than the data in literature , , and . This shows that the algorithm in this paper can improve the ability of IMAD, and the accuracy of intelligent medical auxiliary diagnosis using this method is high.
4.4.2. Comparison of the Diagnostic Feature Classification Performance
According to Figure 4, the clinical characteristics of Parkinson’s disease have three major clinical phenotypes. Each clinical phenotype is composed of 4 factors. The age of onset can be divided into five types. The abovementioned patient characteristics are classified by characteristics. It is shown in Figure 5.
According to Figure 5, as the patient’s age increases, clinical phenotypes also increase. Among them, the classification accuracy rate of the method in this paper is higher than that in literature , , and , and the classification accuracy rate of literature  is always the lowest, while that of literature  and  is relatively high, and the classification accuracy rate tends to be stable at about 80%. This shows that the method of this paper has a good classification effect for most patient characteristics and is suitable for IMAD.
4.4.3. Comparison of the Fusion of Pathological Characteristics
The fusion test of pathological characteristics is shown in Figure 6.
According to Figure 6, when performing feature fusion, it is divided into three steps: data fusion, feature fusion, and decision-level fusion. This paper compares the methods with those in literature , , and . It is shown in Figure 7.
According to Figure 7, when performing data fusion, feature fusion, and decision-level fusion, the three-level feature fusion rates in literature  are 49%, 47%, and 80%, respectively; the three-level feature fusion rates in literature  are 40%, 47%, and 90%; and the tertiary feature fusion rates in literature  are 49%, 60%, and 75%, respectively. The fusion rate of the method in this paper is higher than that of the methods in literature , , and , which shows that the method has a good fusion performance.
The intelligent medical auxiliary diagnosis method is researched, and an IMAD algorithm based on improved decision tree is proposed. A constraint parameter model of IMAD is constructed. By analyzing the characteristics of IMAD, the association rule spectrum characteristic amount of IMAD is extracted to realize the effective extraction of pathological characteristics and provide the basis for medical auxiliary diagnosis.
The analysis shows that the performance of this method is good, the accuracy of feature extraction is high, the classification accuracy is as high as 90%, the fusion analysis ability is high, and the IMAD ability is improved. However, the method in this paper has not been verified and needs to be further studied. Therefore, it is necessary to deepen the algorithm steps, continue to study the big data analysis method, find the auxiliary diagnosis data attribute not mentioned in the current algorithm research, expand the research angle, and strive to make contributions to the intelligent medical auxiliary diagnosis.
The data used to support the findings of this study are included within the article. Readers can access the data supporting the conclusions of the study from the WHO, CDC WONDER, and MIMIC Critical Care Database.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This work was supported by Heilongjiang Provincial Department of Education Natural Science Research Project (NO.2016-KYYWF-0560), The surface scientific and research projects of jiamusi university (NO. L2012-075), the basic research projects of jiamusi university (NO. JMSUJCMS2016-009) and The surface scientific and research projects of jiamusi university (NO. 13Z1201576).
- J. Yang and C.-h. Wei, “Testing serial correlation in partially linear additive models,” Acta Mathematicae Applicatae Sinica, English Series, vol. 35, no. 2, pp. 401–411, 2019.
- S. M. Kamali, A. Arbabi, E. Arbabi et al., “Decoupling optical function and geometrical form using conformal flexible dielectric metasurfaces,” Nature Communications, vol. 7, no. 5, Article ID 11618, 2016.
- M. Liu and Y. Shi, “Model predictive control for thermostatically controlled appliances providing balancing service,” IEEE Transactions on Control Systems Technology, vol. 24, no. 6, pp. 2082–2093, 2016.
- S. Chen, W. Liu, J. Qin et al., “Research progress of cancer computer-aided diagnosis based on deep learning and medical image,” Journal of Biomedical Engineering, no. 2, pp. 160–165, 2017.
- J. Xu, C. Wang, H. Jinxiang et al., “Diagnostic value of high-definition optical intelligent electronic staining endoscopy technology for elderly lung cancer,” Chinese Journal of Gerontology, vol. 38, no. 21, pp. 67–69, 2018.
- L. Zhang, J. Chen, C. Gao et al., “An efficient model for auxiliary diagnosis of hepatocellular carcinoma based on gene expression programming,” Medical & Biological Engineering & Computing, no. 2, pp. 1–9, 2018.
- J. Qu, “Fluorescence lifetime imaging and its applications in cellular microenvironment measurement and auxiliary diagnosis,” in Proceedings of the nternational Conference on Photonics and Imaging in Biology and Medicine, Suzhou, China, 2017.
- Y. Xia, D. Gao, R. Tong et al., “Research on medical record data storage based on knowledge map,” Computer Engineering, vol. 45, no. 1, pp. 9–16, 2019.
- Y. Xu, S. Tong, and Y. Li, “Prescribed performance fuzzy adaptive fault-tolerant control of non-linear systems with actuator faults,” IET Control Theory & Applications, vol. 8, no. 6, pp. 420–431, 2014.
- B. Sun, J. Wang, H. Chen et al., “Measurement of diversity in integrated learning,” Control and Policy, vol. 29, no. 3, pp. 385–395, 2014.
- X. Li, H. Wang, H. He et al., “Intelligent diagnosis with Chinese electronic medical records based on convolutional neural networks,” BMC Bioinformatics, vol. 20, no. 1, pp. 62–68, 2019.
- H. Hu, B. Tang, X. Gong, W. Wei, and H. Wang, “Intelligent fault diagnosis of the high-speed train with big data based on deep neural networks,” IEEE Transactions on Industrial Informatics, vol. 13, no. 4, pp. 2106–2116, 2017.
- T. Han, D. Jiang, X. Zhang, and Y. Sun, “Intelligent diagnosis method for rotating machinery using dictionary learning and singular value decomposition,” Sensors, vol. 17, no. 4, pp. 689–694, 2017.
- M. A. Zhonghai, S. Wang, J. Shi et al., “Fault diagnosis of an intelligent hydraulic pump based on a nonlinear unknown input observer,” Chinese Journal of Aeronautics, vol. 31, no. 2, pp. 385–394, 2018.
- G. Q. Jiang, P. Xie, X. Wang et al., “Intelligent fault diagnosis of rotary machinery based on unsupervised multiscale representation learning,” Chinese Journal of Mechanical Engineering, vol. 30, no. 6, pp. 1–11, 2017.
- S. Bertlein, G. Brown, K. S. Lim et al., “Thiol-ene clickable gelatin:a platform bioink for multiple 3D biofabrication technologies,” Advanced Materials, vol. 29, no. 44, pp. 170–174, 2017.
- H. Leonards, S. Engelhardt, A. Hoffmann et al., “Advantages and drawbacks of thiol-ene based resins for 3D-printing,” in Proceedings of the Laser 3D Manufacturing II, vol. 9353, pp. 353–360, 2015.
- S. Fedala, D. Rémond, R. Zegadi, and A. Felkaoui, “Contribution of angular measurements to intelligent gear faults diagnosis,” Journal of Intelligent Manufacturing, vol. 29, no. 5, pp. 1115–1131, 2018.
- C. E. Pulmano and M. R. J. E. Estuar, “A multi-model approach in developing an intelligent assistant for diagnosis recommendation in clinical health systems,” Procedia Computer Science, vol. 121, pp. 534–541, 2017.
- S. B. Zhou and W. X. Xu, “A novel clustering algorithm based on relative density and decision graph,” Control and Decision, vol. 33, no. 11, pp. 1921–1930, 2018.
Copyright © 2020 Yuntao Wei et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.