#### Abstract

This study aimed at discussing deep learning-based dual-source spiral computed tomography (DSCT) image in the evaluation of the efficacy of statins in the treatment of coronary artery plaque. A convolutional neural network (CNN) algorithm was proposed in this study. On this basis, the model was improved, the Res-Net network was applied to reconstruct the computed tomography (CT) image, and the deep learning network model Mask R-CNN was constructed to enhance the ability of image reconstruction. Then, 80 patients with coronary artery disease who were treated in hospital were selected as the research objects and divided into a control group (*n* = 40) and an observation group (*n* = 40). There were 21 male patients and 19 female patients in the control group, with an average age of 52 ± 3.2 years; there were 24 male patients and 16 female patients in the observation group, with an average age of 51 ± 2.4 years. The observation group was reconstructed with the constructed model, and patients in the control group received traditional CT. The interval between two examinations was 6–12 months, with an average interval of 8 ± 1.78 months. During the interval, all patients received conservative treatment mainly with atorvastatin. The general data of the two groups were comparable without statistical significance (). A network model was constructed to measure the coronary plaque and vascular volume of the patients, and the images were reconstructed on the Res-Net network. The loss value of Res-Net network was stable at the lowest level around 0.02, showing a very fast effect in the training process. After statin treatment, the vascular volume and coronary plaque volume of the patients were decreased obviously (). The average time spent in the network model was 1.20 seconds. The average time spent in the measurement of each disc by doctors A, B, and C was 186 seconds, 158 seconds, and 142 seconds, respectively. The construction of network model markedly improved the speed of CT image diagnosis and treatment. In conclusion, the Res-Net network model proposed in this study had certain feasibility and effectiveness for dual-source CT (DSCT) image segmentation and could effectively improve the clinical information evaluation of CT images from patients with coronary artery disease, which had important reference value for the development of intelligent medical equipment. It could provide a new diagnostic method for clinical prediction and diagnosis of coronary artery disease (CAD).

#### 1. Introduction

The fatality rate of coronary atherosclerosis is increasing day by day, which affects the health of human beings around the world, with about 19 million people experiencing sudden heart problems every year [1]. Coronary atherosclerotic plaque rupture is also an initial factor for the occurrence of acute coronary syndrome with stable angina pectoris, and it is very vital to find out the evaluation of plaque property indicators for disease prevention [2, 3]. Clinically commonly used statins can reverse plaque formation by reducing the lipid core area of plaque, decreasing the uptake rate of oxidized low-density lipoprotein, improving cardiac endothelial cells, reducing the number of macrophages in plaque, and weakening the chemotactic effect of monocytes [4]. With small side effects and high safety, statins can effectively treat cardiovascular diseases. Long-term intensive treatment can effectively improve the biological characteristics of plaques and reverse unstable plaques, which is of great significance for improving the prognosis of patients [5]. At present, coronary CT imaging is an important means of screening coronary heart disease in clinical practice. CT coronary angiography has shown good results in the clinical diagnosis of coronary artery sclerosis, which can reveal the main coronary artery and the main atherosclerotic plaques, and is also of great value for the stability of plaques. The main methods of atherosclerosis include biomarker detection, ultrasonic evaluation, and magnetic resonance imaging (MRI). Biomarker detection has no significant predictive power. Ultrasonic evaluation requires high image quality. MRI safety sensitivity is still in the primary stage, and further study is required. DSCT is effective, safe, and reliable and is known as the “gold standard”. DSCT can quickly distinguish and collect coronary artery images synchronized with electrocardiogram. It is very safe and can accurately judge cardiac stenosis coefficient.

Dual-source CT (DSCT) is a new device based on mature 64-slice CT technology, which has a great breakthrough in time resolution [6, 7]. The system imaging time resolution is 83 milliseconds, which is lower than that of cardiac imaging (0.1 seconds). Besides, the CT scanning speed is faster than the speed of each heartbeat, which effectively promotes the time resolution, so CT scan has become the main method of noninvasive coronary heart disease diagnosis. DSCT has the characteristics of thin layer thickness, small pitch, small collimation, etc., and the exposure scanning method of whole cardiac cycle has a large radiation dose to the blood vessels of children. Prospective electrocardiograph (ECG) examination can effectively reduce radiation dose by exposing only the set R-R interval. The advantages and disadvantages of the CNN algorithm in CT image processing are as follows. First, CNN can increase the generalization of the network when it outputs processing information. Second, CNN weight-sharing reduces the number of network model weight parameters. CNN sampling phase pooling can effectively reduce data volume process and computational complexity from the upper hidden layer and enhance the receptive field of local check. Thus, more structural information can be retained.

Deep learning networks are often applied in medicine to learn original images and are extensively used in image segmentation, image classification, and target image positioning [8, 9]. Some scholars have proposed in research that combining pixel information of different scales can extract the best size information [10, 11]; some researchers believe that reducing the size of the convolution kernel can improve the running speed of the neural network [12, 13]. On the basis of the above research, the convolutional neural network (CNN) was further optimized in this study. Based on this, the model was improved, the Res-Net network was adopted to reconstruct the CT image, and the deep learning network model was constructed to enhance the ability of image reconstruction. Therefore, it could provide a new diagnostic method for clinical prediction and diagnosis of the occurrence and development of coronary artery disease.

#### 2. Materials and Methods

##### 2.1. Research Objects

In this study, 80 patients with coronary heart disease were selected as the research objects, who had undergone two multislice spiral CT (MSCT) coronary artery imaging examinations and were treated in hospital from December 2017 to December 2020. This study was approved by the Medical Ethics Committee of the hospital, and the patients and their family members understood the situation of the study and signed the informed consent forms.

The criteria for inclusion were defined to include patients who were suspected of coronary heart disease, with the symptoms such as chest pain, palpitation, and chest tightness; had no severe cardiopulmonary insufficiency; did not suffer from hyperthyroidism; could communicate normally and were free of linguistic barriers; and successfully completed the DSCT examinations.

The criteria for exclusion were defined to include patients who were combined with vascular dementia, Lewy body dementia, and other mental illnesses; had had coronary artery stent or bypass grafting; had major functional insufficiency; and suffered from diabetes and should stop the biguanide drugs for 48 hours before the examination.

All the research objects were randomly divided into the control group (*n* = 40) and the observation group (*n* = 40). In the control group, there were 21 male patients and 19 female patients, with an average age of 52 ± 3.2 years; in the observation group, there were 24 male patients and 16 female patients, with an average age of 51 ± 2.4 years. The interval between the two examinations of the patient was 6–12 months. All patients received conservative treatment based on atorvastatin treatment during the interval examination, taking 20 mg 30 hours after dinner once a day. The general data of patients from the two groups were comparable but not statistically significant (). The observation group used the constructed model for image analysis. Patients in the control group received traditional CT images. In addition, the Boston ultrasound diagnostic instrument and the ultrasound imaging system were used for detection, and the vascular and plaque volumes were recorded.

##### 2.2. CT Examination

Before the scan, the examination procedure should be described in detail to each patient, so that the patient was in a supine position. The second generation of Siemens was used for scanning, the scanning parameters and image acquisition mode needed to be set, the tube current parameters were adjusted according to the weight, the pitch was automatically given by the machine, and the periodic exposure was fully automatic. 120 or 140 kV was for patients with body mass >90 kg, 100 or 120 kV was for patients with 60 kg < body mass ≤90 kg, and 80 or 100 kV was for patients who had body mass ≤60 kg. The current was 165 mA or 140 mA, the rotation speed was 0.28 seconds per week, collimation was 64 × 0.6 mm, length of laps was 38.4 mm, and the time resolution was 75 ms; the ECG-gated spiral scanning technology was reviewed, the layer thickness was 0.6 mm, and the layer distance was 0.75 mm. MEDRED double-barrelled hyperbaric syringe was injected intravenously through the lower extremities at a flow rate of 0.12 mL/s/kg for 20 seconds, while normal saline was injected at the same flow rate for 10 seconds. The scanning range should be located accurately, and the CT images with the best time sequence were selected and sent to the workstation for image reconstruction. All the images were evaluated by three chief physicians with more than 5 years of experience in the CT room using the double-blind method. In the case of disagreement, the reconstructed images were reevaluated until consensus was reached.

##### 2.3. Physical Basis of CT Image

X-ray CT imaging is a series of conversion of the collected projection data through the detector, and a tomographic image can be formed by the scanned object. The scanned object should be rotated, and the X-ray source and detector should be kept still, so that the detector can collect the projected data. Furthermore, it can also be obtained by holding the scanned object stationary and rotating the detector and X-ray source. Figure 1 indicates that the initial radiation intensity of X-rays is *R*_{0} under CT scanning, *F* represents the attenuation coefficient of the scanned object, and *R* means the radiation intensity of X-rays collected by the detector. The detector and X-ray source do not move, and the scanned object rotates around the center.

*R*(*x*) and *F*(*x*) stand for the radiation intensity and attenuation coefficient, respectively. After the X-ray travels through the *ΔS* length in the scanned object, *R*(*X* *+* *ΔS*) is the radiation intensity, which satisfies the following equation:

When *ΔS* approaches 0,

The projection data collected by the detector are denoted by *Wb*. During X-ray CT image reconstruction, the projected data are collected according to different rotation angles, and the attenuation coefficient of the internal structure of the scanned object is determined based on the data results.

##### 2.4. Neurons

An artificial neural network is composed of many neuron nodes connected to each other. Each neuron point is accompanied by an activation function. The output value of each neuron node is determined by the neuron point and the activation function. Before the output of the last node, there is a weight operation, and the weight size is called the weight. The model of the neuron is shown in Figure 2. *A*_{1}, *A*_{2}, *A*_{3}, and *A*_{n} represent the input of the neuron, *T*_{1}, *T*_{2}, *T*_{3}, and *T*_{n} indicate the weight of the neuron, *F*(*x*) is the activation function, *out* means the final output signal of the neuron, *ω* is the bias signal, *q* is the threshold value, and the bias signal will play a role when the set semaphore is input.

After the neuron input weighted summation, the final output result can be obtained through the following equation:

*A*_{1}, *A*_{2}, *A*_{3}, and *A*_{n} stand for input values of different dimensions, *T*_{1}, *T*_{2}, *T*_{3}, and *T*_{n} represent the weight values in different dimensions, and *T*_{A} expresses the input and weight of the neuron (Figure 3). In network training, the activation function can make the network have a strong ability to learn and express. This is due to the inseparability of some linear problems. When the data are mapped, the interval is 0–1. The ReLU function is the activation function in the CNN, and it can be expressed as

##### 2.5. Reconstruction Algorithm for Deep Learning

There are many algorithms for CT image reconstruction using deep learning. The filtered back projection (FBP) algorithm can reconstruct CT images with better quality. The FBP algorithm includes two steps: first, the projection data collected by the detector are filtered; second, the projected data after the filtering operation are back-projected. The commercial CT commonly used FBP algorithm could be displayed in the following equation:

As shown in Figure 4, the FBP ConvNet algorithm is adopted to mainly process sampled projection data, and the image reconstructed by the algorithm is input into the trained deep learning network to obtain a better quality image. The image is reconstructed by the algorithm to be closer to the true value. The combination of FBP algorithm and CNN not only reduces the limited angle artifacts but also improves the image resolution and residual learning.

##### 2.6. Res-Net Network

The network structure of Res-Net is shown in Figure 5. A reference (*X*) was made for the input of each layer to learn to form a residual function, instead of learning some functions without reference (*X*). This residual function is easier to optimize and can greatly deepen the number of network layers. The residual block contains two layers as shown in the following expression:where *δ* represents the nonlinear ReLU function.

Then, the output value *Y* could be obtained through a shortcut and the second ReLU:

When it needed to change the input and output dimensions (such as changing the number of channels), a linear transformation *W*_{c} can be performed on *x* during shortcut, as shown in the following equation:

The channel can be doubled when a specific dimension output was required. Figure 6 shows the structure of two residual blocks, and the curved arc represents “shortcut connection”. This also showed that Res-Net’s module was not so single.

It is hoped to have a better performance network in deep learning, and it is not the goal that the network does not degenerate. The residual function learned in the Res-Net network was *F* (*x*) = *H* (*x*)–*x*. If *F* (*x*) = 0, then it was the identity mapping mentioned above (*H* (*x*) = *x*). In fact, Res-Net is a special case of “shortcut connections” under the identity mapping. It does not introduce additional parameters and computational complexity. If the optimization objective function is to approximate an identity map, rather than a 0 map (*F* (*x*) = 0) or an identity map, then learning to find the perturbation to the identity map is easier than relearning a mapping function.

##### 2.7. Design of Deep Learning Network Model

The segmentation of images using Mask R-CNN in the deep learning network mainly focuses on target classification, target detection, and pixel-level target segmentation.

The framework diagram is shown in Figure 7. Mask R-CNN added a Mask Prediction Branch to Faster R-CNN, improved ROI Pooling, and proposed region of interest (ROI) Align. Mask R-CNN readded the segmentation branch to the Faster R-CNN architecture, the network structure had changed, and the final loss function had also changed accordingly. Due to the addition of the mask branch, the loss function of each ROI was as follows:where *L*_{cls} is the classification loss, *L*_{box} is the bounding box loss, and *L*_{mask} represents the segmentation loss. In Mask R-CNN, for the newly added mask branch, it would output a dimension for each ROI. After the predicted mask was obtained, the sigmoid function value for each pixel value of the mask (the so-called per-pixel sigmoid) was calculated, and the result obtained was used as one of the inputs of *L*_{mask} (cross-entropy loss function). It should be noted that only the positive sample ROI would be used to calculate the *L*_{mask} here. The definition of the positive sample was the same as the target detection, and the IoU was greater than 0.5 as a positive sample.

##### 2.8. Model Improvement

The RPN network used a convolutional network to generate the location of the lesion in the image. In the sliding process, it should preset the target frame of aspect ratio and area, which was the anchor. The correction frame of anchor consisted of *q*_{x}, *q*_{y}, , and *q*_{h}. The corrected frames *q*_{x} and *q*_{y} were translated in the *X* and *Y* directions of the anchor, while and *q*_{h} were the multiples of the enlarged length and width. The predictive pan-scaling parameter equation is given as follows:

The true translation and zoom parameter formula is expressed as follows:

The specific description of the difference between the predicted translation parameter and the real translation parameter is given as follows:

The loss function formula can be expressed as follows:

*λ* was set to 10, then *N*_{cls} was 256 for the size of the training batch, and *N*_{reg} was 2400 for the number of anchors.

The sigmoid function was applied to each pixel of ROI Align layer.

Each class was allowed to generate an independent mask to avoid interclass competition, which can decouple the type prediction.

The Intersection over Union (IoU) between the predicted box and the real box represented an important indicator of the difference between the predicted box and the real box. The larger the value of IOU, the smaller the value of I-IoU, indicating that the difference between the two was smaller. The objective function equation for the “closer distance” clustering is shown as follows:where *M* represents the sample set of the cluster, *N* represents the category of the cluster, refers to the width and height of the predicted box obtained by the cluster, and Truth [*M*] represents the width and height of the real box.

##### 2.9. Statistical Methods

The survey data processing in this study was analyzed by SPSS 19.0 version statistical software. The measurement data conforming to the normal distribution were expressed as mean ± standard deviation ( ± *s*), and the nonconforming count data were represented by frequency (%). The *t*-test was adopted to compare data differences, and chi-square test was used for quality comparison. Differences with were statistically substantial, and vice versa.

#### 3. Results

##### 3.1. Training and Verification of the Mask R-CNN Model

The dataset was adopted to train and analyze the network model, and the results are shown in Figure 8. The other parts of the network were fixed first, the network iteratively learned 60 epochs, the initial learning rate was divided by 10, and the open network layer was used to learn 100 epochs. The ratio of the training dataset to the verification dataset was 4 : 1, the picture remained the original length, the initial learning efficiency was 0.002, and the weight attenuation was 0.0001.

##### 3.2. Coronary Artery Plaque Images

The quantification of coronary plaque could better achieve cardiovascular risk stratification. As shown in Figure 9, the A, B, C, and D were the coronary CT angiograms. The purple box in Figure 9(a) shows coronary artery noncalcified plaque. Figure 9(b) shows approximate 70%∼80% severe stenosis. Figure 9(c) shows approximate 50%∼60% moderate stenosis. Figure 9(d) shows approximate 70%∼80% severe stenosis. The coronary artery plaque volume was 130 mm^{3}.

**(a)**

**(b)**

**(c)**

**(d)**

##### 3.3. DSCT Diagnosis of Coronary Heart Disease

50% was used as CT value of plaque-induced stenosis, so <50% meant nonclinically significant stenosis and ≥50% represented clinically significant stenosis. The degree of coronary stenosis caused by mixed plaques and calcified plaques was consistent between the two groups. There was moderate agreement in the diagnosis of coronary stenosis due to noncalcified plaques, and the difference was not statistically obvious (). Three patients in the observation group developed acute coronary syndrome during the follow-up because of coronary stenosis <0.05 with noncalcified plaques. Table 1 shows the degree of stenosis caused by different types of plaques in the two groups.

##### 3.4. Comparison on Intravascular Ultrasound Parameters of Patients from the Two Groups

The *y*-axis in Figure 10(a) indicates the results of the plaque volume of the two groups before and after treatment. After atorvastatin treatment, the plaque volume of patients from the two groups was reduced sharply (). The *y*-axis in Figure 10(b) shows the results of the blood vessel volume of the two groups before and after treatment. The blood vessel volume of the two groups was also decreased hugely, and the difference was significant after treatment compared with before treatment ().

##### 3.5. Comparison on Measurements of the Three Doctors

Compared with the model, the average time taken by the three doctors to measure each slide is shown in Figure 11. The average time taken by the three doctors to build the network model was 1.20 seconds, and the average time taken by the three doctors A, B, and C to measure each slide was 186 seconds, 158 seconds, and 142 seconds, respectively. Therefore, it could be concluded that the deep learning reconstruction algorithm network would take much less time to measure CT images than doctors’ measurements.

#### 4. Discussion

Plaques in arteries are classified as calcified plaque and noncalcified plaque. Noncalcified plaque is unstable, and the noncalcified plaque on the vascular wall is easy to fall off. Once falling off, it may lead to distal vascular blockage, cerebrovascular occurrence, and finally the formation of myocardial infarction [14]. CT imaging can clearly show the morphological features of vascular wall and atherosclerotic plaques. Park et al. (2010) [15] studied the effect of coronary artery intervention with statins on epicardial fat thickness in patients. Gan et al. (2010) [16] pointed that statins could protect coronary artery disease and bypass grafting for left main stenosis. Lee et al. (2018) [17] proposed that statins were associated with slower progression of overall coronary atherosclerotic volume, increased plaque calcification, and reduced high-risk plaque features. Statins did not affect the percentage progression of coronary artery stenosis, but induced phenotypic plaque transformation. In this study, the vascular volume and plaque volume of patients were measured before and after treatment, and the results showed that statins steeply reduced the plaque volume and vascular volume, indicating that the drugs had a reliable curative effect. Relevant studies in recent years have shown that statins have a good antiplatelet aggregation function, thereby reducing the body’s inflammatory response, effectively maintaining the integrity of vascular endothelium, enhancing the stability of plaque in patients with coronary heart disease, and enhancing the clinical efficacy of the disease. There are also studies showing that more than half of patients with dilated arteries have coronary artery stenosis. If not treated in time, thrombosis and coronary artery stenosis will lead to myocardial infarction and other adverse conditions in children. CT images will not produce any form of trauma in the diagnosis of coronary artery patients and can also timely observe the occurrence of tumor lesions according to the coronary artery lesions. In many cases, coronary artery dilatation will involve the left and right main coronary arteries, and CT detection can diagnose coronary lesions in multiple locations, which is helpful to improve the accuracy of diagnosis.

With the rapid development of deep learning, many researchers repeatedly write relevant programs to achieve effective deep learning algorithms. With the emergence of deep learning framework, researchers only need to select the appropriate model and optimize the weight parameters of the model area after training. Some researchers simulated the evacuation design of buildings, introduced the auxiliary image data prediction training algorithm and the tracking sequence prediction training algorithm, and verified the accuracy and training speed of the CNN model for dataset prediction. After the classification of images by deep learning, the location of lesions can be presented more clearly and the features of lesions can be extracted accurately.

Image segmentation based on deep learning in medicine includes CNN image segmentation and full CNN image segmentation. In the study of Qin et al. (2021) [18], the use of CT images under the deep learning algorithm to evaluate the airway function of children shows that the Res-Net model can effectively segment CT images and improve the patient’s CT image information. The built model has an accuracy of 96.7% for image detection, and the RIU network model can reliably segment DSCT images. Some researchers used microlocal analysis to demonstrate the appearance of artifacts in CT reconstruction images to prove that FBP algorithm could reconstruct the structural information of scanned targets, and the FBP algorithm could eliminate extra singularities to a certain extent.

#### 5. Conclusion

In this study, a CNN-based network model was constructed to measure the coronary artery plaque and vascular volume. The image reconstruction on Res-Net network showed that Res-Net network training was fast. Deep learning-based CT image could improve the time and space resolution and guarantee image quality. Deep learning-based network model had a good effect on DSCT image reconstruction. There were some limitations in the study. The sample size was small, and the segmentation of image feature information could not reach 100%. Follow-up work needs to expand the sample size to ensure the wide use of the system. In conclusion, the algorithm model in this study could provide theoretical basis for CT image reconstruction in medical system and has important reference value for the development of intelligent medical devices.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare no conflicts of interest.