Abstract

Motion artifacts are a major challenge in the in vivo application of catheter-based cardiac imaging modalities. Gating is a critical tool for suppressing motion artifacts. Electrocardiogram (ECG) gating requires a trigger device or synchronous ECG recordings for retrospective analysis. Existing retrospective software gating methods extract gating signals through separate steps based on changes in vessel morphology or image features, which require a high computational cost and are prone to error accumulation. In this paper, we report on an end-to-end unsupervised learning framework for retrospective image-based gating (IBG) of catheter-based intracoronary images, named IBG Network. It establishes a direct mapping from a continuously acquired image sequence to a gated subsequence. The network was trained on clinical data sets in an unsupervised manner, addressing the difficulty of obtaining the gold standard in deep learning-based motion suppression techniques. Experimental results of in vivo intravascular ultrasound and optical coherence tomography sequences show that the proposed method has better performance in terms of motion artifact suppression and processing efficiency compared with the state-of-the-art nonlearning signal-based and IBG methods.

1. Introduction

Catheter-based cardiac imaging modalities, such as intravascular ultrasound (IVUS) and intravascular optical coherence tomography (IVOCT) [1], are important for the clinical diagnosis of coronary atherosclerotic disease (CAD). They have similar imaging principles in that a guide wire is inserted into the target vascular lumen and secured at the distal end under the guidance of X-ray angiography. Then, a special catheter with a probe at the tip is inserted into the lumen along the guide wire and pushed to the distal end. While the catheter is pulled back from the distal end to the proximal end, the probe emits an energy beam toward the surrounding tissue (IVOCT uses a low-coherence broadband infrared laser source, and IVUS uses ultrasound at 20–50 MHz). A detector at the tip of the catheter collects ultrasound echoes or backscattered light signals. Finally, sequential cross-sectional images (short-axis view or B mode) that show cross sections of the vessels, including lumen, adventitia/media, intima, and plaque load [2] are formed. Longitudinal view (L-view) slices along the long axis of the vessel segment (i.e., time-axis view) are also obtained for volumetric analysis.

A major challenge in the clinical application of intracoronary imaging is motion artifacts. Coronary arteries attach to the epicardial surface and move rhythmically with the heartbeat. When the catheter is continuously pulled back to collect signals, the cardiac movement and pulsating blood flow in the vascular lumen can cause jitter at the tip of the catheter and lateral displacement relative to the lumen. The pullback path is not always parallel to the long axis of the lumen. The deflection of the probe causes the target tissue area to deviate from the detector receiving focus, and the received echo signal cannot accurately reflect the morphology of the target, resulting in deformation and distortion of the vascular structure displayed in the image [3]. Furthermore, cardiac dynamics causes a longitudinal oscillation of the catheter relative to the lumen [4], resulting in repeated sampling of the same anatomical area. Motion artifacts are the result of the combined effects of these factors. Unlike single-frame artifacts, such as blurring artifacts, gain artifacts, acoustic artifacts, reverberation artifacts, and guide wire artifacts, motion artifacts, are related to the analysis of multiframe images. They appear as misalignments (rotations and shifts) and distortions of vessel structures between successive B-mode images, as well as sawtooth-shaped longitudinal appearances in the L-view of pullback sequences [57]. They affect the subsequent processing of image-derived quantities for diagnosis or for input of computational models, such as 3D vessel reconstruction [8], volumetric measurements [9], and estimation of biomechanical parameters (wall strain and elasticity) [10]. Although the acquisition speed of the frequency domain OCT can be increased to more than 100 frames per second (fps) [11], suppressing motion to some extent, relative motion between the catheter and the wall of the vessel may still affect the visualization of the entire pullback sequence, thus decreasing precision in subsequent postprocessing procedures.

Gating is currently the main tool for suppressing motion artifacts in in vivo cardiac imaging applications, including prospective gating and retrospective gating, as shown in Figure 1. Prospective gating is the use of electrocardiogram (ECG) triggering devices to capture images in specific cardiac phases (typically R-waves) [1214]. It is not widely used in the clinic because not all commercially available intravascular imaging catheters include an ECG triggering option. Because only one frame per cardiac cycle is captured, this image acquisition method considerably extends catheterization duration and increases radiation dose and surgical risk compared to continuous catheter pullback.

Retrospective gating is implemented by hardware or software. Retrospective hardware gating involves continuously withdrawing the catheter to obtain sequential images covering several cardiac cycles, while simultaneously recording ECG signals. After cardiac catheterization, images are retrospectively analyzed against the ECG signal, and those images acquired in the same phase (usually R-wave) for each cycle are selected to form a gating subsequence [15]. Unlike prospective ECG triggering, retrospective hardware gating does not prolong intervention time. However, it is difficult to ensure full synchronization between ECG recording and interventional image acquisition. The ECG records the overall physiological electrical activity of the heart, while motion artifacts present in the intracoronary image sequences depend on the local movement of the catheter relative to the vascular lumen. In addition, a more difficult problem is choosing the most effective R–R fraction to achieve maximum interframe stability, especially in the case of arrhythmias.

Retrospective software gating utilizes signal processing techniques (signal-based gating, SBG) or image processing techniques (image-based gating, IBG) to extract implicit cardiac phases from signals or images collected through continuous catheter pullback. It does not require ECG trigger devices or ECG records for retrospective analysis. Therefore, it can overcome the shortcomings of ECG gating and achieve gating without synchronous ECG recording. The state-of-the-art SBG method presented in [16] uses an affinity propagation (AP) algorithm to cluster the correlation matrix of raw imaging signals, extract static signal frames, and then reconstruct the gating images from the static frames of signals. Because not all commercially available intravascular imaging systems allow raw signal acquisition, SBG is not as commonly used as IBG. IBG is implemented in two ways: morphological feature-based and grayscale feature-based. The former is to extract approximately periodic gating signals by tracking the changes in vessel wall contour or lumen centroid position over time in B-mode images [17, 18]. It requires precise segmentation of all B-mode images to extract vessel wall contours. Manual segmentation is time-consuming, while the accuracy and robustness of automatic segmentation are difficult to ensure. The latter analyzes the changes in the intensity features of B-mode images throughout the entire pullback sequence. It eliminates the need for prior segmentation and can achieve fully automatic gating. For example, it can be implemented by constructing a dissimilarity matrix based on the normalized correlation of the intensity features between successive frames. The path with the lowest cumulative dissimilarity is then found in the matrix using dynamic programing to extract the gating signal [19, 20]. In addition, gating signals can be extracted from local changes in the local average of pixel intensity [21], the variation of the motion blur [22], changes in local grayscale features [23], combinations of image edges and pixel intensity [24], phase changes of different frequency components of images [25], or linear combinations of several features of images [26]. These methods require traversing all pixels in each B-mode image, resulting in a computationally heavy load. Addressing this issue, manifold learning has been employed to reduce high-dimensional image sequences to low-dimensional manifolds. The low-dimensional feature vectors that can describe cardiac motion are used to construct a distance function [27, 28].

Traditional retrospective software gating methods extract the gating signal based on image features and signal processing through separate steps. They are sensitive to image noise and artifacts, making it difficult to ensure the robustness of the gating results and are prone to error accumulation. Over the past decade, deep learning has emerged as a potential preferred solution for quickly and accurately analyzing large cardiac imaging data sets [29]. Addressing the issue of IBG for near-infrared spectroscopy IVUS, Bajaj et al. [30] built a neural network model using bidirectional-gated recurrent units to detect end-diastolic frames. This work is the first attempt to use deep learning for IBG by analyzing the absolute intensity difference between corresponding pixels in successive frames. The process is computationally intensive, and the results are sensitive to image noise and artifacts. Recently, they improved their work exploring forward and backward motion features of IVUS sequences by integrating dedicated motion encoders and a bidirectional attention recurrent network [31]. In both works, neural network models were trained in a supervised manner on ECG-defined gold-standard end-diastolic frames. Labeling samples is a laborious task and requires rigorously synchronized intravascular image acquisition and ECG recording.

In this work, we propose an unsupervised deep learning framework to achieve retrospective software gating of intracoronary image sequences. The main contributions are summarized as follows:(1)We enable end-to-end mapping from a continuous pullback intracoronary image sequence to a gated sequence covering several cardiac cycles. In particular, a CNN framework named IBG Network (IBG-Net) is developed to detect gating frames from the original image sequence. It has clinical significance in suppressing motion artifacts associated with the cardiac cycle in intracoronary imaging.(2)We train IBG-Net on clinical data sets in an unsupervised manner, addressing the difficulty of obtaining a gold standard in motion suppression and IBG techniques.(3)We validate the feasibility and superiority of this method in clinical data sets. The experimental results show that compared to traditional SBG and IBG methods, the proposed method performs better in terms of the visual effects of L-view and the quantitative evaluation metrics of vascular wall boundary smoothness, interframe dissimilarity, and vascular geometry measurement.

The remainder of this article is organized as follows: Section 2 describes the proposed method in detail. Section 3 provides relevant results from clinical image experiments and an analysis of quantitative evaluation metrics. Section 4 provides a relevant discussion of the factors that affect the performance of this method and its limitations. Section 5 concludes the article with a summary.

2. Materials and Methods

An overview of our method is illustrated in Figure 2. Intracoronary image sequences are acquired by continuously withdrawing the catheter at a constant speed during routine cardiac catheterization. For each pullback sequence, corrupted frames are manually selected and discarded prior to subsequent analysis. The remaining images, which cover several cardiac cycles and fully show the morphology of the lumen, wall, and plaques of the vessel, are fed into IBG-Net. The network finally outputs a gated subsequence.

2.1. IBG-Net Architecture

IBG-Net consists of two modules: the underlying feature learning (UFL) module and the gating frame extraction (GFE) module. The UFL module is used to extract feature vectors from the input successive B-mode images frame by frame. Its output is a feature vector library in which the feature vectors extracted from each image are stored. As illustrated in Figure 3, the GFE module consists of a 3 × 3 convolutional layer with 64 feature channels, four residual blocks (ResBlocks), an average pooling (avgpool) layer, and two fully connected (FC) layers. Each ResBlock consists of two residual basic blocks (BasicBlocks), each of which contains two 3 × 3 convolutional layers. Furthermore, the third, fifth, and seventh BasicBlocks contain a 1 × 1 convolution layer to reduce the number of feature channels to 1. Each convolutional layer is followed by a batch normalization layer. The FC layers project feature vectors into a 128-dimensional space and perform L2 normalization. The first FC layer has 512 output neurons, and the second one has 128 output neurons, outputting the 128-dimensional feature vector of the input image. The linear rectification function ReLU is used as the activation function.

The GFM module detects the gating frames by comparing the dissimilarity between feature vectors of different frames output by the UFL module. Specific processing details are as follows: First, as illustrated in Figure 4, the dissimilarity between the feature vectors extracted from two images with an interval of k frames is calculated as follows:where represents the dissimilarity between frame i, , and frame i + k, and are the feature vectors extracted from and , respectively, is the cosine similarity between and , the superscript “T” represents the transpose of a vector, and represents vector modulo. The average dissimilarity of two images separated by k frames over the entire image sequence is obtained by the following:where k ranges from 0 to N − 1, and N is the total number of frames in the input image sequence.

Then, the average heart rate, R, in beats per minute (bpm), is estimated from the local peak of the frequency spectrum of the average dissimilarity signal. Given that the human heart rate is between 45 and 200 bpm, we search for this peak in the frequency range of 45–200 bpm. The length of the cardiac cycle in frames is subsequently obtained by , where f is the frame rate of the image acquisition in fps.

Assuming that the first frame in the input image sequence is a gating frame, the similarity between the current gating frame and a frame in the interval is calculated as follows:where and is the similarity between frame m, , and frame m + r, . The frame with the greatest similarity is detected as the next gating frame. This process is repeated until all gating frames are detected.

2.2. Training Data Preparation

We trained the proposed learning framework on clinically acquired image data sets in an unsupervised learning manner. Our data sets consist of IVUS and IVOCT image sequences collected from patients prior to stent implantation during routine cardiac catheterization. The IVUS studies were acquired using a commercially available Jomed Endosonic (Beringen, Switzerland) imaging system with a 2.9 F 30 MHz mechanically driven catheter; its pullback speed is 0.5 mm/s. The B-mode images were acquired at a frame rate of 30 fps, with a size of 256 × 256 pixels and a grayscale range of 0–255. The IVOCT studies were acquired using a spectral domain OCT system (C7XRTM OCT system, LightLab Imaging/St. Jude Medical Inc., St. Paul, MN, USA) equipped with a 2.7 F C7 Dragonfly OCT catheter; its pullback speed is 20 mm/s. B-mode images were acquired at an A-line rate of 120 kHz and a frame rate of 100 fps, with a size of 240 × 240 pixels. The images were resized to 256 × 256 pixels to build the data sets.

2.3. Network Training

In IBG-Net, the UFL module must be trained with a training set, following the steps shown in Figure 5. First, the ith sample xi from the training set is input into a random augmenter to generate two correlated images, which are represented by and , respectively, through random rotation, random shift, random color dithering, and random Gaussian blur. Then, and are input into the twin UFL subnetwork composed of two UFL modules F1(·) and F2(·) with shared weights to obtain the feature vectors and of the two correlated images as follows:

The similarity between and is obtained by the following:

The adaptive moment estimation algorithm (Adam) [32] is used to optimize the parameters of the twin UFL subnetwork to determine the optimal network model with the least loss. For each batch in the training set, the loss function is defined as follows:where denotes the set of all learnable parameters in the network, Q denotes the number of samples in each batch, is the loss between and , and is the loss between and ,where τ is an adjustable parameter, and is an indicator function evaluating to 1 if and 0 otherwise.

In addition, the network parameters are tuned through the five-fold cross-validation to make full use of all data and avoid the local optimal solution due to data distribution bias caused by improper data set partitioning.

Figure 6 shows the training loss relative to the epoch, where batch size = 16, learning rate = 0.01, and the weight decay is 10−4. With the increase of the epoch, the loss decreases rapidly. When the epoch is greater than 600, the change in the loss is no longer significant. Therefore, we set the stopping rule for training at the maximum epochs of 600.

2.4. Network Implementation

After training, IBG-Net can be used for offline gating of intravascular image sequences. Because the trained network can achieve end-to-end mapping from the original image sequence to the gated subsequence, the sequential images, after manually removing corrupted frames are, directly input into the network, and the network outputs a gated image sequence. Figure 7 shows the flow chart of the specific implementation of the network.

3. Results

3.1. Experimental Design
3.1.1. Data Sets

We collected 36 IVUS pullbacks from 25 patients with hidden information (such as name, sex, and age), with a total of 43,100 frames, including 13 left anterior descending, 9 left lateral arteries, and 14 right coronary arteries. In addition, we collect 21 IVOCT pullbacks from 14 patients with hidden information, with a total of 22,500 frames, including 9 left anterior descending, 5 left lateral arteries, and 7 right coronary arteries. The samples in both data sets are shuffled and are randomly partitioned into training and test with a ratio of 8 : 2. To avoid overfitting, the training sets are augmented by random rotation (clockwise and counterclockwise, at angles up to 180°), random shift (horizontal or vertical), and random shear, given that the object of interest in intravascular B-mode images is the cross-section of vessels with a quasisymmetric structure. The augmented IVUS training set contains 49,680 samples, and the IVOCT training set contains 32,750 samples.

3.1.2. Baseline Methods

To test the performance of the proposed learning framework and demonstrate its superiority, the state-of-the-art SBG [16] and traditional IBG [19] methods are used as the baseline. The IBG method utilizes AP clustering to achieve the classification of signal frames, as introduced in Section 1. For brevity, it is called the AP method hereafter. The traditional IBG method is based on a comprehensive search for the optimal path in the grayscale dissimilarity matrix.

3.1.3. Evaluation Metrics

The effectiveness of gating methods to suppress motion artifacts is evaluated from four aspects: (1) the visual effect of the L-view of nongated and corresponding image-gated image sequences, (2) quantitative measures of vessel wall boundary smoothness, (3) the interframe dissimilarity of image sequences before and after gating, and (4) volume measurement of vessel segments.

To visually evaluate the improved quality of the gated image sequence, vascular lumen boundaries are detected from L-view slices. Specific steps are illustrated in Figure 8. In the L-view slice, starting from the central axis of the catheter, all pixels are traversed to the left and right, respectively. According to the preset threshold of intensity difference between adjacent pixels, it is determined whether the current pixel is a lumen contour point. If so, the coordinates of that point are recorded until all lumen contour points are obtained.

The curvature and standard deviation (SD) of the vessel wall border are used as quantitative metrics to evaluate the smoothness of the vessel wall in the L-view and the cross-sectional view, respectively. The curvature of the vessel wall border in the L-view is defined as follows:where is a parametric curve representing the boundary of the vessel wall, and are the first derivative and the second derivative of , respectively, and is the curvature of at the point x. SD measures the dispersion of the distance from the center of the image (i.e., the center of the catheter) to the upper border of the vascular intima/lumen in each B-mode image. It is defined as follows:where () is the distance (in pixels) from the center of the image to the upper border of the vascular lumen in the ith frame, as shown in Figure 9; and is the average of over the entire image sequence.

Average interframe dissimilarity (AIFD) [20] and gating frame number difference (GFND) [4, 20] are two metrics commonly used to measure the performance of gating methods. AIFD is calculated as the average of the dissimilarity between two frames in a pullback sequence consisting of N frames as follows:where is the dissimilarity between frame i and frame j, calculated from Equation (1). GFND is the absolute value of the difference between the theoretical gating frame number and the actual gating frame number Na as follows:where is the total number of frames contained in the nongated image sequence, and denotes the cardiac cycle length in frames.

3.1.4. Implementation Details

In our experiments, we implemented the baseline approaches and built, trained, and tested IBG-Net using a Tesla P100-PCIE-16 GB 1.3285 GHz GPU from NVIDIA, an Intel Core i7-12700H CPU, and 16 GB of video memory. The operating system is Ubuntu1 8.04, the software environment is Python 3.8 for the programing language, and the deep learning framework is Pytorch 1.7.

3.2. Results of IBG-Net Gating

Figure 10 shows heatmaps of the image features extracted from clinically collected IVOCT and IVUS images using trained IBG-Net. It can be seen that the extracted features are concentrated around the vascular lumen and wall. Figure 11 shows the average dissimilarity signals and their amplitude spectra, where the average dissimilarity plots exhibit approximate periodicity, with local minima within each cycle corresponding to roughly equally spaced frame intervals, associated with periodic cardiac movements. The amplitude spectra exhibit an obvious peak in the frequency range of 45–200 bpm, corresponding to an average heart rate of about 87 and 78 bpm, respectively. Given that IVUS and IVOCT images are captured at frame rates of 30–100 fps, respectively, the approximate length of the cardiac cycle is 21 frames and 77 frames, respectively. Figure 12 shows the L-view of the nongated and corresponding image-gated IVUS/OCT image sequences, as well as the 3D appearance of the vessel segments shown in the pullback sequences. Apparently, the vessel walls in the nongated image sequences have a sawtooth appearance, whereas, in the IBG-Net-gated image sequences, the vessel walls become visually smooth. This improvement is even more obvious in 3D views.

Figure 13 shows the results of the quantitative evaluation of the vessel wall smoothness. It can be seen from Figure 13 that after gating, the curvature and its variation along the boundary of the vessel wall decrease significantly, indicating that the continuity and smoothness of the vessel wall are significantly improved. Table 1 provides the quantitative results of AIFD, GFND, and SD for the image sequences before and after gating. Obviously, the GFND of IBG-Net-gated sequences is less than 2, and compared to nongated sequences, the AIFD and SD are significantly reduced. These results suggest that IBG-Net can effectively suppress motion artifacts associated with cardiac cycles.

3.3. Results of the Comparison Experiment

Figure 14 shows the AIFD and SD metrics of gated image sequences obtained using IBG-Net and baseline methods, respectively. Both the AIFD and SD of the IBG-Net-gated sequences are significantly lower than those obtained using the traditional IBG method and slightly lower than those obtained using the AP method, indicating the effectiveness of IBG-Net in reducing motion artifacts.

To quantitatively demonstrate that the proposed method can quantify the volume change of vessel segments more accurately than existing methods, the volume of vessel segments in gated image sequences obtained using ECG gating, IBG-Net, AP, and IBG, respectively, is recorded in Table 2. Figure 15 shows the difference in vessel volumes between different methods using ECG gating as the gold standard. It can be seen that the volume obtained by IBG-Net is closest to that obtained by ECG gating, followed by AP, and the difference between the results obtained by IBG and ECG gating is the largest.

In addition, to evaluate the implementation efficiency of the methods, the analysis time of the three methods is recorded in Table 3, where the time of IBG-Net does not include training time. The IBG method needs to preprocess each frame in the image sequence and design the filter. The AP method needs to reconstruct the image frame by frame after selecting the gating signal frames. IBG-Net can directly obtain gated image sequences in an end-to-end manner, which saves running time and improves processing efficiency.

3.4. Influence of Stenting

Stent implantation is an interventional treatment for CAD, which can effectively improve the blood supply of narrowed arteries. The commonly used cardiac stents in clinical practice are typically bendable metal meshes used to expand and remodel stenotic vascular lumens. Figure 16 shows an IVUS image and an IVOCT image captured from post-stent subjects, as well as an L-view of the two image sequences in which sawtooth-shaped vessel walls can be observed. Figure 17 shows the average dissimilarity signals obtained by IBG-Net and the traditional IBG method based on changes in image intensity [19], respectively, and the lumen centroid offset signals obtained by the method based on changes in vascular morphology [20]. It can be seen from Figure 17 that neither the average dissimilarity signal nor the lumen centroid offset signal presents obvious periodicity. Their magnitude spectra have multiple peaks with lower amplitudes within the frequency range of 45–200 bpm, making it impossible to determine the average heart rate. This is because IBG-Net, vessel morphology-based methods, and pixel intensity-based methods all detect gating frames based on the periodic changes in vascular cross-sectional features. However, there is a certain elasticity gap between the metal mesh and the arterial wall. The impact of cardiac motion on metal stents is not as significant as on arterial walls. Therefore, the temporal variation of cross-sectional features of arterial vessels in sequential intravascular images cannot accurately reflect periodic cardiac dynamics. In this case, none of these three methods can accurately detect gating frames. One possible solution is to suppress motion artifacts directly through image registration without discarding any frames, despite being computationally intensive.

4. Discussion

4.1. Clinical Significance

Motion artifacts exist in in vivo intracoronary image sequences collected by continuously withdrawing the catheter, which are caused by the elasticity of the vessel wall, cardiac and respiratory motion, and rapid movement of the catheter within the lumen. Respiratory motion can be suppressed by acquiring images while the patient holds his breath. Given the rhythmic and continuous nature of cardiac motion, gating techniques are usually used to synchronize image acquisition with the cardiac cycle. Furthermore, in the interpretation of L-views, analysis of images obtained at the same cardiac phase gives a more accurate quantification of the volume changes of vessel segments with plaque load than using a complete nongated pullback sequence [33]. Online ECG gating requires a dedicated trigger acquisition device. It takes approximately seven times the acquisition time of continuous pullback, which limits its application in vivo. Although offline ECG gating does not extend the image acquisition time, complete synchronization between interventional image acquisition and ECG recording cannot be guaranteed. Determining optimal sampling points is also a difficult task. This technique cannot be used for image sequences without ECG recordings.

The method proposed in this article overcomes these limitations and realizes end-to-end fast retrospective software gating. It can significantly suppress motion artifacts and facilitate integration with subsequent procedures, such as 3D vessel reconstruction and quantitative measurement of morphological parameters, which are of importance for clinical applications.

4.2. Necessity of Unsupervised Learning

The purpose of IBG is to detect frames acquired in the same phase of each cardiac cycle based on periodic changes in cross-sectional image features. As introduced in Section 1, traditional nonlearning approaches typically employ morphological feature-based or intensity feature-based strategies. The former involves extracting the contours of the vascular lumen or wall in each image, calculating the centroid of the lumen, analyzing the changes in lumen shape, extracting and filtering the gating signal, and detecting the cardiac phase. The latter involves traversing all pixels in each cross-sectional image, constructing an intensity dissimilarity matrix or analyzing local changes in pixel intensity, searching for the optimal path in the dissimilarity matrix or analyzing the signal reflecting the pixel intensity changes, and detecting the local extremes of the signal. Performing these steps separately in sequence is time-consuming, and errors from each step may accumulate to the next step, affecting the accuracy of the gating results.

This study aims to construct a convolutional neural network to achieve end-to-end mapping from a continuous pullback sequence to a gated subsequence. Compared with traditional methods, it avoids the accumulation of errors in each step and shortens the overall analysis time. The network is trained in an unsupervised manner using clinical in vivo data sets, where the network can improve the accuracy of feature extraction by learning prior knowledge of image features, thereby improving the gating accuracy. Unlike supervised training, there is no need for gold standard data such as ECG-defined ones, thus reducing the difficulty of constructing a data set and avoiding synchronous challenges and susceptibility to arrhythmias in ECG gating.

4.3. Limitations and Future Work

A common limitation of gating techniques is the retention of only one frame per cardiac cycle, which may result in the loss of some clinically valuable information. Single-phase-gated image sequences cannot be used for hemodynamic assessment, vessel elasticity analysis, or biomechanical characterization, which typically require the use of complete pullback data. Additionally, as described in Section 3.3, the proposed method can only achieve satisfactory gating results for image sequences acquired from subjects without stent implantation. However, in the image sequences collected from poststent subjects, vascular features do not change significantly enough between successive frames for effective gating. In future work, we plan to develop a deep learning framework to estimate the motion field between successive frames, based on which images are elastically registered, to achieve direct suppression of motion artifacts.

5. Conclusions

This work proposes an unsupervised deep learning framework for retrospective gating of continuous pullback intracoronary image sequences. The constructed neural network extracts feature vectors from each cross-sectional image and obtains the signal associated with the cardiac cycle by analyzing changes in the feature vectors between successive frames. The network was trained and tested using clinically acquired IVUS and IVOCT image sequences. The results show that the visual effect of the L-view slices of the gated image sequence is significantly improved. Quantitative evaluation metrics based on AIFD, GFND, vessel wall boundary curvature, and SD, and vessel volume are also significantly improved. Furthermore, the proposed method is superior to the traditional SBG and IBG methods in terms of accuracy and processing time. It does not require any image preprocessing procedures such as segmentation or contour extraction. The results verify the feasibility of using deep learning to perform retrospective gating.

Data Availability

The data sets generated and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was financially supported by the National Natural Science Foundations of China (no. 62071181).