Abstract

Cardiovascular diseases have become one of the most prevalent threats to human health throughout the world. As a noninvasive assistant diagnostic tool, the heart sound detection techniques play an important role in the prediction of cardiovascular diseases. In this paper, the latest development of the computer-aided heart sound detection techniques over the last five years has been reviewed. There are mainly the following aspects: the theories of heart sounds and the relationship between heart sounds and cardiovascular diseases; the key technologies used in the processing and analysis of heart sound signals, including denoising, segmentation, feature extraction and classification; with emphasis, the applications of deep learning algorithm in heart sound processing. In the end, some areas for future research in computer-aided heart sound detection techniques are explored, hoping to provide reference to the prediction of cardiovascular diseases.

1. Introduction

With the prevalence of unhealthy living habits, cardiovascular disease (CVD) has become one of the major threats to human health. According to the latest statistics of the World Health Organization (WHO) [1], the number of deaths from CVDs reached 17.9 million in 2016; CVD is the leading cause of mortality throughout the world. At present, there are about 290 million people suffering from cardiovascular diseases in China alone, so the prevention and treatment of cardiovascular diseases have become an urgent issue for health-conscious people.

Heart sounds—the sounds made by the heart systole and diastole—can be recorded as heart sound signals, also known as phonocardiography (PCG), whose acquisition is noninvasive and easy. Through PCG data processing and analyzing, the results can be used as an assistant diagnostic tool for the prediction of cardiovascular diseases. However, due to the characteristics of the heart sound signals and the influence of the noise in the environment, the detection of heart sound signals is facing great challenges. On the one hand, the randomness and variability of cardiovascular disease symptoms lead to the complexity and diversity in the signal manifestation. On the other hand, heart sound signals are relatively weak, and the acquisition process of the original signals can be affected by various noises and interferences, resulting in noisy data collected, which can reduce the accuracy of related parameter extractions and increase the uncertainty of diagnosis.

Computer-aided detection technology is a fast, efficient and economical tool [2], which can be applied to quantitative acquisition and the analysis of heart sound signals. By extracting the key parameters in the PCG and comparing the patient’s monitoring sequence with the tagged database, not only can more intuitive diagnostic results be obtained automatically, but the potential cardiovascular disease may be further inferred by the experts with their clinical knowledge. In recent years, computer-aided detection technology for the heart sound signals processing and analysis has made remarkable achievements and aroused wide interest [3, 4].

At present, intelligent auscultation technology has not been widely used in clinical diagnosis, and the main method used for heart sound detection is manual auscultation. Therefore, the research and application of computer-aided techniques for heart sound detection will greatly promote development in the field of cardiovascular disease diagnosis.

The purpose of this paper is to provide an overview of computer-aided heart sound detection techniques in recent years. The clinical characteristics of heart sound signals are introduced, first. Then, some promising processing and analyzing techniques for heart sound detection that have developed over the last five years are reviewed. Next, the deep learning algorithm that can be applied to the PCG processing and analysis is discussed. Finally, some promising research areas in computer-aided heart sound detection techniques are recommended.

2. Heart Sounds and Cardiovascular Diseases

Vibrations caused by cardiac activities such as myocardial contraction, heart valve closure, and occlusion of the ventricular wall are transmitted through the tissue to the surface of the chest wall and form the heart sound signals that can be perceived by the human ear and recorded with electronic instruments. Figure 1 shows the location of heart valves and arteries associated with auscultation. According to the order of occurrence in a cardiac cycle, the heart sound is divided into four components: the first heart sound (S1), the second heart sound (S2), the third heart sound (S3) and the fourth heart sound (S4). For each of the 4 components, the physiological state of the heart is different. Figure 2 shows the blood flow changes of partial heart sound components in the heart. The intensity, frequency and correlation of the heart sound reflect the heart valve condition, myocardial function and intracardiac blood flow. Table 1 shows the mechanism of the generation of heart sounds, including the cause, features and significance of heart sounds [5].

The fundamental heart sounds (FHS) [6] used in clinical diagnosis include S1 and S2 (S3 appears only in the cardiac cycles of some healthy young people, and S4 does not appear in normal cardiac cycles). The period between S1 and S2 in the same cardiac cycle is called systole, and the one between S2 and S1 in the next cycle is called diastole. The normal duration of systole is about 0.35 sec and that of diastole is about 0.45 sec, for a total of 0.8 sec in a complete cycle. These values are closely related to the occurrence of cardiovascular diseases. Figure 3 shows two normal cardiac cycles.

Heart sound diagnosis with manual auscultation is a qualitative method entirely based on the experience of the expert through analysis of the tone and intensity of the heart sounds. Computer-aided detection techniques for heart sound analysis can obtain the quantized characteristic parameters, which are helpful to find the relationship between the heart sounds and the related diseases. It is conducive to the subsequent traceability of data and the formation of database as well. Therefore, it is significant to research in the non-invasive diagnosis of cardiovascular disease.

3. Computer-Aided Heart Sound Detection Techniques

The computer-aided processing of heart sounds includes denoising [7], segmentation [8], feature extraction and classification [9].

3.1. Denoising

Due to the influence of the external environment, heart sound signals are usually coupled with electromagnetic interference, power frequency interference, random noise, interference from the human body, breath sounds, and lung sounds [10]. The diagnostic accuracy of the detection is directly affected by the quality of the signals and the features extracted subsequently. Consequently, denoising is the first essential step to improve the automatic detection accuracy of heart sounds. The techniques used for heart sound denoising include discrete wavelet transform (DWT), adaptive filtering denoising, singular value decomposition (SVD), etc. In addition, combined methods are applied for better effects, which help to improve the signal quality and detection accuracy.

Jain et al. [11] proposed a DWT-based PCG signal denoising algorithm, using “Coif-5” wavelet as the mother wavelet and combined with an adaptive threshold estimation method, a nonlinear intermediate function method and a genetic algorithm, to optimize the traditional discrete wavelet transform (DWT) algorithm. The improved algorithm eliminated the out-of-band noises and removed the lower detail level coefficients, further improving the denoising performance. Mondal et al. [12] introduced a novel heart-tone denoising method based on the combined framework of wavelet packet transformation and SVD. According to the standard of mutual information measurement, the most abundant nodes in the wavelet tree were selected, and the noise component from the heart sound signals was suppressed by using the SVD technique to process the coefficients corresponding to the selected nodes. Ali et al. [13] selected different DWT families, threshold types, and signal decomposition levels to denoise the heart sound signals, and evaluated the influence of different wavelet functions and wavelet decomposition levels on the efficiency of the denoising algorithm. They concluded that the Db10 wavelet and the discrete Meyer wavelet with the fourth-order decomposition can obtain the maximum SNR (signal-to-noise ratio) and the minimum RMSE (standard error) of the standard heart sounds. Zheng et al. [14] proposed an innovative denoising framework based on a combination of modified SVD and Compressed Sensing (CS), which can well maintain the original morphological characteristics of heart sounds. Compared with the traditional techniques such as DWT and empirical mode decomposition (EMD), this framework can obtain a larger SNR. The denoised heart sound signals still had the highest correlation with the original heart sound signals. Deng and Han [15] proposed an adaptive denoising algorithm. Compared with the conventional wavelet method, the proposed algorithm had better denoising effect.

3.2. Segmentation

Segmentation is often performed on the raw signal or the denoised signal. The purpose of segmentation is to find the beginning and end of heart sounds, and to segment S1, S2, systole, and diastole for the subsequent feature extraction. To date, the methods used for heart sounds segmentation mainly include hidden Markov models (HMM), WT, and correlation coefficient matrices, etc. Table 2 summarizes some of the heart sound segmentation literature in the past five years.

3.3. Feature Extraction and Classification

The goal of feature extraction is to find out a small number of representative features to replace the high-dimensional raw signals. In general, the classification model based on features training is more efficient and accurate than that which is based on raw signals training. Feature extraction is often performed on the signal with segmentation. DWT, continuous wavelet transformation (CWT), short-time Fourier transform (STFT) and Mel Frequency Cepstrum Coefficient (MFCC) are commonly used methods for heart sounds feature extraction. Without segmentation, feature extraction can be conducted on the raw signal or the denoised signal.

Classification can be performed on the features, the raw signals and the denoised signals as well. The goal of classification is to present the qualitative results of the detection, dividing the heart sound signals into the normal or abnormal. The classification techniques for heart sounds include HMM, Support Vector Machine (SVM), Artificial Neural Networks (ANN), k-Nearest Neighbor (kNN), Euclidean distance, etc. Table 3 lists the representative literature on the feature extraction and classification of the heart sound signals over the past five years.

These techniques (SVM, kNN, BP neural network, and logistic regression) all utilize machine learning—an algorithm that allows computer systems to effectively access and analyze data to adjust and improve functioning based on patterns and experience, without the need for explicit programming. In recent years, machine learning has been widely used in heart sound classification. As the incidence of cardiovascular disease increases, the amount of heart sound data to be processed is also increasing. In order to ensure the accuracy of classification while processing a large amount of data, deep learning algorithm has emerged.

4. Application of Deep Learning in Heart Sound Classification

Deep learning is a branch of machine learning that imitates the workings of the human brain, through artificial neural networks—complex algorithms inspired by the brain itself. Thus, it can automatically extract the characteristics of original signals and find out the rules among data by means of a deeper learning than the traditional machine learning, thereby improving its accuracy and efficiency of classification. The concept of deep learning was proposed by Hinton et al. [49] in 2006. Deep learning utilizes the relative relationship of space, and combines low-level models to form more complex high-level models, which greatly improves the training performance of the system. In recent years, it has shown good practicality and reliability in the fields of speech recognition [50], image recognition [51], biomedical data analysis [52, 53], signal processing [54], automatic driving [55] and other areas. Deep learning models have been applied to classify heart sound signals, and the models mainly include Deep Neural Networks (DNN), Convolution Neural Networks (CNN), Recurrent neural networks (RNN) and etc. Table 4 lists the representative literature on the deep learning applied in the classification of heart sound signals over the past five years.

Deep learning has shown good superiority in the computer-aided classification of heart sound signals, but it also faces some challenges. First of all, there are too many parameters of the deep learning model, with a large amount of data to be optimized, a long execution time and a large training data set required. Secondly, the deep learning modelling calls for higher configuration of the computer with powerful CPU and GPU for calculation, hence the experiment cost is high, and the model is unsuitable for home computers and microcomputers. However, the portable heart sound devices have great development potential and good application prospects.

5. Conclusions

With the increasing incidence of cardiovascular diseases in recent years, a greater attention has been drawn to non-invasive heart sound detection technology. In this study, the latest research on computer-aided heart sound detection techniques over the last five years has been reviewed, with the applications of deep learning to the heart sound classification as an emphasis.

Regarding the potential contributions of the technology to human health promotion, the following areas for future research are recommended. A large amount of heart sound data is needed to supplement the heart sound database. Heart sound data is a reliable source of information for discovering the hidden features of the cardiovascular diseases. Therefore, it is necessary to complete and improve the heart sound database and its corresponding expert annotations, for better model training and a more accurate assistant diagnose. Since large-scaled computer systems are already available in hospitals, it has become feasible to establish the complex deep learning model, which will be able to process the heart sound data. Thus, the data processing and the parameters optimizing techniques need more in-depth study. The deep learning modeling requires higher computer configurations with powerful GPU support, but the compressed deep learning algorithms can work on PC or microcomputers. Since the heart sounds classification model based on compressed deep learning algorithms are more accurate than those based on traditional algorithms, further study on the heart sound classification model based on the compressed deep learning algorithms is helpful to the popularization and application of portable heart sound detection.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Funding

This study was supported by the National Key Technologies R & D Program of China (No. 2016YFC0303101) and National Natural Science Foundation of Jilin Province, China (No. 20180101049JC). The authors thank the members of the projects committee for their help and thank anonymous reviewers and editors for their helpful comments.