Research Article  Open Access
JianJun Yan, YiQin Wang, Rui Guo, JinZhuan Zhou, HaiXia Yan, ChunMing Xia, Yong Shen, "Nonlinear Analysis of Auscultation Signals in TCM Using the Combination of Wavelet Packet Transform and Sample Entropy", EvidenceBased Complementary and Alternative Medicine, vol. 2012, Article ID 247012, 9 pages, 2012. https://doi.org/10.1155/2012/247012
Nonlinear Analysis of Auscultation Signals in TCM Using the Combination of Wavelet Packet Transform and Sample Entropy
Abstract
Auscultation signals are nonstationary in nature. Wavelet packet transform (WPT) has currently become a very useful tool in analyzing nonstationary signals. Sample entropy (SampEn) has recently been proposed to act as a measurement for quantifying regularity and complexity of time series data. WPT and SampEn were combined in this paper to analyze auscultation signals in traditional Chinese medicine (TCM). SampEns for WPT coefficients were computed to quantify the signals from qi and yindeficient, as well as healthy, subjects. The complexity of the signal can be evaluated with this scheme in different timefrequency resolutions. First, the voice signals were decomposed into approximated and detailed WPT coefficients. Then, SampEn values for approximated and detailed coefficients were calculated. Finally, SampEn values with significant differences in the three kinds of samples were chosen as the feature parameters for the support vector machine to identify the three types of auscultation signals. The recognition accuracy rates were higher than 90%.
1. Introduction
TCM is considered a unique medical system because of its basic theories describing the physiology and pathology of the human body, disease etiology, diagnosis, and differentiation of symptom complexes. The zangfu organs, according to TCM theories, comprise the core of the human body as an organic entity in which tissues and sense organs are connected through a network of channels and collaterals (blood vessels). In traditional Chinese medicine the zang and fu organs more importantly represent the generalization of the physiology and pathology of certain systems of the human body instead of simply anatomical substances, but Zang fu is comprised of the five zang and six fu organs. The five zang include heart, liver, spleen, lung, and kidney. The six Fu are the gallbladder, stomach, large intestine, small intestine, bladder, and triple burner. When one falls ill, a dysfunction in the zangfu organs may be reflected on the body’s surface through the channels and their collaterals. At the same time, diseases involving body surface tissues may also affect their related zang or fu organs. Furthermore, the affected zang or fu organs may influence each other through internal connections [1]. In addition, auscultation, one of the auscultation and olfaction methods in TCM diagnosis, is used to detect vocal changes reflecting the functional activities of zangfu organs and abundance or decline of the qi, blood, and body fluid.
Auscultation was clearly illustrated as early as in the Internal Classic of Huang Di [2], which provided the theoretical basis for clinical diagnosis in terms of listening to the vocal change. However, complete acoustic diagnostic methods have not been formulated. After the Ming and Qing Dynasties, auscultation gradually attracted the attention of the medical field with both theoretical content and clinical application considerably developed. Thus, a considerable distinctive stepbystep diagnostic method was formed. People around the world made substantial progress in the objective research of auscultation in the recent years with the development of computer and signal processing technology.
Mo made a frequency spectral analysis on the voice of cough patients using digital sonograph [3]. Wang and Yan performed a number of studies on the nonlinearity of the vowel /a/ signals of healthy persons and patients with deficiency syndrome by applying delay vector variance [4, 5]. These studies were effective attempts on the objective auscultation research. Chiu et al. proposed four novel acoustic parameters, such as the average number of zero crossings, variations in local peaks and valleys, variations in first and second formant frequencies, and the spectral energy ratio, to analyze and identify the characteristics among non, qi, and yindeficient subjects [6].
There are several other studies on auscultation around the world [7–11]. These methods have provided a good basis for objective auscultation in clinical diagnosis. However, auscultation signal analysis and recognition are still in the initial stage. The experiment are conducted on a small sample database. Thus the recognition is not satisfactory such that further investigation is necessary to be carried out based on these studies.
The variations in energy imply corresponding changes in signal characteristics considering the changes in the normal and abnormal voice signals corresponding with the changes in the spatial distribution of the voice signal energy. In other words, the different signal frequency components can represent the different physical properties of the measured signal [12, 13]. Compared with the traditional Fourier transform timefrequency analytical method, the wavelet transform (WT) can reveal more information on signals based on multiscale and multiresolution decomposition. Wavelet packets have recently been applied to analyse auscultation signals because of their capability of partitioning both low and highband frequencies unlike the WT that often fails to capture accurately highfrequency information [14–16].
Both approximate entropy (ApEn) and sample entropy (SampEn) can represent the signal complexity which can be used in many biomedical fields. ApEn was proposed by Pincus and Goldberg [17] to compute the quantitative information for the experimental data. However, there are some weak points in the ApEn computation process because its computation in irregular times is affected by a bias, in addition to the inconsistency of ApEn in some cases. SampEn, compared with ApEn, does not count selfmatches and shows better relative consistency and less dependence on data length.
Daubechies 4 (db4) wavelet is selected in this paper as the wavelet packet function to decompose the auscultation signals into 5level wavelet packet coefficients. Then, SampEn is proposed as a feature parameter extracted from these coefficients to analyze quantitatively the auscultation signals. Furthermore, statistical analysis is conducted to obtain the effective feature parameters with significant differences for the recognition of the voice signals. Finally, these feature values are used as input vectors of the support vector machine (SVM) classifier for automatic identification for qi and yindeficient, as well as healthy, subjects.
2. Materials and Methods
Feature parameters of auscultation signals were extracted using a combined WPT and SampEn (Figure 1). Traditional signal processing methods, including the Fourier transform (FT), fast Fourier transform (FFT), and shorttime Fourier transform (STFT), cannot reveal the nonlinear information contained in the nonstationary signal. The nonlinear information of the auscultation signal can be extracted under different timefrequency resolutions with this scheme.
2.1. WPT
Wavelets are generally well crafted to have specific properties that make them available for signal processing. WT has the capability of timefrequency analysis and can draw different frequency bands of the signal. However, with increasing scale, the higher the space resolution ratio of the wavelet functions, the lower the frequency resolution ratio will be. This phenomenon is a drawback of the wavelet function. WPT was developed to adapt the underlying wavelet bases to the contents of a signal. The basic idea is to allow subband decomposition to select adaptively the best basis for a particular signal. The WPT characteristic of narrowing wide window of frequency spectrum with increasing scale overcomes the shortcoming of the WT.
Given a finite energy signal whose scaling space is assumed as , WPT can decompose into small subspaces in a dichotomous way (Figure 2).
shows the th subspace in the th resolution level.
The dichotomous way is realised by the following recursive scheme: where is the resolution level and denotes orthogonal decomposition. , , and are three close spaces corresponding to , , and , respectively. satisfies the following equations: where and are the coefficients of the low and the highpass filters, respectively. The sequence of function generated from a given function is called the wavelet packet basis function.
The voice signal is a kind of transient, nonstationary, and random signal. Therefore, db wavelets have been widely implemented because of their advantage in matching the transient components in voice signals. Moreover, another main issue in wavelet analysis is the vanishing moment determined by trialanderror methods. More points that can be neglected will emerge in the high frequencies if the degree of vanishing moment increases. Therefore, db wavelets with vanishing moments of 4, 6, 8, and 10 were chosen to decompose and reconstitute the voice signals in this study. The db4 wavelet function was selected after analysing the different effects of the wavelet functions to decompose and reconstitute the voice signals because the rate of decay and less point can be neglected.
The signal is decomposed into two subbands in the first level, namely, low and highfrequency subbands. Then, the lowfrequency subbands are further decomposed into lower and higherfrequency parts in the following level, which was also performed in the highfrequency subbands. The same decomposition goes on repeatedly. Then, frequency subbands can be partitioned to be consistent with the signal features.
2.2. SampEn
SampEn examines time series for similar epochs and assigns a nonnegative number to the sequence, with larger values corresponding to greater complexity or irregularity in the data [18]. Selfmatches in the SampEn algorithm are not included in calculating the probability, in contrast to the ApEn algorithm. The time series and similar patterns in parameter and tolerance window are used as two input parameters, which must be set before computation. For a time series , is the length of the time series. SampEn (m, r, N) is computed as follows [18].(1)The vectors defined by , for , are formed. These vectors represent consecutive values starting with the th point.(2)The distance between vectors and , , as the absolute maximum difference between their components is defined: (3)For a given , the number of , denoted as , is counted such that the distance between and is less than or equal to . Then, for , (4) is defined as (5)The dimension is increased to , and was calculated.
Thus, is the probability that two sequences will match points, whereas is the probability that two sequences will match points. Finally, SampEn can be defined as This value is estimated by the statistics:
2.3. SVM
SVM is a useful machine learning technique that has been successfully applied in the classification area. Classifying data is a common task in machine learning. In most cases, the data to be classified is linearly nonseparable but nonlinearly separable in which the nonlinear support vector classifier can then be used. The main idea is to transform the original data into a highdimensional feature space. Thus, it may be nonlinear in the original input space even though the classifier is a hyperplane in the highdimensional feature space [19].
The product is replaced by a kernel function to construct a nonlinear support vector classifier. The following are some commonly used kernel functions:
polynomial (homogenous) polynomial (inhomogeneous) radial basis function Gaussian radial basis function hyperbolic tangent
The goal of SVM is to produce a model that predicts target values of data instances in the test set for which only the attributes are given. The following decision function is applied to determine which class the sample belongs to: The parameters and are the optimum solutions for specificity.
2.4. Clinical Data
Qideficient patients, based on TCM theory and clinical practice, exhibit the following characteristics: dispirited spirit, lack of qi and no desire to speak, discouraged, small voice; giddy dazzled, palpitations, sweaty, qualitatively weak tongue, tender, and feeble pulse. By contrast, yindeficient patients are characterised as follows: emaciation, feverish sensation over the five centres, hot flushes, night sweats, and dry stool, among others. The subjects comprised voice signals from people of different age and sex. The detailed information is listed in Tables 1 and 2.


All these data are collected by our research partner the TCM Syndrome Laboratory of the Shanghai University of Traditional Chinese Medicine in its affiliated hospitals including the Longhua Hospital and the Shuguang Hospital. The voice is recorded using a highperformance microphone (the band is AKG model HSD171) and a 16bit A/D converter connected to a computer. The frequency response range of the microphone is 60 Hz to 17 kHz. Its sensitivity is 1 mv/Pa (−60 dBV) with an impedance of 600 ohms. In addition, the sample frequency is 16 kHz. All the voice samples were collected by the acquisition system developed based on Visual C++ 6.0. The endpoint detection algorithm was applied to remove the nonvoice portions of the leading and trailing of each utterance.
The vowel /a/ was chosen as the utterance. Each subject produced a stable phonation of a sustained English vowel /a/ lasting about one second. This vowel is chosen because both patients and healthy subjects can easily pronounce this vowel. In addition, the vocal organ is not abuttal, and there is no obstacle in the cavity when this vowel is pronounced [20]. The pronunciation flow is unblocked, and a periodical waveform can be produced. Therefore, the vowel /a/ was mainly recently chosen as the utterance. The timedomain plot and spectrum of the vowel /a/ are shown in Figure 3.
(a)
(b)
2.5. Processing of Voice Signal Using WPT
The voice signals including three kinds of samples were analyzed using WPT in the first stage of processing of sample identification. Five levels of wavelet packet decomposition were applied as the preprocessing step for all subjects. The maximum frequency in highfrequency bands of the original signal is 8 kHz under the sample frequency 16 kHz, then the frequency interval of the coefficients for the frequency bands is 250 Hz in fifth level.
2.6. The SampEn Computation
In the second stage, SampEn values of approximation and detailed coefficients at each level of the wavelet decomposition were computed for the voice signals of the healthy subjects, as well as yin and qideficient patients. In choosing the optimum parameters and , Pincus suggested and to 0.25 , where is the standard deviation of the original signal , . One of the original signals was chosen and analysed using different and values to better illustrate the advantages of the choice. The results are shown in Figures 4 and 5. We can easily see that the difference in the SampEn values was the largest among the signals of the three kinds of samples (shown in Figure 5). This condition indicates that the choice of the value is appropriate. We can also see that the SampEn value decreased as the parameter increased, although in a lower degree. Therefore, is selected as 0.2 appropriately.
3. Results and Discussion
3.1. Results on SampEn Values for WPT Coefficients
Voice signals from qi and yindeficient, as well as healthy, subjects were decomposed into subbands using WPT. The frequency bands for these subbands were as follows: (the frequency interval is 4 kHz, ), (the frequency interval is 2 kHz, ), (the frequency interval is 1 kHz, ), (the frequency interval is 0.5 kHz, ), and (the frequency interval is 0.25 kHz, ). SampEn values of the approximated and detailed coefficients under fifthlevel WPT decompositions were computed using the selected parameters in Section 2.6.
The average SampEn values for the coefficients of the 1–5 levels are illustrated in Figures 6(a)–6(e). The differences between healthy and qi or yindeficient samples are relatively high, except in 0–0.5 kHz and 7.5–8 kHz of the forth level and 0.25–0.0.5 kHz, 7.5–7.75 kHz and 7.75–8 kHz of fifth level. However, the differences between the qi and yindeficient samples are relatively low apart from the following frequency ranges: 0 kHz to 8 kHz in the 1–5 levels.
(a)
(b)
(c)
(d)
(e)
We also can see in Figures 6(a)–6(e) that, with increasing wavelet packet levels, the frequency bands become more subtle. At the same time, more feature information contained in the voice signal is represented. Slight changes that cannot be reflected in low scales will be represented in high scales. Furthermore, the overall trend of SampEn values for qideficient, yindeficient and healthy samples tends to be higher as frequency increases. The SampEn values of qideficient samples are lower than those of yindeficient samples in most of frequency bands of 0–4 kHz in 1–5 levels, while the SampEn values for qi and yindeficient samples are intertwined in 4–8 kHz.
3.2. Statistical Analysis
Statistical analysis software, SPSS 20, was applied to analyse the differences among the samples. All SampEn values of the WPT coefficients from the first to the fifth levels were analyzed to obtain the features with significant differences among the three groups of samples. Tables 3, 4, and 5 shows there were 47 frequency bands having SampEn values with significant differences from 1 to 5 level.



3.3. Classification Analysis
LibSVM 2.93 was used to identify the auscultation signal. The feature parameters with remarkable differences (47 features in different bands) were chosen as the input vectors consistent with the format of the LibSVM. The SVM type is CSVC, and the RBF function was chosen as the kernel function for nonlinear training and testing after numerous experiments. The optimum parameters and were obtained as 0.25 and 0.0625 using crossvalidation ( is the penalty factor, and is the parameter for kernel function). Table 6 shows the classification results using SVM, in which a good result for classifying the samples (up to 96%) was obtained. This finding proves that the method applied in this paper is impressive.

3.4. Discussion
The quantitative analysis of the speech of healthy persons and deficient patients is one of the important task in the objectification and modernization of auscultation of TCM. The voices of healthy people are natural, gentle, clear, fluent, and understandable, while the patients with deficient syndrome speak feebly in low voice and discontinuously. The SampEn values of healthy samples are higher than qi or yindeficient samples in most of frequency bands. It may demonstrate that healthy persons have more physiological adaptabilities than the patients with deficiency syndrome. The variation trend of the SampEn values in the qi and yindeficient samples were almost similar, perhaps because both qi and yindeficient subjects belong to the deficiency syndrome, and the differences of voice signal characteristic between them are not remarkably significant. The classification result demonstrated that the SVM classifier was effective for the identification of the auscultation signals. Therefore auscultation analysis based on WPTSampEnSVM was suitable for the identification among qi and yindeficient, as well as healthy, subjects.
4. Conclusions
In this paper, we proposed a new method in identifying the auscultation signals in TCM including three kinds of samples, namely, qi and yindeficient, as well as healthy, samples. Instead of solely using traditional time or frequency domain features, we applied nonlinear dynamic parameter SampEn together with time and frequency analysis method to come up with the wavelet packet to obtain our feature parameters. Wavelet packets are specifically used because of their capability to partition both low and highfrequency signals. At the same time, SampEn, a statistics parameter used to measure the predictability of the current amplitude values of a physiological signal, is adopted in our research to analyze the signals from three kinds of samples. Experimental results illustrated that WPTSampEnSVMbased analysis was suitable for the identification among qi and yindeficient, as well as healthy, subjects. Our future research will improve the performance of indentifying deficient patients by analyzing the SampEn variability of the signals of reconstructed coefficients in different frequency bands of each level. In addition, the clinical sample size will be extended for the verification of our methods.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Grants no. 30701072, 81173199, and 30901897) and the Shanghai 3rd Leading Academic Discipline Project (Grant no. S30302).
References
 Y. Q. Wang, Diagnostics of Traditional Chinese Medicine, Higher Education Press, Beijing, China, 2006.
 S. G. Zhao, “A modem research overview on the auscultation diagnosis of TCM,” Chinese Journal of the Practical Chinese with Modern Medicine, vol. 14, pp. 1218–1220, 2008. View at: Google Scholar
 X. M. Mo, “Preliminary study on making use of sound ring instrument for the diagnosis of deficiency syndrome of the lung with cough,” Journal of Traditional Chinese Medicine Research, vol. 3, pp. 43–44, 1987. View at: Google Scholar
 H. J. Wang, J. J. Yan, Y. Q. Wang, F. Li, and R. Guo, “Digital technology for objective auscultation in traditional Chinese medical diagnosis,” in Proceedings of the International Conference on Audio, Language and Image Processing (ICALIP '08), pp. 1100–1104, July 2008. View at: Publisher Site  Google Scholar
 J. J. Yan, Y. Q. Wang, H. J. Wang et al., “Nonlinear analysis in TCM acoustic diagnosis using delay vector variance,” in Proceedings of the 2nd International Conference on Bioinformatics and Biomedical Engineering (iCBBE '08), pp. 2099–2102, May 2006. View at: Publisher Site  Google Scholar
 C. C. Chiu, H. H. Chang, and C. H. Yang, “Objective auscultation for traditional Chinese medical diagnosis using novel acoustic parameters,” Computer Methods and Programs in Biomedicine, vol. 62, no. 2, pp. 99–107, 2000. View at: Publisher Site  Google Scholar
 X. M. Mo and Y. S. Zhang, “The current situation and prospect of auscultation research in TCM,” Foundation Medical Journal of Traditional Chinese Medicine, vol. 4, no. 1, 1998. View at: Google Scholar
 X. H. Zhang, Concise Sound Medical, People's Health Publishing House, 1985.
 A. H. Tewfik, D. Sinha, and P. Jorgensen, “On the optimal choice of a wavelet for signal representation,” IEEE Transactions on Information Theory, vol. 38, no. 2, pp. 747–765, 1992. View at: Publisher Site  Google Scholar
 C. H. Horng, The Principles and Methods of Diagnostics, The Illustrations of Chinese Medicine, chapter 4, Lead Press, Taipei, Taiwan, 1993.
 X. L. Wang, First exploration of the smell diagnosis on tuberculosis, Summary of Graduate Student Thesis, 1992.
 G. Cui, X. Cao, and X. Zhang, “Analysis of biological data with digital signal processing,” in Proceedings of the IEEE 7th Workshop on Multimedia Signal Processing (MMSP '05), Shanghai, China, November 2005. View at: Publisher Site  Google Scholar
 C. Roberto, Modern Digital Signal Process, Tshinghua University Press, Beijing, China, 2004.
 B. C. Li and J. S. Luo, Wavelet Analysis and Its Applications, Electronics Engineering Press, Beijing, China, 2003.
 X. H. Tang and Q. L. Li, TimeFrequency Analysis and Wavelet Transform, Science Press, Beijing, China, 2008.
 L. H. Yang, D. Q. Dai, and W. L. Huang, Wavelet Tour to Signal Processing, China Machine Press, Beijing, China, 2002.
 S. M. Pincus and A. L. Goldberger, “Physiological timeseries analysis: what does regularity quantify?” American Journal of Physiology, vol. 266, no. 4, pp. H1643–H1656, 1994. View at: Google Scholar
 J. S. Richman and J. R. Moorman, “Physiological timeseries analysis using approximate and sample entropy,” American Journal of Physiology, vol. 278, no. 6, pp. H2039–H2049, 2000. View at: Google Scholar
 V. N. Vapnik, Statistics Learning Theory, Wiley, New York, NY, USA, 1998.
 L. Z. Hou and D. M. Han, “Selection of the vowel sound in the throat sound detection,” Journal of Audiology and speech Diseases, vol. 10, p. 16, 2002. View at: Google Scholar
Copyright
Copyright © 2012 JianJun Yan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.