Cough is a common symptom of many respiratory diseases. Many medical literatures underline that a system for the automatic, objective, and reliable detection of cough events is important and very promising to detect pathology severity in chronic cough disease. In order to track the development status of an audio-based cough monitoring system, we briefly described the history of objective cough detection and then illustrated the cough sound generating principle. The probable endpoints of cough clinical studies, including cough frequency, intensity of coughing, and acoustic properties of cough sound, were analyzed in this paper. Finally, we introduce some successful cough monitoring equipment and their recognition algorithm in detail. It can be obtained that, firstly, acoustic variability of cough sounds within and between individuals makes it difficult to assess the intensity of coughing. Furthermore, now great progress in audio-based cough detection is being made. Moreover, accurate portable objective monitoring systems will be available and widely used in home care and clinical trials in the near future.

1. Introduction

Cough is a common but complicated symptom of respiratory diseases. This symptom is also the very reason why people seek medical advice in America and China [1, 2]. Even though the importance of cough diagnosis is well admitted by academic organizations [24] in recent years, there is no gold standard to assess cough due to the lack of objective and accurate measures of cough frequency and severity [5]. When cough becomes chronic, it is so extremely unpleasant and distressing that the life quality of chronic cough patients has significant reductions [6]. The health care cost, medical consultations, and medication use hence become a heavy burden for them [7]. The assessment of cough severity is subjective at present: it contains visual analogue scales (VAS), health-related quality of life (HRQOL), Leicester cough questionnaire (LCQ), cough-specific quality of life questionnaire (CQLQ), and so on [8, 9]. They have been validated in chronic or acute cough in clinical trials [9]. However, these tools are completed either by the patient himself or by a parent [10], and hence it is in conflict with the standard that the primary outcome measure of clinical trials should be objective. Moreover, some literatures have shown that the objective cough frequency monitors may not have much to do with the subjective assessment methods of cough [11, 12]. It may be due to the different standards used in these tools. The medical literature indicates the necessity of an objective and reliable tool to measure the severity of cough [3].

As early as the 1950s, some attempts to monitor cough objectively have been made [13]. Since then, there have been three main ways to record cough. One is based on the airflow measurement at the mouth to obtain the flow dynamics of cough [14, 15]. However, this method is not suitable for continuous monitoring in the outpatient environment [16]. The second is based on the movement of the chest. For example, some researchers [17] invented an accelerometer-based system that used an accelerometer placed at the volunteer’s chest wall to record cough events, but such system required researchers to count coughs manually. The measurement of cough sounds, the last one, has been more universal because of advances in computer technology and the availability of portable digital sound recording devices [18].

The underlying disease determines the physical character of the cough sound [19]; cough has been described as dry, wet, loose, or whooping, depending on the amount of expectoration and sound quality. Therefore, methods based on cough sounds for counting and classifying cough events have been developed. This article focuses on audio-based methods and systems for the analysis and measurement of cough.

2. Cough Sound Basics

An airflow and acoustic signal for a cough is seen in Figure 1. As shown in the figure, a deep inspiration usually starts a classical cough, followed by glottis closure. During the glottis closure, respiratory muscles contract against the closed glottis and then the sudden opening of the glottis occurs with transient and fast expiratory airflow accompanied by the typical cough sound. Sometimes, several further partial glottis closures lead to some extra voiced sounds, which also called a cough sequence [6, 20]. However, the origin of cough sounds is still unclear because laryngeal structures and the resonance of the nasal and thoracic cavity are all involved in cough and their roles in cough are uncertain to some extent [21].

The typical cough sound is usually divided into three phases (in Figure 2) [3]: (1) an explosive expiration due to the glottis suddenly opening, (2) the intermediate phase with the attenuation of cough sounds, and (3) the voiced phase due to the closing of the vocal cord. In fact, there are a variety of patterns of cough that occur; for example, some cough sounds only have two phases (the intermediate phase and the voiced phase) and the explosive phase usually prolonged because of some diseases.

3. Endpoints of Objective Cough Assessment

Cough frequency evaluation is considered to be a gold standard for the objective evaluation of cough [8]. Besides it, the intensity of coughing, the pattern of coughing, and the acoustic properties of cough sounds may be used as clinical endpoints [3, 8].

3.1. Cough Frequency

According to the second part, even if there exist a variety of patterns of coughing making it difficult to identify and quantify cough, coughing can be quantified in four different ways [22]: (1)Explosive cough sounds: the number of characteristic explosive cough impulses(2)Cough seconds: the quantity of seconds and hours having at lowest an explosive phase(3)Cough breaths: respiration rates including at least a cough(4)Cough epochs: the number of cough sounds with no more than two seconds of each coughing interval.

The effectiveness of any of these metrics over the other is still ongoing research.

These methods are used for counting cough events, and it is not clear whether any of them is more valid than the other. There is a good linear relationship between explosive cough sounds and cough seconds under different circumstances. Cough epochs are less related to explosive cough sounds [6, 23]. As a result, current chronic cough frequency monitors usually use explosive cough sounds to evaluate cough frequency [22].

24-hour cough frequency is proved correct and effective in a longitudinal observational study of 33 healthy subjects with acute cough [24]. 4-hour cough frequency is found to be responsive to improvements in cough severity following trials of therapy in 100 patients with chronic cough [25].

3.2. Cough Intensity

Chronic cough is a common condition related to significant physical and psychological morbidity. But there is a weak relationship between health-related quality of life and cough frequency [26]. This has resulted in cough intensity of some patients that may be significant [27, 28]. The intensity of voluntary, induced, and spontaneous cough has been researched [29, 30], and peak cough flow rate, oesophageal pressure, and gastric pressure are important and relevant measures of cough intensity in patients with chronic cough [31]. Limitations of above indexes are that they are either invasive or impractical to measure in an ambulatory setting [8]. Therefore, cough sound is a potential measure of the intensity of coughs.

Cough intensity may be measured by cough sound power, peak energy, and mean energy [31, 32], and these indexes can be calculated for a time window with a duration of 0.5 second from one set of phase 1, the explosive phase of cough sound [32]. However, more researches should assess cough sound responsiveness as a measure of cough intensity.

3.3. Cough Patterns and Acoustic Properties of Cough Sounds

Cough patterns and the quality of a patient’s cough sound may reflect useful information about their condition. Cough patterns and some cough sound features may be endpoints of clinical experiments [33]. However, more researches should be undertaken to study the relationship between cough patterns (or acoustic properties of cough sounds) and the illness-triggering cough.

4. Audio-Based Cough Monitoring Systems

In the analysis of cough sounds, researchers focus on two aspects: one is the study of cough recording and monitoring equipment and the other is the study of a cough sound processing algorithm.

As previously mentioned, cough frequency is the most valuable index of objective cough assessment [34]. Meanwhile, cough frequency monitoring technology is also the most mature one among the present objective cough assessment technology researches [22]. Manual counting of cough sounds remains the reference standard because compared with other tools, the human ear performs best in counting cough events [35, 36]. Even so, the arduous and time-consuming nature of manually counting cough events restricts its feasibility in larger-scale studies and clinical application [35]. Therefore, automatic monitoring of cough sounds is a development trend of objective cough evaluation.

Currently, there is no accurate fully automated cough detection system available, because it is challenging to replicate the performance of the human ear to detect cough sounds. Recently, technological advances in digital storage devices and sound sensors make it portable and accurate to record cough sounds. Several cough monitors have been developed, and they adopt audio signals alone (microphone and/or contact microphone) or in combination with other sensors such as accelerometers, pneumographic belt, electromyography electrodes, electrocardiography electrodes, induction plethysmography, and pulse oximeter (in Figure 3). Drugman et al. [37] found that the audio microphone performed best among these sensors for cough detection. Thus, we can divide these existing cough frequency monitors into two sections: one only uses audio signal and the other uses mixed signals.

4.1. Audio-Only Cough Monitors

Some cough monitors use the audio only. The Hull Automatic Cough Counter (HACC, Castlefield Hospital, Hull, UK) uses a free-field microphone to record ambulatory sound around 24 h (Figure 4(a)). It uses an artificial neural network (ANN) to detect coughs after signal processing. The system can label coughs automatically but count coughs manually. In a test of 33 patients with chronic cough, the HACC presents a sensitivity of 80% (range from 55 to 100%) and a specificity of 96% [38]. Over 24 h recording and further assessment in different conditions are required.

The Leicester Cough Monitor (LCM, Glenfield Hospital, Leicester, UK) consists of a free-field microphone and MP3 digital recording device [34] (Figure 4(b)). It also enables 24 h recordings. A keyword spotting method based on hidden Markov models is applied in this system to select possible cough fragments [39]. Then, human experts select some of these cough sounds to develop a statistical model to fit the current record. Finally, the designed model is used to handle the whole recording. As a result, the sensitivity of the system is 91% and the specificity is 99% [40]. The system has been used in clinical trials [41]. Crooks et al. [42] used a hybrid system consisting of the Hull Automatic Cough Counter (HACC) and Leicester Cough Monitor (LCM) software to measure cough frequency during COPD exacerbation convalescence and achieved the overall sensitivity of 57.9% and a specificity of 98.2%.

The VitaloJAKTM system (Vitalograph Ltd., Buckingham, UK, and University Hospital of South Manchester, UK) is a semiautomated cough recording device with two sensors (Figure 4(c)). One is a free-air condenser microphone for manual validation, and the other is a chest wall air-coupled condenser microphone for recording cough sounds [43]. An algorithm based on a median frequency threshold is used to compress 24 h cough sound recordings into average 1.5-hour period. This system is accurate but labor intensive and time consuming because of the manual counting [44]. This system reaches a sensitivity more than 99% in a 24 h ambulatory context on ten patients [43], and it can be used in children to exactly measure cough frequency [45].

Drugman et al. designed an acoustic system using ANNs that was tested on voluntary cough from ten healthy subjects in various circumstances with a sensitivity and specificity of about 95% [46]. In [16], a precise and privacy-protecting cough monitor using a low-cost mobile microphone is proposed by Larson et al. They used principal component analysis (PCA) and a random forest classifier to reconstruct and classify the cough sounds with an average true-positive rate of 92% and a false-positive rate of 0.5% from subjects in the wild. Their system is hence able to protect personal privacy. Amrulloh et al. attempted to design an automated method to automatically identify cough segments from the pediatric sound recordings and achieved a sensitivity and specificity of 93% and 98%, respectively [5].

4.2. Cough Monitors with Mixed Signals

There are two commercial systems using multiple signals to detect cough. The Lifeshirt™ (Vivometrics, San Diego, CA, USA) appeared in 2005. It included several sensors integrated in a shirt worn by the user: electrocardiogram, induction plethysmography, 3-axis accelerometer, and a contact microphone placed on the throat (Figure 4(d)). The device achieved a sensitivity of 78.1% and a specificity of 99.6%. Unluckily, the Lifeshirt is no longer available due to the liquidation of the company in 2009.

The Pulmo Track-CC [47] (KarmelSonix Ltd., Haifa, Israel) includes a piezoelectric belt for monitoring chest wall motion, one lapel microphone, and two contact microphones placed on the trachea and the thorax (Figure 4(e)). The device has been tested over about 2 h in healthy volunteers simulating coughing in different situations (while walking, climbing stairs, and sitting and while in a supine position and in noisy environments). The device achieved a sensitivity of 91% for detecting explosive cough sounds and a specificity of 94% on voluntary cough [47]. However, in a study led by Turner and Bothamley, the device only had a sensitivity of 26% for detecting coughs identified by the ear [35] and it performed very well when detecting coughing caused by acute asthma [48].

4.3. The Ideal Cough Frequency Monitors

The ideal ambulatory cough monitoring system would have these characteristics [3, 49]: mobility, unobtrusiveness, compactness, and privacy protection. More importantly, it can allow 24 hour reliable recording and distinguish cough from other sounds automatically with high sensitivity, high specificity, and proportionately few false-positive events compared to the true-positive events. Audio cough monitoring systems mentioned above have met some of these requirements, but the huge number of noncough sounds limits the development of a cough monitoring system.

5. Cough Sound Processing Algorithms

Automatically detecting cough events requires some great answers to at least four major questions [39, 50, 51]: (1) ambient noise reduction: this is an important problem in audio-based detection systems; (2) differentiation from patient sounds, especially speech, laughing, and sneezing: even the most severe cough is far exceeded by the amount of talking; (3) the variability of the cough acoustics: both within and between individuals, combined with the additional complexity of different respiratory diseases; and (4) classification of dry or wet cough: this is a significant medical indicator. Currently, multidisciplinary teams of researchers all over the world are attempting to use pattern recognition techniques such as neural networks, support vector machine (SVM), and naive Bayesian classifier (Bayesian) to manage these questions.

The general workflow for the automatic assessment of cough used in literatures [52, 53] is displayed in Figure 5. The sound signals usually are captured by a microphone, and the first step aims at removing silence within signals. Next, extracting a wide variety of features is a need but this results in huge amount of data. Therefore, dimensionality reduction is carried out by selecting only the most relevant ones. Finally, a part of the dataset is chosen as training data and they are trained by classifiers. After that, the rest of the dataset is then used for testing.

6. Silence Removal

The raw data of cough sounds contains a lot of silent fragments, whose intensity is low. Removing silence is required to save storage space. In many literatures [25, 37, 54], frame processing is the first and then the start time and end time of cough events are calibrated by the double threshold method using the time-domain features, such as zero-crossing rate and energy entropy.

The energy entropy of a divided audio frame expresses its intensity. can be calculated by following formula: where represents cough sound single after frame processing.

The zero-crossing rate (ZCR) [55] is the ratio of sign changes of a signal. It can enhance the accuracy of the detection of cough sound endpoints. It is defined as where represents cough sound single after frame processing and sign [A] is 1 if A is greater than zero and 0 if otherwise.

6.1. Feature Extraction

After silence removal, cough recognition mainly involves extracting features from cough data and then inputting them into a model classifier. Features can be detected from time-domain signals as mentioned above [56] or from frequency-domain signals. Several features have been successfully applied to monitor cough events, including mel-frequency cepstral coefficients (MFCCs) and the characteristic parameters learned by the convolutional neural network (CNN).

6.1.1. Mel-Frequency Cepstral Coefficients (MFCCs)

MFCC is based on the hearing mechanism of human beings. The frequency of subjective perception is not linear, and it follows the empirical formula [57]: where is the perceptual frequency in mel and f is the actual frequency in hertz.

Therefore, the frequency of cough signals is usually transformed into the perceptual frequency, which can simulate auditory processing better. The concrete steps are as follows [5861]: (i)Preemphasize high frequencies, frame, and add windows.(ii)Take the Fourier transform of each frame signal.(iii)Calculate the spectral line energy for each frame of data.(iv)Calculate the logs of the energy at each of the mel frequencies.(v)Carry out the discrete cosine transform of the results achieved in the fourth step.(vi)The MFCCs are the coefficient of the results, and usually the first 12 coefficients are taken.

6.1.2. Convolutional Neural Network (CNN)

CNN is an efficient identification method which has been developed recently and has attracted extensive attention. Generally, the basic structure of CNN consists of two layers; one is the feature extraction layer. The input of each neuron is connected with the local accepted domain of the previous layer, and the feature of the part is extracted. Once the local feature is extracted and its position relationship between other characteristics can be determined, the other is the feature mapping network layer. Each computing layer is composed of multiple feature maps, each of which is a flat plane, and all neuron weights are equal.

6.2. Learning by Classifiers

There are several classifier algorithms for detecting cough, such as support vector machine (SVM), naive Bayesian classifier (Bayesian), neural network (NN), hidden Markov model (HMM), and dynamic time warping (DTW) [62]. Some studies have been conducted to compare the classifier algorithms with each other [57]. The performance measures are explained as follows [63]: (1)Accuracy is the percentage of samples correctly classified from the testing data set.(2)Sensitivity measures the ratio of positives which are exactly identified as well. Sensitivity is defined as the ratio of correctly classified positive samples and true-positive samples.(3)Specificity measures the percentage of negatives which are recognized intrinsically. Specificity can be calculated as the ratio of correctly classified negative samples and true-negative samples.

The advantages and disadvantages of these algorithms cannot be determined because the results of different experiments are different [34, 64]. However, MFCCs + SVM is used more widely [65, 66], and the neural network has potential to model and achieve accurate identification [67, 68].

7. Future Application of Audio-Based Cough Monitoring

Audio-based cough monitoring has potentially wide application in home medical equipment, clinical trials, and the development of new cough therapies.

7.1. Application in Home Care

Chronic cough is common in old people, and the objective monitoring of chronic cough in the daily life helps to improve the quality of life of the aged with chronic cough [69, 70]. Many doctors stress the importance of early diagnosis of childhood asthma and infantile pneumonia [7173]. The objective cough assessment provides a probability for this situation. It has been reported that objective cough monitoring is helpful in the diagnosis and treatment of infantile pneumonia [71]. The Pulmo Track-CC produced by KarmelSonix Ltd. achieves a great praise in the diagnosis of asthma [72]. With the development of the cough monitoring device, it could be widely used in home care in the future.

7.2. Response to Treatments and New Antitussives

Clinical treatment trials are a critical part of the diagnosis and management of chronic cough [74]. Some studies have examined the least important difference of cough frequency monitoring that has been available now [24, 75]. Cough monitors are the first choice for the objective evaluation of cough so that they are more widely used in clinical trials as main endpoints. Cough monitors will be a key part of understanding the response of patients with common respiratory diseases.

In recent years, novel antitussives are under development, but the primary outcome measure of antitussive drugs is still subjective, which harms the interests of patients. Many medical literatures point out that a randomized, placebo-controlled, double-blind clinical trial is the gold standard [76, 77]. The primary endpoint should be objective [78, 79]. Objective cough monitoring would be an ideal tool if it can successfully prove the clinical efficacy of novel antitussives. Moreover, subjective outcome measures would be used to assess symptoms and health-related quality of life.

8. Conclusions

Cough is one of the most important symptoms in respiratory clinical trials, while the objective indicators of cough severity are severely absent. This is because the cough frequency, cough intensity, and other objective cough assessment indicators cannot be accurately measured due to technical conditions. This situation has been improved with the development of sound recording and monitoring techniques over the last 20 years.

The generation of cough is not only related to the vocal cords but also to the lungs, and the cough sounds contain a wealth of individual information. Audio-based cough monitors are emerging. In this paper, the basic principle, hardware composition, and experimental results of a cough monitoring instrument are analyzed in detail. This paper also analyzes objective assessment algorithms of cough and their advantages and disadvantages.

Audio-based cough detection systems are now increasingly applied in clinical research. They are becoming more important to study cough. Automated cough algorithms are being developed in quality and processing speed so that audio-based cough monitors will change the assessment of patients’ responses to treatments and enter many families in the near future.

Conflicts of Interest

The authors declare that there is no conflict of interests regarding the publication of this paper.