Contactless Monitoring of Breathing Patterns and Respiratory Rate at the Pit of the Neck: A Single Camera Approach
Vital signs monitoring is pivotal not only in clinical settings but also in home environments. Remote monitoring devices, systems, and services are emerging as tracking vital signs must be performed on a daily basis. Different types of sensors can be used to monitor breathing patterns and respiratory rate. However, the latter remains the least measured vital sign in several scenarios due to the intrusiveness of most adopted sensors. In this paper, we propose an inexpensive, off-the-shelf, and contactless measuring system for respiration signals taking as region of interest the pit of the neck. The system analyses video recorded by a single RGB camera and extracts the respiratory pattern from intensity variations of reflected light at the level of the collar bones and above the sternum. Breath-by-breath respiratory rate is then estimated from the processed breathing pattern. In addition, the effect of image resolution on monitoring breathing patterns and respiratory rate has been investigated. The proposed system was tested on twelve healthy volunteers (males and females) during quiet breathing at different sensor resolution (i.e., HD 720, PAL, WVGA, VGA, SVGA, and NTSC). Signals collected with the proposed system have been compared against a reference signal in both the frequency domain and time domain. By using the HD 720 resolution, frequency domain analysis showed perfect agreement between average breathing frequency values gathered by the proposed measuring system and reference instrument. An average mean absolute error (MAE) of 0.55 breaths/min was assessed in breath-by-breath monitoring in the time domain, while Bland-Altman showed a bias of −0.03 ± 1.78 breaths/min. Even in the case of lower camera resolution setting (i.e., NTSC), the system demonstrated good performances (MAE of 1.53 breaths/min, bias of −0.06 ± 2.08 breaths/min) for contactless monitoring of both breathing pattern and breath-by-breath respiratory rate over time.
Accurate measurement and monitoring of physiological parameters, such as body temperature, heart rate, respiratory patterns, and, above all, the respiration rate, play a crucial role in a wide range of applications in healthcare and sport activities [1, 2].
Temporal changes of physiological parameters can indicate relevant variations of the physiological status of the subject. Among the wide range of parameters which can be measured in clinical settings, the respiratory rate is the most crucial vital sign to detect early changes in the health status of critically ill patients. For instance, respiratory rate is typically collected at regular interval by operators (i.e., every 8–10 hours) in the clinical setting, while it is often neglected in home-monitored people (i.e., telemonitoring and telerehabilitation). However, the respiratory rate has been demonstrated to be a significant and sensitive clinical predictor for serious adverse events; its value increases during exacerbation of COPD , it can be used to determine hospitalization, and it offers the opportunity for early intervention. Moreover, the respiratory rate has been found to be a more discriminatory parameter between stable and unstable patients than heart rate [1, 4].
Conventional techniques for measuring respiration parameters require sensors in contact with the subject. Measuring techniques based on the monitoring of several parameters sampled from inspiratory and/or expiratory flow (e.g., temperature, RH, CO2, and flow) are widely used. . Sensors may also be attached directly on the torso or integrated in clothes fibers to collect respiratory-related chest or abdominal movements . Such monitoring systems may cause undesirable skin irritation and discomfort, especially when long-term monitoring is required or during sleep. Moreover, it has been shown that such kind of contact-based measurement techniques may influence the underlying physiological parameters being measured . Therefore, contactless monitoring systems are welcomed to surpass issues related to placing sensors on patients.
For this reason, solutions—even commercial ones—based on the analysis of the sound recorded surrounding the person and on the monitoring of temperature map changes adopted to thermal cameras and depth map changes due to breathing have been designed and tested. However, they suffer from high cost, needs of specialized people, and sometimes of low signal-to-noise ratio. Optical motion capture systems have gained greater interest in the field of respiratory monitoring in both research and clinical scenarios . Other approaches resort on markers as color cues to track breathing motion . Recent advancements in video technology and machine vision software have allowed RGB cameras to become exciting solutions as they provide low-cost and easy-to-use noncontact approaches for measuring and monitoring physiological signals.
Different types of cameras have been used to measure physiological parameters, including heart rate and respiratory rate, either by adopting specific sensor camera technology, principle of work, or signals processing procedures. Two main methods have been used, based on remote photoplethysmography and body motion estimation.
Several attempts have been proposed to extract respiratory features from video frames recording breathing-related movements of thorax [8–10], thoracoabdominal area [8, 11], face area –, and area at the edge of the shoulder . Even though some studies consider region of interest (ROI) which include the neck region , none specifically considers the pit of the neck that is a large, visible dip in between the neck and the two collarbones that may be easily identifiable from the video.
Different approaches have been also used to postprocess the pixel data to extract signal related to the respiration from such videos by the subtraction of two continuous images [8, 11], analysis of pixel intensity changes based upon independent component analysis [12, 13], analysis of average contributions of red, green, and blue channel of the video [14, 16, 17], and analysis of optical flow . Even though breathing patterns and respiratory rates have been faithfully estimated using high-quality cameras [14, 16], several other approaches that rely on off-the-shelf webcams also are able to achieve the same level of monitoring accuracy [7, 8, 12, 13]. Even so, only few numbers of studies used notebook built-in webcams—usually used for video chat and video conferences—for contactless physiological monitoring .
Then, a large amount of studies does not declare the details of the camera adopted and a there is a lot of variability in terms of camera resolutions used in such studies (i.e., from 640 × 480  up to 2560 × 1920 ). According to sensor resolutions and the postprocessing methods adopted, the ROI used for extracting the signal varies study by study. Although several studies acquired video data with subjects at different distances from the camera [8, 11], none performed a comparative study with different camera resolutions.
Despite the large number of studies adopting video cameras for respiratory monitoring purposes, there is a lack of results about validity and accuracy of such methods in the practice, since the majority of these studies present proof of concepts or preliminary tests and no error metrics are reported [8, 11, 14, 16]. On available quantitative results, frequency-domain analysis is generally performed to extract the frequency content of the signal collected with the video method, to estimate the average breathing rate. However, time-domain analysis techniques may be useful to investigate breath-by-breath respiratory values among the time and to analyze additional features of respiration, otherwise unfeasible with a frequency-domain analysis.
In this paper, we present a single-camera video-based respiratory monitoring system based on the selection of the pit of the neck area. The aim of the present study is threefold: (i) the development of the measuring system capable of noncontact monitoring of respiratory pattern by using RGB video signal acquired from a single built-in high-definition (HD) webcam; (ii) the experimental test of this monitoring system in extracting average and breath-by-breath breathing rate values using both frequency-domain and time-domain analyses; and (iii) the evaluation of the influence of the sensor setting (i.e., resolution of the video sensor) on the accuracy of both average and breath-by-breath breathing rate values.
The respiratory pattern is estimated by analyzing the intensity of reflected light at the level of the pit of the neck. Experimental trials are presented to test the measuring system in the real practice, that is, in the monitoring of breathing pattern of twelve healthy subjects at self-paced breathing rate. Lastly, an analysis of performances of the proposed measuring system at different camera resolution is presented and discussed.
2. Measuring System: CCD Sensor and Video Processing Algorithm
The proposed measuring system is composed by a hardware module for data recording and preprocessing and a software for respiratory pattern extraction.
2.1. Hardware for Video Data Recording and Preprocessing
A video recorded with a CCD camera is considered as a series of f RGB frames, that is, polychromatic images. Each image is split in red (R), green (G), and blue (B) channels. Each element of the image matrix is a pixel with (x, y) coordinates that vary along the reference coordinate system placed at the bottom left corner of the image. Each pixel value represents a color light intensity. Zero-valued pixels correspond to a black color, whereas the maximum value renders white. The numerical values of each pixel depend on the number of bytes used to represent a given channel. When considering commercial 8-bit/channel cameras (24-bit for RGB colors), the maximum value is 28 (i.e., 256 colors including zero).
In this work, we used the built-in CCD RGB webcam (iSight camera) from a laptop (MacBook Pro, Apple Inc.). Images were recorded at 24-bit RGB with three channels, 8 bits per channel. An ad hoc interface was developed in MATLAB to manage video recording and provide useful event information to the subject (i.e., “hold breath,” “start breathing,” and “data collection completed”) during the data collection. The duration of data collection can also be defined through the interface. Each subject was asked to perform a sequence of actions which properly informed to the subject via the graphical user interface and also timed by the experimentalist.
The video is collected at a set frame rate of 30 Hz, which is enough to discretize the breathing movements that commonly occur up to 60 breaths per minute, equal to 1 Hz.
The proposed system needs to collect an RGB video of a person seated in front of the camera (see Figure 1). The algorithm developed for the analysis was intended to recognize the respiratory signal from intensity value variations encountered in the recorded video. So, after video recording, users are asked to select a pixel at the level of pit of the neck (xCL, yCL) in the first frame of the video. The pit of the neck is the anatomical point near the suprasternal notch (fossa jugularis sternalis), also known as the jugular notch. It is a large, visible dip in between the neck and the two collarbones that may be easily identifiable at the superior border of the manubrium of the sternum, between the clavicular notches.
The script automatically delineates ROI that consists of a rectangular region with dimensions where and . Details of the ROI selected starting from (xCL, yCL) are shown in Figure 1.
2.2. Extraction of Respiratory Pattern
To extract the respiratory pattern from the video, firstly, the selected ROI is split in the red, green, and blue channels (Figure 1). At each frame , the intensity components of each channel are obtained, where is the color channel (i.e., red (R), green (G), and blue (B)). The intensity components are averaged for each line of the ROI according to the following equation:
Therefore, is a function of and . Each is then detrended, that is, the mean is removed from the signal (Figure 2(a)).
To extract the respiratory pattern from the number of trends , the standard deviation of each line is computed. Then, the 5% of the signals with the higher standard deviations are selected (Figure 2(b)). The pattern is then obtained computing the mean value considering the selected lines at each frame.
Figure 2(a) presents a typical trend where the variation of the intensity signal concerning the baseline is characterized by low amplitude and high frequencies in the absence of chest wall movements. When the RGB camera sensor is facing the chest wall, then respiratory content can be extracted using filtering operations. Thus, adequate cut-off frequencies and bandwidth need to be defined. It is crucial to accurately design the filter parameters to obtain proper performance of the measuring system. A band-pass configuration was chosen for controlling the whole bandwidth. The general configuration of the method requires one to fix the low cut-off frequency around 0.05 Hz, to avoid the slow signal variations unrelated to respiratory movements. By filtering the signal content up to a high cut-off frequency equal to 2 Hz, the changes generated by the respiratory movements recorded to the CCD sensor can be adequately isolated and relayed to the subsequent elaboration stages.
For the filter, an infinite impulse response (IIR) filter was designed: a 3rd-order Butterworth digital filter was employed. The transfer function is expressed in terms of and coefficients as in the following equation: where is the filter stage gain and is the filter order. Hence, filtered output in the z-domain () can be expressed as a function of input signal () which is as the following equation:
At that point, the signal is normalized allowing us to obtain , by following the equation in (3): where and are the mean and standard deviation of signal , respectively.
Normalized signal is used for extracting respiratory pattern and temporal information since would be proportional to the changes in the intensity component and thus to the underlying respiratory signal of interest (Figure 3).
2.3. Respiratory Rate Calculation
The breathing rate can be extracted from either in frequency or time domain.
In the frequency domain, the breathing rate can be identified via power spectral density (PSD) estimate. The PSD estimation aims to assess the spectral density of a signal from a sequence of time samples of the same signal (finite set of data). PSD is useful in signal detection, classification, and tracking for detecting any periodicities in the data, by observing peaks at the frequencies corresponding to these periodicities [18, 19].
The main approaches for frequency analysis consist of parametric methods (such as AR, ARMA) and nonparametric methods (window methods). Here, we focus on a nonparametric method.
Let be a deterministic discrete-time signal. Assuming that , then the discrete-time Fourier transform of the data sequence is
Let be the energy spectral density; then we got where is the distribution of energy as a function of frequency. The power spectrum of a zero-mean stationary stochastic process can be calculated as the Fourier transform of its covariance function . Hence, PSD can be defined as where represents the distribution of signal power over frequency [18, 19].
When using this method, the most pronounced maximum frequency peaks of the spectrum identify the periodicity of the signal. Each spectrum obtained with PSD describes how the power of the signal is distributed with frequency. In other words, the power of the signal in a given frequency band can be calculated by integrating over the frequency values of the band. Consequently, PSD can be used to evaluate both (i) the variability of the pattern among the time in all the frequency band and (ii) the average value of the respiratory rate.
Contrary to frequency domain analysis, the time-domain specific points on the signal must be identified. Different approaches can be used based on the detection of maximum and minimum points, as well as on zero-crossing point individuation. We used a method based on both these approaches split into two steps. In the first step, the algorithm identifies the zero-crossing points on the video signal. It allows determining the onset of each respiratory cycle, characterized by a positive going zero-crossing value as where is the value of the signal for frame (or time) index corresponding to the onset of a respiratory cycle. In the second step, the algorithm provides the individuation of local minimum points on the signal and their indices between respiratory cycle onsets determined in the first step as where and are the time indexes of the signal corresponding to the onset of two consecutive respiratory cycles of video signal and is the local signal minimum of a respiratory cycle. The duration of each breath () is then calculated as the time elapsed between two consecutive minima points. Consequently, the breath-by-breath breathing rate is calculated as 60/.
3. Tests and Experimental Trial
3.1. Participants and Reference Data
Our dataset consists of recordings of 12 participants (six males, six females, mean ages 25 ± 3 years old, mean height of 163 ± 8 cm, mean weight 58 ±9 kg). All the participants provided their informed consent. Each participant was invited to sit on a chair in front of the RGB camera at distance of about 1.5 m (see Figure 1). The experiments were carried out indoor and with a stable amount of light delivered by neon lights and one window as sources of illumination. During the experiments, each video was recorded for ~170 s. Participants were asked to keep still and seated, breathe spontaneously, and face the webcam.
At the same time, the pressure drop (ΔP), which occurs during exhalation/inhalation phases of respiration, was collected by a differential pressure sensor  (i.e., Sensirion SDP610, pressure range up to ±125 Pa, Figure 4(a)). To not obstruct the data collection with the webcam, the ΔP was recorded at the level of the nose. The ΔP was sampled at 100 Hz, and the data were sent to a remote laptop via a USB connection and archived via MATLAB. All the steps carried out on signals are summarized in Figure 5.
Then, we carried out a temporal standard cumulative trapezoidal numerical integration of the ΔP signal (i.e., integrated ΔP) to provide a smooth signal for further analysis and to emphasize the maximum and minimum peaks on the signal (see Figure 4). This approach has been used in previous preliminary studies for extracting temporal respiratory features ([20–22]) from the pressure signals.
Afterwards, such integrated ΔP has been filtered using a bandpass Butterworth digital filter in the frequency range 0.05–2 Hz and normalized following the formula in (4). This normalized and integrated ΔP (Figure 4(b)) is the signal used for extracting reference respiratory pattern and respiratory rate values.
An example—obtained from one volunteer—of the ΔP trend collected by the pressure sensor, the normalized and integrated ΔP signal, and the signal extracted by the video processing algorithm is reported in Figure 4.
3.2. Respiratory Pattern and Respiratory Rate Comparisons
Signals obtained from the measuring systems have been compared to the reference signals in terms of similarity of curves and respiratory rate values. The similarity of the frequency content of signals and average respiratory rate values have been investigated from the normalized PSD.
The similarity between signals has been evaluated by overlapping the two normalized PSD, considering the one of the reference instrument as the reference PSD. From frequency dominant peak, the average respiratory rate value can be extracted. From average values of breathing rate, the accuracy (expressed in %) of the proposed method can be calculated as where and are the breathing rate measured using the reference signal and proposed method, respectively.
Additionally, the breath-by-breath respiratory rate values have been compared between instruments by extracting such values with the time-domain analysis. To compare the values gathered by the reference instrument and computed by the video-based method, we calculate the mean absolute error (MAE) of breaths per minute as where is the number of breaths estimated for each subject in the trial, is breaths per minute, and is reference breaths per minute using reference signal data. The standard error (SE) of the mean is then calculated as where is the standard deviation of the absolute difference between estimations and reference data ().
Additionally, the strength of associations between the breath-by-breath values collected with the proposed method and those collected by the reference instrument were evaluated with the Spearman correlation coefficient. Then, the slope of the simple linear regression (, with y-intercept ) computed on such values has been calculated fixing Then, the Bland-Altman analysis was used to evaluate the differences between the two methods: mean of the differences (MOD) and the limits of agreements (LOAs) were used to determine the accuracy and the dispersions of the breath-by-breath respiratory rate differences .
3.3. Influence of Sensor Size Resolution
To investigate the influence of camera sensor resolution on the accuracy of the proposed measuring system, we postelaborated the videos to decrease each frame resolution. This postprocessing was carried out in MATLAB. Bicubic interpolation was used for interpolating data points on a two-dimensional regular grid (sensor matrix). With this method, the output pixel value is a weighted average of pixels in the nearest 4-by-4 neighborhood.
We decided to investigate the performances of 6 camera sensor resolutions (including the resolution of the original video, HD 720) since they can be considered the most used resolution of commercial in-built webcam as HD 720, PAL, WVGA, VGA, NTSC, and SVGA, characterized by three different aspect ratios (i.e., 4 : 3, 5 : 3, and 16 : 9). Attributes such as sensor’s size and number of x and y used in the ROI selection are reported in Table 1.
Since the ROI size is linked with the maximum size of x and y, the ROI size () depends on the resolution of the CCD sensor. As a consequence, the number of lines used to compute the respiratory pattern from the video changes with the resolution (see Table 1). The same data analysis was carried out on signals according to Section 3.2, by considering each postelaborated decreased-resolution signal as a separate signal. Furthermore, same indicators for the respiratory pattern and respiratory rate comparisons against reference signal were used.
The results obtained from the proposed measuring system are compared to the reference ones. The analysis is carried out on both frequency and time domains, separately.
4.1. Breathing Rate Estimation in the Frequency Domain
In the frequency domain, we computed the normalized PSD obtained in each trial. Average breathing rate is calculated indirectly by taking the maximum peak of the normalized PSD plot. The values for each volunteer are reported in Table 2.
The similarities between normalized PSD obtained with the reference signal and the are shown in Figure 6. Since the volunteers were called to perform self-pace breathing, some of them present a pattern with high variations of the respiratory rate during data collection. In these cases, it is normal to obtain a PSD with dominant peaks at different frequencies (see Figure 6). With the proposed method we got an average accuracy on average estimation of 100%.
4.2. Breathing Rate Estimation in the Time Domain
The analysis in the time domain provides additional information compared to the analysis in the frequency domain with normalized PSD, for example, the breath-by-breath respiratory rate values. An average MAE value of 0.55 breaths/min was found, while the maximum value was 1.23 breaths/min. To specify the uncertainty around the estimate of the mean measurement, we use SE since it provides a confidence interval. Thus, we calculated the 95% confidence interval as 1.96 × SE. The computed confidence interval was always better than 0.45 breaths/min.
By considering all the breaths collected in the 12 trials (), the average accuracy of the proposed method based on the analysis of signal is of 97% in the breath-by-breath analysis. Linear regression analysis demonstrated a Spearman correlation coefficient of 0.97, with a slope . The Bland–Altman analysis reveals a slight overestimation of breathing rate with the proposed method (MOD of −0.03 breaths/min) and small LOAs amplitude (±1.78 breaths/min).
4.3. Influence of Sensor Size Resolution
Table 3 reports the dominant peak frequency of the normalized PSD obtained by each trial at the five different resolution investigated (i.e., PAL, WVGA, VGA, NTSC, SVGA). A qualitative analysis of differences between PSDs obtained from the reference signal and the proposed technique with five-different camera settings can be performed through Figure 6 (in Figure 6 there are also the HD 720 camera setting normalized PSDs).
For each subject, the normalized PSDs are very similar regarding shape and dominant peaks using different resolutions. In all cases, except trials 1 and 12, there is one dominant peak which is distinctly sharper from the surrounding peaks. In trial 1 and 12, the presence of several peaks highlighted several changes in breathing rate during the data collection. An example of these differences in the time domain and frequency domain is reported in Figure 7. The average accuracies for average respiratory rate estimation were 98.7%, 94.9%, 94.9%, 95.6%, 98.7% for PAL, WVGA, VGA, NTSC, SVGA, respectively.
The results obtained in the time domain are reported in Figure 8, in terms of MAE and SE. Despite the fact that MAE and SE tend to increase when the resolution decreases (i.e., NTSC), the values are similar to those obtained with higher resolutions. The maximum MAE values were not higher than 1.45 breaths/min. Additionally, Table 4 reports the Bland-Altman analysis results (MOD ± 1.96 × SD), as well as the slopes of the linear regression curves and the values of the correlation coefficient.
Figure 9(a) shows Bland-Altman and linear regression plots obtained using the original sensor resolution (HD 720), while Figure 9(b) shows the plots obtained using the NTSC resolution (with broader LOAs and worst value of correlation coefficient).
Within the wide spectrum of physiological measurements that are useful for clinical assessment, respiratory rate plays a crucial role. Especially, in some conditions it must be monitored continuously, for instance when patients are in clinical setting (i.e., intensive care unit) or both needs the monitoring of physiological data at home (i.e., tele monitoring, tele rehabilitation).
The use of unobtrusive solutions is widespread in respiratory monitoring. Optical technologies can allow nonintrusive and low-cost monitoring of respiratory patterns. Different solutions have been proposed based upon photo-reflective markers and frame subtraction. Although relative new techniques based on the analysis of video collected by digital camera have been demonstrated to be promising in the respiratory monitoring, most of them monitoring only the average respiration rate.
In this paper, we present a single-camera video-based respiratory monitoring system based on the selection of a small skin area near the base of the neck. The proposed method for extracting the respiratory pattern and the corresponding respiratory rate consists of three steps: (i) trunk wall motion data collection, (ii) ROI selection and intensity change analysis to extract video-based respiratory signal, and (iii) analysis in the frequency and time domain to investigate the frequency content of the signal and to extract the breath-by-breath respiratory rate.
Since the proposed method can work with very different built-in RGB cameras (webcams) available in most laptops, we have investigated the influence of sensor resolution (from HD 720 to NTSC) on the respiratory pattern and respiratory rate values extracted from video signal. The method has been tested on 12 participants wearing t-shirt or sweaters during data collection in an unstructured environment. Postprocessed pressure drop signal collected at the nose was used as reference signal in this work. Computed error measurements are at par with those reported in the literature [12, 13].
Results show excellent performances of the method with the use of HD resolution (HD 720) with an accuracy of the method equal to 100% in the estimation of average breathing rate from the frequency-domain analysis. Additionally, PSD spectra demonstrated the similarity of all the breathing pattern collected at the different resolutions when compared to the reference signal frequency content. It results in a lower value of 94.9% of accuracy in the estimate of the average respiratory rate from spectra. Despite the excellent results obtained in the frequency domain, further developments may be devoted to test parametric methods to estimate the PSD, for example, AR methods since the periodicity of the respiratory signal .
In the calculation of breath-by-breath respiratory rate, the use of HD 720 camera setting shows the better results in terms of MAE (average value of 0.55 breaths/min) and SE. Additionally, in this case, the method shows a bias of −0.03 ± 1.78 breaths/min in the calculation of breath-by-breath respiratory rate when compared to the reference values. With lower resolution (NTSC), the dispersions of the data are slightly higher (LOAs are wider, ±2.08 breaths/min), while the MOD value is comparable. These biases are comparable to those obtained  on average respiratory rate. It is slightly worse than to those of  who used spirometer as reference and pseudo-Wigner-Ville distribution time-frequency representation (with 0.7324 bpm of resolution) for the signal analysis during standing position (−0.02 ± 0.83 breaths/min).
By analyzing more than 200 breaths (from 12 volunteers), sensor resolution seems to influence the accuracy of the proposed method. NTSC resolution (the ROI area is one third of the HD 720 area) shows the worst results, with an accuracy of 95.6% in the estimation of average breathing rate, and a MAE error of 1.45 breaths/min. In the estimation of breath-by-breath parameter, the correlation coefficient is 0.95 with a bias of −0.06 ± 2.08 breaths/min. These values can be compared with respiratory rate bias obtained from wearable sensors like using Doppler radar (via fast Fourier transform) with the use of transmitter and receiver antennas when compared to a respiration strap  and to the use of radar which returns using harmonically related filters . Relationship coefficients and bias assessed with the proposed method are in line with those found with wearable systems based on respiratory inductive plethysmographic sensors  or based on optical fibers , which require expensive systems and contact with the patient for more accurate monitoring. So the most important findings are (i) the proposed measuring system is able to detect small chest wall movements caused by the respiration by calculating the pixel color differences between consecutive frames in order to extract respiratory pattern even with low resolution of cameras (i.e., NTSC); (ii) the errors calculated by comparing the average respiratory rate and the breath-by-breath analysis between instruments are acceptable for using the proposed measuring system for accurately monitoring the subject with commercial single camera, even at lower sensor resolutions, in the absence of breathing unrelated movements. Further developments of the proposed system will be the use of an additional technique based on the pixel flow analysis to detect unrelated breathing movements and to reduce their influence in the calculation of breathing pattern. Additionally, we are working on the automatic detection of the chest wall to automatically detect the ROI used for pattern calculation. Further test will be carried out investigating the performances of the proposed method in different scenarios (i.e., clinical setting room and intensive care unit), at different respiratory pattern (i.e., spontaneous breathing, high rate, deep breaths, apnea, and Cheyne-Stokes), and different postures. These upgrades will be useful to test the proposed measuring system for telemonitoring purposes, and in general, for the monitoring of subjects at a distance. Since the encouraging performances in the breath-by-breath monitoring of respiratory rate, the present system will be profitably used in the monitoring of subjects in both clinical and home setting as well as spot physiological check by using a commercial insight webcam, commonly used for video call. Other future works include validation with ultrasound images  and studying the effect of varying lighting conditions, varying distance and clothing .
Reference raw data and videos are available from the corresponding author upon request
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
The authors would like to thank Sara Iacoponi and Giuseppe Tardi for the helpful contribution in data collection and volunteers’ enrollment. Additionally, authors would like to thank all the volunteers who accepted to address their spare time for this study. Daniel Simões Lopes is thankful for the financial support given by Portuguese Foundation for Science and Technology, namely, for the postdoctoral grant SFRH/BPD/97449/2013 and the Portuguese funds with reference UID/CEC/50021/2013 and IT-MEDEX PTDC/EEI-SII/6038/2014.
M. A. Cretikos, R. Bellomo, K. Hillman, J. Chen, S. Finfer, and A. Flabouris, “Respiratory rate: the neglected vital sign,” The Medical Journal of Australia, vol. 188, no. 11, pp. 657–659, 2008.View at: Google Scholar
A. Aliverti, V. Brusasco, P. T. Macklem, and A. Pedotti, Eds., Mechanics of Breathing, Springer Milan, Milano, 2002.View at: Publisher Site
K. S. Tan, R. Saatchi, H. Elphick, and D. Burke, “Real-time vision based respiration monitoring system,” in 2010 7th International Symposium on Communication Systems, Networks & Digital Signal Processing (CSNDSP 2010), pp. 770–774, Newcastle upon Tyne, UK, July 2010.View at: Google Scholar
Y. W. Bai, W. T. Li, and Y. W. Chen, “Design and implementation of an embedded monitor system for detection of a patient’s breath by double webcams,” in 2010 IEEE International Workshop on Medical Measurements and Applications, pp. 171–176, Ottawa, ON, Canada, May 2010.View at: Publisher Site | Google Scholar
S. L. Marple Jr, “Digital spectral analysis with applications,” The Journal of the Acoustical Society of America, vol. 86, p. 492, 1987.View at: Google Scholar
P. Stoica and R. L. Moses, Spectral Analysis of Signals, Pearson/Prentice Hall, 2005.
B. A. Reyes, N. Reljin, Y. Kong, Y. Nam, and K. H. Chon, “Tidal volume and instantaneous respiration rate estimation using a volumetric surrogate signal acquired via a smartphone camera,” IEEE Journal of Biomedical and Health Informatics, vol. 21, no. 3, pp. 764–777, 2017.View at: Publisher Site | Google Scholar
P. Grossman, M. Spoerle, and F. H. Wilhelm, “Reliability of respiratory tidal volume estimation by means of ambulatory inductive plethysmography,” Biomedical Sciences Instrumentation, vol. 42, pp. 193–198, 2006.View at: Google Scholar