Abstract

Gas recognition is a new emerging research area with many civil, military, and industrial applications. The success of any gas recognition system depends on its computational complexity and its robustness. In this work, we propose a new low-complexity recognition method which is tested and successfully validated for tin-oxide gas sensor array chip. The recognition system is based on a vector angle similarity measure between the query gas and the representatives of the different gas classes. The latter are obtained using a clustering algorithm based on the same measure within the training data set. Experimented results on our in-house gas sensors array show more than of correct recognition. The robustness of the proposed method is tested by recognizing gas measurements with simulated drift. Less than of performance degradation is noted at the worst case scenario which represents a significant improvement when compared to the current state-of-the-art.

1. Introduction

The detection and discrimination of gases using microelectronic gas sensor array are required in various industry and domestic applications, such as automobiles, safety, indoor air quality, medicine and food industry [1, 2]. -based gas sensing film technology is commonly used for such applications because of a number of advantages including cost effectiveness, high sensitivity to various gases, and relative compatibility with standard CMOS fabrication processes [3]. The -based sensors are operated at high temperature (typically 300) obtained using integrated microhotplates (MHPs). -based gas sensing film shows high response to a large variety of target gases but low level of selectivity to a given target gas. This phenomenon is widely exhibited in the animal and human biological olfactory systems [4]. Besides low selectivity, other shortcomings such as high sensitivity to humidity, nonlinearities of the sensor's response, drift and slow response are associated with electronic gas sensors in general. Poor selectivity towards the monitored gas, or cross sensitivity towards other gases makes a sensor's output unreliable. Long exposure cycles of the sensors as well as aging factors and poor stability cause a sensor's calibration curve drift with time [5]. The drift can be explained as a random temporal variation of the sensor response when exposed to the same gases under identical conditions. These drifts are due to unknown dynamic processes in the sensor system (e.g., poisoning or aging of sensors) or environmental changes (e.g., temperature and pressure conditions). Many pattern recognition techniques were proposed in the literature to deal with electronic nose low selectivity and good overviews could be found in [2, 6]. However, the processing remains generally quite complex involving many preprocessing operations as well as complex pattern recognition algorithms especially with large number of sensors. In addition, most of the proposed techniques do not address the drift problem and those who deal with this issue require the retraining of the proposed systems using simulated drift observations.

Gas measurements issued from our sensor array are vectors in . Usually, recognition in multidimensional space requires a sufficient number of observations to estimate the parameters of the underlaying model. This number increases rapidly with the dimension therefore recognition becomes inaccurate in practice even each dimension often brings additive information. This is the well-known curse of dimensionality phenomenon [7]. One alternative to overcome this problem is to use a dimensionality reduction technique by either feature selection or feature extraction to find a more compact representation of the data keeping only relevant information. Pursuit projection (PP) [8] is a dimensionality reduction technique finding iteratively projection axis that maximizes a given criterion called projection index (PI). Principal component analysis (PCA) is a PP with the variance as PI, linear discriminant analysis (LDA) is a PP with the classes separability (classes separability is usually measured as the ratio of the within covariance matrix to the between covariance matrix) as PI and independent component analysis (ICA) is a PP with the non-Gaussianity as PI [9, 10]. In this paper, we use a low complexity yet efficient reduction technique based on the vector angle between observations vectors in as a measure of their closeness. The proposed recognition system uses the vector angle similarity measure between the investigated gas and the different gas class representatives obtained using a clustering algorithm with the same similarity measure. This modelling allowed us to reach high recognition rate and has shown high robustness against the drift phenomenon.

The paper is organized as follows. The Sensor array is described in Section 2. In Section 3, the recognition system is detailed and experiments on real gas measurements are presented. The robustness of the proposed technique is investigated in Section 4. Finally, a conclusion is presented in Section 5.

2. Sensor Array Characterization

The monolithic tin-oxide gas sensor array was designed and fabricated using our in-house 5 m 1-metal, 1-poly CMOS process [3]. The cross-sectional view of the gas sensor is shown in Figure 1(a). The top view of the fabricated sensor element with the micro heater and the electrodes is shown in Figure 1(b).

The MHP is at the center of the sensor element and has a dimension of . A 2.8  m air gap is formed between the hotplate membrane and the substrate using a surface micromachining process. The sensing film is deposited onto the MHP using a sputtering method. The sensor signal is measured from the resistance variation across the two Pt electrodes. Different posttreatment combinations were performed on the sensors within the array, including metal catalysts (Pt, Pd, and Au) in 3 columns and ion implantations (B, P, and H) in 3 rows, which results in a response variation across the 16 sensors.

Vapors were injected into the gas chamber, with a diameter of and a reaction volume of , at a flow rate determined by the mass flow controllers (MFCs). The gas concentrations in the sensor chamber are adjusted by selecting the correct flow rate for different gases. Input signals generated by the data acquisition board and used to control the MFC are pulse signals corresponding to different concentrations. The output is then processed by the data acquisition board (DAQ). In this measurement, all the MHPs on the chip were heated to 300, resulting in a total power consumption of 352 mW.

The test gases are relevant for a number of applications mainly in the area of toxic and combustible gas identification. , and Ethanol are commonly found in mines and daily life, which are the potential causes for explosion or poisoning coal mine accidents. Furthermore, the mixture of gases is more dangerous than single gas which is the case in real situations. Three binary mixtures of , and are selected as analyte gases and recognized as new gases.

The gas sensors array is characterized using a periodic exposure-cleaning operating mode. Gases are injected over time in a periodical manner and each gas injection is followed by a cleaning phase which can be done in some cases by injecting dry air. After the cleaning phase, the signal will reach the base-line which is taken as the reference measured before gas injection.

3. The Proposed Gas Recognition Algorithm

3.1. Vector Angle Similarity Measure

The measurement obtained from the 16-sensors array can be represented by a -dimensional vector. In this representation, vector angle can therefore be used as similarity measure. Let and be two vectors belonging to . The vector angle is given by where is the inner product between two vectors, and is the magnitude of a vector. This is a simple way to measure vectors similarity, that is, the closeness of their orientation in space, regardless of their magnitudes. Indeed, vectors belonging to the same category most likely occupy the same region in the high-dimensionality space and therefore the angle between them is expected to be small. Despite its simplicity, this measure has proven its effectiveness with hyper and multispectral images for remote sensing [11, 12], where it is called “spectral angle" due to the origin of the observation vectors, which represent the spectral reflectance of the soil in the observed area. This measure has also been effectively used in color image segmentation [13, 14]. In addition, using the vector angle measure reduces substantially the problem complexity and represents an efficient dimensionality reduction technique. In fact, dealing with data in high-dimensional space gives rise to the well-known curse of dimensionality problem also known as Hughes phenomenon [7]. Usually the model complexity increases with dimensionality, consequently, the number of parameters increases and the size of the training data set required for a reliable estimation becomes large. In practice, the training data set is limited and the space becomes more and more empty when dimensionality grows, which implies a lack in parameters estimation accuracy and therefore a degradation in recognition performance. Using the vector angle is a simple way to overcome this problem.

3.2. Vector Angle Approximation

The ultimate goal of our research work is to integrate the sensor array and the pattern recognition system in CMOS technology. Typically, the , which is a nonlinear function, is implemented in hardware using a look-up table. High accuracy requires a large size of the look-up table which is a real drawback for our proposed pattern recognition system. To overcome this problem, we propose to use an approximation based on Taylor's series expansion. The expansion of the is given by Extensive simulation results show that a first-order approximation is sufficient to keep the same recognition performances as the full expansion. Thus, the following formula is used for the computation of the : In the next section, this measure is used to design the gas recognition system.

3.3. The Recognition System

The K-means algorithm [9] is used to recover the potential local clusters in the data structure of each gas class. The choice of the number of clusters is a model selection problem motivated by the Occam's razor principle, which stipulates that the simplest model that explains data is the one to be preferred [9]. Since no parametric model is associated with the K-means algorithm, model selection techniques like Akaike information criterion and Bayesian information criterion are not applicable. Thus, a large number of clusters are used at the beginning then small clusters are eliminated during the clustering process.

The centroids of the obtained clusters are treated as representatives, which are compared to the candidate gas. The decision follows the simple rule outlined as: “a candidate gas belongs to the class whose one of the representatives is the closest to the candidate (Figure 2)".

Let be the centroids of the class , and be a candidate gas. The minimum angles between and the sets are computed as follows: The decision rule is then The performances of this recognition system is investigated in the next section.

3.4. Experiments

To experiment the proposed recognition system, a large data set of measurements was generated. Different concentrations of 6 categories of gases, namely: CO, , Ethanol, , Ethanol–CO, and , were used with totally 432 vector measurements data set. The three last mixtures are considered as separate gases. This data set is partitioned into two parts: for the learning and for the test. The technique proposed in this work is compared with different classical recognition techniques. All these techniques are preceded by a dimensionality reduction step using PCA, while in our case dimensionality reduction is inherently embedded into the classifier, which constitutes a significant advantage. Table 1 reports the obtained average accuracy with an increasing number of the retained projection axis of the PCA. Despite its low complexity, the proposed system performs well compared to the other techniques. In addition, the use of the vector angle-based classifier also enables dimensionality reduction from the original space to 1D space without any further preprocessing. Only one cluster per gas category is used in our experiments, therefore only one representative is needed for each category which is a very low complexity yet highly efficient recognition system. In other words, our recognition system presents an embedded dimensionality procedure which is very interesting since reduction and classification are performed simultaneously reducing the complexity of the proposed system. On-chip implementation of the proposed classifier will therefore take advantage of the small number of parameters needed to be stored by the system while the computation requirements are kept very low.

4. Robustness Study

The drift phenomenon can be interpreted as a temporal variation of the pattern distribution in the feature space, which can cause a noxious robustness issue to the recognition system. Therefore, it is generally necessary to retrain the entire recognition system using measurements affected by the drift to enhance the performance of the electronic noise.

Figure 3 illustrates an example of an additive drift in which we have reported the response of the sensor as function of the concentration of gases periodically injected into a gas chamber in which the sensors are being placed. We can note that the baseline response of the sensor is shifted, which complicates the classification problem even further since the learned behavior of the sensor is varying with time. The drift has been modelled as where is the sensor output before the drift experiment and was chosen randomly for each sensor [16]. Drift varying between 0 and 30 has been artificially generated. The performance of the recognition was evaluated over the drifted measurements. The identification results of the drifted measurements using the system trained on a nondrifted measurements are shown in Figure 4 and compared to the recognition performances obtained in [15]. The proposed technique shows a higher accuracy than the one achieved in [15] with and without retraining the system. It is also important to note the significant invariance of our performance when increasing the drift. In the worst case, more than of recognition accuracy is obtained. Less than of performance degradation is noted with a drift figure of . The robustness of the proposed technique shows that using the angle between the query gas and the class representatives tends to be more robust than using other similarity measures. In our case, the class representatives are characteristic signatures of the corresponding gases, therefore even in the presence of drift effects, the measurement vector of a given query gas is still closer to its class signature than to the other signatures.

5. Conclusion

In this paper, a novel gas recognition algorithm based on a vector angle similarity measure is proposed. The K-means algorithm is adapted to recover the local data structures in different classes training sets by starting with a large number of clusters and eliminating small-size clusters during the clustering process.

The obtained results show the effectiveness of the proposed technique compared to the existing methods even though it does not require any preprocessing. In fact, the embedded dimensionality reduction performed using the vector angle similarity measure avoids additional operations of the preprocessing required by other methods.

In addition, the proposed technique shows a very high robustness against the drift phenomenon without any additional retraining. The small number of needed parameters and low complexity of the required computations suggest efficient on-chip hardware implementation of the recognition system. Work is underway to integrate the proposed algorithms with the sensor array enabling single chip electronic nose microsystem.

Acknowledgment

This research is supported by an Emerging High Impact area grant from HKUST (Ref. HIA05/06.EG03).