PCA/SVM-Based Method for Pattern Detection in a Multisensor System
This paper presents a multivariate analysis framework for pattern detection in a multisensor system; the proposed principal component analysis (PCA)/support vector machine- (SVM-) based supervision scheme can identify patterns in the multisensory system. Although the PCA and SVM are commonly used in pattern recognition, an effective methodology using the PCA/SVM for multisensory system remains unexplored. Pattern detection in a multisensor system has long been a challenge. For example, object inspections in multisensor systems are difficult to perform because inspectors might fail to use multiple sensing devices when concurrently detecting different patterns. Therefore, to resolve this issue, this study proposes a novel framework for establishing indicators and corresponding thresholds to identify patterns in the system; it employs a feature-based scheme that integrates principal component analysis (PCA) with an SVM for effectively detecting patterns in the system. Experiments were conducted using a tactile and optical measurement system. The experimental results demonstrated that the proposed method can effectively identify patterns in multisensor systems by using a feature-based algorithm that combines PCA and SVM classification for detecting various patterns. Moreover, the proposed framework established alarm indicators and corresponding thresholds that can be used for pattern detection.
Multisensor systems have been widely used in various industrial applications. Tactile and optical measurements are often present in the multisensor system. Signal processing methods have become major priorities because the data from multiple sensors from measurements contain imperfections in the form of imprecision, uncertainty (noise), and incoherence (irrelevant information). Pattern detection in multisensor systems has long been a challenge. The current study proposes a framework that uses indicators and the corresponding thresholds to identify patterns in the system. The proposed method effectively analyzed the multivariate signals and derived the thresholds for detecting different patterns. The principal component analysis- (PCA-) integrated feature-based method can handle imperfections in the data from multiple sensors.
Although the PCA and SVM are commonly used in pattern recognition, the concept of addressing an effective methodology using the PCA/SVM especially for multisensory system remains unexplored. Object inspections in multisensor systems are difficult to perform because inspectors might fail to use multiple sensing devices when concurrently detecting different patterns. Therefore, this paper presents a pattern detection methodology for multisensory system based on PCA/SVM algorithms. The contributions of this study are summarized as follows. The method employs a PCA/SVM scheme for effectively detecting patterns in a tactile and optical measurement system. The system can determine the ideal number of detection variables in a processing framework to detect patterns in data that contains imperfections. Finally, the dimensionality reduction technique can establish alarm indicators and corresponding thresholds to identify patterns in the multisensory system.
The remainder of this paper is organized as follows. Section 2 reviews related work. Section 3 presents the multivariate processing framework and the proposed method for detecting samples. Section 4 presents the experimental results derived from applying the detection system to various samples and a comparison of various existing methods. The final section provides the conclusions of this study.
2. Related Work
This section describes previous PCA-based schemes related to the proposed method and finally addresses differences between the proposed method and the previous approaches.
Recent studies have investigated PCA-based methods for fault detection and feature extraction. Researchers have tried to manage imperfections in multivariate sensed data by using various PCA-based approaches. Hu et al.  employed PCA to evaluate the sensitivity of chiller sensor fault detection and used the same PCA model to show that each sensor had a different level of fault detection sensitivity. Wang et al.  proposed a PCA-based optimal sensor selection method for the condition monitoring of a distributed power generation system involving wind turbines. Xu et al.  presented a Bayesian wavelet PCA methodology for turbomachinery damage diagnosis under uncertainty. Sattar et al.  analyzed acoustic big data from ocean environments for fish sounds. The proposed method involved multiresolution acoustic feature extraction and robust PCA-based feature selection for effectively classifying three types of fish sounds. PCA-based approaches associated with an SVM have been proposed as highly effective for classifying samples. Naveenkumar and Vadivel  applied PCA techniques to reduce the dimensions of depth motion map (DMM) features and presented a PCA-based DMM feature fused with SVMs for human action recognition. Gao and Hou  proposed an SVM-integrated PCA scheme for fault diagnosis and made a comparison with other related diagnostic methods to demonstrate the effectiveness of the proposed approach.
The PCA-based method proposed in this work is similar to these published PCA-based methods; it especially resembles the schemes that combined PCA with SVMs. However, this study proposes a feature-based scheme that integrates PCA with an SVM classifier; the proposed method uses indicators and corresponding thresholds for pattern detection in a multisensor system. The PCA/SVM-based method involves PCA-based data selection and image feature extraction for SVM classification; this method can be used to solve the detection problems inherent in imprecise, uncertain, and incoherent data from multiple sensors. The present study quantitatively compared existing pattern detection methods.
3. Proposed Method
This section describes the PCA-based method and then introduces the SVM-based algorithm and the system (including the facility and instrumentation) for pattern detection in a multisensor system.
3.1. PCA-Based Method
This section describes the PCA-based approach for the selection of suitable detection variables for effectively detecting patterns.
PCA is a common statistical method used to analyze the inherent structure of the data. It helps in reducing the data dimensionality by rotating coordinate axes. The proposed PCA involved an eigenvalue decomposition to produce eigenvalues and eigenvectors for representing the amount of variation of the sensed data. A set of correlated high-dimensional sensed data was transformed to a set of uncorrelated lower-dimensional data. The PCA minimized the squared reconstruction error in dimensionality reduction. The matrix consisted of the sensed data with raw samples (rows) on process variables (columns) and can be expressed aswhere is the th normalized sample column vector. The covariance matrix demonstrated correlation; can be expressed asAfter the eigenvalue decomposition of , can be expressed aswhere and are the projection vectors of onto the principal component subspace and the residual subspace, respectively. Then, was used to detect patterns and can be defined aswhere contains the former columns of the eigenvector matrix, and its column vectors correspond to the nonnegative real eigenvalues. The corresponding eigenvalues in the decreasing magnitude are expressed asThe indicator was employed to determine whether the sensed data belongs to the pattern and is defined as
Figure 1 displays a flowchart of the PCA-based approach. Each sensed data set has 1024 samples. To train PCA model, 512 samples for each data set were randomly selected as the training samples, and the remaining samples were used for PCA-based selection to evaluate the accuracy. The proposed approach applies the following steps to select suitable variables for detection with multiple sensors.
Step 1. Input the sensed data from the detection sensors.
Step 2. Normalize the data with zero mean and unit variance.
Step 3. Employ the scree test  to determine optimal principal components after the eigenvalue decomposition of the normalized data. The scree test is described as follows.where Scr was employed to reduce the number of principal components and determine the optimal components. is the communalities for each process variable. If exceeds 1 and is greater than 0.5, then Scr is equal to 1 and retains the component; otherwise, Scr = 0 and the component is eliminated. Generally, the components with high eigenvalues should be retained, while those with low eigenvalue should be eliminated, but the criteria between high and low are not clear. The scree test approach is appropriate for general models and may not be too restrictive for models with lots of variables.
Step 4. Establish the indicator on the basis of the optimal principal components.
Step 5. Classify the indicator by using an SVM based on .
Each was obtained from the PCA model training; the indexes = (0, 1, 2, 3) indexed the indicators . Each indicator had 512 samples for each sensed data set in the training. The training procedure can be summarized as follows: select the operational data; normalize the data into training data; determine the optimal principal components from the training data; and establish the indicator on the basis of the optimal principal components.
Step 6. Terminate the process and obtain detection results.
For example, suppose that Step 1 tests new samples with four process variables. Step 2 normalizes the samples into . Step 3 determines the optimal principal components as on the basis of the scree test . Step 4 establishes the indicator . Step 5 then classifies the indicator by using the SVM and obtains the classification result of . Step 6 then stops the process and obtains detection results for class D. In this example, the algorithm obtains suitable variables and activates an alarm for a pattern that transits from class C to class D.
3.2. EFD-Based Algorithm and SVM Classifier
This section describes the edge feature description- (EFD-) based algorithm, including adaptive region-growing (ARG) segmentation and EFD-based extraction . The ARG segmentation uses a set of initial seeds and adaptively groups neighboring pixels into growth regions. For inclusion in one of the regions, a pixel must be eight-connected to at least one pixel in that region. The regions are merged if a pixel is connected to more than one region. EFD-based extraction employs an edge feature description for the ARG-segmented image. Edge pixels in the segmented images typically belong to one of the eight possible edge patterns . After all pixels in an image have been processed, the edge is classified using edge feature vectors. The edge descriptor from the feature description is expressed aswhere represents the seven coefficients of the normalized edge numbers from Edge Patterns 1, , 3, 5, 6, 7, and 8, and ranges from 0 to 1. represents the coefficient of the normalized edge numbers from Edge Pattern 2 and Edge Pattern 4 (Edge Pattern ).
As shown in Figure 2, the EFD-based algorithm applies the following steps to adaptively obtain suitable image parameters for the pattern detection.
Step 1. A total of 200 sample image patterns are tested for a given number of descriptors .
Step 2. The ARG segmentation is implemented using a set of initial seeds and groups neighboring pixels with each initial seed within the growth regions.
Step 3. EFD-based extraction is implemented on the basis of a given number of descriptors and extract features as edge descriptors from the segmented image.
Step 4. The SVMs classify images.
Step 5. The recognition rate is determined for the given image.
The recognition rate is defined as follows:where is the number of accurately classified images in the test run and is the total number of test sets ( is 200 in this case). If the recognition rate exceeds a given value , then Step 6 commences; otherwise, Steps 2–5 are repeated.
Step 6. The process stops when the sample images in all cases of the given number of descriptors have been tested; otherwise, Steps 2–5 are repeated. In addition, the algorithm stops when any segmented image fails to satisfy the condition in Step 5.
For example, suppose Step 1 inputs the sample image pattern with descriptors , , , and . Step 2 implements the ARG segmentation. Step 3 employs an EFD-based scheme to extract features, which are classified using an SVM in Step 4. When is 0.9 (i.e., a 90% accuracy rate), Steps 2–5 are repeated until the recognition rate exceeds 0.9. Step 6 determines whether to stop the process. Thus, the method automatically obtains suitable descriptors , , , for effectively detecting patterns.
Consider the set of feature values belonging to two separate classes with input (-dimensional input space) and class labels (target output) . An SVM is employed to implement the classification . This SVM constructs a hyperplane as a decision surface to maximize the margin of separation between positive and negative examples. For this SVM classifier, two parameters, namely, parameter and the radial basis function kernel parameter γ, must be optimized. The parameter is a user-specified positive parameter used for controlling the trade-off between SVM complexities.
This study adopted the hold-out procedure for determining the two parameters; in this procedure, samples were classified into training samples, on which the classifiers were trained, and other samples, on which the classifier accuracy was tested. Table 1 lists classes of the pattern samples used in the SVM model. The SVM comprised 300 sensed data sets (150 sets from vibration signals; 150 sets from fluctuation signals) and 280 pattern images. To train the SVM model, one hundred data sets (50 sets from vibration signals; 50 sets from fluctuation signals) and eighty images were randomly selected as the training samples, and the remaining data were used for evaluating the SVM classifier accuracy. Table 2 lists the training results with different classes of samples. The SVM was tested at various combinations of the two parameters. Figure 3 presents the testing accuracy at various combinations of the two parameters. High testing accuracy was realized when the combinations = 211, = 2−5 and = 211, = 2−5 were used for sensed data sets and pattern images in the SVM, respectively. Table 3 lists the classification results for different sample sizes, indicating that sample sizes are apparently unrelated to the classification results.
3.3. Detection System
This section describes the development of the proposed multisensory system. As an example of tactile and optical measurements in the system, pattern detection was tested on measurements of a cylinder in a water tunnel.
A detection system was designed and built for this study (Figure 4). At the entrance of the test section, a pitot tube was used to monitor the inlet flow velocity. Table 1 lists the classes of pattern samples used in the experiments. A 16 mm diameter cylinder was equipped with two accelerometers, both of which had a sensitivity of 4.693 mV/g and a frequency response in the range from 2 to 8 kHz. A steel wire under tension was used to regulate the natural frequency of the cylinder. The natural frequency () in the test was 22 Hz. Accelerometers were installed inside the cylinder at the middle of the span to measure the strength of the cylinder vibrations. A hot-film anemometer with a hot-film probe measured the corresponding flow velocity fluctuations. The accelerometers and the anemometer forwarded their signals to an industrial computer, which converted these signals to obtain displacements and velocity spectra. To obtain image signals, a fluorescent dye was added into the water and the industrial computer triggered a charge-coupled device (CCD) camera through Wi-Fi to acquire synchronous images. The synchronous image information was transmitted from the CCD camera to the industrial computer, which captured the synchronous images with a frame grabber.
Figure 5 displays a schematic of the processing framework. The framework enables the sensing and identification of patterns by employing a suitable processing strategy based on the sensing results. The pattern detection procedure can be summarized as follows: input the signals from multiple sensors; the input image signals are converted to 1024 × 768 pixel images with an 8-bit gray level and normalized within 0 and 1 (image preprocessing), and a value of 1 is set for initial seeds in the image; then an ARG-based algorithm uses the initial seeds to group neighboring pixels; the vibration and fluctuation signals are normalized with zero mean and unit variance (data translation); execute feature extraction methods: PCA-based selection for the vibration/fluctuation data, PCA-based selection for the images, and ARG-EFD-based feature selection for the images; execute PCA-based selection and establish PCA model training; execute ARG image segmentation; establish indicators on the basis of the optimal principal components and implement EFD-based extraction; classify the indicators based on (obtained from PCA model training) and the images using the SVM; identify optimal feature extraction methods on the basis of time-cost function and accuracy rates; and obtain detection results and optimal feature extraction methods (PCA-based selection for the vibration/fluctuation data and ARG-EFD-based feature selection for the images in the case). Samples from multiple sensors (including the 200 data sets and the 200 synchronous images for each class listed in Table 1) were used for pattern detection. For detecting class D, the indicator and edge descriptors were set to and ; and were obtained from the PCA-based and the ARG-EFD-based algorithm, respectively, for class D samples; the SVM classification of both = and = showed that the samples belonged to class D. The procedure was complete for a given sample when no data set and image remained in the data/image queue. and were reset, and other samples were detected similarly. The detection was complete for all samples when the data/image queue was empty.
4. Experimental Results and Discussion
This section describes the general classification results obtained using the proposed multivariate processing framework to detect patterns in a multisensor system. Experiments were conducted to examine the accuracy and performance of the proposed method. The major results from each experiment included pattern detections, effectiveness of the proposed framework, algorithm accuracy, and algorithm performance. The results revealed that the proposed algorithm can be used as an alarm indicator for pattern detection in the tactile and optical measurement system. The proposed framework, which employs a feature-based PCA algorithm and incorporates SVM classification, was able to classify patterns in the multisensory system effectively and detect patterns in sensed data containing noise and irrelevant information. Moreover, this study determined that the proposed algorithm outperformed existing methods.
4.1. General Classification Results Obtained Using the PCA/SVM-Based Method
This section describes the general classification test for the availability of the PCA/SVM-based method. The experiment was set up as follows. As displayed in Figure 6, the experimental setup for the test included hex key samples, a vibration trigger, a vibration sensor, a CCD camera, and an industrial computer. Data regarding the test samples are listed in Table 4. The samples were selected from the 200 validation samples for each class. One of the samples, installed on a black steel plate, was tested for the classification. Similarly, a steel wire under tension was used to regulate the natural frequency of the plate. The natural frequency () in the test was 17 Hz. To produce vibrations, signals were transmitted from an industrial computer to a vibration trigger through Bluetooth, and the trigger then induced vibrations (at 17 Hz) to shake the plate. To obtain vibration signals, an accelerometer measured the strength of the vibrations from a vibration sensor through Wi-Fi and forwarded them to the industrial computer, which converted the signals to displacements. To obtain image signals, the industrial computer triggered a CCD camera through Wi-Fi to acquire synchronous images of hex key samples. The synchronous image information was transmitted from the CCD cameras to the industrial computer, which was equipped with a frame grabber for capturing the hex key images.
The vibration sensor obtained two process variables (- and -directional signals). The PCA-based method normalized the signals into and obtained the principal components as after the eigenvalue decomposition of the normalized data. The indicator was then established and denoted as for the corresponding class . The vibration behavior was further examined by a spectral analysis of the vibrations. Figure 7 displays the amplitude spectra of the plate in the direction. The dominant frequencies and indicators are listed in Table 5. The optimal principal component was ; the indicator = () was established for SVM classification. Figure 8 shows an example of the hex key image segmentation using the feature-based algorithm with the descriptors . The descriptors were the optimal selections for class in the classification because these values yielded the highest average accuracy rate of 98% (Table 6). Table 7 lists the classification results obtained using the PCA/SVM-based method (a feature-based scheme that integrated PCA with an SVM classifier) and indicates an average accuracy rate of 98%.
This section describes a quantitative comparison of the detection methods. The detection methods employed multiple sensors, namely, a tactile sensor (a vibration sensor) and an optical sensor (a CCD camera). Three learning classifiers, based on Bayes classifier (BC) , artificial neural network (ANN) , and -nearest neighbor (KNN)  methods were compared with the proposed method. Similarly, 100 data sets and 80 images were randomly selected as training samples, and the other 200 sets and 200 images were used as validation samples for evaluating the performance of the detection methods. In the KNN classification, the input data were a set of and a set of , and the output was the class membership (). The input data were classified by a majority vote of their = 5 nearest neighbors. The ANN was a three-layer neural network with three neurons () and four neurons () in the input layer, four corresponding neurons in the hidden layer, and three corresponding neurons () in the output layer. The BC was as follows: where comprised derived from input data sets and derived from an input image. The mean vectors and the covariance matrix of the coefficients for the class were derived. The input images were identified as the class by minimizing the calculated value of . Figure 9 presents the experimental accuracy rates. The accuracy rate for the system employing multiple sensors was higher than that for a system employing other sensors because the PCA/SVM-based method was able to handle imperfections in the detection, especially for multiple sensors. The accuracy rates for employing multiple sensors were 98% for the SVM, 92% for the KNN, 90% for the ANN, and 89% for the BC, demonstrating that the SVM had the greatest accuracy. This may be because the SVM mapped the input data sets into a high-dimensional feature space and maximized the margin between two classes in the feature space. The high-dimensional feature space was suitable for a data set with only a small number of training samples.
4.2. Pattern Detection in a Multisensor System
This section describes the test for the availability of the multisensory system. The experimental setup was prepared as follows. As displayed in Figure 4, the experimental setup for the test included two accelerometers, a hot-film probe, a CCD camera, and an industrial computer. Data regarding the test patterns are listed in Table 1. The patterns were selected from the 200 validation samples for each class. During detection, an accelerometer measured the strength of vibrations and a hot-film anemometer measured the corresponding flow velocity fluctuations. They forwarded them through Wi-Fi to the industrial computer, which converted the signals to displacements and velocity spectra. To characterize the magnitude of cylinder vibration, the dimensionless parameter can be introduced as follows.where is the root mean square (r.m.s.) value of the cylinder’s amplitude; is the diameter of the cylinder; and and are the r.m.s. values of the cylinder displacements in the - and -directions, respectively. To obtain image signals, the industrial computer triggered a CCD camera through Wi-Fi to acquire synchronous images.
During detection, the vibration sensor and hot-film anemometer measured the vibrations and velocity fluctuations, respectively. The PCA-based method normalized the signals (- and -directional signals) into and , respectively, and obtained the principal components as after the eigenvalue decomposition. The indicator was denoted as for the corresponding class . Table 8 lists the vibration amplitudes, dominant frequencies, and indicators in the detection. The vibration amplitudes were (%) for class . Class with the large-amplitude vibrations (more than 10%) was detected; the corresponding threshold velocity (patterns that transited from class C to class D) was 1.75 m/s. The vibration behavior was further examined in terms of the dominant frequencies , such as , , , and for . Class caused serious vibration to the cylinder when the asymmetric vortices were shed from the structure and the vortex shedding frequency matched the cylinder’s natural frequency (22 Hz) . Furthermore, the optimal principal components ( > 1) were , , , and for and the alarm indicator = was established in the experiment. Figure 10 shows the image segmentations using the feature-based algorithm with descriptors and . The descriptors were the optimal selections for class in the classification because produced details and continuous contours for the pattern samples. Table 9 lists the classification results of class obtained using the PCA/SVM-based method; it indicates an average accuracy rate of 98%.
To evaluate the method, a time-cost function approximately quantified the amount of time required for an algorithm used in binary search tree operations, and it was described by where bounded the logarithmic time required by an algorithm for all -sized inputs in the big- notation, excluding coefficients and lower-order terms. Figure 11 presents the time-cost function and classification accuracy rates in the detection. The accuracy rates were 91% for the tactile sensor, 96% for the optical sensor, and 98% for the multiple sensor system. The time-cost function for the optical sensor was lower than those for the other two sensors because the feature-based optical sensor directly used edge descriptors in the segmented image to recognize the pattern samples. For lower than 30 μs, the PCA/SVM-based method employing a multiple sensor system outperformed the methods employing other sensor systems.
This paper proposes a feature-based PCA algorithm for effectively identifying patterns in a multisensor system. The PCA/SVM-based method can be used to solve the detection problem inherent when the data from multiple sensors contain imperfections in the form of imprecision, uncertainty, and incoherence. The system combined the PCA/feature-based algorithm with an SVM for detecting patterns. The results demonstrated that the proposed algorithm could reduce the number of detection variables in the multivariate processing framework and effectively detected patterns in the multisensory system. The framework was able to establish alarm indicators and the corresponding thresholds for identifying patterns in the system. The detection system applied suitable detection parameters to attain an average recognition rate of 98%. The proposed method outperformed existing methods by employing multisensors or single sensors (a tactile or an optical sensor).
Conflicts of Interest
The author has no conflicts of interest to declare regarding the publication of this paper.
This work was supported by Shih Chien University under Project 106-05-05002-03.
M. Naveenkumar and A. Vadivel, “3-D projected PCA based DMM feature fusing with SMO-SVM for human action recognition,” Procedia Computer Science, vol. 89, pp. 759–763, 2016.View at: Google Scholar
J. E. Jackson, A Users Guide to Principal Components, John Wiley & Sons, New York, NY, USA, 2004.