Sensing and Intelligent Perception in Robotic ApplicationsView this Special Issue
Novel SGH Recognition Algorithm Based Robot Binocular Vision System for Sorting Process
To achieve automatic sorting on commodity trademarks, a binocular vision system has been constructed in this paper. By adjusting camera pose, this system can obtain greater shooting perspective. In order to improve sorting accuracy, a now SGH recognition method is proposed. SGH consists of spatial color histogram (S feature), gray level cooccurrence matrix (G feature), and Hu moments (H) feature, which represent color feature, texture feature, and shaper feature, respectively. Similarity judgment function is built by using SGH. The experimental results show that SGH algorithm has a higher visual accuracy compared to single feature based recognition method.
In the automatic sorting system of commodity, robot executes recognition algorithm to judge the type of trademark according to the image information acquired by vision sensor. To improve the sorting efficiency, two aspects should be improved: the camera sensor can be adjusted flexibly and have large view, and recognition algorithm should have a high accuracy .
For the recognition process, machine vision recognition is often implemented with several steps including image acquisition, image preprocessing, image segmentation, feature extraction, and recognition classification . The purpose of the study on new visual sensing systems is to have a larger shooting range as well as a higher image definition, which can lay a good foundation for the following image processing step. Among all the steps following image acquisition, feature extraction plays the most important role to influence the performance of the recognition algorithm.
In machine vision recognition, color feature is widely used in various algorithms. Zhou and Ruan proposed a recognition approach for the gesture images, which achieves the recognition through constructing and filling the color space of the image of the hand skin and then executing the support vector machine classification . Besides, Binder and Kawanable separated the original image into subimages under multiple color channels, then built the collection of the SIFT features in each subimage with the help of the color histogram, and eventually finished the recognition through fusing these SIFT features . Furthermore, S. Ghosh and A. Ghosh applied the machine vision recognition technique in the data comparison of the medical images. Color segmentation was implemented on the RGB color space of the images to be recognized and the features in the lesion area were restored with pseudocolor, which improved the efficiency and the accuracy of the pathological diagnosis . Apart from the color feature, the texture feature is also frequently utilized in machine vision recognition. Lira et al. proposed a supervised learning technique which is used in texture feature recognition. The calculation of the texture features relied on the self-designed computing moments and the static statistical features of the image. The calculated texture information was input into the Bayes classifier for the recognition and the comparison . Moreover, in Wang and Yau’s study on the facial gender recognition, a self-adapted texture detection method focusing on the moustache texture features was designed. Combined with color degradation and shadow segmentation, the method achieved more than 93% accuracy on gender recognition . In the existing machine vision recognition methods, the shape feature of the image is widely used as well. Whytock et al. built the Poisson distance approximation function which was used to determine the shape edges in the image and then extract the shape features. The recognition was accomplished through comparing the extracted features with the shape feature library . Also, Tan et al. built an eigenvalue matrix to describe the shape feature and applied it into the autorecognition of the flower blossom. The experimental result showed that the accuracy of the method is above 80% .
In this paper, we carry out the recognition on trademarks with the help of a binocular camera sensor. Trademark images have different features. Some have obvious color features, some have rich texture features, and some have significant shape features. Therefore, when designing the trademark recognition algorithm, we fuse the three types of features and use the images taken by the binocular camera sensor to achieve a higher recognition accuracy and efficiency.
2. Composition of the Hardware System
2.1. Binocular Camera Sensor
To obtain a larger shooting range, we design a binocular camera sensor of which the composition is shown in Figure 1.
There are two cameras in this sensor. Two cameras are fixed by the bracket which can rotate and pitch with the movement of three-foot seat. The light source is arranged between two cameras. We can decide whether to use it according to the scene illumination conditions. Because of the existence of bracket spacing, two cameras can obtain a bigger view. Based on these two characteristics, binocular camera sensor can be applied in robot sorting process.
2.2. Recognition System Based on the Binocular Camera Sensor
We set up the machine vision recognition system on the basis of the binocular camera sensor, which is shown in Figure 2.
First of all, the binocular camera sensor obtains the satisfactory shooting view by adjusting the poses of the two cameras. Then, the captured images are transmitted to the computer via the image capture card. Finally, the computer executes the recognition algorithm to process and analyze the images and at last gives the recognition result.
3. SGH Recognition Algorithm
Images captured by the camera often contain various feature information. Thus, it is usually hard to guarantee the robustness when using one certain feature to realize the recognition. For that reason, we combine the color feature, the texture feature, and the shape feature together to use by the recognition process.
3.1. S Feature
The color histogram is a significant feature to indicate the color information of the images and it is also the most common color feature used in image retrieval. Histograms like the color histogram, the overall histogram, the fuzzy histogram, and the accumulated histogram have been widely used in the existing recognition algorithms. However, these methods only focus on the statistical information of the color and ignore the relation between the color feature and the spatial position of the pixel. The spatial color histogram is a combination of the color and the space information which is adopted in our recognition algorithm.
Assume that the toke variable of the image is and the size of the image is ; then we can define a new intermediate descriptive variable which is indicated as
In (1), denotes the pixel of the image and has been normalized which indicates the color histogram information. Another two intermediate descriptive variables are defined as
In (3), indicates the Euclidean formula.
Based on , , and , we can define the mathematical form for the spatial color histogram, which is
3.2. G Feature
The gray-level cooccurrence matrix (GLCM) is the most common measurement to indicate the texture feature of the image. In the recognition process, the matrix represents the gray correlation of two images. Besides, the GLCM sometimes needs to be normalized into during the calculation. characterizes correlation between any two grayscale image textures.
When indicating the texture features, the GLCM usually needs 4 kinds of descriptions which are listed as follows.
(a) Energy. Here, energy represents the gray-level distribution of an image. If the image data distribute around the leading diagonal, then we consider the gray-level distribution of the image relatively symmetric. Under this circumstance, a larger energy value indicates a coarse texture feature. The equation to calculate the energy is as follows:
In formula (5), represents a total of gray-level image texture.
(b) Entropy. Entropy is a crucial parameter to measure the total information of an image and a large entropy value usually indicates a rich texture feature of an image. The equation to get the entropy is as follows:
(c) Contrast. Contrast is a vital measurement for the definition of the texture feature. If the texture feature is clear, the contrast value will be large. The equation for the contrast is as follows:
(d) Relativity. Relativity indicates the gray-level similarity between the row and the column elements of the GLCM. We can get the relativity based on the following equation:
Here, , , , and denote the mean value and the variance, respectively, and the equations for these parameters are
Altogether, the texture feature can be denoted as follows:
3.3. H Feature
The shape of the image can be seen as an advanced expression of the image visual effect. However, the shape may change if the basic information of the image (such as the size of the image) varies, which hinders its usage in machine vision recognition. For that reason, many feature description operators based on shape invariance have been extracted, such as the inertia moment, the interior angle, the Harris corner, and the Hu moment. In this paper, we use the Hu moment to indicate the shape feature of the image.
Commonly, the Hu moment is obtained based on the region and its calculation is very complex. In order to get the Hu moment with a shorter time, we propose a new calculation approach which is based on the contour. The specific steps of the approach are as follows.
Calculate the linear integralof which denotes the coordinate of the pixel that curve C passes and ( and ). When , represents the arc length of the curve.
Get the th central moment with the equation shown in the following:
In (12), and .
Get the Hu set of invariant moments, which are shown as follows:
When we get to , we can have the Hu moment which is based on the contour. Here, we use a one-dimensional array to represent the result, which is .
3.4. The Recognition Judging Function
When applying the recognition system shown in Figure 2, we establish an image database that contains all the possible situations during the recognition process. Therefore, the recognition process has been transformed into the feature comparison between the captured image and the image database.
When carrying out the feature comparison, we compare the color feature, the texture feature, and the shape feature of the captured image with those in the image database and consequently get a similarity-judging formula, which is shown as follows:
In the formula, , , and indicate the color feature similarity, the texture feature similarity, and the shape feature similarity between the captured image and the image database, respectively. In addition, and represent the captured image and the image database. Also, we use , , and to denote the weight of the three features in the judging function.
4. Experimental Results and Analysis
In order to verify the proposed recognition algorithm and the corresponding recognition system, we carry out the experiment. For the hardware, the recognition system is based on the binocular camera sensor. The cameras used in our experiment are DH-SV2001FC/FM color industrial cameras manufactured by Daheng (group) Co., Ltd., of which the highest resolution can reach up to 1628 × 1236. Besides, the computer we use is a Dell Inspiron laptop with a dual-core 2.0 GHz CPU, an 8 G RAM, and a 1 TB hard drive. As for the software, we use C++ to write the algorithm and select 1,000 trademark images to establish the image database.
In our experiment, we attach some trademark images on a planar cardboard and position the board vertically to the principal optical axis of the two cameras. Images taken by the two cameras are shown in Figure 3.
(a) Images taken by the left camera
(b) Images taken by the right camera
As shown in Figure 3, restricted by the shooting range, the two cameras can only capture part of the trademark board, respectively. Therefore, only the valid trademarks are selected and marked according to the segmentation algorithm. In our experiment, 16 valid trademarks are selected and compared with the image database.
From these 16 images, we extracted 3 vectors of S feature, 4 vectors of G feature, and 7 vectors of S feature. By performing normalization, features data are shown as in Table 1.
Visual recognition platform is shown in Figure 4. In the platform, left-upper part is display area of visual sensor information, left-middle part is conclusion area, left-down part is function button area, and right part is information area of feature vectors.
In order to verify the effectiveness of the proposed multifeature fusion recognition algorithm, we compare it with recognitions based only on color feature, texture feature, and shape feature, respectively. Recognition accuracy is shown as follows:
In formula (15), is recognition accuracy. is accurate identification image, is reference image in database, and is number.
Recognition accuracy results of four kinds of methods are shown in Table 2.
From Table 1, we can get that, with the increasing number of the images, the advantage of the proposed algorithm in recognition accuracy shows up. When the trademark images to be recognized pile up to 16, the accuracy of SGH recognition algorithm still reaches 87.5%. By contrast, the accuracies of the other three recognition methods are 81.3%, 75%, and 56.3%, respectively.
The reason for the low accuracy of the shape-feature based recognition is because shapes are simple and sometimes resemble others in the trademarks that we use for the experiment. Likewise, the reason for the relative high accuracy of the color-feature based recognition is that the trademarks contain very rich color information, which offers more references for the similarity judgment.
As for SGH recognition algorithm that we propose, the vital reason for its higher recognition accuracy is that it utilizes the three features simultaneously, which provides more effective information for the judgment.
We also find that recognition accuracy will decrease as the number of images increases by four kinds of algorithm. It is because similarity judgment will decrease with a large number of images.
Further, we compare the time consumption for the four recognition methods, which is shown in Figure 4.
From the curves in Figure 5, we can see that, due to the simultaneous usage of the three features, the time consumption of the proposed algorithm is more than the other three methods. However, with the number of the images going up, the time consumption of the proposed algorithm keeps declining. Thus, when dealing with a large number of images, the disadvantage of the proposed algorithm will diminish.
In this work, a new vision sensor with greater view is designed to apply on the sorting system. Based on this sensor, we establish a machine vision recognition system and design a novel SGH recognition algorithm.
In SGH algorithm, multiple features are utilized which includes color feature, texture feature, and the shape feature. These three types of features are indicated by spatial color histogram, gray-level cooccurrence matrix, and Hu moment, respectively. SGH is superior to traditional single feature based recognition algorithms which is limited by the partial representation power of single feature. Experimental results show that the accuracy of the proposed algorithm is obviously higher than those of the recognition methods merely based on the color feature, the texture feature, and the shape feature.
With respect to time consuming, SGH algorithm requires more time compared to single feature based algorithms, but this time weakness can be fully compensated by the superior recognition performance of SGH. Moreover, when the number of images increases, the time consumption of SGH presents an obvious decline tendency. This indicates that, in the large scale system, the efficiency of SGH approaches to the traditional methods. In the future, we will focus on how to further reduce the executing time and enhance the flexibility of our algorithm.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This study was supported by Heilongjiang Province ordinary college training program for New Century Excellent Talents with Grant no. 1254-NCET-008.
A. Binder and M. Kawanable, “Enhancing recognition of visual concepts with primitive color histograms via non-sparse multiple kernel learning,” in Proceedings of the 10th Workshop of the Cross-Language Evaluation Forum (CLEF ’09), vol. 6242, pp. 269–276, Corfu, Greece, September–October 2009.View at: Google Scholar