Abstract

An approach is developed for the extraction of affine invariant descriptors by cutting object into slices. Gray values associated with every pixel in each slice are summed up to construct affine invariant descriptors. As a result, these descriptors are very robust to additive noise. In order to establish slices of correspondence between an object and its affine transformed version, general contour (GC) of the object is constructed by performing projection along lines with different polar angles. Consequently, affine in-variant division curves are derived. A slice is formed by points fall in the region enclosed by two adjacent division curves. To test and evaluate the proposed method, several experiments have been conducted. Experimental results show that the proposed method is very robust to noise.

1. Introduction

Object recognition is an important topic in the area of computer vision and has been found numerous applications in real world. One of the common difficulties in object recognition is that the object shape is often distorted for observing under various orientations which can be appropriately described by perspective transformation [1]. Furthermore, if the size of observed object is far less than the distance between object and the observing position, the change of the object's shape can be described by affine transform.

The extraction of affine features plays a very important role in pattern recognition and computer vision [24]. Many algorithms have been developed. Based on whether the features are extracted from the contour only or from the whole shape region, the approaches using invariants features can be classified into two main categories: region-based methods and contour-based methods [5].

The contour usually offers more shape information than interior content [5], and the contour-based methods provide better data reduction. Examples of these approaches include the Fourier descriptors (FD) [6] and the wavelet descriptors [1, 79], principal components analysis descriptors [10] and the gradient-based local descriptors [11]. Mokhtarian and Abbasi [12] use the maxima of curvature scale space image to represent 2D shapes under affine transforms. However, the performance of these contour-based methods is strongly dependent on the success of the boundary extraction process, and they can only be used to objects with single boundary.

In contrast to contour-based methods, region-based techniques take all pixels with a shape region into account to obtain the shape representation. Moment invariant methods are the most widely used techniques. The commonly used affine moment invariants(AMIs) [13, 14] are extensions of the classical moment invariants firstly developed by Hu [15]. Although the moment-based methods can be applicable to binary or gray-scale images, it is shown in [16] that high-order moments are more vulnerable to white noise. This makes their use undesirable in image representation and pattern recognition. On the other hand, only moments of higher orders carry the fine details of an image. These two conflicting factors generally limit the using of AMIs in object representation and recognition. Some novel region-based have also been proposed to extract affine invariant features. Ben and Arids propose the frequency domain technique [17]. Petrou and Kadyrov propose the trace transform which is based on applying a particular combination of functions to the image [18]. Recently, a novel approach called multiscale autoconvolution(MSA) was derived by Rahtu et al. [19]. Promising results have been obtained by MSA in various objection classification problem. These methods give high accuracy but usually at the expense of high complexity and computational demands. Furthermore, some of these methods are sensitive to noise in the background [19].

To derive robust affine invariant descriptors, a region-based method is proposed in this paper. We cut the object into slices, and the affine invariant descriptors are constructed by summing up the gray value associated with every pixels in each slice.

To establish slices correspondence between an object and its affine transform version, the central projection transformation(CPT) has been employed. CPT is firstly proposed in [20] and further developed in [21] to extract rotation invariant features. By performing projection along lines with different polar angles, any object can be converted to a closed curve, which is referred to as general contour(GC) of the object. It can be proved that GC derived from the affine transformed object is the same affine transformed version of the original object.

Then, some affine invariant closed curves, which are called -division curves, are derived from the object based on the obtained GC. Points on these division curves are selected as constant division points on the line segment connected the centroid of the object and points on the obtained GC. As the constant varieties, various division curves can be derived. Consequently, a slice is formed by points fallen in the region between two adjacent division curves. Gray values associated with points in this slice are summed up to extract affine invariant descriptors. Several experiments have been carried out to illustrate the proposed method from different aspects. Satisfying results have been achieved.

The rest of the paper is organized as follows: in Section 2, the GC of object and its properties are introduced. In Section 3, the affine invariant descriptors are constructed. Furthermore, it is shown that these descriptors is robust to noise. The experiments and results are shown in Section 4. Finally, some conclusion remarks are given in Section 5.

2. The GC and Its Characteristics

Any object can be converted to a closed curve (general contour) of the object by taking projection along lines from the centroid with different angles (central projection transform). In this section, we devote to studying the characteristics of GC.

2.1. The GC of an Object

Suppose that an object is represented by in the plane. Firstly, the origin of the reference system is transformed to the centroid of the object, as denoted by , which can be computed from the geometric moments as follows: Let be the longest distance from to a point on the pattern.

To derive the GC of an object, the Cartesian coordinate system should be converted to polar coordinate system. The conversion is based on the following relations: Hence, the shape can be represented by a function of and , namely, where , and .

After the conversion of the system, we perform CPT to the object by computing the following integral: where . The function is, in fact, equal to the total mass as distributed along different angle from 0 to . The CPT method has been used to extract rotation invariant signature by combining wavelet analysis and fractal theory in [21]. A satisfying classification rate has been achieved in the recognition of rotated English letters, Chinese characters, and handwritten signatures. For more details of CPT, refer to [20].

From a practical point of view, the images to be analyzed by a recognition system are most often stored in discrete formats. Catering to such two-dimensional discrimination patterns, we should modify (2.4) into the following expressions: where .

Definition 2.1. For an angle , is given in (2.4), then denotes a point in the plane of . Let go from 0 to , then forms a closed curve. We call this closed curve the general contour (GC) of the object.

For an object , we denote the GC extracted from it as . By discrete form (2.5), the discrete GC of the object can be derived. For example, Figure 1(a) is a gray-scale image taken from the well-known Columbia Coil-20 database [22], and Figure 1(b) shows the image of a binary image taken from the MPEG-7 database. Figures 1(c) and 1(d) show the GCs of Figures 1(a) and 1(b), respectively.

2.2. The Properties of GC

The GC of an object has the following properties: single contour, affine invariant, and robust to noise.

Single Contour
By (2.4), a single value is correspond to an angle . Consequently, a single closed curve (GC) can be derived from any object. For instance, see the GCs of Figures 1(a) and 1(b) in Figures 1(c) and 1(d). Those objects have been concentrated into a integral pattern. In real life, many objects consist of several separable components. Contour-based methods are unapplicable to these objects. By performing projection along lines with different polar angles, a single closed curve can be derived, and contour-based methods can be applied to any object. Consequently, shape representation based on GC of the object may provide better data reduction than some region-based methods.

Affine Invariant
Affine maps parallel lines onto parallel lines, intersecting lines into intersecting lines. Based on these facts, it can be proved that the GC extracted from the affine transformed object is also an affine transformed version of GC extracted from the original object. Figure 2(a) shows an affine transformed version of Figure 1(a), and Figure 2(c) shows the GC derived from Figure 2(a). Observing GCs in Figures 1(c), 2(c), 1(d), and 2(d), we can see that GC of an object is affine invariant.

Robustness to Noise
It is shown in [23] that Radon transform is quite robust to noise. We can similarity show that GC derived from the object is robust to additive noise as a result of summing pixel values to generate GC.

3. Features Extraction by Cutting Object into Slices

To extract affine invariant features, we cut the object into slices. These slices are regions enclosed by affine invariant closed curves which are derived based on the GC of the object. A slice derived from the affine transformed object is the same affine transformed version of slice derived from the original object.

3.1. Cutting Object into Slices

Prior cutting the object into slices, we should derive affine invariant closed curves which are called division curves of the object.

Definition 3.1. For an object , suppose that is its GC, and is the centroid of the object as defined in (2.1). If and superpose each other, the point is selected as the centroid . Otherwise, the point is selected on the line segment connected the centroid and point on the GC such that the following equation is satisfied. That is, where is a constant. As going along the GC, the locus of point formed a closed curve. We denote this closed curve as , which is called the -division curve of the object.

As the constant varied, different division curves will be obtained. Figure 3 shows division curves of Figures 1(a) and 2(a). We can observed that division curves extracted from the affine transformed object are also affine transformed version of division curves extracted from the original object.

We denote the region enclosed by two different division curves -division curve and -division curve as , which is called -slice of the object . Figure 4 shows some slices of the object in Figure 1(a).

3.2. Affine Invariant Descriptors

By different division curves, the object can be cut into a number of slices. We can employ some well-known methods such as AMIs [13] and MSA [19] to extract affine invariant feature vectors from a piece of these slices. Consequently, the object is recognized by composing these feature vectors into a united vector. However, the moment-based method is very sensitive to noise, and MSA has large computational complexity. In this paper, we extract affine invariant features by summing up gray values associated with points in region of the derived slices. It will be shown that these features are very robust to noise.

Choose a series of numbers such that . For an object , we denote as the mass of -slice of the object ; that is, We denote as follows: We will prove that are affine invariants. In the experiments of this paper, the objects are cut equally into parts. If we set the maximum , then the numbers are set to In this paper, is set to 4.

Theorem 3.2. For an object , as given in (3.3) are affine invariants.

Proof. As aforementioned, GC derived from the affine transformed object is the same affine transformed version of GC derived from the original object. In addition, affine transform preserves the ratios of distances along a line. Consequently, the -division curve derived from the affine transformed object is the same affine transformed version of -division curve derived from the original object. As a result, the -slice derived from the affine transformed object is the same affine transformed version of -slice derived from the original object .
Affine maps have mass relative invariance property, which states that the mass of an affinely transformed object is equal to the product of its original object mass times the determinant of the transformation matrix. In other word, the slice of the original object and the slice of the affinely transformed object satisfy the following equation: It follows that Hence, Therefore, are affine invariants.

We call given in (3.3) as affine invariant descriptors.

3.3. Robustness to Noise

In this section, we study the noise robustness of the affine invariant descriptors given in (3.3). Let be the original image whose intensity values are random variables with mean and variance . Suppose that the image is noised by noise with zero mean and variance .

Since the affine invariant descriptors defined in (3.3) are the integral of gray values of the slices for the continuous case, the integral of noise in the slice is constant and is equal to the mean value of the noise which is assumed to be zero. Therefore, zero-mean white noise has no effect on the descriptors of the image in this situation.

In practice, the image is stored by a finite number of pixels. As aforementioned, we add up intensity values of the pixels in a slice to calculate the affine invariant descriptors. Assume that we add up pixels of to calculate , where is the number of pixels in . Suppose that denotes the expected values and that denotes the variance. Therefore, Then, the expected value of is

Equations (3.8) and (3.9) indicates the relations of descriptors value and the mean the variance of the original image.

After introducing the noise with zero mean and variance to the image, the signal-to-noise ratio (SNR) of the image is It follows from (3.9) that the SNR of the affine invariant descriptors can be given as follows: This means the SNR is increased by . Due to the fact that in many practical situation , we may alternatively write Hence,

This shows that SNR has been increased by a factor of , which is practically a large quantity. As a result, the affine invariant descriptors are very robust to additive noise.

4. Experiments

In this section, some experiments are carried out to illustrate the performance of the proposed method. The gray-scale images utilized in our experiments are taken from the well-known Columbia Coil-20 database [22], which contains 20 different objects shown in Figure 5. The Coil-20 database includes some sets of similar objects, such as three toy cars, ANACIN, and TYLENOL packs. They can be easily misclassified due to their similarity.

In the experiments, affine transformations are generated by the following transformation matrix [7]: where denote the scaling, rotation transformation, respectively, and denote the skewing transformation. To each image, the affine transformations are generated by setting the parameters in (4.1) as follows: , , , and .

In this paper, is set to 4, and the classification accuracy is defined as

4.1. Implement Issues

In practice, the objects are not available as continuous functions. We only have some amount of discrete samples. The digital image are represented as matrix. With the affine transforms, the position of each point changes, and it is possible that number of points in any region changes too. Hence, the GC should be parameterized to establish one-to-one points correspondence between GC and its affine transformed version.

Several parameterizations have been reported. In this paper, we adopt a curve normalization approach proposed by Yang et al. [24], which is called EAN. The EAN method mainly composes of the following steps. (i)For the discrete GC , compute the total area of the GC by the following formula: Let the number of points on the contour after EAN be too. Denote . (ii)Select the starting point on GC as the starting point of the normalized curve. From on GC, search a point along GC, such that the area of each closed zone, namely the polygon equals to , where denotes the centroid of the object. (iii)Using the same method, from point , calculate all the points , along GC.

This normalization provides a one-to-one relation between the points of original GC and transformed GC. For more information of EAN, refer to [24].

Consequently, the -division curve can be constructed to establish one-to-one points correspondence between -division curves and its affine transformed version. Finally, the region enclosed by two different division curves forms the slice of the object.

4.2. Comparison with AMIs and MSA

In this experiment, we compare the proposed method with MSA and AMI. The AMIs method is implemented as discussed in [13], and 3 AMIs invariants are used. The MSA method is implemented as discussed in [19], and 29 MSA invariants are used. As aforementioned, the Coil-20 database is employed. Each image is transformed 140 times. That is to say, the test is repeated 2800 times using every method. In this experiment, our method is performed for . The classification accuracies of the proposed method, AMIs, and MSA are 98.59%, 100%, and 95.31%, respectively. The results indicate that AMIs perform best in this test, and the proposed method is a little outperforms over MSA.

We firstly add the salt and pepper noise with intensities varying from 0.005 to 0.03 to the transformed images. Figure 6 shows the classification accuracies of all methods in the corresponding noise degree. We can observe that the classification accuracy of AMIs decreases rapidly from noise free condition to small noise degree. The classification accuracy decreases from 100% to less than 50% when the noise intensity is 0.005. MSA performs much better than AMIs, but the results are not satisfying. The drop of classification associated with the proposed method is even less than three percents. To large noise degrees, the proposed method keeps high accuracies all the time.

We add the Gaussian noise with zero mean and different variance varying from to to the transformed images. Figure 7 plots the classification accuracies of all methods in the corresponding noise degree. The results indicate that AMIs and MSA are much more sensitive to Gaussian noise than salt and pepper noise. Their classification results fall quickly once the image is suffered from Gaussian noise. However, the classification accuracies of the the proposed method greatly outperform AMIs and MSA in every noise degree.

4.3. The Affection of Slice Size

The affine invariant descriptors are constructed by cutting the object into slices. We test the performance of the proposed method with different slice sizes in this experiment. The slice sizes is affected by in (3.4). With big , the object is cut into small slices with large computational complexity. On the other hand, if is small, the object is cut into big slices with low computational complexity.

The gray-scale images of Coil-20 database are employed. These images are transformed as aforementioned; that is to say, the test is repeated 2800 times. These transformed images are classified according to their affine invariant descriptors by comparing their distance (Euclidean distance) to that of the training images. The accuracies are shown in Table 1. As increasing, fine details of the object can be carried by small size slices. Hence, high accuracy can be achieved (e.g., 99.55% accuracy for ).

The images are always noise for reasons in real-life situations. So, we also test the robustness of the proposed method in this experiment. Every test image is added Gaussian noise with different intensities. The intensity level is set to . The accuracies are also shown in Table 1. We can observe that the accuracies decreased with increasing noise level. Furthermore, under noisy conditions, although fine details of the object can be carried by small size slices, discrete error will affect the accuracy. For instance, the accuracy of is lower than it of all the time.

4.4. Discussions

In this study, we cut object into slices, and the affine invariant features are derived by summing up gray value associated with every pixels in each slices. Experimental results show that the proposed method is very robust to conventional Gaussian noise. However, conventional Gaussian noise is never enough. Recently, people are more interested in fractional Gaussian noise (fGn) than conventional one. For more details, see [2528]. The simulation of fGn was discussed in [25, 29]. Multiscaled fGn can be found in [25, 3032]. Our future research is to consider fGn in the proposed scheme.

5. Conclusions

In this paper, we describe a novel approach for the extraction of affine invariant features by cutting object into slices. Firstly, the general contour (GC) is derived from the object by performing projection along lines with different polar angles. Consequently, some affine invariant curves, which is called division curves are derived from the object based on the derived GC. Then, a slice is formed by points fallen in the region between two adjacent division curves. The affine invariant features are derived by summing up gray value associated with every pixels in each slices. These features are very robust to additive noise as a result of summing pixel values to generate these features. In comparison with AMIs and MSA, the proposed method is more robust to noise in the background.

As for our future work, some characteristics of slices associated with an object will be deeply studied, and more experimental results will be reported.

Acknowledgments

The authors would like to thank the anonymous referees for their suggestions and insightful comments. This work was supported in part by the National Science Foundation under Grants nos. 60973157, 60873168, 60873102, and 61003209 in part by the Natural Science Foundation of Jiangsu Province Education Department under Grant no. 08KJB520004, in part by “Qing-Lan” Program for Young Excellent-Teachers of Jiangsu Province. This work was also supported in part by the Science Research Foundation of Nanjing University of Information Science and Technology under Grant no. 20070123.