Abstract

The wavelet transform is a well-known signal analysis method in several engineering disciplines. In image processing and pattern recognition, the wavelet transform is used in many applications for image coding as well as feature extraction purposes. It can be used to describe a given object shape by wavelet descriptors (WD). Thus, it is used to recognize objects according to their contour shape by deriving a number of WD and comparing them with the WD of stored contour patterns. For our method, we use a periodical angle function derived from an extracted object contour. In order to apply the WD, the Mexican Hat can be used as the mother wavelet. In this paper, the method of object shape recognition using wavelet descriptors is described coherently and includes details relating to the method of applying the periodical angle function and the derivation of the formulas for the Haar as well as Mexican Hat wavelet descriptors. To evaluate the results of object recognition when using wavelet descriptors taking into account the dependence on the starting point, the paper describes a sufficient method for the comparison of wavelet descriptors using the minimum distance matrix.

1. Introduction

Shape representation is an important step in object recognition tasks and plays a key role in many industrial applications. The recognition of 2D flat objects (e.g., plastic seals or aluminium profiles in the automobile industry) is a prime example of the need to find an appropriate shape representation which guarantees the detection of the objects despite slight object differences. Another example is the recognition of classes of weed species in agricultural applications with the objective of using the appropriate amount and type of pesticides in order to kill the pests. The problem here is that the weeds’ shapes change significantly according to their growing stages. For the above given reasons, it is necessary to find an adequate contour description, which reflects both rough and detailed information about the object shape.

In recent years, many methods for shape representation and recognition have been proposed [110]. An advanced review of shape representation techniques can be found in [1, 2]. One can distinguish between contour-oriented and region-oriented shape description techniques. The first method describes the object outlines and the second describes the object areas. In both cases, suitable contour and region primitives are first used in order to extract the relevant objects from a given image (e.g., Freeman codes [3], polygon lines [10], and pixels or image squares [4]). Based on this data, shape description can be performed by applying a feature vector in order to represent the given objects. Using such a feature vector, the recognition task can be solved by comparing the feature vectors of different objects. In this paper, we use contour polygons and apply feature vectors using wavelet descriptors.

The most popular method related to the shape description using contour data is described in [9]. The method uses shape presentation by the so-called Fourier descriptors. The classical Fourier transform allows a loss-free transmission of an image signal in the frequency domain, but unfortunately it loses the full spatial resolution. The wavelet transform offers a good solution for this problem and additionally allows the possibility of selecting the appropriate basic functions for the signal to be analyzed. This makes the wavelet transform particularly useful for many image processing problems. The mathematical foundations of the wavelet transform were developed in the early nineties, and the term “Wavelet” was used in the 1980s for functions, which generalize the short-time Fourier transform [11]. The multiresolution analysis (MRA) using the Orthonormal Basis Functions was introduced in 1989 [12, 13]. A comparison between the short-time or windowed Fourier transform and the wavelet transform is discussed in [14]. The wavelet transformation was used in 1996 in order to describe a contour shapes [15] method based on the theory of periodized wavelets and using the MRA in order to apply wavelet descriptors. In [1618] similar approaches are presented, which use different coordinates to describe a given object shape. One of the unattractive properties of the wavelet transform is the dependency of the WD on the selected starting point. This problem was reported in [19] and several solutions based on fixing the starting point were discussed. Hu et al report a solution using Zernike moments [19].

In this paper, we present the derivation of wavelet descriptors using a periodical angle function on the basis of the Mexican Hat and Haar wavelet. The new method is based on the publications listed in [2124] and is described in this paper much more coherently and, specifically, provides further details related to the following important points.(i)Applying the angle function: in some cases, applying angle functions hides some sources of error, for example, concave or convex object shape. Therefore, the paper shows the method of applying the angle function step by step (see (1)–(6)). (ii)Derivation of the formulas for the Haar wavelet descriptors: equations (10)–(14) show the derivation of the Haar wavelet descriptors. The transition from (13) to (14) is given in Appendix A.(iii)Derivation of the formulas for the Mexican Hat wavelet descriptors: equations (15)–(22) show the derivation of Mexican Hat wavelet descriptors. The transition from (21) to (22) is given in Appendix B. (iv)Performance assessment using minimum distance matrix: presented here is an evaluation of the performance of the recognition method when using wavelet descriptors taking into account the dependence on the starting point.(v)Performance assessment compared to Fourier descriptors: a comparison between wavelet descriptors and Fourier descriptors is performed in this paper using the minimum distance matrix to show the efficiency of the wavelet descriptors as opposed to the Fourier descriptors.

To represent a given object shape, we show how to apply a periodical angle function using the polygon data of a given object shape. This angle function must be free from any singularity, which might arise from object rotations. For that reason, the paper shows the derivation of the angle function for a simple geometric object. To obtain a suitable number of WD, we normalize the angle function over the interval and derive a wavelet building set in the same interval. The results are shown on the basis of a simple example to illustrate the different steps of the new method. We also present results related to the recognition of puzzle pieces for two different wavelet type and compare the results of these different implementations in order to find the appropriate wavelet building set for this application.

The paper is organized as follows. Section 2 addresses the derivation of the angle function and describes the problem of singularity. Section 3 introduces the continuous wavelet transformation. The derivation of the WD using the Mexican Hat as well as the Haar function is presented in Sections 4 and 5. In Section 6, the way of applying suitable wavelet building set is addressed and discussed. In Section 7, the results of using derived WD to recognize object shapes are discussed based on a robot vision system for puzzle composition. In this context, the minimum distance approach is described, which is used to compare two different WD sets. Since objects in real images are affected by noise and image digitization, we discuss the impact of image noises on the angle function and thus on the derived WD in Section 8. For this purpose, we added artificial noise to the image of a puzzle and compared both the angle functions as well as the WD of the puzzle with and without noise. A comparison between the Fourier and wavelet descriptors in recognition tasks is shown in Section 9. In Section 10, the starting point problem is discussed.

2. Shape Description Using an Angle Function

To derive an angle function, we use the polygon information of a given object shape derived after contour extraction and approximation [10]. Figure 1 shows the example of an object shape with five edges; its derived angle (green colored) and the periodical angle function (blue colored) are shown in Figure 2. The red point in Figure 1 indicates the starting point. The order of the polygon vertices is given here in a clockwise direction. This order is used for outer contours. For inner contours, the order is counter-clockwise.

The -axis in the diagram of the angle function represents the measured length between the starting point and a considered point on the contour given in pixels. In the diagram of the periodical angle function , the -axis represents the normalized length within the interval . The -axis represents in both diagrams an angle given in radiant.

To obtain the angle function of the given shape, we first calculate the length () and angles () of every edge with respect to the -axis according to the following: is the number of polygon edges and and are the coordinates of the polygon corners ,  , respectively. We define the absolute angle of the polygon edge as follows:ifif { ifelse if } else ifelse if.

Figure 3 shows the calculated values. Here, the angles are given in radian. As shown in this figure, the absolute angles are always positive. To obtain the angle function, we calculate the angle differences between every polygon edge and the first polygon edge by the following taking into consideration that differences higher than and smaller than must be corrected in order to fulfill the condition of modulo : For the above given example we obtain the following values: The angle function is defined as follows:

The length represents the accumulated length beginning from the given starting point.

The derived angle function is defined within the interval , where is the total length (circumference) of the given contour polygon and can be scaled in the interval using the following parameter transformation:

Using the following normalization: where the positive sign is given for outer and negative sign for inner contours. is a periodical function with a period of (see Figure 2). We will use this function to execute the continuous wavelet transform and apply the wavelet descriptors. The periodical angle function is better suited than the initial angle function because the recognition can be performed independent of the sizes of the considered objects. This is important since the object size changes in the camera image according to the distance between camera and object. In this paper we consider outer contours. For inner contours only the negative sign in (6) must be simply used instead of the positive sign.

3. Wavelet Transformation

Similar to the FT, the WT uses elementary functions, called wavelets, to describe a given signal. In contrast to the FT, which uses harmonic functions with different frequencies, the WT uses only one basis wavelet (mother wavelet) to derive the reconstruction signals [14]. Through dilatation, compression, and shifting of the mother wavelet, we can derive new variants of this signal, which together constitute the so-called wavelet building set. The general derivation formula of wavelets from the mother wavelet is given as follows:

where is the compression or dilatation parameter and is the shifting parameter. Figure 4 shows the mother wavelet based on the Haar function and some derived variants resulting from compression, dilatation, and shifting using (7). Figure 5 shows the equivalent Mexican Hat wavelets. The function can be scaled over the interval similar to the periodic angle function.

Based on (7), (8) shows the coefficients of the continuous wavelet transform for the derived angle function given in (6). We will call these coefficients wavelet descriptors (WD) similar to the name of the Fourier descriptors (FD). Based on the MRA, we receive the approximate signal for and detail signal for ,

Since the wavelets are time-limited variants of the basis function, we can limit the integration in (8) to the definition interval and finally receive the following:

4. Haar Wavelet Descriptors (HWD)

To calculate the Haar Wavelet descriptors, we just replace the function in (9) by the scaled Haar function and set the integration limits to the interval , which represents the nonzero value between the left and right border of the Haar Wavelet. We thus obtain the following expression:

If we replace the function by according to (6), we obtain

If we now replace the parameter by according to (5), we receive the final expression of the Haar Wavelet descriptors:

The integration in (12) depends on the positions of the low-high and high-low edges of the Haar Wavelet as shown in Figure 6. To execute the integration, we divide the first integral in (12) into three subintegrals , , and and the second integral into , , and according to the location of the Haar Wavelet within the defined interval of the angle function (see Figure 6). We thus obtain (13). The integration outside of the interval is always equal to zero,

After solving the integrals to , we receive the final expression of the HWD as given in the following: The first four terms of (14) depend only on the polygon starting point and the parameters and and do not include any shape information. Only the last two terms include the angle differences between every two consecutive edges and therefore information about the contour shape.

5. Mexican Hat Wavelet Descriptors (MWD)

Similar to the Haar Wavelet descriptors, we can calculate the Mexican Hat wavelet descriptors as given below. Using the Mexican Hat function as basis wavelet we receive the wavelet building set as given in the following: The corresponding wavelet descriptors are expressed as Using the parameter transformation in (5) and the normalization in (6), we obtain the following: After small modification, we receive the following: Multiplying the terms in (18) then yields the following: The above given integration includes the following four terms: Equation (20) can be expressed as follows: After solving the integrals in (21), we receive the final expression of the MWD as given in the following:

In (22) only the first term includes shape information, since it includes . All other terms depend on the parameters and and are constant for a given Mexican Hat wavelet. For this reason, these terms do not need to be considered in the comparison of the wavelet descriptors given in (22).

6. Derivation of Wavelet Descriptors

The used wavelets in (7) can be seen as a filterbank with high and low frequency signals. With the increase in scale , the function is dilated in time to focus on long-time behavior of the associated signal . In general, large-scale allowed a global view of the signal while small-scale shows a detailed view of the signal. To take this into account and to obtain suitable WD for representing a given object shape we vary the values of the compression or dilatation parameter and the shifting parameter according to the following equations: with ;  : number of WD and

As shown in (23) the parameters and are always positive. By changing the parameters   and  in this equation, we obtain a sufficient wavelet building set, which covers the complete definition interval of the angle function. Similar to the MRA [14], we can vary the parameters , , and to construct a wavelet building set with different low- and high-frequency signals to obtain components from the approximation as well as detail signal. This is important, since the components of the approximation signal are needed to describe the rough shape and the detailed signal components to describe small shape changes of the object. For a given value of and depending on , we receive values . For all these values we use the wavelets as scaling functions. All other wavelets, for which , are used as approximation functions. To receive components from the detail signal, which corresponds to the wavelet signals with high-frequency , we can choose the higher values of with the appropriate values of . A better alternative, however, is to use the reciprocal values of , which are used to receive components from the approximation signal. Generally, only a small number of WD (e.g., 32 or 64) is needed in practical recognition applications to describe different object shapes. In this case, the parameter can be set to 4 if we use the reciprocal value of to include components of the detail signal. For , Figure 7 shows a part of the Haar as well as Mexican Hat wavelet functions.

As shown in this figure, small values of the parameter produce compressed variants and large values and, on the other hand, create dilated variants of the mother wavelet. In both cases, we receive an approximation signal of the wavelet transformation, since . To receive components of the detail signal, which describes small details of the contour shape, we can use in combination with the same values of . For such values we obtain WD, which are qualified to describe small matches between the compared shapes.

7. Experimental Results

Figure 8 shows the overall procedure of the proposed shape description and recognition approach using WD. The implementation was carried out as part of a robot vision system for puzzle composition. The WD comparison is done here using the minim distance matrix (MDM).

This chapter presents some results of the above-described method. For better illustration, we apply the wavelet descriptor method within a robot vision system environment to enable the robot to compose a puzzle of size 20 30 cm. The puzzle is made of wood and includes 10 different puzzle pieces. The size of the puzzle pieces varies between 3 3 and 13 7 cm. Here, the robot has to recognize the pieces based on their shapes and place them into the correct slots of the puzzle board. The pieces are distributed on a flat surface near to the puzzle board so that a CCD camera can acquire a grey level image of the puzzle. The location and orientation of the puzzle pieces and puzzle board can be chosen arbitrarily. The recognition of the pieces is made by comparing the wavelet descriptors using the minimum distance method. The overall procedure of the proposed shape description and recognition approach is given in Table 1.

Figure 9 shows the camera image. The figures enclosed within the blue rectangle are the figure slots in the puzzle board. The three crosses serve as reference points with known robot coordinates and are used to transform pixel coordinates into robot coordinates. The crosses do not impact on the presented results. Figure 10 shows the extracted and approximated contours of the puzzle pieces as polygons. Here the calculated polygon vertices of every contour are marked by a red point; the starting points are marked on each contour by a circle. The number of the polygon vertices in this image varies between 10 and 23 points. The chosen approximation method [10] takes into account the curvature along the given contour so that contour parts with high curvature are mapped by a higher number of polygon vertices than contour parts with slight curvature. It is important to mention that the number as well as positions of the polygon vertices varies for the same shape in different images slightly due to the quantization and binarization noises. Thus, the angles between the polygon edges of an extracted contour will also vary accordingly. A separate assessment of the impact of noise on the angle functions and hence the wavelet descriptors is shown in Section 9 based on an artificial noisy image of a puzzle piece. For every contour of the puzzle pieces we determined 25 WD from the approximation and another 25 WD from the detail signal. The WD are used to identify the puzzle pieces by calculating the minimum distance matrix as shown below. Figures 11 and 12 show the first 16 MWD and HWD obtained from the approximation () and the detail signal () for the giraffe as well as horse shaped puzzle pieces given in Figure 9. The used starting points of the derived angle functions are marked in Figure 9 in green color. The dilatation or compress parameter and shifting parameter are calculated as given in (23) for and . The reported experimental results are calculated in all cases using the described strategy in Section 10 to ensure starting point independence.

To measure the similarity between two object shapes we can calculate the differences between the WD of the different shapes using the following Euclidean distance : where and are the WD of the two compared shapes and is the number of WD taken from the approximation and/or detail signal. Table 1 shows the Euclidean distance matrix calculated from the MWD approximation signal for the shapes of Figure 8 for . In this matrix, every cell value represents the smallest Euclidean distance between the two puzzle shapes given in the row and column of the selected cell calculated by varying the starting points as described in Section 10. To explain the results in Table 1, Figure 13 shows the Euclidean distance values of MWD between the puzzle piece “Horse” (sixth row of Table 1), “Giraffe” (eighth row of Table 1), and the ten figure shapes within the puzzle board (column 1 till 10 of Table 1) in a bar chart calculated for from the approximation signal.

As can be seen from Figure 13, the Euclidean distances are small for the same shapes and relatively large for different shapes. Thus, these values are adequate for recognizing the given shapes “Horse” and “Giraffe.” This fact applies to all shapes of Figure 9 (see green bar in Figure 13). Similar results are obtained using HWD instead of MWD with the difference that the values are higher. Comparable results were also obtained for MWD and HWD of the detail signals as well. By combining the 25 WD from the approximation and 25 WD from the detail signal, we receive the final MWD and HWD components, which represent the different puzzle shapes.

8. Impact of Image Noises

To study the impact of image noises on the angle function and thus on the derived WD we added artificial noise to the image of  “Giraffe.” Figure 14 shows the image with and without noise as well as the extracted and approximated contours of both images.

As shown in Figure 14, the contour polygons of “Giraffe” show large differences because of noise. The number of polygon vertices for the shape without noise is 22 and with noise only 18. The positions of the polygon vertices differ slightly. The main inertial axes show, however, large differences. Figure 15 shows the periodical angle function compared for both cases for the same starting point. As shown in this figure, the differences are relatively small despite the changes of the number and positions of the polygon vertices. The differences between the WD are marginal (Figure 16). Here 25 MWD from the approximation and 25 MWD from the detail signal were calculated. Here, the minimum distance between the WD is 0,6.

It should be noted that the individual values of the WD have marginal differences, despite the relatively large minimum distance of 0,6. The evaluation of the WD by comparing the individual values, for example, using a Fuzzy method [8], can deliver better results.

9. Comparison between FD and WD

To compare the wavelet descriptors with the Fourier descriptors, we first calculate 50 FD of all given shapes of Figure 9. Figure 17 shows the first 16 values of the calculated FD for both considered shapes “Horse” and “Giraffe.” The Euclidean distance matrix is given in Table 2, where all 50 FD were considered. The Euclidean distance values between the puzzle piece “Horse” and “Giraffe” and the ten figure shapes within the puzzle board are drawn in Figure 18. The diagrams show that the minimum distance values of the FD are also qualified, similar to the WD, to recognize both shapes, since the values of the minimum distances between the same shapes represent the smallest distances. The only difference is the relatively small values of the minimum distance in comparison with the values of Table 1, respectively, Figure 13. This can cause confusion in recognition tasks, when the images are afflicted with noises. In order to assess the influence of image noise on the values of the FD, we calculated the FD for the shape “Giraffe” with artificial noise (see Figure 14). Figure 19 shows the values of this shape with and without noise compared. As seen in this figure, the individual values differ only slightly, similar to the WD. The minimum distance between the same shapes is 0,43.

As shown above, using the approximation and/or detail signal, it is very easy to recognize object shapes using few numbers of MWD or HWD by calculating the Euclidean distance given in (25). For the example in Figure 9, the results show that the recognition can be achieved using either the approximation signal, detail signal, or the combination of both signals. In addition, the recognition is comparable if we use the MWD or HWD with the difference that the values of HWD are higher than the values of MWD. The minimum distance is a simple way to evaluate WD. This method has the disadvantage of losing the information about the local WD differences. For this reason, the comparison of each WD can deliver better results in some cases, requiring though a higher computational calculation effort.

Compared to the recognition using the FD, the recognition using WD is more adequate because the differences between the minimum distance values are significantly higher. This is important for recognition tasks in which not only known but also unknown objects are present. In this case, it is necessary to define a threshold for the minimum distance value to distinguish known and unknown objects. The disadvantage of the WD is the dependency of the WD from the starting point on the contour. This problem can be solved by calculating the WD for every possible starting point. This will be explained below in Section 10.

10. Solving the Problem of Starting Point

The results in Section 5 are obtained under the condition of starting point equality. If the starting points change, the angle functions will also change and with them the corresponding WD. If we change the starting point of “Horse,” for instance, from the green colored position to the red one, we receive the MWD as shown in Figure 20. Here both approximation (MWD 1–25) and detail signals (MWD 26–50) are shown in the same diagram.

As shown in Figure 20, the change of the starting point leads to large changes in the WD. The Euclidean distances between the MWD of the same object with different starting points are 4,8 for the approximation and 6,1 for the detail signal. This is due to the change of the angle function within the interval according to the change of the starting point. Figure 21 shows the periodical angle functions of the example shape “Giraffe” for the two different starting points of Figure 9.

Since the position of the starting point as well as number of polygon corners for a given object in real applications depends on several parameters, which cannot be fixed, such as position and rotation of the objects in the image, number of objects, and extraction and approximation method, the above-mentioned issue can cause confusion in recognition tasks because it is not explicitly clear whether large values of the Euclidean distance are related to shape differences or to different starting points. The recognition process using the minimum distance method will fail. To solve this problem it is either necessary to specify a striking point as a starting position on the contour and to calculate the WD for this point or to calculate the WD for all found polygon corners as shown below.

The first strategy can be carried out using the inertial axes of the given objects if these do not show large changes across different images. A striking point could be an intersection point between one of these axes and the considered contour. In Figure 22 the inertial axes for the example of Figure 9 are drawn in red colour. As shown in this figure the axes of inertia have small as well as large deviations despite the low image noise. The large deviations occur in many cases due to existing object symmetries as this is the case for some of the puzzle pieces. Due to this fact we solve the problem of the starting point using the second strategy as follows.

Suppose we have a number of object samples and an unknown object , which must be classified according to one of the given object classes, the procedure can then be performed as follows.(i)Calculate the WD of all objects for an arbitrary starting point and store them in a database.(ii)Calculate the WD sets (e.g., 25 WD from the approximation and 25 WD from the detail signal) for all possible starting points () of the unknown object . This can be done easily if we use the polygon description of the object contour and change the starting point from one polygon corner to the next by shifting the length and corresponding angles of the contour polygon.(iii)Compare the WD sets of the unknown object with the stored WD of the object samples using the Euclidean distance method. We receive a number of Euclidean distances ; ; according to the number of different starting points used in step (ii) and the number of object samples given in step (i).(iv)Find the minimum value of . The stored object sample related to this minimum value represents the recognized object.

The strategy given above allows for a wholly owned recognition. This feature constitutes an important condition for use in industrial application. The method needs higher computational effort. This is, however, negligible relative to image acquisition and preprocessing time. For the example in Figure 9 with 23 objects (including the three reference crosses) and a total number of polygon corners for all objects of 333, we calculated 50 WD for every object and every polygon corner (this means  WD). The total time needed for all WD was 48 ms using a standard PC with 3 GHz frequency. For an object with, for example, 20 polygon vertices, this means that the time needed to calculate the WD according to (14) and (22) is smaller than 3 ms. Compared with the image acquisition time of 40 ms (European CCIR norm) the effort is reasonable.

11. Conclusion

The representation of object contours using wavelet descriptors is more efficient than Fourier descriptors in object recognition tasks, since the differences between the WD for different objects are significantly larger than between the FD. In particular, the Mexican Hat as well as Haar functions are qualified for use as mother wavelets to obtain a sufficient number of WD, which can be used in recognition tasks. The WD can be calculated very easily using (14) for the HWD and (22) for MWD. The number of WD needed to recognize a given object increases according to the complexity of the object shapes and must be set according to the given application. It is possible in some cases to use only the components of the approximation signal in order to recognize an unknown object using the minimum distance method, but generally the use of the detail signal will include detail information about small contour changes between the compared objects. The starting point on the contour has a large influence on the recognition process because the values of the WD depend strongly on it. The paper describes one possible solution, in which not only one set of WD is computed and compared with the stored WD of the object samples but also several sets of WD according to the different starting points. This will increase the time consumption of the recognition process compared with the time consumption needed using FD. However, this is no longer a problem due to speed of the current generation of computers.

Appendices

A.

If we assume that edge is located within polygon edge , edge within polygon edge , and edge within polygon edge (see Figure 6), the subintegrals in (13) can be calculated as follows:

Here ,    for and   for . In the example of Figure 5, , , . In this example, is equal to zero, since .

Because in , , , and within the integration limits are constant, these subintegrals can be easily solved and we obtain the following:

, , and are the values of the angle function at the polygon edges , , and . These values change for a given contour only if the polygon starting point changes.

The integration in and can be divided into several parts according to the number of polygon edges included within the integral limits. For , we receive

Since the value of within the integral limits is constant we receive

We can modify (A.4) as given below and finally receive

The given difference in (18) represents the angle difference between the polygon edge and and can be replaced by . We receive Similar to we receive the following for :

If we now add we receive the following expression: Equation (A.8) can be simplified to (A.9) by combining similar terms:

If we insert (A.9) into (13) we receive the Haar wavelet descriptors as given by (14).

B.

Using the following parameter transformation:

with and ,

the values in (21) can be written as follows:

The terms and represent constant values because they are independent of the angle function .

These terms include the following different integrations: The sum of and results in the following: After combining similar terms we obtain the following:

The terms and include the angle function and can be calculated by dividing the total integral into several integral terms. For we receive

where and the number of polygon edges.

Since the values within the integral limits are constant, we can write The integration yields the following expression: or and after small modification we obtain the following: With we receive the following:

If we now add and , we receive the following:

Finally, by adding all four terms and using (21) we receive the Mexican Hat wavelet descriptors (MWD) as given in (22).