Abstract

We apply the concept of Fuzzy Transform (for short, F-transform) for improving the results of the image matching based on the Greatest Eigen Fuzzy Set (for short, GEFS) with respect to max-min composition and the Smallest Eigen Fuzzy Set (for short, SEFS) with respect to min-max composition already studied in the literature. The direct F-transform of an image can be compared with the direct F-transform of a sample image to be matched and we use suitable indexes to measure the grade of similarity between the two images. We make our experiments on the image dataset extracted from the well-known Prima Project View Sphere Database, comparing the results obtained with this method with that one based on the GEFS and SEFS. Other experiments are performed on frames of videos extracted from the Ohio State University dataset.

1. Introduction

Solution methods of fuzzy relational equations have been well studied in the literature (cf., e.g., [115]) and applied to image processing problems like image compression [1619] and image reconstruction [7, 8, 2022]. In particular, Eigen Fuzzy Sets [2325] have been applied to image processing and medical diagnosis [2, 6, 7, 16]. If an image of sizes (pixels) is interpreted as a fuzzy relation on the set , the concepts of the Greatest Eigen Fuzzy Set (for short, GEFS) of with respect to the max-min composition and of the Smallest Eigen Fuzzy Set (for short, SEFS) of with respect to the min-max decomposition [2, 24, 25] were studied and used in [26, 27] for an image matching process defined over square images. The GEFS and SEFS of the original image are compared with the GEFS and SEFS of the image to be matched by using a similarity measure based on the Root Mean Square Error (for short, RMSE). The advantage of using GEFS and SEFS is in terms of memory storage is that we can indeed compress an image dataset (in which each image has sizes ) in a dataset in which each image is stored by means of its GEFS and SEFS which have total dimension .

The main disadvantage of using GEFS and SEFS is that we cannot compare images in which the number of rows is different from the number of columns. Our aim is to show that we can use an F-transform for image matching problems, reducing an image dataset of sizes (in general, is not necessarily equal to ) into a dataset of dimensions comparable with that one obtained by using GEFS and SEFS if , so having convenience in terms of memory storage.

The F-transform based method [2830] is used in the literature for image and video compression [29, 3133], image segmentation [20], and data analysis [22, 34]; indeed, in [31, 32] the quality of the decoded images obtained by using the F-transform compression method is shown to be better than that one obtained with the fuzzy relation equations and fully comparable with the JPEG technique.

The main characteristic of the F-transform method is to maintain an acceptable quality in the reconstructed image even under strong compression rates; indeed in [20] the authors show that the segmentation process can be applied directly over the compressed images. Here we use the direct F-transform in image matching analysis with the aim of reducing the memory used to store the image dataset. In fact, we compress a monochromatic image (or a band of a multiband image) of sizes via the direct F-transform to a matrix of sizes using a compression rate .

By using a distance, we compare the F-transform of each image with the F-transform of the sample image. We also adopt a preprocessing phase for compressing each image with several compression rates. In Figure 1 we show the preprocessing phase on a dataset of color images. We compress each color image in the three monochromatic components corresponding to the three bands , , and .

At the end of the preprocessing phase we can use the compressed image dataset for image matching analysis. Supposing that the original image dataset was composed by s color images of sizes using a compression rate , we obtain that the dimension of the compressed image dataset is constituted totally of pixels.

In Figure 2 we schematize the image matching process. The sample image is compressed by the F-transform method; then we compare the three compressed bands of each image obtained via F-transform with those ones deduced for the sample image by using the Peak Signal to Noise Ratio (for short, PSNR). At the end of this process, we determine the image in the dataset with the greatest overall PSNR with respect to the sample image.

Here a monochromatic image or a band of a color image of sizes is interpreted as a fuzzy relation whose entries are obtained by normalizing the intensity of each pixel with respect to the length of the scale, that is, . We show that our F-transform approach can be also applied in image matching processes to images of sizes (eventually, ), giving analogous results with respect to that one obtained with GEFS and SEFS based method. The comparison tests are made on the color image dataset extracted from View Sphere Database, an image dataset consisting in a set of images of objects in which an object is photographed from various directions by using a camera placed on a semisphere whose center is the same considered object. We also use the Ohio State University color video datasets sample for our tests. Each video is composed by frames consisting of color images; we show the results for the Mom-Daughter and sflowg motions. In Section 2 we recall the concepts of F-transform in two variables. In Section 3 we recall the GEFS and SEFS based method; in Section 4 we propose our image matching method based on the F-transforms. Our experiments are illustrated in Section 5, and Section 6 is conclusive.

2. F-Transforms in Two Variables

Following [29] and limiting ourselves to the discrete case, let and be a increasing sequence of points (nodes) of , . We say that the fuzzy sets (basic functions) form a fuzzy partition of if the following hold: (1) for every ;(2) if for ;(3) is a continuous function on ;(4) strictly increases on [] for and strictly decreases on for ;(5) for every .

The fuzzy partition is said to be uniform if (6) and , where and (equidistant nodes);(7) for every and ;(8) for every and .

Let , be nodes such that . Furthermore, let be a fuzzy partition of , and let be an assigned function, , with and being “sufficiently dense” sets with respect to the chosen partitions; that is for each (resp., ) there exists an index (resp., ) such that (resp., ). The matrix is said to be the direct F-transform of with respect to and if we have for each and Then the inverse F-transform of with respect to and is the function defined as The following existence theorem holds [29].

Theorem 1. Let be a given function, , with and . Then for every , there exist two integers , and related fuzzy partitions of and of such that the sets , are sufficiently dense with respect to such partitions and is satisfied for every and .

Let be a gray image of sizes , seen as , with being the normalized value of the pixel given by if is the length of the gray scale. In [27] is compressed via the F-transform defined for each and as where , , , , , and (resp., ), (resp., ), is a fuzzy partition of (resp., ). The following fuzzy relation is the decoded version of and it is defined as for every . We have subdivided in submatrices of sizes , called blocks (cf., e.g., [2, 16]), compressed to blocks of sizes via defined for each and as The basic functions (resp., ), defined below, constitute a uniform fuzzy partition of (resp., ): where , , , , and where , , , . We decompress to of sizes by setting for every which approximates up to an arbitrary quantity in the sense of Theorem 1, which, unfortunately, does not give a method for finding two integers and such that . Then we prove several values of and . For every compression rate , we evaluate the quality of the reconstructed image via the PSNR defined as where is Here is the reconstructed image obtained by recomposing the blocks .

3. Max-Min and Min-Max Eigen Fuzzy Sets

Let be a nonempty finite set, and , such that where “” is the max-min composition. In terms of membership functions, we have that for all and is defined as an Eigen Fuzzy Set of . Let , , be defined iteratively by It is known [2, 24, 25] that there exists an integer such that is the GEFS of with respect to the max-min composition. We also consider the following: where “” denotes the min-max composition, that is, in terms of membership functions: for all and is also defined to be an Eigen Fuzzy Set of with respect to the min-max composition. It is easily seen that (14) is equivalent to the following: where and are pointwise defined as and for all . Since for some is the GEFS of with respect to the max-min composition, it is immediately proved that the fuzzy set defined as for every is the SEFS of with respect to the min-max composition.

In [27] a distance based on GEFS and SEFS for image matching is used over images of sizes . Indeed, considering two single-band images of sizes , say and , such distance is given by where , , are the GEFS and SEFS of the fuzzy relation , respectively, obtained by normalizing in the pixels of the image .

In [26, 27] experiments are presented over color images of sizes concerning two objects (an eraser and a pen) extracted from View Sphere Database. Each object is put in the center of a semisphere on which a camera is placed in 91 different directions. The camera establishes an image (photography) of the object for each direction which can be identified from two angles and as illustrated in Figure 3.

A sample image (with given , for the eraser and , for the pen) is to be compared with another image chosen among the remaining 90 directions. GEFS and SEFS are calculated in the three components of each image in the RGB space, for which it is natural to assume the following extension of (17): where , , are the measures (17) calculated in each band , , . For image matching, the GEFS and SEFS components in each band are extracted from each image, thus forming a dataset with reduced storage memory. An image is compared with the images in the dataset using (18). If the dataset contains color images of sizes and the dimension of the original dataset is , then the dimension of the GEFS and SEFS dataset is , so we have a compression rate given by So we obtain a compression rate if .

4. The Image Matching Process via F-Transforms

We consider an image dataset formed by color images of sizes . In the preprocessing phase we compress each image of the dataset using the direct F-transform. Each image is divided in blocks of sizes and each block is compressed in a block of sizes . Thus the images are coded with a compression rate . In our experiments we set the sizes of the original and compressed blocks, so that is comparable with (18). For example, for , we use and , so .

In the reduced dataset we store the F-transform components of each image. We use the PSNR between a sample image and an image defined for every compression rate (cf. (9)) as where RMSE (Root Mean Square Error) is given by (cf. (10)) If we have color images, we define an overall PSNR as where , , are the similarity measures (20) calculated in each band , , compression rate . In our experiments we compare the results obtained by using the F-transforms (resp., GEFS and SEFS) based method with the PSNR (20) (resp. (18)). We use the color image datasets of 256 gray levels and of sizes pixels, available in the View Sphere Database for each object considered, the best image of the object itself maximizes the PSNR (22). In other experiments we use our F-transform method over color video datasets in which each frame is formed by images of 256 gray levels and of sizes , available in the Ohio University sample digital color video database. A color video is schematically formed by a sequence of frames. If we consider a frame in a video as the sample image, we prove that the image with greatest PSNR with respect to the sample image is an image with frame number close to the frame number of the sample image.

5. Results of Tests

We compare results obtained by using the GEFS and SEFS and F-transform based methods for image matching on all the image datasets, each of sizes , extracted from the View Sphere Database. In the first image dataset, concerning an eraser, we consider, as sample image , the image obtained from the camera in the direction with angles and . For brevity, we consider a dataset of 40 test images, and we compare with the images considered in the remaining 40 other directions. In Table 1 (resp., Table 2) we report the distances (17) and (18) (resp., PSNR (20) and (22) with ) obtained using the GEFS and SEFS (resp., F-transform) based method.

In Figure 4 we show the trend of the index PSNR obtained by the F-Transform method with respect to the distance (18) obtained using the GEFS and SEFS method.

As we can see from Tables 1 and 2, both methods give the same reply: the better image similar with the image eraser in the direction and (Figure 5) is given from that one at and (Figure 6). The trend in Figure 4 shows that the value of the distance (18) increases as the PSNR decreases.

In order to have a further confirmation of our approach, we have considered a second object, a pen, contained in the View Sphere Database whose sample image is obtained from the camera in the direction with angles and . We also limit the problem to a dataset of 40 test images whose best distances (17) and (18) (resp., (20) and (22) with ) under the SEFS and GEFS (resp., F-transform) based method, are reported in Table 3 (resp., Table 4).

In Figure 7 we show the trend of the index PSNR obtained by the F-transform method with respect to the distance obtained by using the GEFS and SEFS method. As we can see from Tables 3 and 4, in both methods the best image similar to the original image in the directions   and (Figure 8) is given from that one at and (Figure 9). Also in this example, the trend in Figure 7 shows that the value of the distance (18) increases as the PSNR decreases.

Now we present the results over a sequence of frames of a video, Mom-Daughter, available in the Ohio University sample digital color video database. Each frame is a color image of sizes with 256 gray levels for each band. We use our method with a compression rate ; that is, in each band every frame is decomposed in 150 blocks, and each block has sizes compressed to a block of sizes . Since , the GEFS and SEFS based method is not applicable. We set the sample image as the image corresponding to the first frame of the video. We expect that the frame number of the image with higher PSNR with respect to the sample image is the image with frame number close to the frame number of sample image. In Table 5 we report the best results obtained using the F-transform based method in terms of the (20) and (22) with . As expected, albeit with slight variations, all the PSNRs diminish by increasing of the frame number, and the second frame (Figure 11) is the frame with the greatest PSNR w. r. t. the first frame (Figure 10) containing the sample image.

In Figure 12 we show the trend of the PSNR (22) with the frame number. This trend is obtained for all the sample video frames in the video dataset. For reasons of brevity, now we report only the results obtained for another test performed on the sequence of frames of another video in the Ohio sample digital video database, the video sflowg. The PSNR in Figure 15 diminishes by increasing the frame number, and the second frame (Figure 14) is the frame with the greatest PSNR w. r. t. the first frame (Figure 13) containing the sample image.

For supporting the validity of the F-transform method for all the sample frames, we measure, for the frame with the greatest PSNR w.r.t the sample frame, the correspondent value PSNR0 obtained by using the original frame instead of the correspondent compressed frame decoded via the inverse F-Transform. In Figure 16 we show the trend of the difference PSNR0 − PSNR with respect to PSNR0. The trend indicates that this difference is always less than 2. This result shows that if we compress the images in the dataset with rate by using the F-transform method, we can use the compressed image dataset for image matching processes, comparing the decompressed image with respect to a sample image despite the loss of information due to the compression.

6. Conclusions

The results on the images of sizes of the View Sphere Image Database show that, using our F-transform based method, we obtain the same results in terms of image matching and in terms of reduced memory storage reached also via the GEFS and SEFS based method, which is applicable only over images with , while our method concerns images of any sizes.

Moreover our tests executed on color video frames of sizes (, pixels with 256 gray levels) of the Ohio University color videos dataset show that, by choosing the first frame as the sample image, we obtain as image with the highest PSNR that one corresponding to the successive frame, as expected, although a loss of information on the decoded images because of the compression process.