Abstract

Recently, many researchers in the field of automatic content-based image retrieval have devoted a remarkable amount of research looking for methods to retrieve the best relevant images to the query image. This paper presents a novel algorithm for increasing the precision in content-based image retrieval based on electromagnetism optimization technique. The electromagnetism optimization is a nature-inspired technique that follows the collective attraction-repulsion mechanism by considering each image as an electrical charge. The algorithm is composed of two phases: fitness function measurement and electromagnetism optimization technique. It is implemented on a database with 8,000 images spread across 80 classes with 100 images in each class. Eight thousand queries are fired on the database, and the overall average precision is computed. Experimental results of the proposed approach have shown significant improvement in the retrieval performance in regard to precision.

1. Introduction

Images have rich content, and each image is described by its own features. Content-based image retrieval (CBIR) uses features like color, texture, shape, and so forth, to retrieve relevant images from a large image database. In the last ten years, a great deal of researches who work on image retrieval have concentrated on CBIR’s algorithms. The performance of a CBIR algorithm is strongly depended on the availability of suitable features for appropriate representation of semantic aspects of the images. There are many ways to retrieve images in CBIR as reported in [1].

A typical CBIR algorithm performs two major tasks. First is the offline feature extraction task where a set of features is extracted to describe the visual content of each image in the database. The second task is the online image retrieval. In this task, the feature vector for a given query image is computed and compared to the feature vectors of each image in the database, and images that are similar to the query images are retrieved.

Two issues in CBIR algorithm that devoted a great amount of research are efficiency and accuracy of the search and retrieval algorithms. Efficiency of a search process could be improved by using proper indexing techniques. The retrieval accuracy can be improved by increasing the number of selected features. However, both methods increase the complexity of the retrieval system. Nonetheless, an effective CBIR algorithm needs to have both efficient search mechanism and accurate set of visual features [2].

Unfortunately, the current CBIR methods are not always satisfactory in achieving effective result in the retrieval of an optimal query due to the gap between low-level features and high-level semantic concepts. Therefore, many algorithms have been proposed to improve the retrieval accuracy by retrieving more images that are similar to query image [25]. For instance, relevance feedback mechanism has been used to enhance CBIR systems through continuous interaction with the user [4, 68]. In addition to that, several algorithms have been proposed to optimize retrieval parameters, including genetic [911] and particle swarm optimization [6, 12]. Most of such approaches, however, do not give sufficient retrieval accuracy but provide an efficient search mechanism with an acceptable computation time.

The purpose of this paper is to present a novel algorithm for increasing the precision in content-based image retrieval using electromagnetism-like (EMag) algorithm.

Electromagnetism-like algorithm is a novel population-based metaheuristic algorithm which was firstly introduced by Birbil and Fang [13] to solve continuous optimization models using bounded variables.

EMag algorithm is a simple and direct search algorithm which has been inspired by the electromagnetism phenomenon. It is a flexible and effective nature-based approach for single objective optimization problems [14].(i)It does not use crossover or mutation operators to explore feasible regions; instead, it implements a collective of attraction-repulsion mechanism yielding in a reduction of computational cost with respect to memory allocation and execution time.(ii)No gradient information is required, and it employs a decimal system which clearly contrasts to genetic algorithm.

EMag algorithm originally came from the electromagnetism theory in physics, which simulates the electromagnetism theory by considering each particle to be an electrical charge. The charge of each particle is determined by a fitness function, and each particle moves based on the degree of the attraction or repulsion among them. The attraction-repulsion mechanism of EMag corresponds to the reproduction, crossover, and mutation in Genetic Algorithm [15].

EMag has both been successfully applied to the solution of different sorts of engineering problems such as resource constraint project scheduling problems [16], vehicle routing [17], array pattern optimization [15], image processing [14], neural network training [18], and control systems [19]. However, to our knowledge, EMag has not yet been applied to any related image retrieval work.

In this paper, we aim to apply EMag in content-based image retrieval process and examine the degree of precision improvement made when applying electromagnetism optimization for a CBIR system. To establish that, we first retrieved images using a traditional image retrieval method followed by image retrieval analysis using electromagnetism optimization method.

The rest of the paper is organized as follows. Section 2 presents the proposed algorithm for content-based image retrieval. Section 3 presents image features extraction process. Sections 4 and 5 provide the experimental results and comparisons with other methods. The conclusion and the future research directions are presented in Section 6.

2. The Proposed Algorithm

Traditionally, the semantic model used to describe a CBIR algorithm mainly comprises of two parts, namely, the image database and the query image. A set of features is extracted to describe the content of each image in the database. Subsequently, the feature values extracted from the query image are compared with the feature values of each image already stored in the database. The similarity measurement to the query image features will be possible by searching the most similar image features into the database, where the retrieved images are ranked by their Euclidean distances.

According to the previously mentioned concept, we design an image retrieval system based on EMag algorithm, as shown in Figure 1. Our system operates in two phases: fitness function measurements and EMag optimization algorithm.

(1) The fitness function is important for successful application of the EMag optimization algorithm. The image associated with the smallest fitness value is more relevant to the query image given by the user. In phase one, the fitness value of each image with respect to the query image is calculated. Each image measures its effectiveness by the fitness function . In this work, the fitness function is defined as follows: where , are the feature values extracted from the query image and each image stored in a database, respectively; is set to a small positive value (0.01) to avoid division by zero; is the number of image classes in a database, and is the precision value, which is defined as

(2) The optimization procedure (EMag) is used in phase two to determine relevant images by minimizing the fitness function. As a result of this phase, images are partitioned into two groups. The first group is the relevant images group, and the second group is the irrelevant ones.Steps of the EMag algorithm for the optimization of image retrieval are described as follows.(i)Local Search of EMag Algorithm. Each retrieved image is regarded as a charged image, and all images are assumed to be uniformly distributed between the upper and lower bounds. The algorithm seeks to minimize the fitness function because smaller value implies higher accuracy in content-based image retrieval. The procedure ends when all the database images are evaluated, and the image that gathered the smallest fitness value is chosen as a relevant image to the query image.(ii)Calculation of Charge. From the electromagnetism theory which states that ‘‘the force exerted on a point via other points is inversely proportional to the distance between the points and directly proportional to the product of their charges” [22], the particle (image) moves following the resultant Coulomb's force which is produced among particles as a charge-like value. In the EMag implementation, the charge of image is determined by its fitness value and obtained as follows [23]: where and are the values of fitness function for image and image , respectively; is the image which has the smallest fitness value. The overall resultant force between all images determines the actual effect of the optimization process. The final force vector for each image is evaluated under Coulomb’s law as follows [23]: In this formula, represents attraction and corresponds to repulsion. Note that an image will not produce the force to affect itself.(iii)Change the Position of Images in the Rank. Images are partitioning into two groups accordingly, matching the corresponding force vector. Each image changes its position according to the resultant force which can be given [15] as follows: where is the image position after retrieval and is the image at position ; is the random step length assumed to be uniformly distributed between 0 and 1. The and represent the upper and lower bounds, respectively. The image moves toward the upper bound by a random step length when the resultant force is positive, or moves toward the lower bound when the resultant force is negative. The optimum image ( ) does not move, because it is an image with absolute attraction which attracts all other images in the image database. Throughout this phase, two groups of images: relevant and irrelevant images are created.The images that moved toward are the closest to the query image that they represent the relevant images, whereas the images that moved toward the represent the irrelevant images.Finally, the system ranks and retrieves the relevant images. The advantage of EMag is that it brings very similar images together, and these images are likely to be relevant for similar queries.

Algorithmic complexity is concerned about how fast or slow a particular algorithm performs. The goal of computational complexity is to classify the proposed algorithm according to its performances. We use the ‘‘big-O’’ notation to express an algorithm runtime complexity. One of the “big-O’’ notation is the which is commonly known as the Logarithmic time. Big-O notation is a mathematical construct used to describe algorithmic complexity. This generally means that the algorithm deals with a data set that is iteratively partitioned. Therefore, the size of an input affects the growth of the algorithm proportionally. means that the algorithm's running time is dependent on the logarithm of the input size . Obviously, is smaller than ; hence, algorithm of complexity should be better because it is faster.

In this work, the retrieval time is the sum of the features extraction time () of the query image, the similarity time () between the query image and every image in the database, and the sorting time () of the retrieved images to rank all images in the database according to their similarity to the query. Therefore, the total retrieval time is where is the number of images in the database. If we ignore the features extraction time of the query image (which depend on the particular hardware the program is run on), we could say that grows at the order of . By using the big-O notation, we could write The same approach is applied for the retrieval time of the optimized images (): grows at the order of L as well; therefore, Here, is the number of images in the bounds nearest to the query image. Since , therefore , which indicates that for EMag algorithm the search time does not increase linearly with the database size.

3. Image Features

The choice of appropriate image features is one of the key issues in querying image databases by similarity. Features extraction that captures visual content of images for the purpose of image indexing and retrieval can be implemented by using global or local features. The global features should be calculated over the entire image. The advantage of global extraction is its high speed for both process of extracting features and computing similarity. However, these global features cannot handle parts of the image that may have different characteristics. Therefore, local features extraction of an image are necessary [24].

A full description of all the image features and their comparative effectiveness for image retrieval is beyond the scope of this paper. Instead, our aim is to evaluate only the precision improvements of a CBIR system when applying electromagnetism optimization algorithm in its retrieval mechanism. Nevertheless, as the effectiveness of initial retrieval by the proposed method is related to the nature and quality of the features used in representing image content, therefore the following features are also considered in this work:(i)color features (histogram);(ii)texture features (gray level cooccurrence matrix).

It is noteworthy that in order to extract sufficient information from each image, all images are resized into pixels size and then divided into nonoverlap 16 blocks of pixels size. All statistical measures are computed for multichannel image matrices (Red, Green, and Blue), and their average is determined.

(1) Color Features. Colors are a numeric quantity that describes a color of an image, and it is one of the easiest visual features used in image retrieval. Color histogram of an image is the probability mass function of the image intensities. It gives the distribution number of pixels in an image representing intensities of the three color channels (RGB).

The color features in this work represent the standard deviation and entropy for the color histogram to each color band of each block in each image. Standard deviation describes the spread in the data. The following equation defines standard deviation for an image block of size [25]: where

(2) Texture Features. Texture is an important property for images. One of the most traditional ways to analyze the spatial distribution of the gray levels of an image is by using Gray Level cooccurrence matrix (GLCM). In this work, the cooccurrence texture features are extracted for each block over the input image. The following parameters are used most frequently in the literature which are: entropy, homogeneity, contract, and energy [26].(i)Entropy measures the amount of information and the larger value of entropy is the greater amount of information carried by image, but inversely, correlated to energy. Entropy feature of GLCM is one of the features that has the best discriminatory power which is implemented using Matlab and is given in the following equation: where is the number of occurrences of gray levels, within the given image window.(ii)Homogeneity measures the closeness of the distribution of elements in the GLCM to the GLCM diagonal, where (iii)Contrast measures the intensity contrast between a pixel and its neighbor over the whole image, where (iv)Energy measures the sum of squared elements in the GLCM, where

Since different features can generate different ranges of values of features, a normalization method has been used in every feature computation.

4. Experimental Results

To evaluate the effectiveness of the proposed approach, experiments were carried out on a subset of images from the Corel Photo Gallery. This subset contains about 20,000 images of very diverse subject matter for which each image was manually labeled. Corel image database is a benchmark database widely used to evaluate the performance of a CBIR algorithm. The original Corel database includes plenty of classes, each of which contains 100 or more images. However, some of the categories are reorganized into 80 classes with 100 images in each class. This means that we have used 8,000 query images (80 classes with 100 images in each class) for the database and average precision results are reported. Samples are shown in Figure 2.

For all the experiments, we set the number of retrieved images as equal to 20. For traditional CBIR, we used the Euclidean distance to compute the distance similarity between images, while for EMag CBIR, we used the fitness function to assign the fitness value for each image. Search by similarity rank returns number of images in the increasing order of distance with respect to the query image. It is assumed that relevant images are returned before nonrelevant images. Therefore, a good effectiveness measure should capture the concept of similarity, and one of the commonly used effectiveness measures is Precision-Recall.

Our proposed performance tests for the algorithm were implemented using Matlab 2010a on Intel(R) Core i7 at 2.2 GHz, 4 GB DDR3 memory, 64-bit, and Windows 7. Each image in the database is used as a query to retrieve similar images.

Figure 3 shows the retrieval results of a traditional CBIR for a given query image. The first 20 retrieved images are shown in Figure 3(a). Images from the same class are considered as similar images. The results indicate that several irrelevant images are also retrieved. The graph of Figure 3(b) shows the average precision versus the number of retrieved images, while Figure 3(c) shows the average precision versus the average recall for the given query image. The lesser number of relevant images can be explained by the preference of the algorithm to consider only the low-level features.

Figure 4 shows the results for EMag CBIR for the given query image. The first 20 relevant images are shown in Figure 4(a). The graph of Figure 4(b) shows the average precision versus the number of retrieved images. This result shows that the usage EMag can highly increase the average precision of a content-based search. Further, the graph in Figure 4(c) shows that the proposed EMag algorithm is effective in providing higher precision in retrieval especially for the low recall levels.

The distinctive aspect of this work is the use of optimization technique to improve the precision of the retrieved images. Our experimental findings clearly indicate that optimization helps to return a large number of relevant images to the query image. Figure 5 further depicted that images relevant to the query are ranked higher in the results of the proposed method than those of the traditional CBIR method. It is also noteworthy that our algorithm is able to improve the overall average precision from 38.64% to 83.57%, which is considerably high for an image retrieval algorithm.

Further experiment is performed using more complex features involving noisy query image. The following experiment simulates noisy query image by convolving a Gaussian filter PSF (point-spread function) with the query image using Matlab function (imfilter) and then adding Matlab function Gaussian noise of variance to the blurred query image (imnoise), as shown in Figure 6. The finding shows that adding noise and blurring effect to the query image affects the overall average precision, thus reducing its performance from 83.57% to 64.10% as shown in Figure 7. However, in this noisy situation 64.10%, is still considerably high for an image retrieval algorithm, hence, implies that the proposed algorithm is robust to noise and blurring effect.

To further prove the robustness of the proposed algorithm to noise and blurring, Figure 7 displays the comparison of the average precision values obtained among the traditional CBIR, the EMag with the blurred noisy query image, and the proposed EMag. The comparison result shows that the proposed EMag algorithm performed the best while the EMag with the blurred noisy query image performed better than the traditional CBIR.

5. Comparison with Other Methods

In order to show the superiority of our approach, we compare our approach with those in [6, 10, 20, 21]. Table 1 gives an overall view of the performance of different methods, which have used the same database but with different features.

Noted that researchers in [20] used a combination of color layout descriptor and Gabor filter for content-based image retrieval. In [21], color, texture, and shape features are used as visual features. The average precision values for these works are equal to 58.29% and 59.5%, respectively. These approaches have relied on the use of low-level features to search visually similar images. Table 1 also shows the comparative results between our algorithm and the retrieval algorithms that use the particle swarm optimization with relevance feedback approach [6] and a CBIR method based on an interactive genetic algorithm [10]. Both approaches used optimization with relevance feedback to enhance the precision of retrieved images. In the case where relevance feedback is used, the error rate may be higher because of the changes in the user’s choices while choosing the relevant images. The effectiveness of the algorithm will be reduced as well.

6. Conclusion

In this paper, we have presented a novel algorithm for increasing the precision in content-based image retrieval based on electromagnetism optimization. The proposed algorithm takes the advantages of the electromagnetism theory of physics, which simulates the electromagnetism theory by considering each image to be an electrical charge. The charge of each image is determined by a fitness function, and each image moves based on the degree of the attraction or repulsion among them. Our experimental results of the proposed EMag algorithm show a significant improvement in the average precision value of the retrieval performance of images as compared to other traditional CBIR technique. In fact, the EMag algorithm is proven robust to noisy and blurred query images as well. Future work needs to be done for implementing EMag for image features reduction to further speed up the image retrieval process. The use of learning principles in image search engine should also be considered in order to increase the number of returned relevant images.

Acknowledgments

The authors would like to thank the reviewers for their comments. This research has been funded by the University of Malaya, under the Grant no. RG066-11ICT.