Abstract

We present an overview of methods and applications of automatic characterization of the appearance of materials through colour and texture analysis. We propose a taxonomy based on three classes of methods (spectral, spatial, and hybrid) and discuss their general advantages and disadvantages. For each class we present a set of methods that are computationally cheap and easy to implement and that was proved to be reliable in many applications. We put these methods in the context of typical industrial environments and provide examples of their application in the following tasks: surface grading, surface inspection, and content-based image retrieval. We emphasize the potential benefits that would come from a wide implementation of these methods, such as better product quality, new services, and higher customer satisfaction.

1. Introduction

Computer vision has been a topic of intense research activity for decades. The wide availability of imaging devices, as well as the continuously increasing computing power, has contributed to the development of successful applications in many areas: remote sensing, computer-aided diagnosis, automatic surveillance, crowd monitoring, and food control are just some examples [1]. Manufacturing is also an important sector that benefits from computer vision methods: robotics [2], process automation [3], and quality control [4] are typical applications in this context. Most of these applications share the same underlying problem: the automatic characterization of the visual appearance of the materials that one has to deal with in the various domains.

This is particularly true for products with high aesthetic value [5, 6], such as natural stone [7], ceramic [8], parquet [9], and fabric [10, 11], to cite some. In such cases it is the visual appearance itself that largely determines the quality—therefore the price—of the material. The measurement of visual appearance can be considered a part of “soft metrology,” the aim of which is the objective quantification of the properties of materials that are determined by sensorial human response. In an increasingly competitive worldwide market, reliable and effective assessment of this feature is a key point to ensure high quality standard and the success of a company.

Traditionally, the analysis of the visual appearance has been accomplished by skilled operators. This is a lengthy, tedious, scarcely reproducible, and largely subjective approach [12]. To overcome these issues many companies are abandoning manual procedures and moving towards automated computer vision systems, which in principle provide higher quality standards, better reproducibility, and more reliable product records. These systems employ image processing techniques to encode the visual appearance of a particular product through the analysis of two main visual features, namely, colour and texture, even though other characteristics such as contrast and gloss can be considered as well [5].

Intuitively, colour and texture provide most of the information that we need to infer the material an object is composed of, as well as to distinguish one material from another. Neuroimaging research has in fact confirmed this common belief [13]. Colour is the result of the interaction of three elements [14]: an intrinsic characteristic of the material (reflectance spectrum), an accidental condition (illumination), and the response of the sensor (human eye or camera). Texture, a more elusive concept, has been defined in many ways, though none of the definitions given so far is entirely satisfactory. Herein we will mention the definition given by Davies [15], according to whom texture is “the property of a surface that gives rise to the local variability in appearance,” where such variability may be either an intrinsic characteristic of the material or the consequence of surface roughness. The concept of stationary texture, which means that the statistical characteristics of the texture are the same everywhere [16], is also useful in this context. In the remainder of the paper we implicitly assume that when we use the word texture we refer to a stationary texture.

We present in this paper an overview of colour and texture analysis methods along with their applications in industry. The remainder of the manuscript is organized as follows. Section 2 gives an up-to-date overview of the methods for automatic characterization of the visual appearance and proposes a taxonomy to group them in a meaningful way. Particular attention is given to robustness against noise factors, such as variations in illumination and viewpoint. Section 3 deals with the main applications in industry, with particular emphasis on grading, surface inspection, and content-based image retrieval (CBIR). Concluding remarks are reported in Section 4, along with a discussion on future research directions and potentially innovative applications.

2. Methods

Originally, approaches to automatic characterization of the visual appearance of materials considered colour and texture separately. In the last decade, however, an increasing interest has been devoted to integrating these two sources of data into a unifying model. Strong evidence has in fact supported this idea and confirmed that incorporating colour into the texture model has beneficial effects in many applications [17, 18]. The classification scheme that we propose in this paper reflects this idea, namely, that the methods can be based on one of the two features separately (either colour or texture) or consider the two of them jointly.

Based on these considerations it makes sense to define the following three main categories: spectral, spatial, and hybrid methods [19]. The first group is based on colour only: the relative spatial distribution of the pixel values is not taken into account. The second group encloses texture-based methods: they are called spatial methods since they take into account the relative variation of pixel intensity (grayscale values) in the spatial domain, but discards colour altogether (images are previously converted into grayscale). Finally, those methods that consider colour and texture jointly are referred to as hybrid methods. Before describing the three groups in detail, we will first discuss three points that apply to each method in general; these are dimensionality, robustness to noise factors, and parameter tuning.

Dimensionality refers to the number of parameters a method relies upon. Whichever the approach we use to characterize the visual appearance of a material, this will be represented through a finite set of quantitative parameters, which are usually referred to as the feature vector. The appearance of a material can therefore be regarded as a point in an -dimensional space, where is the number of parameters. Ideally, a good method should provide high discrimination accuracy with as few parameters as possible. Computation, in fact, becomes costly as dimensionality increases, representing a serious limitation for applications that require real-time processing. Furthermore, long feature vectors are likely to degrade the performance of a classifier, as a consequence of the well-known phenomenon of “curse of dimensionality” [20].

Noise factors represent uncontrollable sources of uncertainty that commonly occur in nonideal conditions. In real working environments it can be quite complicated to maintain steady and controlled imaging conditions. Most commonly one has to deal with variable conditions, such as changes in rotation, scale, illumination, and so forth. It can be quite difficult to compensate for such sources of noise. Some of them are likely to degrade the performance of a method so strongly as to make it virtually useless. To illustrate this concept Figure 1 shows how the appearance of a material can be affected when exposed to different sources of noise.

The effects of changes in illumination are reported in Figures 1(a) and 1(b). Changes in translation, scale, rotation, and/or viewpoint are shown in Figures 1(c), 1(d), 1(e), and 1(f). As we discuss in the following subsections, each class of methods is sensitive to one or more sources of noise. It is therefore the designer’s responsibility to select the method which is most appropriate to the specific application domain, considering the potential noise factors that can arise in their specific context.

Finally, it is important to mention that virtually any spectral, spatial, or hybrid image descriptor depends on one or more parameters—see Table 1 for a roundup of the methods presented here and their parameters. In any specific application, the performance of a method may vary greatly depending on the setup of such parameters. When it comes to implementing industrial systems, it is advisable to stick to the values suggested in the literature as a first step; then, to further improve the results, it is possible to optimise the parameters’ values in a supervised fashion, adaptive to the material, to the image kind, but also to the experience of the user.

In the following subsections we analyse spectral, spatial, and hybrid approaches to colour texture analysis and discuss their pros and cons. For each class we present a set of representative methods and their use in three different industrial applications: grading, surface inspection, and CBIR. In Table 2 we summarise the robustness of each method to different noise factors (i.e., changes in illumination intensity, viewpoint, rotation, and scale) and report a qualitative appraisal of the computational time. The methods included in the experiments have been selected on the basis of classification accuracy, computational efficiency, and easiness of implementation.

2.1. Spectral Methods

Spectral methods consider the colour content of the image regardless of its spatial distribution. This is usually represented through RGB triplets, since this is the format used by most imaging devices. Hyperspectral imaging is also gaining importance in practical applications: for a review of methods the interested reader is referred to the work of Chang [21]. Herein we limit our discussion to the traditional and well-established domain of trichromatic images. It is worth recalling, at this point, some basic facts about colour spaces. We know that they can be device-independent (colorimetric) or device-dependent. The first group includes any colour space that can be converted into XYZ without additional information. In principle device-independent data are absolute values that do not depend on the imaging system. They also carry information about the illuminant under which the measurement is taken (e.g., D65, D50, etc.). In contrast, device-dependent colour spaces encode device-specific data. Consequently colour data acquired with two different devices are unrelated, and no information is conveyed about the illuminant.

The problem of choosing the best colour space for visual similarity has been debated at length in the literature [18, 22, 23]. As for robustness against changes in illumination, it is clear that, in principle, device-independent data are the best option. The problem, however, is that standard imaging devices do not provide this kind of data—which need to be estimated through calibration. Regarding accuracy, it has been suggested that linear spaces (e.g., CIE Lab) should provide the best results, for distance in these spaces correlates with dissimilarity in the human vision system. Experimental results, however, have not confirmed this hypothesis; they report, on the contrary, rather inconclusive and fairly contradictory results [24, 25].

We subdivide spectral methods into two subgroups: methods based on colour statistics and methods based on colour histograms. The methods of the first group consider various combinations of global statistical parameters (such as mean value, standard deviation, median, etc.) which are computed directly from the colour data. These are also referred to as soft colour descriptors [26]. Within this group it has been shown that parameters as simple as the average values of each , , and channel can discriminate quite well among different materials. Kukkonen et al. [8] reported a successful application of this method in surface grading of ceramic tiles. In the same domain of application, López et al. [26] proved the effectiveness of various combinations of soft colour descriptors such as mean, standard deviation, and moments. Niskanen et al. proposed the use of colour centiles (values that divide the cumulative probability distribution of each colour channel into the percentage required) for defect detection in parquet slabs [27].

The second group is based on the concept of colour histogram. This is an approximated estimation of the probability of occurrence of each colour in an image. The number of components of the colour histogram is determined through a suitable quantization of the colour space. The progenitor of this group is the 3D colour histogram [28]. This is obtained by dividing the colour space into cells of equal volume. This method is still valid today but presents the disadvantage of a high dimensionality. Modified versions have been proposed with the aim of avoiding this problem. Among these it is worth mentioning Paschos’ chromaticity moments [22], which discard the intensity channel and consider the two-dimensional histogram arising from the two chromatic channels only. The colour images are first converted into the space—a process which requires colour calibration—and then the(intensity) channel is discarded. The resulting two-dimensional distribution over the plane is characterized through a set of moments.

We conclude this subsection with some remarks about spectral methods as a whole. It is important to point out that, due to the fact that this family of techniques considers the colour content of the image regardless of the spatial distribution, any spatial information is lost. A positive outcome of that is a fairly good robustness against translation, rotation and changes of the viewing angle (see Table 2). A negative one is that dissimilar images may have similar colour distributions, and therefore these methods can fail in some cases. To illustrate this concept, we captured an image of a granite slab (Figure 2(a)) and rearranged the spatial distribution of the pixels (Figure 2(b)). Both images have exactly the same colour content, but their appearance is quite different.

In summary, spectral methods are highly recommended in presence of steady illumination conditions. In such situations they can compensate very well for changes in viewpoint illumination and scale. By contrast, they should be avoided when illumination conditions are variable or unknown. For quantitative performance analyses on spectral methods applied to wood and ceramics the interested reader may find useful data in [8, 9, 24, 26].

2.2. Spatial Methods

Spatial methods are based on pure texture. They capture the visual appearance of a material through features extracted from grayscale images, therefore discarding colour information. Research on this area has been intense for many years. Various attempts to categorize the methods of this group have been proposed in the past [2931]. Such classification schemes, however, have proved rather unsatisfactory, due to the high degree of overlap that exists among many methods. In an effort to solve this problem, Xie and Mirmehdi have recently proposed a fuzzy categorization [32]. In their scheme a method can belong to one or more of the following four classes: statistical, structural, model-based, and signal processing-based. Statistical approaches consider the spatial distribution of pixel values. Structural methods describe textures in terms of the spatial arrangement of certain elementary patterns, which are usually referred to as texels or textons. Model-based methods characterize the texture through a set of parameters of a predefined mathematical model. Finally the signal processing-based methods typically employ a bank of filters to analyse texture at different frequencies and orientations. Herein we selected four classic methods that are easy to implement and proved to be effective in many applications: gray level co-occurrence matrices, local binary patterns, coordinated clusters representation, and Gabor filters.

Gray level co-occurrence matrices (GLCM) are among the most common texture descriptors [33]. They are also important from a historical perspective, as they are one of the methods that pioneered texture analysis. This approach belongs to the statistical group. It is based on the bidimensional joint probability of the grayscale values of pairs of pixels separated by a predefined displacement vector. Each displacement generates one co-occurrence matrix, from which a set of statistical parameters are computed. Herein we considered the following: contrast, correlation, energy, entropy, and homogeneity. Most commonly the set of displacements includes one-pixel shifts along four directions: 0°, 45°, 90°, and 135°. This is the setting adopted in our implementation. The matrices corresponding to the four displacements are averaged for rotation-invariance. Invariance to illumination changes can be obtained by using statistical parameters that are invariant to illumination intensity, such as energy, entropy, contrast, correlation, and homogeneity.

Local binary patterns (LBP) [34] and coordinated clusters representation (CCR) [35] are both statistical and structural. They characterize textures through the probability of occurrence of certain elemental patterns that can occur in a neighbourhood of predefined shape and size. Experiments have shown that good results can be obtained using a neighbourhood as small as a window. Both LBP and CCR consider the probability of occurrence of the binary patterns that result from each position of the neighbourhood as it slides by one-pixel steps along the image. Binarization is based on thresholding: this is local in the case of LBP (threshold is the value of the central pixel) and global in the case of CCR (threshold is computed from the whole image through isentropic partition of the grayscale histogram). This gives 28 and 29 possible binary patterns in the two cases (LBP and CCR, resp.). Rotationally invariant versions of the two methods can be easily obtained considering an interpolated circular neighbourhood [34]. These two versions—which are the ones used herein—are referred to in the literature as LB and CC [36].

Gabor filters [37] have played a significant role in texture analysis and are ubiquitous in many applications. The reason for the success of the method perhaps lies in its capability of mimicking the human vision system [38]. The approach is based on multichannel, frequency- and orientation-selective analysis of the image. The implementation therefore consists in designing a bank of filters and selecting a proper set of parameters for each filter [39]. Texture features are typically the mean and the standard deviation of the absolute value of the transformed images. Invariance against rotation can be easily achieved through DFT normalization.

Compared with spectral methods, spatial methods show rather opposed characteristics. Whereas the former are invariant against changes in rotation, viewpoint, and scale, the latter are not. They are, in fact, quite sensitive to changes in viewpoint, scale, and rotation. Invariance against rotation can be easily introduced in some spatial methods (the methods considered herein are all rotationally invariant), but invariance to viewpoint and scale is much more difficult to achieve. In contrast, whereas spectral methods are scarcely resilient to illumination changes, structural methods are more robust in this sense. Some of them are intrinsically invariant to changes in illumination intensity (e.g., LBP); in the others a change in illumination intensity can be easily compensated for. In summary, spatial methods are recommendable when we are interested in discriminating materials’ appearance on the basis of local changes, that is, texture. Hence they are appealing for those materials with a clear local structure, such as woven fabric, wood, granite, and so forth; by contrast they are scarcely descriptive of materials with “flat” appearance. Operating conditions can tolerate only slight changes in viewpoint and scale, whereas changes in illumination intensity and rotation are acceptable when the proper methods are used (see Table 2).

2.3. Hybrid Methods

In an attempt to exploit structural and spectral information jointly, researchers have devoted significant attention to the development of combined (hybrid) methods. Experiments have in fact confirmed that combined approaches can provide better performance when compared with methods that consider texture and colour separately [18]. Approaches that consider the two sources of data can rely on one of the following strategies: considering colour and texture separately and concatenating the resulting feature vector; extracting texture features from each colour channel separately (intra-channel features); and extracting texture features from couples of feature channels (inter-channel features). There is no general consensus, at present, about which of these strategies is the best [26]. In this subsection we report four methods that have proved reliable and effective in a number of applications: Integrative co-occurrence matrices, opponent colour local binary patterns, colour ranklets, and local binary patterns + colour percentiles.

GCLM can be easily extended to colour images considering both the co-occurrence probability within one colour channel (intra-channel features) and between couples of colour channels (inter-channel features). The latter are usually extracted from -, -, and - couples, as proposed in the implementation of Palm [40]. Texture features are extracted as in GCLM. The method is usually referred to as Integrative co-occurrence matrices (integrative CM).

Local binary patterns have also been extended to colour images in a similar way. The method is referred to as opponent colour local binary patterns (OCLBP). In this model intra-channel features are obtained by applying the LBP operator to each channel separately. Inter-channel features are computed for pairs of colour channels (i.e., -, -, and -) by taking the neighbourhood from one channel and the threshold from the other channel [41]. The method considered herein makes use of the rotationally invariant version of LBP LB and and is therefore indicated as .

In a similar way ranklets, a nonparametric local measure of relative intensity based on the ranklet transform, have been extended to colour textures by taking both intra- and inter-channel features in the RGB space [42]. Since both colour ranklets and are based on relative differences or ratios between colour values in the , , and channels, they are theoretically invariant to changes in illumination intensity.

Finally we mention an approach where texture and colour are computed separately and the resulting features are merged. This method considers LBas texture descriptor and colour centiles as colour descriptors. Successful applications of this method have been reported in surface detection of parquet slabs [27].

In Sections 2.1 and 2.2 we have set into evidence that spectral and spatial methods capture different and complementary aspects of materials’ appearance: spectral methods are able to discriminate among material with similar texture but different colour, whereas spatial methods can distinguish between similar colours and different textures. It makes therefore sense to combine the two concepts into joint descriptors. However, it has been pointed out that the gain in accuracy that one obtains with hybrid methods pairs with a significant loss in robustness [43]. In fact, whereas spectral and spatial methods alone have some interesting invariance features, which have been mentioned in the preceding subsections, these are lost on hybrid methods (see Table 2). Any type of change in illumination, viewpoint, rotation, and scale can in principle affect these methods and result in a significant loss of performance. Quantitative performance analyses of hybrid methods applied to granite and wood can be found in [7, 27].

3. Industrial Applications

Automatic characterization of the visual appearance of materials has many interesting applications in industry. In this paper we are particularly concerned with the following main classes of problems: grading, surface inspection, and content-based image retrieval. The aim of this section is to provide the reader with a description of these scenarios, as well as to present practical examples of application in the three contexts.

Grading is a process where products are grouped into lots based on the criterion of “similar appearance” [24]. This means deciding which category or grade (among a set of predefined classes) each material belongs to [30]. Grading applications are related to stages of the supply chain such as ordering, delivering, and warehousing. This process implies two major steps: extraction of visual features and assignment of labels. In the preceding section we presented a set of methods that can be used in the first step. The second step is usually referred to as a classification problem, which can be either supervised or unsupervised. It is beyond the scope of this paper to describe in detail these concepts and related methods. The interested reader will find a comprehensive reference in the classic text of Duda et al. [20]. In practical applications supervised classification consists in assigning a class label to a product among a set of predefined classes, each possible class being represented by one or more training samples. In contrast, in unsupervised classification we are given a set of products to be grouped into a predefined number of classes, but no training samples are given in this case. Unsupervised classification is also referred to as clustering.

Surface inspection is concerned with the detection of visible surface defects such as stains, cracks, veins, knots, and so forth. An interesting and up-to-date review of recent advances in this field can be found in [44]. Surface inspection is based on image segmentation, a process where an image of the material under control is split into regions with homogeneous visual properties. Segmentation has been extensively studied by the computer vision community, and many approaches are described in the literature. They range from simple pixel-by-pixel classification to more involved methods, such as region growing.

Content-based image retrieval is the process of searching large datasets based on visual content, rather than metadata such as keywords, tags, and so forth. For a review of methods readers may find the work of Vassilieva useful [45]. Potential applications of CBIR in industry are appealing, though not completely exploited so far. An interesting use, for instance, is in the problem of finding the material that is most similar to a query sample. We can imagine, for example, a situation where a customer needs to replace some broken pieces of granite, ceramics, and so forth or has to enlarge a façade or floor which has previously been covered with material of a certain type. In such cases the customer will request a material “similar to” his (the query sample). A CBIR system—perhaps distributed over a network of potential providers—would be of great help in such a case and would foster cooperation among companies in a global marketplace.

3.1. Grading

In this section we present an example of surface grading applied to four different types of industrial materials: wood, ceramics, textile, and granite. The first three databases (Figures 3(a), 3(b), and 3(c)) are composed of 30 classes each—one image per class. The images have been picked from a public internet repository (http://www.archibaseplanet.com), and the resolution is pixels. In order to get more examples of each material, the images have been subdivided into four nonoverlapping subimages. The fourth dataset (Figure 3(d)) contains 12 granite classes, with four images for each class. The granite images (http://dismac.dii.unipg.it/mm) have been acquired in our laboratory under controlled illumination conditions using an illuminator (Monster DOME Light 18.25) and a digital camera (Samsung S850). Image resolution is pixels and no further subdivision is performed in this case.

To assess the performance of the descriptors presented in Section 2, we carried out a supervised classification experiment. First, each dataset is randomly split into two subsets: one for training (training set) and the other for classification (validation set). The subdivision guarantees that half of the images of each class are used for training and half for validation (this procedure is known as stratified sampling). Second, each image of the validation set is classified through the nearest neighbour rule (1-NN) and distance [20].

Finally, classification accuracy is estimated as the percentage of images of the validation set that have been classified correctly. To make the results stable, the subdivision into training and validation set is repeated 100 times and the final accuracy is the average value over the 100 problems. The general expression to compute the classification accuracy is where is the classification accuracy, the total number of problems (, in this case), the number of correctly classified samples, and the number of the validation samples in the th problem.

The results of the experiments are summarized in Table 3. The first two columns report the class and name of each method, the third column the dimension of the feature space, and the following columns the average accuracy over the four datasets.

In general the results are fairly good and confirm the effectiveness of the considered methods for grading. Among the spatial methods, and Gabor filters show the best accuracy. Spectral methods perform quite well too: it is worth mentioning that a method as simple as the mean values of each , , and channel provides rather a good accuracy. Hybrid methods, on average, are the most effective, as one would expect.

3.2. Surface Inspection

In this section we describe a typical surface inspection task consisting of detecting defective areas on the inspected surface. Three images of defective specimens of fabric, paper, and wood (see Figure 4) have been acquired in our laboratory under controlled illumination conditions using the setup described in Section 3.1. The task is about segmenting the image into defective and non-defective zones. In the experiment we avoided complicated segmenting methods such as region growing or morphological analysis and stuck to the simple and easy-to-implement per pixel classification. We considered one method for each of the spectral, spatial, and hybrid groups presented in Section 2, namely, , , and values,  and + colour centiles. Segmentation has been performed through supervised classification based on the Naïve Bayes algorithm [20] with a normal probability density kernel. For each specimen approximately 5% of the whole area have been used for training and the remaining 95% for validation. The “true” position and extension of the defects (ground truth) have been manually determined by human experts.

As a measure of accuracy, we considered the sum of the percentage of foreground pixels (i.e., defective areas) correctly classified as foreground and that of background pixels (i.e., non-defective areas) correctly classified as background. This parameter gives an overall estimate of the effectiveness of the segmentation process. In formulas we have where is the overall accuracy; the whole image; and the background and foreground produced by the automatic segmentation procedure; and the “true” background. The segmentation results are summarised in Figure 4. The results show that, for each problem, there is at least one method that produces good segmentation results, with . Conversely, not all the methods perform well in each problem, suggesting that the selection of the right descriptor is a domain-specific problem which needs to be considered with extreme care. Note that the results presented in Figure 4 could be further enhanced through some morphological postprocessing steps, such as region growing, morphological filtering, and so forth.

3.3. Content-Based Image Retrieval

Finally we present an application of content-based image retrieval. To illustrate this application we use a prototype software developed by the authors (Figure 5).

In the example presented in this section we considered different types of wood (parquet). The first step consists in creating a database of wood images. In this phase the user selects a set of images and uploads each image to the database. When an image is uploaded by the user (Figure 5(a)) the system computes the visual characteristics of the image using the descriptors discussed in Section 2 (the user can choose one or more). The visual characteristics are then stored within the image. This process could be made automatically through an appropriate online acquisition and recording system. In the example shown in this section we created a database of 30 types of wood (see Figure 6).

Once the database is created, the user can insert a query image and perform content-based retrieval. In this stage the user selects the visual characteristic that he wants to use for the retrieval and the distance type. The system presents the user with the four most similar images in the database, in descending similarity order from left to right (Figure 5(b)). The results of a retrieval experiment are reported in Figure 7. Here we have a query image on the left and the retrieved images on the right. Each row corresponds to a different descriptor.

It is interesting to comment on the result provided by the three methods. First, notice that the query image presents a slight texture formed by approximately vertical lines. We can see that the first three images retrieved through actually present a texture very similar to that of the query image’s, as one would expect. On the other hand, the first two images returned by colour centiles are very similar, in colour, to the query image, but have very little texture. Finally, the combination of and colour centiles provides a balanced result between colour and texture.

4. Conclusions

In this paper we presented an overview of methods for automatic characterization of the visual appearance of materials and illustrated a set of applications in industry. The aim of the paper was to provide both researchers and practitioners with a survey of methods and applications in an area where research interest is currently high. The implementation of the methods presented in this paper may, in fact, lead to better products as well as new services. This is particularly interesting for products where visual appearance plays a crucial role: marble, granite, ceramic, parquet, leather, fabric, and paper are just some examples. In an increasingly competitive global market, effective measurement, recording and retrieval of visual features are key points to ensure high quality standards. The benefits expected from a wider adoption of these methods include better product quality, fewer faulty products, and higher customer satisfaction. In the near future emerging techniques, such as RFID, are likely to play an important role in this context. It is easy to imagine a scenario where the visual characteristics of a product are stored together with the product in RFID tags in some sort of product identity card. This would bring significant improvement to warehouse management and the supply chain.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was partially supported by the Spanish Government within Project no. CTM2010-16573 and by the European Commission within Projects no. Life09-ENV/FI/000568 and no. LIFE12-ENV/IT/000411.