ISRN Machine Vision The latest articles from Hindawi Publishing Corporation © 2014 , Hindawi Publishing Corporation . All rights reserved. Performance Evaluation of Noise Reduction Filters for Color Images through Normalized Color Difference (NCD) Decomposition Wed, 22 Jan 2014 13:24:08 +0000 Removing noise without producing image distortion is the challenging goal for any image denoising filter. Thus, the different amounts of residual noise and unwanted blur should be evaluated to analyze the actual performance of a denoising process. In this paper a novel full-reference method for measuring such features in color images is presented. The proposed approach is based on the decomposition of the normalized color difference (NCD) into three components that separately take into account different classes of filtering errors such as the inaccuracy in filtering noise pulses, the inaccuracy in reducing Gaussian noise, and the amount of collateral distortion. Computer simulations show that the proposed method offers significant advantages over other measures of filtering performance in the literature, including the recently proposed vector techniques. Fabrizio Russo Copyright © 2014 Fabrizio Russo. All rights reserved. Novel Approach for Rooftop Detection Using Support Vector Machine Mon, 23 Dec 2013 13:49:16 +0000 A new method for detecting rooftops in satellite images is presented. The proposed method is based on a combination of machine learning techniques, namely, k-means clustering and support vector machines (SVM). Firstly k-means clustering is used to segment the image into a set of rooftop candidates—these are homogeneous regions in the image which are potentially associated with rooftop areas. Next, the candidates are submitted to a classification stage which determines which amongst them correspond to “true” rooftops. To achieve improved accuracy, a novel two-pass classification process is used. In the first pass, a trained SVM is used in the normal way to distinguish between rooftop and nonrooftop regions. However, this can be a challenging task, resulting in a relatively high rate of misclassification. Hence, the second pass, which we call the “histogram method,” was devised with the aim of detecting rooftops which were missed in the first pass. The performance of the model is assessed both in terms of the percentage of correctly classified candidates as well as the accuracy of the estimated rooftop area. Hayk Baluyan, Bikash Joshi, Amer Al Hinai, and Wei Lee Woon Copyright © 2013 Hayk Baluyan et al. All rights reserved. A Robust Illumination Normalization Method Based on Mean Estimation for Face Recognition Mon, 16 Dec 2013 09:47:44 +0000 An illumination normalization method for face recognition has been developed since it was difficult to control lighting conditions efficiently in the practical applications. Considering that the irradiation light is of little variation in a certain area, a mean estimation method is used to simulate the illumination component of a face image. Illumination component is removed by subtracting the mean estimation from the original image. In order to highlight face texture features and suppress the impact of adjacent domains, a ratio of the quotient image and its modulus mean value is obtained. The exponent result of the ratio is closely approximate to a relative reflection component. Since the gray value of facial organs is less than that of the facial skin, postprocessing is applied to the images in order to highlight facial texture for face recognition. Experiments show that the performance by using the proposed method is superior to that of state of the arts. Yong Luo, Ye-Peng Guan, and Chang-Qi Zhang Copyright © 2013 Yong Luo et al. All rights reserved. Resection-Intersection Bundle Adjustment Revisited Thu, 12 Dec 2013 18:25:54 +0000 Bundle adjustment is one of the essential components of the computer vision toolbox. This paper revisits the resection-intersection approach, which has previously been shown to have inferior convergence properties. Modifications are proposed that greatly improve the performance of this method, resulting in a fast and accurate approach. Firstly, a linear triangulation step is added to the intersection stage, yielding higher accuracy and improved convergence rate. Secondly, the effect of parameter updates is tracked in order to reduce wasteful computation; only variables coupled to significantly changing variables are updated. This leads to significant improvements in computation time, at the cost of a small, controllable increase in error. Loop closures are handled effectively without the need for additional network modelling. The proposed approach is shown experimentally to yield comparable accuracy to a full sparse bundle adjustment (20% error increase) while computation time scales much better with the number of variables. Experiments on a progressive reconstruction system show the proposed method to be more efficient by a factor of 65 to 177, and 4.5 times more accurate (increasing over time) than a localised sparse bundle adjustment approach. Ruan Lakemond, Clinton Fookes, and Sridha Sridharan Copyright © 2013 Ruan Lakemond et al. All rights reserved. Active Object Recognition with a Space-Variant Retina Thu, 05 Dec 2013 16:39:42 +0000 When independent component analysis (ICA) is applied to color natural images, the representation it learns has spatiochromatic properties similar to the responses of neurons in primary visual cortex. Existing models of ICA have only been applied to pixel patches. This does not take into account the space-variant nature of human vision. To address this, we use the space-variant log-polar transformation to acquire samples from color natural images, and then we apply ICA to the acquired samples. We analyze the spatiochromatic properties of the learned ICA filters. Qualitatively, the model matches the receptive field properties of neurons in primary visual cortex, including exhibiting the same opponent-color structure and a higher density of receptive fields in the foveal region compared to the periphery. We also adopt the “self-taught learning” paradigm from machine learning to assess the model’s efficacy at active object and face classification, and the model is competitive with the best approaches in computer vision. Christopher Kanan Copyright © 2013 Christopher Kanan. All rights reserved. Towards Understanding the Formation of Uniform Local Binary Patterns Wed, 31 Jul 2013 09:03:22 +0000 The research reported in this paper focuses on the modeling of Local Binary Patterns (LBPs) and presents an a priori model where LBPs are considered as combinations of permutations. The aim is to increase the understanding of the mechanisms related to the formation of uniform LBPs. Uniform patterns are known to exhibit high discriminative capability; however, so far the reasons for this have not been fully explored. We report an observation that although the overall a priori probability of uniform LBPs is high, it is mostly due to the high probability of only certain classes of patterns, while the a priori probability of other patterns is very low. In order to examine this behavior, the relationship between the runs up and down test for randomness of permutations and the uniform LBPs was studied. Quantitative experiments were then carried out to show that the relative effect of uniform patterns to the LBP histogram is strengthened with deterministic data, in comparison with the i.i.d. model. This was verified by using an a priori model as well as through experiments with natural image data. It was further illustrated that specific uniform LBP codes can also provide responses to salient shapes, that is, to monotonically changing intensity functions and edges within the image microstructure. Olli Lahdenoja, Jonne Poikonen, and Mika Laiho Copyright © 2013 Olli Lahdenoja et al. All rights reserved. Affine-Invariant Feature Extraction for Activity Recognition Mon, 15 Jul 2013 12:58:45 +0000 We propose an innovative approach for human activity recognition based on affine-invariant shape representation and SVM-based feature classification. In this approach, a compact computationally efficient affine-invariant representation of action shapes is developed by using affine moment invariants. Dynamic affine invariants are derived from the 3D spatiotemporal action volume and the average image created from the 3D volume and classified by an SVM classifier. On two standard benchmark action datasets (KTH and Weizmann datasets), the approach yields promising results that compare favorably with those previously reported in the literature, while maintaining real-time performance. Samy Sadek, Ayoub Al-Hamadi, Gerald Krell, and Bernd Michaelis Copyright © 2013 Samy Sadek et al. All rights reserved. Vision Measurement Scheme Using Single Camera Rotation Wed, 05 Jun 2013 09:19:37 +0000 We propose vision measurement scheme for estimating the distance or size of the object in static scene, which requires single camera with 3-axis accelerometer sensor rotating around a fixed axis. First, we formulate the rotation matrix and translation vector from one coordinate system of the camera to another in terms of the rotation angle, which can be figured out from the readouts of the sensor. Second, with the camera calibration data and through coordinate system transformation, we propose a method for calculating the orientation and position of the rotation axis relative to camera coordinate system. Finally, given the rotation angle and the images of the object in static scene at two different positions, one before and the other after camera rotation, the 3D coordinate of the point on the object can be determined. Experimental results show the validity of our method. Shidu Dong Copyright © 2013 Shidu Dong. All rights reserved. Visible and Infrared Face Identification via Sparse Representation Sun, 02 Jun 2013 08:58:21 +0000 We present a facial recognition technique based on facial sparse representation. A dictionary is learned from data, and patches extracted from a face are decomposed in a sparse manner onto this dictionary. We particularly focus on the design of dictionaries that play a crucial role in the final identification rates. Applied to various databases and modalities, we show that this approach gives interesting performances. We propose also a score fusion framework that allows quantifying the saliency classifiers outputs and merging them according to these saliencies. Pierre Buyssens and Marinette Revenu Copyright © 2013 Pierre Buyssens and Marinette Revenu. All rights reserved. Deformable Contour-Based Maneuvering Flying Vehicle Tracking in Color Video Sequences Tue, 19 Mar 2013 10:27:02 +0000 This paper presents a new method for the tracking of maneuvering flying vehicles using a deformable contour model in color video sequences. The proposed approach concentrates on targets with maneuvering motion in sky, which involves fundamental aspect change stemmed from 3D rotation of the target or video camera. In order to segment and track the aircraft in a video, at first, the target contour is initialized manually in a key frame, and then it is matched and tracked automatically in the subsequent frames. Generally active contour models employ a set of energy functions based on edge, texture, color, and shape features. Afterwards, objective function is minimized iteratively to track the target contour. In the proposed algorithm, we employ game of life cellular automaton to manage snake pixels’ (snaxels’) deformation in each epoch of minimization procedure. Furthermore, to cope with the large aspect change of aircraft, a Gaussian model has been taken into account to represent the target color in RGB space. To compensate for changes in luminance and chrominance ingredients of the target, the prior distribution function is dynamically updated during tracking. The proposed algorithm is evaluated using the collected dataset, and the expected probability of tracking error is calculated. Experimental results show positive results for the proposed algorithm. Samira Sabouri, Alireza Behrad, and Hassan Ghassemian Copyright © 2013 Samira Sabouri et al. All rights reserved. LoCoBoard: Low-Cost Interactive Whiteboard Using Computer Vision Algorithms Wed, 13 Mar 2013 14:13:43 +0000 In the current digital age, the adoption of natural interfaces between humans and machines is increasingly important. This trend is particularly significant in the education sector where interactive tools and applications can ease the presentation and comprehension of complex concepts, stimulate collaborative work, and improve teaching practices. An important step towards this vision, interactive whiteboards are gaining widespread adoption in various levels of education. Nevertheless, these solutions are usually expensive, making their acceptance slow, especially in countries with more fragile economies. In this context, we present the low-cost interactive whiteboard (LoCoBoard) project, an open-source interactive whiteboard with low-cost hardware requirements, usually accessible in our daily lives, for an easy installation: a webcam-equipped computer, a video projector, and an infrared pointing device. The detection software framework offers five different Pointer Location algorithms with support for the Tangible User Interface Object protocol and also adapts to support multiple operating systems. We discuss the detailed physical and logical structure of LoCoBoard and compare its performance with that of similar systems. We believe that the proposed solution may represent a valuable contribution to ease the access to interactive whiteboards and increase widespread use with obvious benefits. Christophe Soares, Rui S. Moreira, José M. Torres, and Pedro Sobral Copyright © 2013 Christophe Soares et al. All rights reserved. Area Optimized FPGA-Based Implementation of The Sobel Compass Edge Detector Thu, 07 Mar 2013 15:25:07 +0000 This paper presents a new FPGA resource optimized hardware architecture for real-time edge detection using the Sobel compass operator. The architecture uses a single processing element to compute the gradient for all directions. This greatly economizes on the FPGA resources' usages (more than 40% reduction) while maintaining real-time video frame rates. The measured performance of the architecture is 50 fps for standard PAL size video and 200 fps for CIF size video. The use of pipelining further improved the performance (185 fps for PAL size video and 740 fps for CIF size video) without significant increase in FPGA resources. Sanjay Singh, Anil Kumar Saini, Ravi Saini, A. S. Mandal, Chandra Shekhar, and Anil Vohra Copyright © 2013 Sanjay Singh et al. All rights reserved. New Brodatz-Based Image Databases for Grayscale Color and Multiband Texture Analysis Sun, 24 Feb 2013 19:21:31 +0000 Grayscale and color textures can have spectral informative content. This spectral information coexists with the grayscale or chromatic spatial pattern that characterizes the texture. This informative and nontextural spectral content can be a source of confusion for rigorous evaluations of the intrinsic textural performance of texture methods. In this paper, we used basic image processing tools to develop a new class of textures in which texture information is the only source of discrimination. Spectral information in this new class of textures contributes only to form texture. The textures are grouped into two databases. The first is the Normalized Brodatz Texture database (NBT) which is a collection of grayscale images. The second is the Multiband Texture (MBT) database which is a collection of color texture images. Thus, this new class of textures is ideal for rigorous comparisons between texture analysis methods based only on their intrinsic performance on texture characterization. Safia Abdelmounaime and He Dong-Chen Copyright © 2013 Safia Abdelmounaime and He Dong-Chen. All rights reserved. Fast Exact Nearest Neighbour Matching in High Dimensions Using -D Sort Sun, 17 Feb 2013 09:20:28 +0000 Data structures such as -D trees and hierarchical -means trees perform very well in approximate nearest neighbour matching, but are only marginally more effective than linear search when performing exact matching in high-dimensional image descriptor data. This paper presents several improvements to linear search that allows it to outperform existing methods and recommends two approaches to exact matching. The first method reduces the number of operations by evaluating the distance measure in order of significance of the query dimensions and terminating when the partial distance exceeds the search threshold. This method does not require preprocessing and significantly outperforms existing methods. The second method improves query speed further by presorting the data using a data structure called -D sort. The order information is used as a priority queue to reduce the time taken to find the exact match and to restrict the range of data searched. Construction of the -D sort structure is very simple to implement, does not require any parameter tuning, and requires significantly less time than the best-performing tree structure, and data can be added to the structure relatively efficiently. Ruan Lakemond, Clinton Fookes, and Sridha Sridharan Copyright © 2013 Ruan Lakemond et al. All rights reserved. Multimodal Markov Random Field for Image Reranking Based on Relevance Feedback Mon, 11 Feb 2013 14:08:00 +0000 This paper introduces a multimodal approach for reranking of image retrieval results based on relevance feedback. We consider the problem of reordering the ranked list of images returned by an image retrieval system, in such a way that relevant images to a query are moved to the first positions of the list. We propose a Markov random field (MRF) model that aims at classifying the images in the initial retrieval-result list as relevant or irrelevant; the output of the MRF is used to generate a new list of ranked images. The MRF takes into account (1) the rank information provided by the initial retrieval system, (2) similarities among images in the list, and (3) relevance feedback information. Hence, the problem of image reranking is reduced to that of minimizing an energy function that represents a trade-off between image relevance and interimage similarity. The proposed MRF is a multimodal as it can take advantage of both visual and textual information by which images are described with. We report experimental results in the IAPR TC12 collection using visual and textual features to represent images. Experimental results show that our method is able to improve the ranking provided by the base retrieval system. Also, the multimodal MRF outperforms unimodal (i.e., either text-based or image-based) MRFs that we have developed in previous work. Furthermore, the proposed MRF outperforms baseline multimodal methods that combine information from unimodal MRFs. Ricardo Omar Chávez, Hugo Jair Escalante, Manuel Montes-y-Gómez, and Luis Enrique Sucar Copyright © 2013 Ricardo Omar Chávez et al. All rights reserved. Chord-Length Shape Features for Human Activity Recognition Mon, 26 Nov 2012 14:21:12 +0000 Despite their high stability and compactness, chord-length shape features have received relatively little attention in the human action recognition literature. In this paper, we present a new approach for human activity recognition, based on chord-length shape features. The most interesting contribution of this paper is twofold. We first show how a compact, computationally efficient shape descriptor; the chord-length shape features are constructed using 1-D chord-length functions. Second, we unfold how to use fuzzy membership functions to partition action snippets into a number of temporal states. On two benchmark action datasets (KTH and WEIZMANN), the approach yields promising results that compare favorably with those previously reported in the literature, while maintaining real-time performance. Samy Sadek, Ayoub Al-Hamadi, Bernd Michaelis, and Usama Sayed Copyright © 2012 Samy Sadek et al. All rights reserved. Wavelet-Based Multiscale Adaptive LBP with Directional Statistical Features for Recognizing Artificial Faces Thu, 01 Nov 2012 14:33:55 +0000 Recognizing avatar faces is a very important issue for the security of virtual worlds. In this paper, a novel face recognition technique based on the wavelet transform and the multiscale representation of the adaptive local binary pattern (ALBP) with directional statistical features is proposed to increase the accuracy rate of recognizing avatars in different virtual worlds. The proposed technique consists of three stages: preprocessing, feature extraction, and recognition. In the preprocessing and feature extraction stages, wavelet decomposition is used to enhance the common features of the same subject of images and the multiscale ALBP (MALBP) is used to extract representative features from each facial image. Then, in the recognition stage the wavelet MALBP (WMALBP) histogram dissimilarity with statistical features of each test image and each class model is used within the nearest neighbor classifier to improve the classification accuracy of the WMALBP. Experiments conducted on two virtual world avatar face image datasets show that our technique performs better than LBP, PCA, multiscale local binary pattern, ALBP, and ALBP with directional statistical features (ALBPF) in terms of the accuracy and the time required to classify each facial image to its subject. Abdallah A. Mohamed and Roman V. Yampolskiy Copyright © 2012 Abdallah A. Mohamed and Roman V. Yampolskiy. All rights reserved. Practical Recognition System for Text Printed on Clear Reflected Material Sun, 14 Oct 2012 14:46:53 +0000 Text embedded in an image contains useful information for applications in the medical, industrial, commercial, and research fields. While many systems have been designed to correctly identify text in images, no work addressing the recognition of degraded text on clear plastic has been found. This paper posits novel methods and an apparatus for extracting text from an image with the practical assumption: (a) poor background contrast, (b) white, curved, and/or differing fonts or character width between sets of images, (c) dotted text printed on curved reflective material, and/or (d) touching characters. Methods were evaluated using a total of 100 unique test images containing a variety of texts captured from water bottles. These tests averaged a processing time of ~10 seconds (using MATLAB R2008A on an HP 8510 W with 4 G of RAM and 2.3 GHz of processor speed), and experimental results yielded an average recognition rate of 90 to 93% using customized systems generated by the proposed development. Khader Mohammad and Sos Agaian Copyright © 2012 Khader Mohammad and Sos Agaian. All rights reserved. Local Stereo Matching Using Adaptive Local Segmentation Thu, 23 Aug 2012 08:36:08 +0000 We propose a new dense local stereo matching framework for gray-level images based on an adaptive local segmentation using a dynamic threshold. We define a new validity domain of the frontoparallel assumption based on the local intensity variations in the 4 neighborhoods of the matching pixel. The preprocessing step smoothes low-textured areas and sharpens texture edges, whereas the postprocessing step detects and recovers occluded and unreliable disparities. The algorithm achieves high stereo reconstruction quality in regions with uniform intensities as well as in textured regions. The algorithm is robust against local radiometrical differences and successfully recovers disparities around the objects edges, disparities of thin objects, and the disparities of the occluded region. Moreover, our algorithm intrinsically prevents errors caused by occlusion to propagate into nonoccluded regions. It has only a small number of parameters. The performance of our algorithm is evaluated on the Middlebury test bed stereo images. It ranks highly on the evaluation list outperforming many local and global stereo algorithms using color images. Among the local algorithms relying on the frontoparallel assumption, our algorithm is the best-ranked algorithm. We also demonstrate that our algorithm is working well on practical examples as for disparity estimation of a tomato seedling and a 3D reconstruction of a face. Sanja Damjanović, Ferdinand van der Heijden, and Luuk J. Spreeuwers Copyright © 2012 Sanja Damjanović et al. All rights reserved. Joint Segmentation and Groupwise Registration of Cardiac Perfusion Images Using Temporal Information Mon, 11 Jun 2012 11:50:20 +0000 We propose a joint segmentation and groupwise registration method for cardiac perfusion images by using temporal information. The nature of perfusion images makes groupwise registration especially attractive as the temporal information from the entire image sequence can be used. Registration aims to maximize the smoothness of the intensity signal, while segmentation minimizes a pixel’s dissimilarity with other pixels having the same segmentation label. The cost function is optimized in an iterative fashion using B-splines. Tests on real patient datasets show that compared to two other methods, our method shows lower registration error and higher segmentation accuracy. This is attributed to the use of temporal information for groupwise registration and mutually complementary registration and segmentation information in one framework, while other methods solve the two problems separately. Dwarikanath Mahapatra Copyright © 2012 Dwarikanath Mahapatra. All rights reserved. Spatiotemporal Relations and Modeling Motion Classes by Combined Topological and Directional Relations Method Wed, 16 May 2012 10:18:10 +0000 Defining spatiotemporal relations and modeling motion events are emerging issues of current research. Motion events are the subclasses of spatiotemporal relations, where stable and unstable spatio-temporal topological relations and temporal order of occurrence of a primitive event play an important role. In this paper, we proposed a theory of spatio-temporal relations based on topological and orientation perspective. This theory characterized the spatiotemporal relations into different classes according to the application domain and topological stability. This proposes a common sense reasoning and modeling motion events in diverse application with the motion classes as primitives, which describe change in orientation and topological relations model. Orientation information is added to remove the locative symmetry of topological relations from motion events, and these events are defined as a systematic way. This will help to improve the understanding of spatial scenario in spatiotemporal applications. Nadeem Salamat and El-hadi Zahzah Copyright © 2012 Nadeem Salamat and El-hadi Zahzah. All rights reserved. 3D Human Motion Tracking and Reconstruction Using DCT Matrix Descriptor Mon, 07 May 2012 18:14:05 +0000 One of the most important issues in human motion analysis is the tracking and 3D reconstruction of human motion, which utilizes the anatomic points' positions. These points can uniquely define the position and orientation of all anatomical segments. In this work, a new method is proposed for tracking and 3D reconstruction of human motion from the image sequence of a monocular static camera. In this method, 2D tracking is used for 3D reconstruction, which a database of selected frames is used for the correction of tracking process. The method utilizes a new image descriptor based on discrete cosine transform (DCT), which is employed in different stages of the algorithm. The advantage of using this descriptor is the capabilities of selecting proper frequency regions in various tasks, which results in an efficient tracking and pose matching algorithms. The tracking and matching algorithms are based on reference descriptor matrixes (RDMs), which are updated after each stage based on the frequency regions in DCT blocks. Finally, 3D reconstruction is performed using Taylor’s method. Experimental results show the promise of the algorithm. Alireza Behrad and Nadia Roodsarabi Copyright © 2012 Alireza Behrad and Nadia Roodsarabi. All rights reserved. An Effective Slow-Motion Detection Approach for Compressed Soccer Videos Sun, 25 Mar 2012 09:41:29 +0000 Slow-motion replays are content full segments of broadcast soccer videos. In this paper, we propose an efficient method for detection of slow-motion shots produced by high-speed cameras in soccer broadcasts. A rich set of color, motion, and cinematic features are extracted from compressed video by partial decoding of the MPEG-1 bitstream. Then, slow-motion shots are modeled by SVM classifiers for each shot class. A set of six full-match soccer games is used for training and evaluation of the proposed method. Our algorithm presents satisfactory results along with high speed for slow-motion detection in soccer videos. Vahid Kiani and Hamid Reza Pourreza Copyright © 2012 Vahid Kiani and Hamid Reza Pourreza. All rights reserved. A Wavelet-Domain Local Dominant Feature Selection Scheme for Face Recognition Sun, 11 Mar 2012 08:29:06 +0000 A multiresolution feature extraction algorithm for face recognition is proposed based on two-dimensional discrete wavelet transform (2D-DWT), which efficiently exploits the local spatial variations in a face image. For feature extraction, instead of considering the entire face image, an entropy-based local band selection criterion is developed, which selects high-informative horizontal segments from the face image. In order to capture the local spatial variations within these bands precisely, the horizontal band is segmented into several small spatial modules. The effect of modularization in terms of the entropy content of the face images has been investigated. Dominant wavelet coefficients corresponding to each module residing inside those bands are selected as features. A histogram-based threshold criterion is proposed to select dominant coefficients, which drastically reduces the feature dimension and provides high within-class compactness and high between-class separability. The effect of using different mother wavelets for the purpose of feature extraction has been also investigated. PCA is performed to further reduce the dimensionality of the feature space. Extensive experimentation is carried out upon standard face databases, and a very high degree of recognition accuracy is achieved by the proposed method in comparison to those obtained by some of the existing methods. Hafiz Imtiaz and Shaikh Anowarul Fattah Copyright © 2012 Hafiz Imtiaz and Shaikh Anowarul Fattah. All rights reserved. An Effective Color Addition to Feature Detection and Description for Book Spine Image Matching Sun, 18 Dec 2011 10:00:56 +0000 The important task of library book inventory, or shelf-reading, requires humans to remove each book from a library shelf, open the front cover, scan a barcode, and reshelve the book. It is a labor-intensive and often error-prone process. Technologies such as 2D barcode scanning or radio frequency identification (RFID) tags have recently been proposed to improve this process. They both incur significant upfront costs and require a large investment of time to fit books with special tags before the system can be productive. A vision-based automation system is proposed to improve this process without those prohibitively high upfront costs. This low-cost shelf-reading system uses a hand-held imaging device such as a smartphone to capture book spine images and a server that processes feature descriptors in these images for book identification. Existing color feature descriptors for feature matching typically use grayscale feature detectors, which omit important color edges. Also, photometric-invariant color feature descriptors require unnecessary computations to provide color descriptor information. This paper presents the development of a simple color enhancement feature descriptor called Color Difference-of-Gaussians SIFT (CDSIFT). CDSIFT is well suited for library inventory process automation, and this paper introduces such a system for this unique application. Spencer G. Fowers and Dah-Jye Lee Copyright © 2012 Spencer G. Fowers and Dah-Jye Lee. All rights reserved. On the Brittleness of Handwritten Digit Recognition Models Wed, 30 Nov 2011 14:51:38 +0000 Handwritten digit recognition is an important benchmark task in computer vision. Learning algorithms and feature representations which offer excellent performance for this task have been known for some time. Here, we focus on two major practical considerations: the relationship between the the amount of training data and error rate (corresponding to the effort to collect training data to build a model with a given maximum error rate) and the transferability of models' expertise between different datasets (corresponding to the usefulness for general handwritten digit recognition). While the relationship between amount of training data and error rate is very stable and to some extent independent of the specific dataset used—only the classifier and feature representation have significant effect—it has proven to be impossible to transfer low error rates on one or two pooled datasets to similarly low error rates on another dataset. We have called this weakness brittleness, inspired by an old Artificial Intelligence term that means the same thing. This weakness may be a general weakness of trained image classification systems. Alexander K. Seewald Copyright © 2012 Alexander K. Seewald. All rights reserved. Analysis of Facial Images across Age Progression by Humans Tue, 01 Nov 2011 13:16:08 +0000 The appearance of human faces can undergo large variations over aging progress. Analysis of facial image taken over age progression recently attracts increasing attentions in computer-vision community. Human abilities for such analysis are, however, less studied. In this paper, we conduct a thorough study of human ability on two tasks, face verification and age estimation, for facial images taken at different ages. Detailed and rigorous experimental analysis is provided, which helps understanding roles of different factors including age group, age gap, race, and gender. In addition, our study also leads to an interesting observation: for age estimation, photos from adults are more challenging than that from young people. We expect the study to provide a reference for machine-based solutions. Jingting Zeng, Haibin Ling, Longin Jan Latecki, Shanon Fitzhugh, and Guodong Guo Copyright © 2012 Jingting Zeng et al. All rights reserved. An Artificial Cellular Convolution Architecture for Real-Time Image Processing Tue, 01 Nov 2011 13:11:37 +0000 An artificial cell is comprised of the most basic elements in a hierarchical system, that has minimal functionality, but general enough to obey the rules of “artificial life.” The ability to replicate, organize hierarchy, and generalize within an environment is some of the properties of an artificial cell. We present a hardware artificial cell having the properties of generalization ability, the ability of self-organization, and the reproducibility. The cells are used in parallel hardware architecture for implementing the real-time 2D image convolution operation. The proposed hardware design is implemented on FPGA and tested on images. We report improved processing speeds and demonstrate its usefulness in an image filtering application. H. Mahrous and A. P. James Copyright © 2012 H. Mahrous and A. P. James. All rights reserved. Vessel Extraction of Conjunctival Images Using LBPs and ANFIS Thu, 20 Oct 2011 09:19:24 +0000 The main goal of medical imaging applications is to diagnose some diseases, try to prevent the progression of them, and actually cure the patients. The number of people that suffer from diabetes is growing very fast these recent years in many countries and it is needed to diagnose this disease in the beginning to prevent the subsequent side effects like blindness and so on. One of the first ways to detect this disease is analysis of vessels in some parts of the eye such as retina and conjunctiva. Some studies have been done on effects of vessel changes of conjunctiva in diabetes diagnosis and it is proved that conjunctival vessel extraction and analysis is a good way for this purpose. In this paper, we proposed a method to detect and extract the vessels of conjunctiva automatically. It is the first stage of the process of diabetes diagnosis. We first extract some textural features from each pixel of the conjunctiva image using LBP and then classify each pixel to vessels or nonvessels according to the features vector based on a supervised classifier, ANFIS. We tested the proposed algorithm on 40 conjunctival images to show the performance and efficiency of our method. Seyed Mohsen Zabihi, Hamid Reza Pourreza, and Touka Banaee Copyright © 2012 Seyed Mohsen Zabihi et al. All rights reserved.