Abstract

Most of the existing license plate (LP) detection systems have shown significant development in the processing of the images, with restrictions related to environmental conditions and plate variations. With increased mobility and internationalization, there is a need to develop a universal LP detection system, which can handle multiple LPs of many countries and any vehicle, in an open environment and all weather conditions, having different plate variations. This paper presents a novel LP detection method using different clustering techniques based on geometrical properties of the LP characters and proposed a new character extraction method, for noisy/missed character components of the LP due to the presence of noise between LP characters and LP border. The proposed method detects multiple LPs from an input image or video, having different plate variations, under different environmental and weather conditions because of the geometrical properties of the set of characters in the LP. The proposed method is tested using standard media-lab and Application Oriented License Plate (AOLP) benchmark LP recognition databases and achieved the success rates of 97.3% and 93.7%, respectively. Results clearly indicate that the proposed approach is comparable to the previously published papers, which evaluated their performance on publicly available benchmark LP databases.

1. Introduction

License plate recognition (LPR) system plays a key role in intelligent transportation systems, such as traffic control, parking lot access control, electronic toll collection, and information management. Typical LPR system contains four processing steps. The first step is to get the image or video from the camera. The second step is LP detection from the input image. The third step is to extract the characters from the LP and the final step is to recognize the extracted characters using different classifiers. These four steps can be achieved by the combination of different techniques of image processing and pattern recognition. Out of these four steps, the LP detection and character recognition steps are very crucial for the success of LPR systems.

LP detection systems have shown significant development for more than a decade with good performance reports, but most of these systems’ evaluation is carried on proprietary data sets, having controlled conditions on environment and plate variations. To assess the performance of the LP detection methods, there is a need for a common publicly available benchmark LP data set, which should contain videos and images taken in an open environment and with different plate variations. A common publicly available benchmark LP data set, for performance evaluation of LPR systems, which is initiated by Anagnostopoulos et al. in paper [1] and contains 741 still images of Greek LPs with several open environmental conditions and different plate variations is present at [2]. For evaluation of the proposed approach, we have used 741 still images of media-lab Greek LP database, 159 Indian, and Israeli LPs from videos and still images. As media-lab Greek LP database missed motorcycles, vehicles with rotated LPs, the combination of different types of vehicles, and more than one motorcycle in a single image, an appropriate care is taken while selecting Indian and Israeli LP images, to achieve all the combinations of plate variations which are missed by media-lab Greek LP database.

In this paper, we have proposed a new approach for finding the LP/LPs in an image using various clustering techniques on geometrical properties of the LP characters, and a new approach for finding and extracting the noisy/missed characters of the LP/LPs, due to the presence of noise such as dirt or screw or stamp between LP characters and LP border. The clustering techniques proposed in this paper use geometrical properties of the components of LP characters, such as the distance between the components, the angle between the components, and the height of the components, to find the probable LP/LPs. This is the first time that different clustering techniques are applied on geometrical properties of the components of an input image for finding the probable LP/LPs. The proposed geometry-based clustering method for finding the vehicle LP/LPs is scale and rotational invariant and is suitable for many countries LP detection, for any type of vehicles and motorcycles having different plate variations.

The performance of the proposed LP detection method is more prominent when compared with other competitive LP detection methods from the literature, by taking into consideration publicly available benchmark LP databases. It is inappropriate to declare which methods are better because in most of the previously published methods the performance evaluations were carried on proprietary data sets having restricted conditions and were not revealed to the public, to assess their performance. In this paper, we are proposing new methods for LP detection, noisy/missed character extraction, and LP characters rotation correction. New findings in this paper are as follows:(i)Proposing a new method for LP detection, using distance-based, line-based, and height-based clustering techniques, on geometrical properties of the LP components.(ii)Proposing a new method to remove unwanted clustered components, using the thinning and resizing technique.(iii)Proposing a new method for correcting LP rotation of the probable LP cluster components, using the average angle amongst successive probable LP cluster components’ left-top coordinates and -axis.(iv)Proposing a new method to extract the noisy/missed characters of the probable LP, because of the presence of noise between LP characters and LP border.The remaining sections of this paper are planned as follows. Section 2 exhibits the existing similar research. Section 3 describes the proposed approach for multiple LPs detection. Section 4 elaborates on the proposed methodology for multiple LPs detection. Section 5 describes the extraction of noisy/missed characters due to the presence of noise between LP characters and LP border. This section also describes a method for probable LP characters rotation if the vehicle LP is rotated. Experimental results are discussed in Section 6 and Conclusion in Section 7.

2. Existing Similar Research

Many LPR algorithms have been proposed in the literature for the past ten years and even today LP detection remains the challenging area due to different environmental conditions and plate variations. In the literature, there is no LP detection method which will work for many countries, for all types of vehicles and motorcycles, without any constraint. LP detection is challenging and crucial in LPR systems, which influences the recognition rate. Most of the existing LP detection papers from the literature are based on edge information, morphological operations, template matching, and color information of the LP.

Lee et al. in paper [3] proposed a color image processing (CIP) method to extract the LP of Korean car’s, based on LP background and LP characters color using the color histogram. A Neural Network (NN) classifier is used to classify a color. This paper used the aspect ratio of the LP region to select the most probable LP and reported 91.25% success rate for LP detection over 80 car images. The drawback of paper [3] is that the LP detection will not work properly if the vehicle color matches either background color of the LP or color of the LP characters. A hybrid LP localization scheme is presented by Bai and Liu in paper [4], based on the edge statistics and morphology (ESM). The proposed approach had four sections. Section  1 handles the vertical edge detection, Section  2 takes care of the edge statistical analysis, Section  3 finds the hierarchical-based LP location, and Section  4 finds the morphology-based LP extraction. This paper reported 99.6% overall success rate for detecting the LP out of 9825 images. The drawback of paper [4] is that it uses edge information and morphology-based approaches to detect the LP. Some LPs are not so easy to detect using edge information and morphology-based approaches have to define Structuring Element (SE) to perform morphological operations to find the probable LP/LPs from an input image. Defining a particular SE to perform morphology-based operations to detect the probable LP/LPs is a nongeneric approach and will fail to detect the LP/LPs from the input images under various characteristics of LP/LPs in the images.

Yang et al. in paper [5] proposed a new method based on fixed color collocation (FCC) to locate the LP. This method used the color collocation of the plate’s background and characters, to recognize the LPs. This paper reported 95% success rate for LP detection. The drawback of paper [5] is also similar to the drawback of paper [3]. Anagnostopoulos et al. in paper [6] proposed a new image segmentation method called sliding concentric windows (SCW) for LP detection. The SCW method works based on local irregularities in the image. The method uses statistics such as standard deviation and mean value, for possible LP location. SCW uses two concentric windows A and B with different sizes, to scan the image from left to right and top to bottom to find the mean and standard deviation of the regions of the concentric windows. If the ratio of the statistical measurements exceeds a threshold value set by the user, then the central pixel of the two concentric windows is considered to be the part of LP. This paper reported a success rate of 96.5% for LP detection using media-lab proprietary LP data set. The limitation of paper [6] is that the statistical measurement threshold value set by the user has to be decided according to the application after a trial-and-error procedure, which is not a generic solution.

Faradji et al. in paper [7] proposed a real-time and robust (RTR) method to find LP location. Finding LP location has several stages, with the combination of Sobel mask, histogram analysis, and morphological operations. The overall success rate for detecting the LP by this paper was 83.5%. The limitation of paper [7] is the same as that in paper [4]. Huang et al. in paper [8] proposed LPR strategy for motorcycles for checking annual inspection status. The LP data set considered by this paper contains only motorcycles having the LP characters falling in only one line. The method proposed in this paper finds the LP using search window with the help of horizontal and vertical projections (SWHVP) and reported an average LP detection rate of 97.55%. The drawback of paper [8] is that it is not mentioned how to get the initial size of the search window to perform horizontal and vertical projections. It also used morphology-based dilation operation which is not a generic solution to detect the LP/LPs as explained in paper [4] drawback. Wen et al. in paper [9] proposed two methods to find the LP from an input image (two pass). These two methods are based on Connected Component Analysis (CCA) model. Before applying these methods, the input image is binarized using improved Bernsen algorithm, to remove shadows and uneven illumination. Method  1 is used to find the candidate regions based on prior knowledge of the LP. The frame is detected using CCA methodology. If the frame is broken, the LP cannot be detected correctly. When the LP is not detected using Method  1, then Method  2 is adopted. Method  2 extracts the LP using large numeral extraction technique. This paper reported a success rate of 97.16%. The drawback of paper [9] is that the proposed Method  1 fails to detect the LP/LPs from the input image if the LP frame is broken. The drawback of the Method  2 is that it will fail to detect the LP/LPs from the input image if the LP/LPs are not in horizontal position.

Haneda and Hanaizumi in paper [10] proposed RELIP algorithm, which performs a global search for the probable LP using multiple templates, 3D cross-correlation function, and Principal Component Expansion. This paper uses corner detection to remove deformation of LPs. RELIP reported 97% LP detection success rate. The drawback of paper [10] is that it uses spatial similarity with an LP/LPs template to detect the LP which is not a generic solution in the real world context as the LPs having various deformations such as tilt, rotation, and pan from various viewpoints. Zhou et al. in paper [11] proposed a new approach for LP detection based on Principal Visual Word (PVW) discovering and visual word matching. In visual word matching, it will compare the extracted SIFT features of the test image with all discovered PVW and locate the LP based on matching results. This method published 93.2% success rate on the proprietary data set and 84.8% success rate on Caltech dataset. The drawback of paper [11] is that it works based on the prior knowledge of the LP.

Al-Ghaili et al. in paper [12] proposed a new approach in which a color image is converted to grayscale and then the adaptive threshold is applied on the grayscale image to convert it into a binary image. ULEA method is applied to the grayscale image to enhance the quality, by removing the noise. Next, VEDA is applied to detect the LP from the input image. In order to detect the true LP, some statistical and logical operations are applied. The success rate reported by this paper was 91.65% for LP detection. The drawback of paper [12] is that it extracts the LP/LPs from the input image based on extracting vertical edges, which is the same as the drawback of paper [4]. Hsu et al. in paper [13] proposed a new approach (AOLP) for detecting LP candidates, using Expectation-Maximization clustering method on vertical edges of grayscale images. This paper reported 93.33% success rate on AOLP proprietary benchmark LP data set and reported 92.1% success rate on media-lab benchmark LP data set, for LP detection. It is mentioned in the paper that the LPR solution is designed primarily based on LPs of Taiwan and is not optimal for other countries. The drawback of paper [13] is the same as that in paper [4] because its LP detection is based on extracting vertical edges.

Abo Samra and Khalefah in paper [14] proposed a new LP localization algorithm using dynamic image processing techniques and genetic algorithms (GA) (DIP-GA). CCA technique is used to detect the candidate objects of the input image and improved the CCA technique with the help of modified GA. The system is made adaptable to any country by introducing a scale-invariant geometric relationship matrix to model the LP symbols. The speed of the LP detection is improved by introducing two new crossover operators. The system reported 97.61% success rate using publicly available media-lab benchmark LP database by considering only 335 images out of 741 images. This paper also reported 98.75% success rate using proprietary data set having 800 image samples and reported 98.41% overall accuracy. The drawback of paper [14] is that it is not able to detect multiple LPs in an image.

The LP detection methods which use edge information and morphological operations mainly focus on finding the components which are rectangular in shape with specific aspect ratio. Such type of LP extraction methods will fail to identify the LP if the LP does not follow a rectangular shape with proper aspect ratio. The problem with template matching LP detection methods is that it will not work for all types of plate variations. The color based LP extraction methods use the background color of the LP, to identify the probable LP candidate region because some countries use a particular background color in their LPs. Such type of LPR systems will fail to detect the LP properly if the body color of the vehicle matches with the LP background color. The above categorized LP detection methods used the features of LP in an image, to extract the LP location from an input image. These categories of LP detection methods have limitations in extracting LP/LPs from an image because of the features they adopted to extract LP/LPs.

In this paper, we are proposing for the first time a new LP detection method, which will work for the LP detection of many countries, having any shape, which uses different clustering techniques on geometrical properties of the character components of the LP/LPs in the input image. The advantage of using different clustering techniques on geometrical properties of the character components of the LP is that they are independent of scale, rotation, tilt, and orientation. There are very few techniques/publications in the literature which talk about LP detection methods under various environmental conditions and plate variations mentioned, which will work for any type of vehicle and motorcycle having any LP shape.

3. Proposed Approach for Multiple License Plates Detection

This paper proposes a new method for LP/LPs detection and noisy characters extraction, for any type of vehicle and motorcycle, having different plate variations, under different environmental and weather conditions. The environmental conditions include different illumination, weather, and background conditions. The plate variations include location of the plate anywhere on the vehicle, many plates in single image, different combination of vehicles with different plate orientations, different sizes of plates, background color of plates, plates with dirt, rotated plates, LPs having two lines of characters and each line of characters are of different size, and tilted LPs. The proposed method can be articulated as a generalized method for identifying the LP/LPs, because it is independent of plate variations under different environmental and weather conditions and can be applicable to many countries and for any vehicle having multiple lines in the LP. In the proposed approach, we have not used any type of edge detection, template matching, morphological operations, and color information of the LPs, which are extensively used by previously published papers, to detect the LP.

The proposed method uses CCA to label the components and applies different clustering techniques on geometrical properties of the labelled components such as the location of the components, the angle between the components, and the height of the components, to extract the probable LP/LPs. In most of the countries, the LP characters are near to each other, are positioned along one or multiple lines, and are similar in height. Based on these properties of LP characters, clustering techniques can be applied on geometrical properties of LP characters to identify the LP/LPs from an input image. Most of these properties are followed by many nations while designing their LPs. That is why the proposed approach can be applicable to detect the LP of many nations which follow the properties of the LP characters mentioned in this paper while designing their LPs. The proposed method contains the following steps.(i)Apply preprocessing steps on the input image. If it is a video, convert the video into different frames and then apply preprocessing steps. After completing the preprocessing steps, label the components of the input image to find the number of components and to extract the geometrical properties of the each of components such as left-top coordinates, width, and height.(ii)Apply newly proposed distance-based clustering algorithm on each of components’ left-top coordinates. This algorithm divides the image components into various distance-based clusters, which are close to each other. Let be the number of clusters formed after distance-based clustering from components of the preprocessed input image.(iii)Recluster each distance-based cluster into different line-based clusters. If the components of the distance-based clusters subtend a similar angle between the lines joining left-top coordinates of the components with the -axis, then construct line-based clusters from such components. Line-based clustering algorithm reclusters distance-based cluster components into line-based clusters, which are in line and close to each other.(iv)Recluster each line-based cluster, based on the cluster components height. This is height-based clustering, which reclusters line-based clusters into height-based clusters.(v)After the distance, line, and height-based clustering techniques, the resultant clustered components have the properties such as near to each other, positioned along a line, and being similar in height. These properties belong to the characters of LP of any vehicle and motorcycle in the world.(vi)In the resultant clusters, there may be a few non-LP clusters in which all the components follow the properties of LP characters. In order to remove such non-LP clusters, apply the thinning and resizing technique on height-based clusters. Let be the number of probable LP clusters in the image, after applying thinning and resizing technique.(vii)Refine clusters further by finding the border of the cluster components. If the border percentage for each cluster is less than predefined threshold, remove such clusters from the list of probable LP cluster (). After this step, let be the number of probable LP clusters.(viii)Now, apply the newly proposed character extraction method to extract noisy/missed characters of the LP, on each of “” probable LPs, due to the presence of noise such as screw or dirt or stamps between LP characters and border of the LP.(ix)After extracting the noisy/missed characters, rotate the probable LP characters using the average angle between the lines joining adjacent components’ left-top coordinates and the -axis, so that all the probable LP characters will be horizontal to the -axis.The above mentioned outlined procedures are explained in detail in the following sections.

4. Detailed Description of the Proposed Approach for Multiple License Plates Detection from Videos and Still Images

This section describes in detail the proposed approach to find the probable LP/LPs in an image with the help of various clustering, thinning, and resizing techniques. This section also explains the method and the need to further refine the probable LP/LPs by finding the border of the LP/LPs.

4.1. Preprocessing Stage

Steps(1)Due to the effect of illumination in an open environment and the presence of shadows, it is very difficult to process an input image with the help of traditional threshold binarization methods and will not give satisfactory results. In this paper, we are using Bernsen algorithm to overcome the illumination and shadow problems in an image. Let denote gray value at a point of an image. Let be the centre of a block of size in the image, where is the number of pixels. Threshold at the point can be computed using (2)Convert the input image  1 (shown in Figure 1) into grayscale. If it is video, convert it into frames and then convert each frame into grayscale. Apply the Bernsen algorithm to overcome from uneven illuminations or shadows present in the grayscale image. Complement the binarized image, whose output is shown in Figure 1 second column. Remove the components, whose height is less than three pixels from the complemented binary image, because no LP character is less than three pixels in height. The image in Figure 1, third column, shows the output after removing the components that are less than three pixels in height.(3)To implement the rest of the operations on individual components, such distance-based clustering, line-based clustering, and height-based clustering extract the geometrical properties of individual components such as left-top coordinates, width, and height.(4)Geometrical properties of the individual components described in Step  3 can be extracted using the following procedure:(a)To find the number of components present in an image, use CCA method to apply labelling to preprocessed image. Let be the number of components present in the input image at this stage.(b)After labelling, find left-top coordinates, width, and height of components.(c)Crop each individual component from the preprocessed image using left-top coordinates, width, and height.(d)Save the cropped components of the input image as separate image components.

4.2. Clustering Stage

The purpose of the clustering stage is to prepare the probable LP clusters from components of the preprocessed input image. In this stage, the proposed system performs three types of clustering techniques, one after the other. The first clustering technique is the distance-based clustering, whose purpose is to divide components of the preprocessed input image into groups which are near to each other. The second clustering technique is the line-based clustering, whose significance is to divide each distance-based cluster into an array of line-based clusters. The components of the line-based clusters are in line and close to each other from the viewpoint of left-top coordinates of the components. The third clustering technique is the height-based clustering, whose significance is to regroup the line-based cluster components which are similar in their height. After all these clustering techniques, the resultant cluster components in each cluster are close to each other, positioned in a line and alike in their heights which are the probable LP/LPs of the input image.

Context dependent variables are cluster_size and max_distance. The first context dependent variable cluster_size indicates the minimum number of characters in a row of the LP and can be tuned to satisfy country specific LP constraints. In our experiments, we have considered cluster_size as 4. The next context dependent variable max_distance is used during various clustering stages which explain the maximum distance that is allowed between the successive components of the LP clusters. The value of the variable max_distance is computed as one-third of the columns of the input image. The proposed geometry-based clustering method has following steps.

Steps(1)After preprocessing stage, cluster all the components of the image, based on the distance between left-top coordinates of each individual component, using distance-based clustering algorithm which is explained in detail in the following steps.(2)Distance-based clustering algorithm prepares a matrix (distance matrix) of size , where indicates the number of components of the input image after the preprocessing stage. In the distance matrix, the 1st row indicates the distance between the 1st label component and rest of the components’ left-top coordinates, 2nd row indicates the distance between 2nd label component and rest of the components’ left-top coordinates, and so on. Distance-based clustering algorithm prepares maximum clusters, one for each component with the help of distance matrix.(3)Remove those distance-based clusters whose size is less than cluster_size and which are the subset of other distance-based clusters. Retain those distance-based clusters with a large number of components in it when performing the subset removal operation. At this stage, the components of the image are clustered into groups, based on the distance between the left-top coordinates of each individual component.(4)Figure 2 shows an example image with the components , which are clustered into two distance-based clusters highlighted with rectangular border (in red color). The components formed as first distance-based cluster and as second distance-based cluster.(5)Figure 3 shows the resultant image after distance-based clustering. Let be the number of distance-based clusters at this stage, which are shown as ellipses in Figure 3. As most of the components of the input image are very small in size, it is not possible to observe with the naked eye the components formed as distance-based clusters from the input image. Hence, the same is explained by taking an example as shown in Figure 2.(6)Now, apply line-based clustering technique on each of distance-based clusters, to find those components which are in line with each other.(7)In the line-based clustering, consider individual distance-based cluster and take each component from the cluster and draw a line from left-top coordinates of one component to the left-top coordinates of the next component in the current cluster. Find the angle between -axis and the line that is drawn between two left-top coordinates of components as shown in (2). In the same way, find the angle between rest of the components and -axis, as a matrix (angle matrix) of size , where indicates the number of individual components in each distance-based cluster:(8)Cluster those components as line-based clusters, which subtend similar angle with the -axis and which are close to each other (based on max_distance), for each row of the angle matrix. Now, remove those line-based clusters, which are less than cluster_size and which are subsets to other line-based clusters. This is the line-based clustering technique.(9)Line-based clustering is used to recluster the distance-based cluster components, based on the property of having similar angle and closeness of the cluster components. After this stage, the components of an input image are clustered, using distance-based clustering and line-based clustering techniques.(10)Figure 4 shows an example image in which the distance-based clusters (shown in the red border) are divided into line-based clusters (shown in the green border). In Figure 4, the first distance-based cluster with the components is divided into two line-based clusters with the components and , indicated with rectangular (green in color) boxes. The second distance-based cluster with the components resulted into a line-based cluster with the components C10, C11, and C13 and the component C12 is removed from the resultant list, because C12 is not in line with other components. The resultant image after line-based clustering is shown in Figure 5 in which the line-based clusters are marked in rectangular boxes.(11)There is no difference between Figures 3 and 5 based on the number of components and positions of the components are concerned. The resultant distance-based clusters are shown in the ellipse shape with various colors in Figure 3. The resultant line-based clusters are shown in the rectangular shape with various colors in Figure 5.(12)After line-based clustering, apply height-based clustering technique, to remove few unwanted/junk components, which are very close to and in line with the components, but shows much difference in height as compared with other components in each line-based cluster. Now, remove those clusters which are less than cluster_size and which are subsets to other clusters. This is height-based clustering.(13)Figure 6 shows an example image in which the component C3 (indicated with the arrow) is showing much difference in height as compared to the rest of the components; it will be removed from the line-based cluster and will result in height-based cluster with the components C1, C2, C4, and C5. Figure 7 shows the image after height-based clustering and the resultant height-based clusters are shown in ellipses. At this stage, let be the number of height-based clusters.(14)After height-based clustering, few of the clusters from the final cluster list may contain all unwanted components, which obey all the characteristics based on distance, line, and height-based clustering. Such junk component clusters can be removed by thinning and resizing technique.(15)Apply infinite thinning and resizing technique on each height-based cluster of the image. Thinning is a morphological operation used for skeletonization of the binary image components. When we apply thinning and resizing operations, junk components will retain its shape, but the LP characters will fade away completely. Remove those clusters from the cluster list, which retain its shape after thinning and resizing operations. This technique removes all junk cluster components from the final cluster list. Let be the number of clusters after removing junk clusters using the thinning and resizing technique. Figure 8 shows image after thinning and resizing technique and the resultant clusters are shown in ellipses shape.

4.3. Finding Border of the License Plate

Contrasting to a typical LPR system, the proposed system first finds the probable LP characters and then finds the border of the components. The reason for finding the border of LP is that there can be a group of non-LP characters in the image with similar properties of LP characters without the border. In order to avoid such type of characters, the system proposes to find the border of the LP.

Predefined Border Percentage: Beta. Beta is a user defined variable, which decides the border percentage that an LP can have. After rigorous experimentations with the help of many datasets from multiple countries, we have come to a conclusion to decide the value of Beta as 70%.

Steps(1)Let be the number of probable LP clusters at this stage. Take each individual component from the cluster and traverse from left-top coordinates of each individual component towards the upward direction, till the traversal reaches the border point or three times the height of each individual character component.(2)Apply the same procedure towards the downward direction, to find the bottom border point for each individual component.(3)Save the top and bottom border points for all the components of each cluster.(4)Find the border percentage for each cluster using the top and bottom border points. Retain a cluster only when the percentage is greater than or equal to the predefined border percentage “Beta.” Retained clusters indicate the LP/LPs of an image and its components indicate the individual characters of the LP.(5)Remove those clusters which fall below predefined border percentage “Beta.” This is another way to further refine the cluster list to get the required LP region. Figure 9 shows the image after LP border identification.

5. Noisy Characters Extraction and License Plate Characters Rotation

Noisy/missed characters extraction stage is to extract few of the LP characters, which may be missed during the previous stages, because of the presence of noise such as screw or dirt or stamps between the LP border and LP characters. This section proposes a new approach to extract such type of noisy/missed characters. The proposed noisy/missed characters extraction algorithm can be applied after Section 4.3.

Steps(1)The proposed algorithm uses (3) to find the noisy LP characters’ left-top coordinates at a distance from the leftmost cluster component’s left-top coordinates and the average slope amongst the cluster components:(2)At this stage, we have few probable LP clusters. Take each probable LP cluster and find the average height and average slope () amongst the subsequent cluster components’ left-top coordinates and the -axis.(3)From each probable LP cluster, take left-top coordinates of the first component and move in the downward direction to (1/4)th of the average height of probable LP cluster. Now, traverse the image right side, one pixel at a time, using (3) mentioned above, to find any noisy/missed LP character components.(4)If any noisy/missed LP character component is found, which is not part of the probable LP cluster component and is not a background pixel, then find the left-top coordinates and the width of the noisy/missed component. To find the left-top coordinates and the width of the noisy/missed component, we have to perform three traversals as described as follows:(i)The first traversal is towards the top side of the noisy/missed character component to compensate the left-top coordinate’s slope with the average slope. The second traversal is to traverse towards the left side of the missed component in the direction of the average slope, till it reaches leftmost pixel of the noisy/missed component, within the average height, using (3). After reaching the left most side of the noisy/missed component, fix it as left-top coordinates of the noisy/missed component.(ii)Take the newly found left-top coordinates of the noisy/missed LP character component and move towards the right side (third traversal) in the direction of the average slope, till it reaches the right most point of the missed component, within the average height, using (3). The difference between the -coordinates of the newly found left-top coordinate point and the right most point is the width of the noisy/missed component.(iii)Crop the noisy/missed component using left-top coordinates, width, and the average height of the cluster components.(5)Repeat the same procedure, till the traversal reaches the last component of the probable LP cluster component. Repeat the same procedure, for all probable LP clusters.(6)Rotate each individual component of the probable LP cluster by using average angle , which can be calculated from the average slope () amongst the probable LP cluster components using (4). Consider each component of the binary image as and can be defined as shown in (5). One has(7)Let be the image component before rotation and let be the image component after rotation. Use average angle to rotate . The equation for each individual pixel of can be obtained by using (8)In a rotated image, if the average angle of the probable cluster components is above a certain threshold, then there is a chance that the other part of the character component will be present in the target component. In such a case, retain the bigger component from the target component and remove the rest of the components.

6. Experimental Results

The above described concepts are implemented using MATLAB on Intel core i3 processor machine, having 4 GB RAM. The performance of the proposed LP detection method is compared with some of the competitive LP detection methods, by taking into consideration publicly available media-lab benchmark LP database, Israeli LP images from the web, and proprietary Indian LPs, having different plate variations and weather conditions in an open environment. Total images from all these data sets are 900. To further assess the performance of the proposed LP detection method, we have considered AOLP benchmark LP database having 2049 images with three subsets. For experimentation, we have considered Indian, Israeli, and media-lab Greek benchmark LPs as single data set having different characteristics of the LPs in images as described in Table 1.

In an open environment, there are many possible ways by which we can capture an image from cameras. The proposed geometry-based clustering techniques in this paper are invariant to size, tilt, pan, and rotation. That is why the proposed approach works properly with extreme observation views. There is no restriction on the size of LP characters to detect the LP, which can be observed from Figures 10(a23) and 10(b23). There are very few methods in the literature, which talks about the LP detection of motorcycles, where LP characters fall in two lines and each line of characters are of different size. The proposed method works for any type of vehicles, motorcycles, vans, and trucks having multiple lines of LP characters and also each line of characters having different sizes. The proposed approach will fail to detect the LP/LPs from an input image if the LP characters touch the border of the LP or there are less than cluster_size characters in the LP or there are no characters present in LP at all. Table 1 shows the summarization of the various characteristics of LPs in the images described in Figure 10. For example, Figure 10(a1) (S. number 1 in Table 1) indicates an LP in an input image from Greek in an open environment. In all images, we have assumed neither fixed number of characters in the LP nor the number of lines. We assumed that the sizes of the LP characters may be different due to different view conditions.

Figure 10 shows the sample results for all categories of the LPs, which include images of different plate variations, environmental, and weather conditions from media-lab, Israeli, and proprietary Indian LP databases whose summary is given in Table 1. The odd column of Figure 10 shows the actual image and the even column shows the binarized image. Red border in the binary image of Figure 10 indicates the identification of the LP and without a red border belongs to a failed case. Most of the vehicles in many countries including motorcycles will have only a single line of characters in their LPs, but there may be a chance that few countries like India will have LP characters that will fall into more than one line and there may be a chance that the characters in each line of LP may vary in size as shown in Figures 10(a8), 10(a18), and 10(a19). From such type of the LPs, most of the existing LPR systems detect only the line of characters, which are bigger in size. Hence, such type of LP detection systems will not satisfy the real-time requirements. The proposed approach will work in all such conditions.

Figures 10(a21), 10(b21), 10(a22), and 10(b22) show images with failed LP detection, due to overlapping of LP characters with LP border because of the presence of lot of dirt between LP characters and LP border. Figures 10(a24) and 10(b24) show images with LPs 2nd (character “K”) and 6th (character “7”) characters touching the border of the LP due to the presence of noise such as screw or dirt or stamp between LP characters and border of the LP. These characters are extracted successfully using the proposed noisy/missed characters extraction algorithm from Section 5. Figures 10(a1)–10(a9), 10(a12), 10(a21), 10(a22), and 10(a24) show Greek LPs, Figures 10(a13)–10(a19) show Indian LPs, Figures 10(a10), 10(a11), and 10(a23) show Israeli LPs, and Figure 10(a20) shows an LP from Urdu lingual country.

From Table 1 and Figure 10 we can conclude that the images are considered from four different countries and we have also considered AOLP benchmark database for performance evaluation which contains images from Taiwan country. With these results, we can claim that the proposed approach will work successfully to identify LP/LPs from an image, whose individual character properties are near to each other, in line with each other, and similar in height in a particular line. There is no restriction on the number of lines present in the LP of the vehicle and the characters in each line of the LP can be of different size.

The performance comparison amongst few of the prominent LP detection methods and the proposed approach is shown in Table 2. The method proposed by Bai and Liu in paper [4] is superior to the proposed approach with a remarkable LP detection success rate of 99.6%, which supersedes all other methods. The methods proposed by Huang et al. in paper [8], Wen et al. in paper [9], Yang et al. in paper [5], Haneda and Hanaizumi in paper [10], and Anagnostopoulos et al. in paper [6] reported 97.55%, 97.16%, 95.3%, 97%, and 96.5% LP detection rates, respectively, which are less than the proposed method’s success rate of 98.74%. Lee et al. in paper [3] and Al-Ghaili et al. in paper [12] reported 91.25% and 91.65% success rates, but their data set contains only cars. Faradji et al. in paper [7] reported lower success rate, as compared to others. Zhou et al. in paper [11] reported 93.2% success rate on the proprietary data set and 84.8% on Caltech data set. Abo Samra and Khalefah in paper [14] reported 98.75% success rate which is equivalent to the proposed methods success rate of 98.74%. The success rates of the above mentioned LP detection systems are based on proprietary data sets.

It is impractical to compare the performances of different LP detection systems which evaluated their performances using proprietary LP data sets. There should be a common, true benchmark LP database, openly available to assess the performance of the proposed LP detection systems. A common, publicly available media-lab benchmark LP database, for the research community, is initiated by Anagnostopoulos et al. in paper [1], which contains Greek vehicle LP images. As the media-lab benchmark LP database is not satisfying all plate variations mentioned in this paper, we coupled the images of Israeli and Indian LPs having cars, vans, trucks, and motorcycles, with media-lab benchmark LP database, to attain all plate variations mentioned. Table 3 shows the performance comparison between SCW method [6], AOLP method [13], and the proposed approach using media-lab and AOLP benchmark LP databases. Using media-lab benchmark LP database, the proposed method’s success rate of 97.3% is better when compared to SCW method’s success rate of 96.5% (number of images taken by SCW method is 1334) and is more than AOLP method’s success rate of 92.1%. Using AOLP benchmark LP database, the success rate of the proposed approach is 93.7%, which is close to the success rate of 93.33% of the AOLP approach and better than 81.67%, which is the success rate of SCW method. The average success rate of the proposed approach which is based on both the benchmark LP databases is 94.66% and is a bit more than AOLP’s average success rate of 92.72% and is better than SCW’s average success rate of 89.09% as shown in Table 3. Abo Samra and Khalefah in paper [14] also tested their proposed methods performance using media-lab benchmark LP database and reported a success rate of 97.61% using 335 images only (instead of 741 images), whereas the proposed method used 741 images from media-lab benchmark LP data set and reported 97.3% success rate which is almost equivalent to the success rate reported in paper [14].

The success rate of the proposed approach using media-lab benchmark LP data set and the proprietary LP data set is 97.56%, which is a bit more as compared to the success rate of 96.5% of the SCW approach. The success rate of SCW method is reported using 1334 images, whereas they made available only 741 images online, as media-lab benchmark LP database. We do not have clarification on rest of () 593 images. During our experimentation, we observed around 2% of the media-lab LP benchmark database having noisy/missed characters and we have achieved 100% noisy characters extraction from the input images using the proposed noisy/missed characters extraction method.

We observed that most of the LP detection papers from the literature vastly used edge information, template information, morphological operations, and color information of the LPs. These types of methodologies have restrictions when detecting the LP/LPs from the input images as explained in Section 2. In order to overcome the shortcomings of the LP detection methods from the literature which are enlightened in this paper, we have proposed geometry-based clustering techniques which are invariant to color, scale, rotation, and scale variances of the LPs and also proved from Figure 10 that the proposed multiple LPs detection method successfully detects the LPs from the input images taken from the open environment, all weather conditions, and all plate variations mentioned in this paper. Hence, the proposed method has the ability to detect multiple LPs from an input image which follow the properties of the proposed geometry-based clustering techniques.

7. Conclusion

In this paper, we have proposed a new method for LP/LPs detection and noisy/missed characters extraction due to the presence of noise between LP characters and LP border. The proposed method’s performance is evaluated on media-lab and AOLP benchmark LP data sets and reported success rates of 97.3% and 93.7%, respectively, which are shown in performance comparison Table 2. The average success rate of the proposed approach (94.66%) is more as compared to SCW (89.09%) and AOLP (92.72%) approaches using both benchmark LP data sets which are shown in Table 3. The proposed approach can detect multiple LPs in an image and is not specific to any country; there is no restriction on the number of characters present in the LPs, the number of lines present in the LPs, and the size of the characters in each line of the LP. As demonstrated in the results, the proposed approach is less restrictive as compared with most of the previously published work and it works for many countries having different plate variations, under different environmental and weather conditions. The proposed approach fails to identify the LP/LPs, if the LP characters are missed, due to the presence of noise such as extremely dirty or blur or characters of LP touches the LP border.

Competing Interests

The authors declare that there are no competing interests regarding the publication of this paper.