Vehicle License Plate Recognition Using Shufflenetv2 Dilated Convolution for Intelligent Transportation Applications in Urban Internet of Things

Li, Xiufeng; Wen, Zheng; Hua, Qiaozhi

doi:https://doi.org/10.1155/2022/3627246

Wireless Communications and Mobile Computing

On this page

Abstract Introduction Materials and Methods Results and Discussion Conclusions Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Artificial Intelligence for Urban Internet of Things

View this Special Issue

Research Article | Open Access

Volume 2022 | Article ID 3627246 | https://doi.org/10.1155/2022/3627246

Vehicle License Plate Recognition Using Shufflenetv2 Dilated Convolution for Intelligent Transportation Applications in Urban Internet of Things

Xiufeng Li,¹Zheng Wen,²and Qiaozhi Hua³

Academic Editor: Han Liu

Received27 Feb 2022

Revised03 Apr 2022

Accepted16 Apr 2022

Published19 May 2022

Abstract

Intelligent transportation applications based on urban Internet of Things can improve the efficiency of government services and promote urban modernization. As smart cameras are more and more widely used in cities, artificial intelligence technology is an important force to achieve license plate recognition. An efficient license plate recognition algorithm not only improves the efficiency of traffic management but also saves management costs. This paper proposes a network based on the shufflenetv2 dilated convolution (SDC) model, which includes two parts: license plate location and license plate recognition. SDC model adopts shufflenetv2 as the backbone network, which combines dilated convolution and global context blocks. Therefore, the receptive field and feature expression ability of the model are enhanced. For license plate location, CIOU loss considers not only the coverage area of the bounding box but also the center distance and aspect ratio. For license plate recognition, CTC loss trains the network based on the sequence and solves the sample alignment problem, which improves the accuracy of license plate recognition. The experiments show that the precision of the SDC model in license plate location is 98.7%, which is 5.2%, 5.5%, and 4.1% higher than the precision of Faster-RCNN, YOLOv3, and SSD, respectively. The precision of the SDC model in license plate recognition is 98.2%, which is 5.3%, 3.7%, and 2.9% higher than the precision of LPRNet, AlexNet, and RPNet, respectively.

1. Introduction

Intelligent transportation is an important foundation and guarantee for national economic and social development [1]. With the innovation of Internet of Things and big data [2, 3], it is very important to further promote the digitization and intelligence of transportation industry, which is conducive to promoting the better and faster development of transportation industry. Due to the advancement of social modernization and the improvement in people’s quality of life, the number of intelligent cameras and vehicles is growing rapidly, which brings great challenges to traffic management cost [4, 5]. Therefore, accurately obtaining traffic data and constructing traffic data processing model is the premise of building intelligent transportation, and this problem can be solved by big data technology [6].

Based on the Internet of Things, intelligent transportation can improve not only traffic quality but also the efficiency of traffic management [7], using advanced video monitoring equipment and intelligent identification methods to increase the temporal, spatial, and scope management, which can continuously improve the fineness of transportation systems. As the basic work of intelligent transportation, the license plate recognition technology based on deep learning has laid a good foundation for the next analysis and decision-making of the Internet of Things.

In the actual scene, there are many complex objects in the captured images, such as people and vehicles. Before we recognize the license plate, we need to locate the vehicle in the image. Traditional license plate location methods include methods based on edge detection [8], color features [9], and mathematical morphology [10]. However, the above methods are greatly affected by the external environment and image quality. Du et al. [11] proposed the SSD model with VGG19 as the basic network. SSD predicts the position offset between each bounding box and the ground-truth box. However, SSD method has a situation in which the position offset is too large and beyond the range of the image. Redmon et al. [12] proposed the Yolov3 model, which was built with darknet-53 as the backbone. YOLOv3 model controls the range from 0 to 1 by adding sigmoid activation to the predicted position offset, which solves the problem of an excessively large position offset. These two methods are types of one-stage methods [13], which have the advantage of fewer calculations and can save time, but they lack accuracy. Ren et al. [14] proposed the Faster-RCNN model, which is a type of two-stage method. It has great advantages in terms of accuracy, but it requires considerable computation and time [15].

At present, license plate recognition algorithms include template matching [16] and feature analysis matching methods [17]. The template matching method adjusts the license plate characters according to the size and matches the template characters in all sample libraries. The feature analysis matching method extracts the features and discriminates between the results according to the number and shape of the character contour. However, they are obviously affected by illumination, noise, and character occlusion. Zherzdev et al. [18] proposed the LPRNet model, which does not need to segment characters and effectively solves the gradient problem. However, the recognition accuracy of complex situations is not high, and it easily decreases during training. Xiu et al. [19] proposed an end-to-end license plate recognition algorithm based on AlexNet model. The AlexNet model uses overlapping pooling, local normalization, and dropout methods to improve the accuracy of the model. But the convolution and pooling of AlexNet model in the training process will cause the loss of features. Xu et al. [20] proposed the RPNet model, whose feature map is shared and the loss function is jointly optimized. However, the lack of the spatial information and receptive field of the model leads to incomplete feature extraction.

To solve the accuracy problem of the above algorithm, this paper proposes a network based on the shufflenetv2 [21] dilated convolution (SDC) model: (1)SDC model adopts shufflenetv2 as the backbone network, which combines dilated convolution and global context blocks. Therefore, the receptive field and feature expression ability of the model are enhanced(2)For license plate location, CIOU loss considers not only the coverage area of the bounding box but also the center distance and aspect ratio. For the license plate recognition, CTC loss trains the network based on the sequence and solves the sample alignment problem, which improves the accuracy of license plate recognition

2. Materials and Methods

For the problems regarding about the accuracy of license plate recognition [22], this paper proposes a network based on the shufflenetv2 dilated convolution (SDC) model, which includes two parts: license plate location and license plate recognition.

2.1. Dilated Convolution

Because vehicle recognition systems in practical applications face different scenarios (e.g., road traffic and high-speed charge stations) [23], different proportions of license plate sizes result in the whole picture. In the face of these complex situations, the use of a standard convolution receptive field cannot solve our problem.

Dilated convolution adds dilation based on an ordinary convolution [24], which increases the size of receptive fields in the calculation process. The dilation rate is used to control the interval between points of the convolution kernel. In Figure 1, the left figure shows a standard convolution kernel, which corresponds to the , with the receptive field represented as the blue region. The left figure represents the dilated convolution, which corresponds to the . The receptive field is enlarged by , which can make the coverage of the convolution larger to add the receptive field.

The calculation of the dilated convolution receptive field is shown below: where indicates the receptive field and indicates the dilation rate.

2.2. License Plate Location

The conventional license plate location algorithm, including the one-stage method and the two-stage method, cannot achieve a good balance of speed and accuracy in the face of complex situations. Therefore, this paper uses the anchor-free location method based on four angles, and the calculation method is shown below: where is the amplitude, . is the predicted coordinate of the angle, and is the ground-truth coordinate. and represent the variance.

This paper uses the lightweight network shufflenetv2 to build the model, which reduces the training time. The lightweight ShuffleNetv2 network contains multiple Shuffle blocks. The channel split operation in the Shuffle block divides the number of feature channels into two branches, which reduces the number of parameters in each branch and improves the operation speed. Combining global context blocks and dilated convolution enhances the spatial information of the model and increases the receptive field of the model.

In Figure 2, the gray block represents the stem block with dilated convolution [25]. The stem block utilizes three convolutions instead of a large-scale convolution to reduce the loss of information and is combined with dilated convolution to increase the receptive field size. The loop module includes three shuffle blocks and a global context-dilated convolution (GC-DC) block [26]. The green block represents the shuffle block with , which is used to compress the width and height of the feature layer. The Shuffle block with not only reduces the amount of calculation, but also retains more feature information to improve the effect of feature extraction. The yellow block represents the shuffle block with , which is used to deepen the number of layers of the network. The orange block is the GC-DC block, which represents the global context block with dilated convolution. The blue line represents the jump connection, and the red line indicates the loop operation. The loop module is followed by a down-sampling operation and can obtain three feature maps with different scales. “Add” is the feature fusion operation. The blue block is the residual block, and the white block is the convolution. After applying the residual block and convolution, the located license plate image is obtained.

The loss function used in license plate location is CIOU loss [27]. Compared with IOU loss [28], GIOU loss [29], and DIOU loss [27], it considers the overlapping area, center point distance, and aspect ratio. The model can accelerate convergence and improve the regression accuracy when performing bounding box selection. where and are the two bounding boxes; is the ratio of the intersection and union of these two boxes; and denote the center points of the predicted and ground-truth boxes, respectively; represents the Euclidean distance between the bounding box and the center point of the ground truth; represents the diagonal distance of the minimum circumscribed matrix of the predicted box and the ground-truth box; and are the width and height of the bounding box, respectively; and and are the width and height produced by the ground-truth value, respectively.

2.3. License Plate Recognition

The license plate recognition method requires character segmentation training, which increases the training time. Therefore, this paper uses the lightweight ShuffleNetv2 network as the backbone network, combined with dilated convolution and CTC Loss [30] to establish a model and adopts end-to-end training without character segmentation. In Figure 3, the white block is convolution, and the blue block is pooling. The yellow block represents the shuffle block with . The part in the red box is the loop module. The black block represents the downsampling operation, and “Add” is the feature fusion operation. The orange block is the GC-DC block, which represents the global context block with dilated convolution.

The loss function used in the license plate recognition network is CTC loss. The traditional BP neural network is trained by the frame, and CTC loss is trained by the sequence. The training based on the frame needs to align samples. However, the location and proportion of license plates on each frame are different, which makes it difficult to align samples in practice. CTC loss is a loss function without the need to align samples, and only the corresponding sequence must be obtained for the same character.

In Formula (4), is the input data, and is the output data. is the sequence label, and is the sequence label with blank character. In Formula (5), represents the set of many-to-one mappings. In Formula (6), is the sum of probabilities that the input is and the output is , is the forward recursive probability sum, and is the reverse recursive probability sum.

3. Results and Discussion

The dataset used in our experiment is the CCPD2019 dataset (https://github.com/detectRecog/CCPD). The CCPD2019 dataset is an open source license plate dataset of the University of Science and Technology of China, which is an authoritative dataset in the field of license plate recognition. In Figure 4, the photos of license plates in the CCPD2019 dataset involve a variety of complex environments. The CPU used in our experiment is an Intel(R) Core(TM) i9-9820X CPU @ 3.30 GHz, and the GPU is an NVDIA GeForce RTX 2080 Ti GPU.

Precision and recall are used as the evaluation criteria in our experiment [31], as shown in Formulas (9) and (10), where TP represents the number of license plates correctly determined by the model, FP represents the number of non-license plates incorrectly determined as license plates by the model, and FN represents the number of license plates incorrectly determined as non-license plates by the model.

3.1. License Plate Location Experiment

This experiment adopts the anchor-free location method based on four angles. In the training stage, it is necessary to locate the four-angle positions of license plates in each image and obtain the corresponding coordinates. We send the corresponding coordinate and the pictures into the model for training. The effect of license plate location is expressed by the Gaussian score. When the Gaussian score of training is higher than the set threshold, the model will start training the recognition part. The threshold of this experiment is set to 0.85. The license plate location results are shown in Figure 5.

The located image will not be directly input into the recognition network. There will be an interception operation to eliminate other parts and only retain the license plate. Figure 6 shows the results comparing the locations and the reserved license plates.

Our approach compares with three classical license plate location algorithms (fast RCNN, YOLOv3, and SSD). The comparison results on the CCPD2019 dataset are shown in Table 1.

As seen from the results in Table 1 and Figure 7, for the precision evaluation criterion, our approach is 5.2% more precise than Faster-RCNN, 5.5% more precise than YOLOv3, and 4.1% more precise than SSD. For the recall evaluation criterion, the recall of our approach is 8.0% higher than Faster-RCNN, 5.1% higher than YOLOv3, and 7.1% higher than SSD.

3.2. License Plate Recognition Experiment

The image obtained by the Section 3.1 is input to the license plate recognition network. The results of license plate recognition are shown in Figure 8.

Our approach is compared with three representative license plate recognition algorithms (LPRNet, AlexNet, and RPNet). The comparison results of the four license plate recognition algorithms are shown in Table 2.

As seen from the results in Table 2 and Figure 9, for the precision evaluation criterion, our approach is 5.3% more precise than LPRNet, 3.7% more precise than AlexNet, and 2.9% more precise than RPNet. For the recall evaluation criterion, the recall of our approach is 7.1% higher than LPRNet, 4.0% higher than AlexNet, and 1.8% higher than RPNet.

4. Conclusions

The combination of the big data and Internet of Things can achieve good results by training the model with data obtained from urban intelligent cameras [32]. Big data helps to establish a comprehensive traffic information system. By integrating the “data warehouse” in different regions and fields, the integrated utilization mode of public transport information is constructed. The Internet of Things and big data technology are research focus and would have a profound impact on the intelligent transportation.

Therefore, this paper proposes an SDC model, which combines dilated convolution and global context blocks. Therefore, the receptive field and feature expression ability of the model are enhanced. For license plate location, CIOU loss considers not only the coverage area of the bounding box but also the center distance and aspect ratio. For license plate recognition, CTC loss trains the network based on the sequence and solves the sample alignment problem, which improves the accuracy of license plate recognition.

As a part of intelligent transportation, license plate recognition technology based on deep learning has laid a good foundation for the future analysis and decision-making of the Internet of Things [33]. However, the accuracy for complex situations is not high. For example, for snowy and rainy days, license plates re prone to partial occlusion, which will lead to license plate recognition errors. For dim or nighttime environments, due to the lack of light, there is a lack of color information in the picture, making it difficult to locate and recognize license plates. Therefore, adjusting the parameters of the deep learning model in complex situations will become a difficult problem in the Internet of Things.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgments

This work was supported in part by horizontal scientific research project of Campus Network Design Scheme (HX2021251), in part by the Hubei Natural Science Foundation under grant 2021CFB156 and the JSPS KAKENHI under grant JP21K17737.

References

L. Tan, K. Yu, L. Lin et al., “Speech emotion recognition enhanced traffic efficiency solution for autonomous vehicles in a 5G-enabled space-air-ground integrated intelligent transportation system,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 3, pp. 2830–2842, 2022.
View at: Publisher Site | Google Scholar
K. Yu, L. Tan, C. Yang et al., “A blockchain-based Shamir’s threshold cryptography scheme for data protection in industrial Internet of Things settings,” IEEE Internet of Things Journal, 2021.
View at: Publisher Site | Google Scholar
L. Zhen, Y. Zhang, K. Yu, N. Kumar, A. Barnawi, and Y. Xie, “Early collision detection for massive random access in satellite-based internet of things,” IEEE Transactions on Vehicular Technology, vol. 70, no. 5, pp. 5184–5189, 2021.
View at: Publisher Site | Google Scholar
F. Ding, K. Yu, Z. Gu, X. Li, and Y. Shi, “Perceptual enhancement for autonomous vehicles: restoring visually degraded images for context prediction via adversarial training,” IEEE Transactions on Intelligent Transportation Systems, pp. 1–12, 2021.
View at: Publisher Site | Google Scholar
C. Feng, K. Yu, M. Aloqaily, M. Alazab, Z. Lv, and S. Mumtaz, “Attribute-based encryption with parallel outsourced decryption for edge intelligent IoV,” IEEE Transactions on Vehicular Technology, vol. 69, no. 11, pp. 13784–13795, 2020.
View at: Publisher Site | Google Scholar
X. Shang, L. Tan, K. Yu, J. Zhang, K. Kaur, and M. M. Hassan, “Newton-interpolation-based zk-SNARK for artificial Internet of Things,” Ad Hoc Networks, vol. 123, article 102656, 2021.
View at: Publisher Site | Google Scholar
L. Tan, N. Shi, K. Yu, M. Aloqaily, and Y. Jararweh, “A blockchain-empowered access control framework for smart devices in green Internet of Things,” ACM Transactions on Internet Technology, vol. 21, no. 3, pp. 1–20, 2021.
View at: Publisher Site | Google Scholar
H. Lin, J. Zhao, S. Li, and G. Qiu, “License plate location method based on edge detection and mathematical morphology,” in 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Chongqing, China, 2020.
View at: Publisher Site | Google Scholar
M. Zhang, W. Yu, J. Su, and W. Li, “Design of license plate recognition system based on machine learning,” in 2019 IEEE 4th International Conference on Image, Vision and Computing (ICIVC), pp. 518–522, Xiamen, China, 2019.
View at: Publisher Site | Google Scholar
G. Lin, B. Xue, B. Xu, and C. Chen, “License plate recognition based on mathematical morphology and template matching,” in 2019 Chinese Automation Congress (CAC), pp. 405–410, Hangzhou, China, 2019.
View at: Publisher Site | Google Scholar
L. Du, L. Li, D. Wei, and J. Mao, “Saliency-guided single shot multibox detector for target detection in SAR images,” IEEE Transactions on Geoscience and Remote Sensing, vol. 58, no. 5, pp. 3366–3376, 2020.
View at: Publisher Site | Google Scholar
J. Redmon and A. Farhadi, “YOLOv3: an incremental improvement,” Computer Vision and Pattern Recognition, pp. 1–6, 2018.
View at: Google Scholar
C. Feng, Y. Zhong, Y. Gao, M. R. Scott, and W. Huang, “TOOD: task-aligned one-stage object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 2021.
View at: Publisher Site | Google Scholar
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: towards real-time object detection with region proposal networks,” IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 39, no. 6, pp. 1137–1149, 2017.
View at: Publisher Site | Google Scholar
Y. Liu, J. Han, Q. Zhang, and L. Wang, “Salient object detection via two-stage graphs,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 4, pp. 1023–1037, 2019.
View at: Publisher Site | Google Scholar
I. Kusumadewi, C. A. Sari, D. Setiadi, and E. H. Rachmawanto, “License number plate recognition using template matching and bounding box method,” Journal of Physics Conference Series, vol. 1201, no. 1, article 012067, 2019.
View at: Publisher Site | Google Scholar
K. Abebe, R. Sharma, Y. K. Chung, and D. Zerihum, “Vehicle plate recognition for Ethiopian license plate: based sift feature,” International Journal of Engineering and Technology, vol. 8, no. 6, pp. 568–570, 2020.
View at: Google Scholar
S. Zherzdev and A. Gruzdev, “LPRNet: license plate recognition via deep neural networks,” Computer Vision and Pattern Recognition, pp. 1–6, 2018.
View at: Google Scholar
C. G. Xiu, X. Dan, and S. Zhang, “Research on mining platform scale license plate based on deep learning,” IOP Conference Series: Earth and Environmental Science, vol. 242, no. 2, pp. 22048–22053, 2019.
View at: Publisher Site | Google Scholar
Z. Xu, W. Yang, A. Meng et al., “Towards end-to-end license plate detection and recognition: a large dataset and baseline,” Computer Vision – ECCV 2018. ECCV 2018, Springer, Cham, vol. 11217, pp. 255–271, 2018.
View at: Publisher Site | Google Scholar
N. Ma, X. Zhang, H. T. Zheng, and J. Sun, “Shufflenet v2: practical guidelines for efficient cnn architecture design,” Computer Vision – ECCV 2018. ECCV 2018, Springer, Cham, vol. 11218, pp. 116–131, 2018.
View at: Publisher Site | Google Scholar
D. V. Niture, V. Dhakane, P. Jawalkar, and A. Bamnote, “Smart transportation system using IOT,” International Journal of Engineering and Advanced Technology, vol. 10, no. 5, pp. 434–438, 2021.
View at: Publisher Site | Google Scholar
X. Chen, “Research and application of license plate recognition system for property access control,” Modern Information Technology, no. 2, pp. 1–3, 2018.
View at: Google Scholar
R. Gomes, P. Rozario, and N. Adhikari, “Deep learning optimization in remote sensing image segmentation using dilated convolutions and ShuffleNet,” in 2021 IEEE International Conference on Electro Information Technology(EIT), pp. 244–249, Mt. Pleasant, MI, USA, 2021.
View at: Publisher Site | Google Scholar
Z. Shen, L. Zhuang, J. Li, Y.-G. Jiang, Y. Chen, and X. Xue, “DSOD: learning deeply supervised object detectors from scratch,” in 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1919–1927, Venice, Italy, 2017.
View at: Publisher Site | Google Scholar
Y. Yang and H. Deng, “GC-YOLOv3: you only look once with global context block,” Electronics, vol. 9, no. 8, p. 1235, 2020.
View at: Publisher Site | Google Scholar
Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, “Distance-IoU loss: faster and better learning for bounding box regression,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 7, pp. 12993–13000, 2020.
View at: Publisher Site | Google Scholar
S. L. Tychsen and L. Petersson, “Improving object localization with fitness nms and bounded iou loss,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6877–6885, Salt Lake City, UT, USA, 2018.
View at: Publisher Site | Google Scholar
Y. Chen, H. Li, R. Gao, and D. Zhao, “Boost 3-D object detection via point clouds segmentation and fused 3-D GIoU-L₁ loss,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, pp. 762–773, 2020.
View at: Publisher Site | Google Scholar
X. Kang, H. Huang, Y. Hu, and Z. Huang, “Connectionist temporal classification loss for vector quantized variational autoencoder in zero-shot voice conversion,” Digital Signal Processing, vol. 116, article 103110, 2021.
View at: Publisher Site | Google Scholar
N. S. Hadjidimitriou, M. Lippi, M. Dell’Amico, and A. Skiera, “Machine learning for severity classification of accidents involving powered two wheelers,” IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 10, pp. 4308–4317, 2019.
View at: Publisher Site | Google Scholar
J. Huang, L. Tan, W. Li, and K. Yu, “RON-enhanced blockchain propagation mechanism for edge-enabled smart cities,” Journal of Information Security and Applications, vol. 61, article 102936, 2021.
View at: Publisher Site | Google Scholar
K. Yu, L. Tan, S. Mumtaz et al., “Securing critical infrastructures: deep-learning-based threat detection in IIoT,” IEEE Communications Magazine, vol. 59, no. 10, pp. 76–82, 2021.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2022 Xiufeng Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

342

Downloads

441

Citations