Abstract
Aiming at the demand of industrial instrument reading, this study proposes a method of industrial instrument classification and reading recognition based on YOLOv3. Given that industrial meters can be divided into pointer meters and digital meters according to the dial type, this method conducts a reading study for each of the two types of meters. Firstly, the YOLOv3 model is trained to recognize and detect the meter types and classify the meters according to the values of the obtained classes. The pointer meter uses a Hough circle to detect the dial, extracts the scale and the pointer, calculates the angle between the 0 scale line and the pointer, and obtains the reading of the pointer meter. The digital meter extracts the digits by finding the contours of the dial and the digit area and then uses a support vector machine (SVM) to identify the extracted digits and output the readings of the digital meter. Through the test, the mean average precision (mAP) of the recognition model in this study is 93.73%. The absolute error of pointer meter reading is less than 0.1 in general, and the maximum relative error is 0.35%. The accuracy of the digital meter reading is 99.7%. The proposed method can accurately read the value of the instrument and meet the needs of industrial production.
1. Introduction
Nowadays, industrial instruments are becoming increasingly functional and play an increasingly important role in the industry, becoming an integral part of modern industrial production. They have become an integral part of modern industrial production. With the continuous integration and deepening of electronic information and modern industry, industrial production is gradually developing towards automation and intelligence. IoT-based instrument reading is widely used, but this method is only applicable to industrial instruments with communication interfaces for reading. In the current industrial life, there are still many traditional industrial meters without communication interfaces, which are mostly read manually. However, the manual reading of the meter is a large and inefficient workload, which can easily cause visual fatigue and thus lead to errors in the values read.
Industrial meters at this stage can be divided into pointer meters and digital meters according to the type of dial. Since there are advantages and disadvantages to both pointer and digital meters, both types of instruments are often used in combination in industrial production. This study analyzes the advantages and disadvantages of the instrument recognition and reading method used in Zhang’s paper [1] and Lei’s paper [2] and proposes an industrial instrument reading recognition system based on YOLOv3, which uses the YOLOv3 target detection algorithm to identify the instrument type. By calculating the included angle between the scale line and the pointer in the pointer instrument, the reading of the pointer instrument is recognized, and the support vector machine (SVM) method is used to identify the numbers in the digital instrument, and then, the reading result of the digital instrument is obtained. The test of the system realized by this method shows that this method has high accuracy in the type identification and reading of industrial instruments.
2. Related Work
With the rapid development of machine vision, instrument image processing technology has become a mainstream technology in the field of industrial instrumentation and is widely used throughout the production systems of industry. In 2018, Kucheruk et al. [3] used template matching to precisely locate the center of the circle of the meter and the pointer and then calculated the rotation angle of the pointer and finally obtained the meter reading. In 2019, Chen et al. [4] proposed a fast and direct digital recognition algorithm for digital instruments based on feature extraction. In 2020, Guo et al. [5] used the GrayWorld algorithm to process digital meter images and then train a variable convolution character recognition algorithm to complete the recognition of digital meters. Based on machine vision, Zhang [1] firstly identified the type of instrument meters and then processed the pointer and scale, respectively, in pointer meters to read the pointer meters, while the digital meter adopted MNIST number set automatic recognition algorithm to recognize the reading. This method has an accuracy of 93.47% for meter type recognition and 98.5% for digital meter readings. Zhang et al. [6] read the pointer meter in HSV color space, determined the pointer centroid based on MBR, and was able to quickly locate the pointer position. In 2021, Li et al. [7] proposed an Otsu algorithm and an improved Hough algorithm, which can well avoid the influence of uneven illumination on pointer meter readings. Sowah et al. [8], using computer vision and machine learning technology, developed a new algorithm used for automatic meter reading, using a series of image contour filter cascading number classifier to extract the advanced features; this method does not rely on any information before the instrument is being read and can directly get the meter reading. Tang et al. [9] conducted automatic detection of pointer instrument through image recognition method, and the accuracy of reading recognition met the performance requirements, with efficient and intelligent operation characteristics. Wang et al. [10] combined target detection and computer vision, used fast-RCNN to detect the instrument panel, and then read the pointer indicator representation through Hough transform. Lei [2] designed a system capable of reading both pointer and digital meters. The system uses a target detection method to locate pointer meters and then combines image processing techniques to recognize the meter readings, while digital meters are read using a deep learning-based method. This system is very accurate for meter recognition, but can only identify and locate pointer meters with an accuracy of 81.4% using the minimum external matrix-based method, while the error using the PSPNet reading method is very high, with an average error value of 0.771.
3. YOLOv3-Based Recognition Model for Industrial Instruments
A model for industrial instrument type recognition is produced by collecting instrument images, labeling them with type, and then feeding them into the YOLOv3 neural network for training. The instrument images to be classified are fed into the trained model and the instrument types are identified by the model. Once the recognition is complete the instrument images are stored in a corresponding folder. The main process is shown in Figure 1.
3.1. YOLOv3 Principle
YOLOv3 borrows the ResNet idea by adding the residual module to the network and modifying Darknet-19 of YOLOv2 to Darknet-53 to form a deeper network layer. Fifty three layers of convolution are used in Darknet-53, where a large number of 3 × 3 and 1 × 1 convolution kernels are set up and 256 × 256 × 3 is used as the output.
The structure of the YOLOv3 model [11–15] is shown in Figure 2. Each square in Figure 2 is a basic block consisting of Conv2d, BatchNorm2d, and LeakyReLU (except the last layer of each output) called Basic_Block. The resblock in Figure 2 is a residual block consisting of Basic_Block and a shortcut called Residual_Block. The rightmost three outputs in the diagram are called Header_Block and consist of a Basic_Block and a Conv2d. [+] in Figure 2 stands for splicing, i.e., splicing the channel dimensions of two tensors.
Figure 2 shows that the final outputs of the model are y1, y2, and y3. Assuming an output of 1 × 3 × 416 × 416 the output dimensions are 1 × 255 × 13 × 13, 1 × 255 × 26 × 26, and 1 × 255 × 52 × 52, three different scales are used to predict large, medium, and small targets, respectively.
Each piece of data used for training consists of a picture and a label, with the label content being the enclosing box and class of all targets of interest in the graph, with the enclosing box generally represented by the top-left vertex and bottom-right vertex. Intuitively, it is best if the output of the network is the same as the labels, i.e., (x1, y1), (x2, y2), and class. In practice, however, a direct regression of the above six values would not work well and the network would converge very slowly because the range of variation in the above six values is so large that the neural network would predict the data better and converge significantly faster after normalization. So, the YOLOv3 network chooses to predict a normalized data and encodes the six values mentioned above.
The relationship between the above output and the actual bounding box [16–18], confidence levels, and categories is as follows:
It can be seen that the predictions tx and ty are the deviation of the center of the bounding box relative to the top-left corner of each grid, tw and th are a factor of the width and height of the bounding box, tp is the confidence level after sigmoid, and the category is the category corresponding to the subscript of the maximum value in the last 80 data.
3.2. Model Training
The images of pointer and digital meters were collected and labeled using LabelImg image labeling software to classify the images into two categories: pointer meters (labeled as “pointer”) and digital meters (labeled as “digital”). The labeling effect is shown in Figure 3.
After the images are annotated, the annotated file is divided into a training set and a test set, and the test set is fed into the YOLOv3 network [19–22] for training. The number of training rounds (epochs) is set to 200, the batch size of each iteration is 2, and the first 249 layers are unblocked so that they can be trained together.
3.3. Experimental Results and Analysis
Two hundred different meter images were selected for input into the system to test the recognition effect of the model, and some of the recognized images are shown in Figure 4.
The loss during training was recorded and plotted as a loss curve and compared with the model used in [1]. As can be seen from Figure 5, the YOLOv3 model [23–25] used in this study has a lower loss value and is better able to achieve the recognition of industrial meter types.
To test the recognition effect of the model more accurately, the Precision-Recall curve (i.e., P-R curve), with Recall as the horizontal coordinate and Precision as the vertical coordinate, is chosen to evaluate the results after recognition in this study; the results are shown in Figure 6.
As can be seen from Figure 6, the Average Precision (AP) of pointer instrument recognition in the test set is 93.62%, the AP of digital instrument recognition is 93.83%, and the mean Average Precision (mAP) of the industrial instrument recognition model based on YOLOv3 is 93.73%, which is 0.26% higher than [1] and has higher identification accuracy.
4. Reading Recognition of Pointer Meters
Reading recognition of pointer meters is described in the following sections.
4.1. Image Processing of Pointer Meters
The process of image processing of pointer meters is as follows.
4.1.1. Mean Value Filtering
Mean filtering means converting a pixel point (x, y) in an image to the mean of the pixel values around that pixel point, thus obtaining the grey scale (x, y) of the current point on the image, by accessing all the pixel points in the image, in turn, to achieve the mean filtering operation on the image. The basic principle [26–28] is as follows:
For pointer meters, the information on the dial contains a variety of information such as scale lines, scale values, and hands. The mean filtering of the pointer meter image can well remove the irrelevant information on the dial and facilitate the subsequent extraction of key information on the dial.
4.1.2. Greyscaling
Greyscaling is the conversion of a colourful image into a grey-only image. By means of greyscaling, the contrast of the image is increased and the dynamic range of the pixels is expanded, making the whole image finer, clearer, and easier to recognize for computer processing. The calculation process for greyscale is as follows:
After greyscale processing, it is possible to obtain a greyscale image that is closer to human vision and contains only one color information per pixel in greyscale. The processed pointer meter image is shown in Figure 7.
4.2. Position the Dial of a Pointer Meter
Given that the dials of pointer meters are circular, detection of circular dials in meter images can be performed using the Hough circle transformation, a feature-based digital image processing technique that extracts circles from the image. Candidate circles can be generated by voting to construct a cumulative coordinate plane in the parameter space and select the local maximum value of the accumulator matrix.
The equation of a circle in the Cartesian coordinate system iswhere the coordinates of the center of the circle are (a, b) and the radius is r, so equation (4) can also be expressed as
So, a circle in a three-dimensional coordinate system a-b-r can be defined by a single point (a, b, r). If the circles in the image all pass through a certain pixel point, it is possible to transform these circles into curves in polar coordinate space. Multiple circles will intersect curves in polar space, and a threshold is set for these intersection points, above which the points are identified as circles.
4.3. Dial Information Extraction
The dial of a pointer meter contains a lot of important information, mainly the scale lines, the scale values, and the pointer. In order to achieve the reading of the meter, the information on the dial needs to be identified and extracted in steps, removing the distracting information on the dial and retaining the important information on the annunciator. By identifying the retained information, the reading function of the pointer meter is thus realized.
4.3.1. Extracting the Scale
The tick marks on the dial can be extracted by finding the tick outline, based on the characteristics of the tick marks. The principle is to find the black pixel and define it as the starting pixel, which is retraced each time it touches the black pixel and then goes clockwise around the black pixel visiting each pixel in its molar neighborhood until it touches the black pixel. The effect of extracting the tick marks is shown in Figure 8.
4.3.2. Locating the Center of the Circle
After selecting the scale lines, the least square method is used to fit the scale lines, and the fitting lines on the dial of the instrument can be obtained. As shown in Figure 9, the center of the dial can be obtained by taking the average of the intersection points of all the scale lines.
4.3.3. Extracting Pointers
Based on the analysis of the characteristics of pointer meters, the Hough linear transform was used to identify the pointer in the dashboard and to extract refinement of the pointer. The essence of the Hough linear is a mapping of the coordinates of the pixel points in a two-dimensional image, represented as a curve in polar coordinate space.
If the composition of pixels in an image intersects at a point, then all pixels in the image can be transformed into a curve in parametric coordinate space. After the transformation, each point (x, y) in the image space is mapped to a sine curve in (r) polar coordinate space. If the points in image space all intersect after the transformation into polar coordinate space, and the intersection points are all the same point, the parameter corresponding to that point is the parameter corresponding to the parametric equation of the line, i.e., the parameter of the line on which the pointer is located.
4.4. Calculation of Angles
The output of the meter readings is obtained by calculating the angle of rotation of the pointer and the angle of rotation of the total range and comparing them to obtain the final reading, where the position of the 0 scale is vector of the center of the circle coordinates O(x, y) pointing to the start of the scale A(x, y); the position of the pointer is vector of the center of the circle coordinates O(x, y) pointing to the end of the pointer B(x, y), and the position of the maximum range is vector of the center of the circle coordinates O(x, y), pointing to the end of the pointer C(x, y).The angle of rotation of the pointer is, the angle of rotation of the total range is; then, the formula is
4.5. Experimental Results and Analysis
Pictures of pointer meters with different values were entered into the system to test the pointer meter reading function, and some of the test results are shown in Table 1.
As can be seen from Table 1, the absolute error of the pointer meter reading method used in this paper is generally less than 0.1, and the maximum relative error is 0.35%. However, the maximum relative error of the mBR-based method in literature [6] is 0.67%, that of the method based on Otsu and improved Hough transform in [7] is 0.9%, and that of [2] is 1.5%. By comparison, the error of the pointer meter reading method adopted in this study is small and can meet the needs of pointer meter reading.
5. Digital Meter Reading Recognition
Digital meter reading recognition is described in the following sections.
5.1. Image Processing of Digital Meters
The process of image processing of digital meters is as follows.
5.1.1. Gaussian Filtering
Gaussian filtering is used to filter out normally distributed noise (Gaussian noise). The value of each pixel is weighted by the value of the same pixel and the values of neighboring pixels; the closer to the center, the higher the weight. The effect of Gaussian filtering is gentler compared to other filters and is able to maintain the invariance of image changes such as rotation and lighting changes. It is calculated as follows:
The dial information on digital meters is relatively simple, with essentially only the digits and decimal point. The Gaussian filtered image of the meter still retains the digits and decimal point intact and has no effect on subsequent readings.
5.1.2. Binarisation
Binarisation is the setting of the pixel points in an image according to their grey scale values, converting the original greyscale map into a binary image containing only black and white information, i.e., only two values of 0 and 255. Binarisation is calculated as follows:
Image binarisation removes distracting information from the digital dial, extracting information from the image more accurately, and retaining clearer numerical and decimal information.
The processed digital meter images are shown in Figure 10.
5.2. Digital Recognition
The SVM dataset was created from images collected on the Internet and contains 400 images from 0 to 9.
This study utilizes the histogram (HOG) method which has now been widely used for complex facial and gesture recognition. First, the Sobel derivative values of the horizontal and vertical sides of the image are calculated to obtain the gradient angle and gradient size of each pixel, and then, the angle of the gradient is converted to an integer between 0 and 16. The image is divided into four 10 ∗ 10 blocks and the histogram of the slope of each block is derived using the gradient value of that block as the weighted value. Each small cube is then represented by 16 vectors, and the whole image is then represented by the eigenvectors of the four blocks.
The experimental results show that the method can effectively eliminate background interference and extract 64 feature values. However, the extracted feature values are all too large for good recognition. So, the gradient histogram can be transformed by the Hellinger kernel function, which is calculated as follows:
As shown in Figure 11, after the Hellinger kernel transformation, these data were transformed from the largest number to the smallest decimal and preserved their feature information well enough to be roughly viewed as a normalization process.
(a)
(b)
5.3. Experimental Results and Analysis
The digital meter reading function was tested by entering pictures of digital meters with different values into the system. Some of the test results are shown in Table 2.
The algorithm used in this study for digital instrument reading is compared with [1, 2, 5]. As can be seen from Table 3, the accuracy of the digital instrument reading method adopted in this study is 1.2% higher than [1], 0.3% higher than [2], and 0.25% higher than [5], with higher accuracy and more accurate reading results.
6. Conclusions
This study presents a YOLOv3-based classification and reading recognition method for Industrial Instrument, which achieves target detection of meters by training a YOLOv3 neural network model for the type recognition of meters. The Hough transform is used to locate the dials and hands of pointer meters, and then, the angle between the scale 0 mark and the readings of pointer meters are calculated. The dial of a digital meter is located by contour finding, the digital area is located, the digits are segmented, the segmented digits are identified using SVM, and the final output is identified and read.
The biggest advantage of this method is the ability to identify the type of industrial meter, classify the type of meter, and call different meter reading modules for the classified meter to read and identify it. The system can identify both pointer type meters and digital meters, where the mAP of the meter type identification model is 93.73%, the relative error of pointer meter reading is less than 3.5%, the error is small, and the accuracy of the digital meter reading is 99.7%, and the result is accurate.
Data Availability
The datasets analyzed during the current study are available in the paddle AI Studio repository (https://aistudio.baidu.com/aistudio/datasetdetail/157981) and public domain resources (https://aistudio.baidu.com/aistudio/datasetdetail/137555 and https://aistudio.baidu.com/aistudio/datasetdetail/124339).
Conflicts of Interest
The authors declare that they have no conflicts of interest to report regarding the present study.
Acknowledgments
This work was sponsored by the following: (1) the Training Project of Top Scientific Research Talents of Nantong Institute of Technology, under Grant no. XBJRC2021005, (2) the Science and Technology Planning Project of Nantong City, under Grant nos. JC2021132, JCZ21058, JCZ20172, JCZ20151, JCZ20148, and MS22021028, and (3) the Universities Natural Science Research Projects of Jiangsu Province, under Grant nos. 17KJB520031, 21KJD210004, 22KJB520032, 22KJD520007, 22KJD520008, and 22KJD520009.