Abstract

In the process of the development of image processing technology, image segmentation is a very important image processing technology in the field of machine vision, pedestrian detection, medical imaging, and so on. However, the traditional image segmentation technology cannot solve the problems of reflection and uneven illumination. This paper presents a local threshold segmentation method based on FPGA, which can automatically select the optimal threshold according to different gray levels of images. First, the image is processed by mean filtering to remove noise interference in the image. Then, the idea of the mean value of the local neighborhood block and the Gaussian weighted sum in the local neighborhood is used to deal with the reflective and uneven light on the image. The process is designed and realized on FPGA. Finally, the design algorithm is verified by ModelSim simulation software and QT5 software. The experimental results show that the algorithm can effectively solve the problems of reflection and uneven illumination on the image surface, and the segmentation effect is significantly improved compared with the fixed threshold algorithm and Otsu algorithm. It also has certain reference value in medicine, agriculture, engineering, and other fields.

1. Introduction

With the development of image processing technology, digital image capture and processing technology is developing towards a higher level. The traditional image processing system based on software platforms has been difficult to meet the needs, so people put forward new requirements for the image processing system. Due to the natural parallelism of image processing algorithms, the addition of field-programmable gate array (FPGA) hardware platforms has brought new vitality to image processing [16]. In addition, image segmentation is a very important image processing technology, especially in the medical field. For example, the lung image was segmented into cancer and noncancer parts by a superpixel algorithm [7]. Therefore, image segmentation technology has been concerned and valued. Thousands of kinds of image segmentation algorithms have been proposed, and new image segmentation algorithms are constantly being born. Common image segmentation methods include threshold-based segmentation methods, edge-based segmentation methods, and region-based segmentation methods [813]. In recent years, Bo et al. [14] proposed a new deformable contour model for ultrasonic image sequence segmentation, which can resist the influence of misleading or weak boundary in ultrasonic image segmentation. Hongyu et al. [15] proposed a rib segmentation framework based on unpaired sample enhancement and a multiscale network, which has good segmentation performance for multiorgan overlapping regions and fuzzy regions. Zhao et al. [16] proposed an automatic segmentation method for small organs based on limited training data, which is superior to cutting-edge deep learning methods, traditional forest-based methods, and multiatlas methods in small organ segmentation. Li and Zou et al. [17] proposed a portrait image segmentation method based on the combination of an improved genetic algorithm and threshold image segmentation, which solved the shortcomings of unsatisfactory segmentation effect and low segmentation accuracy when traditional algorithms were applied to portrait image segmentation. For the threshold segmentation method, if the image is interfered by external factors, such as reflection and uneven illumination, resulting in a large gap in the gray level of the image surface, then how to solve this phenomenon has a certain challenge. Local threshold segmentation can be used for multitarget segmentation and local threshold selection, but the target segmentation in the local threshold has poor connection and contains noise, which is suitable for close recognition of the target image. Therefore, this paper proposes a design based on FPGA local threshold segmentation, using the local neighborhood block means and the idea of Gaussian weighted sum in the local neighborhood; in the case of the image surface, the gray level gap is too large, completing the selection of adaptive local threshold, through Modelsim code simulation, and Visual Studio QT5 is connected to the platform for debugging and running display.

2. Local Adaptive Threshold Algorithm

The size of the threshold determines the accuracy of the image information. If the threshold is too high, some detailed images will be filtered and the edges are seriously interrupted. If the threshold is too low, more false edges of the images will cause the judgment error of the image information. Local thresholding algorithms can enable local image regions such as brightness, contrast, and texture to have corresponding optimal thresholds.

Common local threshold segmentation algorithms include the mean value of the local neighborhood blocks and the Gaussian weighted sum of the local neighborhood blocks. Based on this, the processing window is set as an odd-number square window, assuming μ is the mean value of the pixels in the processing window, is the pixel variance in the processing window, is the input pixel value, and is the output pixel value. The calculation formulas of μ and and are as follows:where “r” is the radius of the processing window and K is a constant greater than 0. When K = 1, the inequality is always true, which indicates that the result of segmentation is the same whether the foreground is high brightness or low brightness. The schematic diagram of the segmentation effect is shown in Figure 1.

Before segmentation, the center point of the picture represents the foreground of different brightness, and the gray level represents the background. After the segmentation algorithm, black represents the foreground and white represents the background. The segmentation results of different brightness prospects are consistent. Therefore, the problem caused by uneven light to the image can be solved by (3). To understand and reduce the difficulty of writing the FPGA code, let the current outputpixel is , and the pixel treated by the algorithm is . Convert equation (3) to the transformed expression as follows:

In this experiment, the radius “r” of the processing window is 7, and the constant K is 1. Bring them into formula (4) to obtain the expression as follows:

3. FPGA Implementation

3.1. Overall Design

At present, pipelined video positioning systems are based on universal processors and adopt OpenCV machine vision technology, which has shortcomings such as long response time, poor real-time performance, high cost, and insufficient flexibility, while FPGA has rich logic and storage resources and unique parallel processing advantages. Therefore, this paper adopts the pipeline way to achieve the local adaptive threshold on FPGA. According to the requirements of equation (5), the following steps are required: (1) calculate μ in the processing window; (2) make the difference between the center pixel of the window and μ and calculate the square of the difference ; (3) multiply the result of step (2) by 225 to complete the calculation of the left inequality; (4) calculate the 225-pixel values in the processing window and complete the calculation of the inequality on the right with the square sum of μ; (5) compare steps (3) and (4) to complete the local adaptive threshold segmentation; (6) align the rows and columns, and complete the boundary processing. The overall design of implementing locally adaptive threshold segmentation in FPGA is shown in Figure 2.

Among them, the window cache module and the mean filtering module are used inside the data superposition module. In order to ensure that the mean in the processing window and the pixels of the current processing window cache differ in the same timing, that is, the timing of the mean filtering module and the window cache module should be consistent, so a delay of 11 beats is required before the current window cache module. Since the data superposition module port does not support array entry, only the pixel data after the square can be assembled into a vector and input to the superposition module port. At the same time, the square value of the center pixel of the left inequality does not be recalculated and can be extracted directly from the new vector. After calculation, the multiplication circuit consumes only 2 beats, while the data superposition module uses 8 beats, so it needs to delay 6 beats after the multiplication circuit. The timing alignment is compared to complete the local adaptive threshold segmentation. Finally, the boundary is reset according to the valid data flag bit. Some of the main programs are as follows:/The squared pixel data is spliced into a vector/assign square  [(i + 1)2 Bit W − 1: i2Bit W] = square_pixdiff [i]/Extract the square value of the difference between the center pixel and the mean/parameter med_pix = num_all >> 1assign square_med_pix = square[(med_pix + 1)2Bit W − 1: med_pix2Bit W]

3.2. Design of the Window Cache Module

Sliding window cache design is the basic operation commonly used in FPGA when processing images, which is suitable for real-time and efficient pipeline processing on FPGA. In this paper, the size of the sliding window cache is set to 15 × 15 rectangular boxes, which requires delaying 14-row direction and 14-column direction image pixels, namely, consuming 14 lines of FIFO to complete the line cache and 225 registers to complete the column cache. When FPGA caches one line of the image, the new line of image data is transferred to Line_FIFO1, and Line_FIFO1 caches the image data. When the Line_FIFO1 cache data reaches one row, Line_FIFO1 reads the cached image data before the next row arrives and passes it to the next Line_FIFO. At the same time, the new row of data is cached again into Line_FIFO1, until the image data cache is complete. The 15 × 15 window cache structure is shown in Figure 3.

3.3. Design of Data Stack Module

The function of the data stack module is to add up all the pixel values in the processing window. This experiment used two data superposition modules, one is seeking the cached data sum in the processing window; the other is stacking the square sum of the difference between the pixels and the mean in the processing window. However, stacking 225 pixels in turn is not only heavy work but also difficult for code readability and maintainability. Therefore, this paper describes the data superposition circuit by the recursive method. By summarizing the rules, it is found that if the current number to be added is n (odd number), as an adder and as a register are required, that is, the number to be added next time is . The bisection recursive calling formula is shown in

The recursive summation block diagram is shown in Figure 4. In Figure 4, the input of din_vector is the sum of squares of the difference between all pixels and the mean in the 225 processing window. The first_vector is the pixel vector after the first superposition and serves as the input vector for the next superposition, and the last_vector is the pixel vector after the last superposition, completing the superposition of 255-pixel data in this cycle.

3.4. Design of the Mean Filter Module

The mean filter module is a linear filtering method with a simple algorithm and high smoothness and has a good inhibitory effect on periodic interference. Its main function is to reduce the sharp change of the image gray value to achieve the purpose of reducing noise. If r is used to represent the radius of the processing window, represents the input pixel value of the current window and the output pixel value of the processing window; the mean filter can be expressed by

In image filtering processing, when the image processes the boundary pixels, the convolution core and the image use area cannot match, which will cause calculation problems, so the boundary should also be processed. Common boundary processing methods include boundary filling of 0, boundary filling of the nearest pixel values, and no boundary processing. Boundary filling 0 is to expand each boundary of the image and set the extended boundary to 0. Boundary filling the nearest pixel value is also to expand each boundary of the image and set the extended boundary to the value of adjacent pixels. In this paper, only the upper boundary is treated, and the rest is not treated. The block diagram of the mean filter module is shown in Figure 5.

4. System Test

4.1. Experimental System Design

To verify the effectiveness of the algorithm, the system mainly uses the QT5 software design program to convert the image and text files to each other, and the designed algorithm is simulated and tested by the Modelsim software. Because it is a simulation test, first simulate a video clock and then simulate the acquisition of image data, according to the clock requirements to realize the capture of image data. Finally, the captured image data is transmitted to the designed algorithm for the simulation test. Modelsim will output text data after the algorithm in this article, which can be converted into image data and displayed on the host computer through the QT5 software. In addition, this paper also designs some commonly used image processing algorithms for global threshold segmentation through VS2015; compared with the algorithm used in this paper, the final comparison results are output. The block diagram of the experimental test system is shown in Figure 6.

4.2. Video Streaming Test

In this paper, images with a resolution of and a scanning frequency of 60 HZ are tested. Then, the number of clock cycles required for one frame of image is 1_050_000, and the total amount of data in 1 second is 63_000_000, that is, only when the clock frequency is at least 63MHZ, video can be scanned. Therefore, the clock period of the video is about 15.87 ns. The scan time used to measure a single frame image is the product of the clock period and the number of clock periods, about 16_666_666 ns. The simulation diagram of the video stream is shown in Figure 7. It can be measured from Figure 7 that the scan time of one frame of image is 16_743_739 ns and just enough can meet the scanning frequency of 60 HZ.

4.3. Video Capture Test

Before video capture simulation, we need to calculate the local clock, that is the capture clock. Generally speaking, the local clock cycle is greater than 1/3 times the video clock cycle. The video clock cycle calculated from the previous section is about 15.87 ns, and then the period of the local clock is less than 47.61 ns. Therefore, for the convenience of observation, the clock period of the system video is 16 ns, and the local clock period is 40 ns. At the same time, the asynchronous FIFO is used to solve the problem of inconsistency between the local clock and the pixel clock. The video capture simulation diagram is shown in Figure 8. The right side of Figure 8 is the image data file processed by QT5, and the left side of Figure 8 is the data captured by the video. After comparison, it can be seen that the captured data and image data are consistent.

5. Analysis of Test Results

This paper selects 4 images for testing, which are the paper image; carve image; watch image; and cup cover image; their size is 640 pixel × 480 pixels. The experimental results are shown in Figure 9. The first row of paper images and the second row of carved images in Figure 9 are both affected by uneven lighting. The threshold of the fixed-threshold algorithm is tested continuously, and the final threshold is 128. The global threshold segmentation of the fixed threshold algorithm and the Otsu algorithm is absolute, and a large amount of noise is segmented in the bright part, resulting in blurred images; the target in the dark part is even more unrecognizable. The algorithm in this paper effectively performs threshold segmentation on the images of the bright part and the dark part. It can not only segment the small objects in the paper image clearly and accurately but also effectively segment the outline of the figure in the dark part of the upper right corner of the statue image. The third row of the watch and the fourth row of cup cover in Figure 9 are both reflected by strong light. The threshold of the fixed-threshold algorithm is tested continuously, and the final threshold is 100. The fixed threshold algorithm and the Otsu algorithm lost the digital information of “5” and “6” on the dial of the reflective part in the lower right corner of the watch image and the overall letter information of the cup lid image during the segmentation process. The algorithm solves the interference of the reflection phenomenon and accurately retains the digital information of the dial in the reflective part in the lower right corner of the watch image and the letter information in the cup cover image. Compared with the fixed threshold algorithm, the proposed algorithm does not need to manually set the threshold with experience, but can automatically set the optimal threshold according to the brightness and darkness of the image. Compared with the Otsu algorithm, the proposed algorithm can better deal with the influence of uneven illumination. However, in the carve of Figure 9, we can see that although the local threshold algorithm can recognize the edge information of the image, there will be a lot of interference. Therefore, the local threshold segmentation has certain limitations for prospective images.

6. Conclusion

The main purpose of this paper is to solve the problem of reflection and uneven illumination in the process of image processing. A local threshold segmentation algorithm based on FPGA is proposed. The algorithm adopts the mean value of local neighborhood blocks and the Gaussian weighted sum design idea in the local neighborhood. First, mean filtering is used to remove the noise interference in the image. Then, a local threshold segmentation algorithm is designed on FPGA to solve the interference caused by reflection and uneven illumination. Finally, verify the designed algorithm through Modelsim simulation software and design a fixed threshold algorithm and Otsu algorithm on VS2015 to test the image. After comparing the experimental results, the algorithm can effectively reduce the image interference caused by uneven illumination and reflection and improve the segmentation effect.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

This work was supported by the Jiangsu Graduate Practical Innovation Project (nos. SJCX21_1517 and SJCX22_1685), Major Project of Natural Science Research of Jiangsu Province Colleges and Universities (no. 19KJA110002), Natural Science Foundation of China under Grant no. 61673108, and Yancheng Institute of Technology High-Level Talent Research Initiation Project (no. XJR2022001).