Research Article | Open Access

Volume 2021 |Article ID 5966463 | https://doi.org/10.1155/2021/5966463

Jiulun Fan, Jipeng Yang, "Mean-Based Breakpoint Selection on Circular Histogram", Mathematical Problems in Engineering, vol. 2021, Article ID 5966463, 13 pages, 2021. https://doi.org/10.1155/2021/5966463

Mean-Based Breakpoint Selection on Circular Histogram

Revised21 Oct 2021
Accepted12 Nov 2021
Published03 Dec 2021

Abstract

Circular histogram represents the statistical distribution of circular data; the H component histogram of HSI color model is a typical example of the circular histogram. When using H component to segment color image, a feasible way is to transform the circular histogram into a linear histogram, and then, the mature gray image thresholding methods are used on the linear histogram to select the threshold value. Thus, the reasonable selection of the breakpoint on circular histogram to linearize the circular histogram is the key. In this paper, based on the angles mean on circular histogram and the line mean on linear histogram, a simple breakpoint selection criterion is proposed, and the suitable range of this method is analyzed. Compared with the existing breakpoint selection criteria based on Lorenz curve and cumulative distribution entropy, the proposed method has the advantages of simple expression and less calculation and does not depend on the direction of rotation.

1. Introduction

The data obtained from actual observations can be expressed in various measurement spaces, and the angles’ space showing the angle change is one of the measurement spaces. Angles’ space data processing belongs to the branch of the discipline of statistics: direction (circular) statistics [1, 2]. Angle-based data are called direction data, and angles are commonly expressed as unit vectors. Different from the measurement based on the scale, the direction data have inherent periodic (cyclic) characteristics, which make the direction data have many unique and novel characteristics in modeling and statistical processing.

The data that show the angle change of a single variable is called circular data [2], and one of its visual display methods is a point on the unit circle or a unit vector on a plane. A typical example of circular data is the H component in the HSI color model of color image [3]. The HSI color model is a mathematical image model proposed by the American colorist H. A. Munsell in 1915; it uses H (hue), S (saturation), and I (intensity) to describe color characteristics. The HSI color model is different from the commonly used RGB color models. The three components R(red), G(green), and B(blue) of RGB color model are linearly dependent, but the three components H, S, and I of the HSI color model are linearly independent. Since the HSI color model has a good capability of representing the colors of human perception, the color image segmentation in HSI color space has achieved good results [410].

As a typical example of circular data, Hue(H) represents the basic colors of the image ; it can be expressed as a circular histogram, due to its periodicity. At first, some scholars used the hue histogram to segment the color image without considering its periodicity [11, 12]. Tseng et al. [13] proposed the thresholding method of circular histogram for color image segmentation for the first time in order not to lose the periodicity of the hue. Wu et al. [14] gave an iterative Otsu’s algorithm based on the circular histogram, but this method is not the optimal method and cannot guarantee the convergence of the algorithm. Dimov et al. [15] gave the method of optimal thresholding and multithresholding of circular histogram through the symmetry constraint of threshold point pairs, but the calculation is very complicated. Utilizing the cyclic characteristic of a circular histogram, Lai and Posin [16] theoretically analyzed that when the circular histogram is expanded into a linear histogram and the Otsu method is adopted, only half of the points on the circle need to be searched to obtain the optimal threshold point pair, which successfully reduces the time complexity from to . However, this method is not general and can only be applied to two-class threshold.

Lai and Posin’s research [16] shows that it is a feasible way that first expand the circular histogram into a linear histogram and then use the mature linear histogram threshold methods (such as Otsu’s [17], fuzzy entropy [18], and context sensitive [19] thresholding techniques) to obtain the threshold of a circular histogram. How to choose a suitable breakpoint to linearize the circular histogram becomes a key. For this reason, we have proposed two breakpoint selection criteria. One is the criterion based on the Lorenz curve [20]; we discussed the relation between the area difference and the expansion direction and gave the optimal breakpoint selection criterion in the anticlockwise or clockwise direction. The other is the criterion based on the cumulative distribution entropy [21], and we built a circular histogram expansion model based on the cumulative distribution entropy and discussed the optimal breakpoint selection criteria under different expansion directions. These two circular histogram expansion methods overcome the randomness of breakpoint selection. However, the computational complexity of the Lorenz curve and cumulative distribution entropy is relatively high and makes the selection of the optimal breakpoint spend much time.

Circular statistics, as a particular branch, generally deals with data composed of angles or directions. Due to the obvious periodicity of this data, it is necessary to distinguish between circular data and linear data. In circular statistics, angle mean is used to represent the average angle of a set of data on a circle; it is a circular statistical invariant on the circular histogram; it does not change with the rotation of the circular histogram. On the contrary, line mean represents the average of a linear set of data and is a linear statistical invariant on the linear histogram. Since the angle mean is a circular statistical invariant, the line mean is a linear statistical invariant; in view of this, this paper proposes a simple breakpoint selection criterion to minimize the distance between the angle mean on the circle and the point on the circle corresponding to the line mean of the expanded linear histogram. The proposed criterion can quickly and reasonably find the breakpoint that keeps the distribution unchanged after the circular histogram is expanded.

This paper proposes a fast method for breakpoint selection in circular histogram, which solves the problem of low efficiency in expanding circular histogram into linear histogram. It is organized as follows. Section 2 describes the angle mean of the circular histogram and the line mean of the linear histogram and gives the optimal breakpoint selection criterion. In Section 3, the suitable range of the proposed method is given by comparing it with the optimal breakpoint selection criteria of the Lorentz curve based and cumulative distribution entropy based. Section 4 summarizes the paper.

2. Criteria for Selection of Breakpoint in Circular Histogram

In this section, we use the H component histogram of the HSI color model to explain the circular histogram. Figure 1(a) shows the H component diagram in the HSI color model. The H component represents the periodic change of color in the anticlockwise direction. For example, red is 0, green is , and blue is . Taking into account the periodic changes of the H component, a circular histogram is used to represent the statistical distribution of the H component (Figure 1(b)).

When we use H component to realize color image segmentation, a feasible approach is to transform the circular histogram into a linear histogram, and then, we use the threshold segmentation methods on the linear histogram to select the threshold. The distribution information carried by different linearized histograms produced by the same circular histogram at different cutting points may be different. Figure 2 shows the result of the circular histogram (Figure 1(b)) expanded at two different points. Although they are derived from the same circular distribution, their linearized distributions are not similar. In order to keep the distribution of the linearized histogram as consistent as possible with the distribution of the circular histogram, a new breakpoint selection method is given below.

2.1. Angles Mean and Linearized Mean of Circular Histogram

For a circular histogram with L points , is the frequency of point on the circle and is the corresponding angle. The trigonometric moments on circular histogram are

The angles mean on circular histogram is defined aswhere .

The average direction given in definition (2) is a statistic that describes the position state characteristics of the circular histogram. It does not depend on the starting point and the rotation direction, reflecting the center of the circular histogram [1, 2]. The red line in Figure 3 represents the angles mean of the circular histogram. To show more clearly, Figure 3 uses the rose diagram to illustrate the circular histogram.

Suppose the circular histogram (Figure 1(b)) is expanded into a linear histogram (Figure 4) in the anticlockwise direction at the breakpoint , where . The line mean on linear histogram is can be obtained as

The corresponding point of the line mean on the circular histogram is formulated as

2.2. Breakpoint Selection Criteria

The goal of linearizing the circular histogram is to be able to maintain the complete original distribution. To find the optimal breakpoint, considering the angles mean is a circular invariant on a circular histogram and is a linear invariant of the linear histogram expanded at the breakpoint , it is hoped that the point on the circle corresponding to and are as close as possible so that the linear histogram expanded by the breakpoint can retain more original information of the circular histogram distribution.

and are points on the circle. Because of periodicity, the distance between them is different from the Euclidean distance, and more attention is paid to the difference in the direction of the two values. The cosine value of the angle between them can be used to measure the difference in the direction of them. The distance between and can be measured by the cosine of the angles [1, 2] and expressed as

The value of is only related to the angles of and , . When the two angles are the same, ; when the directions of two angles are opposite, .

The mean-based selection criterion for the optimal breakpoint is

Obviously, when the circular histogram expands in the clockwise direction, the corresponding value of the line mean on the circular histogram is the same as the value obtained in equation (4). Therefore, the method in this paper is unrelated to the expansion direction of the circular histogram.

It is important to emphasize that the idea of mean-based breakpoint selection criterion is different from the existing Lorenz curve-based and cumulative distribution entropy-based breakpoint selection criteria [20, 21]. The mean-based method uses the invariants of circular statistics and linear statistics. Lorenz curve-based and cumulative distribution entropy-based methods, using the cumulative distribution information of each linearized histogram, are related to the counterclockwise or clockwise direction of rotation.

The algorithm of breakpoint selection on circular histogram is very simple and easy to implement. The algorithm of breakpoint selection with mean is illustrated in Algorithm 1.

 Input: H-histogram Hist, Hue magnitude Output: The optimal breakpoint Then calculate the distance with Hist according to equation (5) for magnitude do Rotate historium to the right or left by Calculate the distance with Hist according equation (5) if then end if end for return

3. Experiment Results and Analysis

The experiment is divided into two parts to evaluate the proposed method. The experiments are performed using Python3.8 on a PC with Intel Core 2.50 GHZ CPU and 8 GB RAM, under Windows 10 operating system. In circular models, the von Mises distribution (also known as the circular normal distribution) is the most important distribution. The status is equivalent to the normal distribution in the linear distribution. Many theories with applications in the circular statistics are often discussed for the von Mises distribution [1, 2]. Therefore, the first part shows the results of selecting breakpoint for different types of artificial von Mises distributions and discusses the influence of parameters (the mean direction) and (the concentration parameter) of the bimodal von Mises distribution [1, 2] on the proposed method. In the second part, the proposed mean-based breakpoint selection criterion is compared with the existing breakpoint selection criteria, including the breakpoint selection method based on the Lorenz curve [20] (Lorenz-based), cumulative distribution entropy [21] (CDFE-based), and artificial bee colony [22] (ABC-based) on the H component circular histogram corresponding to 8 images from the Berkeley dataset. For convenience, the quantitative level of the H component in the experimental part is 360.

3.1. Artificial Circular Histograms

We assume that the target and background in the circular histogram are distributed as the ideal von Mises distribution. The linearization effect of the mixture bimodal circular histogram composed of different parameters and is analyzed.

In Figure 5, the distribution of circular histogram (a) is a mixture of von Mises distributions and . The distribution of circular histogram (b) is a mixture of the von Mises distributions and . The distribution of circular histogram (c) is a mixture of the von Mises distributions and . The linear histograms (d)-(f) show the linearization results of Figure (a)-(c) with the mean-based method, respectively.

As demonstrated in Figure 5(d)5(f), the expansion effect of and is better than . The linearized histogram corresponding to and can maintain the original distribution. The linearized histogram corresponding to failed to completely retain the original distribution.

To more specifically illustrate the linearization effect of the mean-based method on the mixture distribution of the same , Figure 6 shows the relation between the percentage of the broken distribution (see the red box in Figure 5(f)) and the mean direction difference . Due to the symmetry of the circle, the mean direction difference only is selected from 0 to 180. The effect of the rest is equivalent to its symmetrical part. When the mean direction difference is closer to 180, the proportion of the broken distribution will suddenly increase, but the maximum will not exceed 0.5%. When the mean direction difference is less than 150, the linearization effect is similar and better.

Similarly, Figures 79 show the relation between the percentage of the broken distribution and the mean direction difference when and are the combinations of (5, 10), (10, 15), and (10, 20), respectively.

The maximum percentage of broken distribution is 20% in Figure 7, and it is 14% in Figure 8, which shows the lower the overall concentration parameter, the worse the overall linearization effect.

In Figure 9, the maximum percentage of broken distribution is 16%. Figures 8 and 9 show the small difference in the concentration parameter of the two distributions is conducive to the linearization of the circular histogram.

In Figures 79, the percentage of broken distribution in different concentration parameters is positively related to the mean direction difference, and it increases exponentially around 180. This exponential increase greatly reduces the linearization effect near 180.

In summary, from the results of Figures 69, it can be seen that the mean-based method is suitable for situations where the target is not far from the background in the circular histogram. The distance between the mean direction should generally not exceed .

3.2. Real Circular Histograms

To further illustrate the scope and effect of the mean-based breaking method, the H component circular histogram corresponding to the 8 color images in the Berkeley dataset is selected for breakpoint selection and compared with the Lorenz-based [20], CDFE-based [21], and ABC-based [22] breakpoint selection criteria. The linearization result of 8 images can be seen in Figures 1017.

The variance and kurtosis have also been computed to fully compare the effects of 4 algorithms. Tables 1 and 2 depict the variance and kurtosis using the existing Lorenz-based, CDFE-based, ABC-based, and proposed mean-based histogram techniques. Equation (7) defines the calculation formula of variance. Variance represents the discrete trend of data distribution. When the data distribution is relatively scattered, the variance is large, and when the data distribution is relatively concentrated, the variance is small. Equation (8) defines the calculation formula of kurtosis. The lower limit of kurtosis will not be lower than 1, and the upper limit will not be higher than the number of data. The greater the kurtosis, the steeper the distribution:where is variance, is kurtosis, is the value of the variable , is the average of the variable , is the number of the variable , and is the probability of the value of the variable .

 Image number ABC-based Lorenz-based CDFE-based Proposed 10 2895.8717 1382.4085 1001.8067 954.1594 11 344.7057 3425.8995 586.0825 343.2546 12 858.3888 1616.9806 857.7780 831.1879 13 19144.1159 2243.3291 1294.3982 1048.8016 14 5758.7597 5001.3854 3719.9960 4699.2499 15 19389.5913 4873.6870 2155.0256 2146.6792 16 4207.6051 6436.7250 4207.6051 3940.4131 17 6837.8419 8879.5251 6879.6104 6836.8612 Average 7429.6100 4232.4925 2587.7878 2600.0759
Bold values indicate the best experimental results.
 Image number ABC-based Lorenz-based CDFE-based Proposed 10 18.0893 30.0191 20.8513 30.4270 11 16.4595 23.6096 33.9834 34.8668 12 3.4539 20.4018 3.3559 22.0457 13 1.5940 13.6122 1.7440 14.7389 14 4.2556 7.0680 4.3209 6.1891 15 1.1194 1.8224 1.8338 1.8579 16 4.3734 4.7211 4.3734 4.8500 17 2.1337 2.7392 2.4110 2.8241 Average 6.4349 12.9992 9.1092 14.7249
Bold values indicate the best experimental results.

As we can see from Figures 10 and 11, when the distribution is unimodal (bimodal coincidence), the breaking results of Lorenz-based, CDFE-based, ABC-based, and mean-based methods are appropriate, which guarantees the integrity of the distribution.

It can be seen from the H component histograms in Figures 12 and 13 that the distance between the centers of the two distributions of the circular histogram is small. In terms of maintaining the integrity of circular distribution, the CDFE-based method and the mean-based method shows the better effect. The variance and kurtosis of the mean method is the best in Tables 1 and 2.

Comparing the results of Figures 14(c)14(f), the CDFE-based method shows the best effect and is the Lorenz-based method. The distance between the centers of the two distributions of the circular histogram (Figure 14(b)) is about 160. The mean-based result from Figure 14(f) shows the small distribution is broken by the breakpoint. This is consistent with the conclusion obtained by the artificial circular histogram analysis. When the centers of the two distributions are far apart, the linearization effect of the mean-based method will deteriorate.

From Figure 15, we can see the linearization effect of the CDFE-based and mean-based methods is better than that of the Lorenz-based and ABC-based methods. The linearization result using the Lorenz-based method does not maintain the integrity of the circular distribution; a small part becomes the right part of the linearized histogram. The linearization result using the ABC-based method completely destroys the distribution.

For complex distributions in Figures 16 and 17 that most color types appear, the frequency is different. It can be seen from Tables 12 that the mean-based method has a slight advantage over the ABC-based method, the CDFE-based method, and the Lorenz-based method in variance and kurtosis.

On the whole, from the linearization results of the 8 circular histograms, the mean-based method is superior to CDFE-based, Lorenz-based, and ABC-based methods, when the difference between the center positions of the target and the background is not particularly large. The mean-based method is effective in suitable scenarios. Judging from the average of metrics of the 8 images shown in Tables 1 and 2, the mean-based method is the best among the four methods.

Table 3 shows the time spent on the linearization of the above 8 images by the Lorenz-based, CDFE-based, ABC-based, and mean-based methods, respectively. The mean-based method has a great advantage in speed. Compared with the Lorenz-based method, on an average, it can save about 7 times of time. Compared with the CDFE-based and ABC-based methods, the improvement is even greater, on an average, shortening about 71 times the time.

 Image number ABC-based Lorenz-based CDFE-based Proposed 10 13.7141 1.4599 13.4678 0.1849 11 13.0234 1.4141 14.2149 0.1893 12 12.4212 1.2345 14.2315 0.2473 13 12.9586 1.4671 14.5628 0.2004 14 13.5517 1.4209 14.7226 0.2160 15 13.0452 1.3908 14.3101 0.1804 16 13.2638 1.5045 14.1923 0.1702 17 13.0594 1.4125 14.1845 0.1925 Average 13.1297 1.4130 14.2358 0.1976

4. Conclusions

For the linearization of circular histograms, we propose a new method to select breakpoint. The new method uses a simple mean operation to give the optimal breakpoint selection criterion. We discuss the applicable scenarios of this breakpoint selection criterion. Experiments show that the new method can guarantee the linearization effect of the circular histogram in suitable scenarios, reduce the computational complexity of the breakpoint selection, and provide a better way for the linearization of the circular histogram. In future, we will explore a new breakpoint selection method based on the mean-based method in this paper, which have a better linearization effect when the centers of the two distributions are far apart.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (nos. 62071378, 62071379, 62071380, and 61901365) and “New Star Team of Xi’an University of Posts and Telecommunications” (no. xyt2016-01).

References

1. K. V. Mardia and P. E. Jupp, Directional Statistics, Wiley, Hoboken, NJ, USA, 2000.
2. S. R. Jammalamadaka and A. Sengupta, Topics in Circular Statistics, World Scientific, Singapore, 2001.
3. R. C. Gonzalez and R. E. Woods, Digital Image Processing, Prentice Prentice Hall, Hoboken, NJ, USA, 2002.
4. F. Garcia-Lamont, J. Cervantes, A. López, and L. Rodriguez, “Segmentation of images by color features: a survey,” Neurocomputing, vol. 292, no. 1, pp. 1–27, 2018. View at: Publisher Site | Google Scholar
5. S. Ito, M. Yoshioka, S. Omatu, K. Kita, and K. Kugo, “An image segmentation method using histograms and the human characteristics of HSI color space for a scene image,” Artificial Life and Robotics, vol. 10, no. 1, pp. 6–10, 2006. View at: Publisher Site | Google Scholar
6. C. Zhang and P. Wang, “A new method of color image segmentation based on intensity and Hue clustering,” in Proceedings of the 15th International Conference on Pattern Recognition (ICPR-2000), pp. 613–616, Barcelona, Spain, September 2000. View at: Google Scholar
7. J. Du, Y. Lu, M. Zhu, K. Zhang, and C. Ding, “A novel algorithm of color tongue image segmentation based on HSI,” in Proceedings of the 2008 International Conference on BioMedical Engineering and Informatics, pp. 733–737, Sanya, China, May 2008. View at: Google Scholar
8. J. Duan and L. Yu, “A WBC segmentation method based on HSI color space,” in Proceedings of the 4th IEEE International Conference on Broadband Network and Multimedia Technology, pp. 629–632, Shenzhen, China, October 2011. View at: Google Scholar
9. N. H. Harun, M. Y. Mashor, N. R. Mokhtar et al., “Comparison of acute leukemia image segmentation using HSI and RGB color space,” in Proceedings of the 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010), pp. 749–752, Kuala Lumpur, Malaysia, May 2010. View at: Google Scholar
10. G. F. Yang, J. Fan, and D. Wang, “Recursive algorithms of maximum entropy thresholding on circular histogram,” Mathematical Problems in Engineering, vol. 2021, p. 6653031, 2021. View at: Publisher Site | Google Scholar
11. S. Tominaga, “Expansion of color images using three perceptual attributes,” Pattern Recognition Letters, vol. 6, no. 1, pp. 77–85, 1987. View at: Publisher Site | Google Scholar
12. M. Celenk, “A color clustering technique for image segmentation,” Computer Vision, Graphics, and Image Processing, vol. 52, no. 2, pp. 145–170, 1990. View at: Publisher Site | Google Scholar
13. D. C. Tseng, Y. F. Li, and C. T. Tung, “Circular histogram thresholding for color image segmentation,” in Proceedings of the 3rd International Conference on Document Analysis and Recognition, pp. 673–676, Montreal,Canada, August 1995. View at: Google Scholar
14. J. Wu, P. Zeng, Y. Zhou, and C. Olivier, “A novel colour image segmentation method and its application to white blood cell image analysis,” in Proceedings of the 3rd International Conference on Signal Processing, pp. 16–20, Beijing, China, November 2006. View at: Google Scholar
15. L. Dimov, “Cyclic histogram thresholding and multithresholding,” International Conference on Computer Systems and Technologies, vol. 2, no. 5, pp. 1–8, 2009. View at: Publisher Site | Google Scholar
16. Y. K. Lai and P. L. Rosin, “Efficient circular thresholding,” IEEE Transactions on Image Processing, vol. 23, no. 3, pp. 992–1001, 2014. View at: Publisher Site | Google Scholar
17. N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62–66, 1979. View at: Publisher Site | Google Scholar
18. L.-K. Huang and M.-J. Wang, “Image thresholding by minimizing the measures of fuzziness,” Pattern Recognition, vol. 28, no. 1, pp. 41–51, 1995. View at: Publisher Site | Google Scholar
19. A. Singla and S. Patra, “A context sensitive thresholding technique for Automatic image segmentation,” in Proceedings of the ICCIDM 2014, pp. 19–25, VSSUT, Odhisa, India, August 2014. View at: Google Scholar
20. C. Kang, C. Wu, and J. Fan, “Lorenz Curve-Based entropy thresholding on circular histogram,” IEEE Access, vol. 8, pp. 17025–17038, 2020. View at: Publisher Site | Google Scholar
21. C. Kang, C. Wu, and J. Fan, “Entropy-based circular histogram thresholding for color image segmentation,” Signal Image and Video Processing, vol. 15, no. 41, pp. 129–138, 2020. View at: Publisher Site | Google Scholar
22. A. S. Kirti and A. Singla, “Context-sensitive thresholding technique using ABC for Aerial images,” Advances in Intelligent Systems and Computing, vol. 898, pp. 85–93, 2019. View at: Publisher Site | Google Scholar

Copyright © 2021 Jiulun Fan and Jipeng Yang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.