Computational Intelligence in Image Processing 2014View this Special Issue
Sealing Clay Text Segmentation Based on Radon-Like Features and Adaptive Enhancement Filters
Text extraction is a key issue in sealing clay research. The traditional method based on rubbings increases the risk of sealing clay damage and is unfavorable to sealing clay protection. Therefore, using digital image of sealing clay, a new method for text segmentation based on Radon-like features and adaptive enhancement filters is proposed in this paper. First, adaptive enhancement LM filter bank is used to get the maximum energy image; second, the edge image of the maximum energy image is calculated; finally, Radon-like feature images are generated by combining maximum energy image and its edge image. The average image of Radon-like feature images is segmented by the image thresholding method. Compared with 2D Otsu, GA, and FastFCM, the experiment result shows that this method can perform better in terms of accuracy and completeness of the text.
In recent years, with the development of computer, network, and multimedia technology, digital preservation of culture relics has become richer in its connotations: from the initial real-time recording and permanent preservation to multiple levels such as preservation studies, dissemination, and utilization. Digital preservation of culture relics aims at effectively protecting the body of culture relics in the process of exploring and taking their historical, artistic, and scientific value. Among culture relics, sealing clay plays an important role in revising and supplementing for ancient documents that record official systems and geographic data. As a result, text extraction and identification have become key issues in sealing clay research. With the promoting of digital preservation of culture relics in China, more and more sealing clays are presented and archived by images. Sealing clay research based on images can be conducted without direct contact with the body of culture relics, which makes a balance between preservation and utilization of culture relics. Depending on image processing and analyzing technology, many researches are possible. In these researches, text extraction is the most significant one, which will affect following steps, such as feature quantification and text recognition.
As for the sealing clay text extraction, traditional method is identifying the text by ink rubbing. This method needs to make rubbings on a piece of paper from the sealing clay, in which the sealing clay is directly used and contacted, increasing the risks of damaging. For the seriously incomplete or damaged sealing clay whose preservation condition is not so good, especially, ink rubbings are not advisable from the perspective of culture relic preservation. However, taking sealing clay image as the research objective can directly extract text from the image, without the process of making ink rubbings. Since the sealing clay surface is simple, mainly including the background and the text, thresholding segmentation method is available to the research. Currently, thresholding segmentation method includes histogram-shape-based method, such as peak-valley thresholding  and shape modeling thresholding ; cluster-based method, such as minimum error thresholding  and fuzzy clustering thresholding ; entropy-based method, such as cross entropy thresholding  and fuzzy entropy thresholding ; and similar characteristic-based method, such as fuzzy similar thresholding  and semantic thresholding . All of the thresholding segmentation methods above can achieve better effects with their own appropriate images. Nevertheless, there are rare systemic studies on the segmentation of sealing clay images. Thus, on the basis of previous studies on image segmentation and insistent demands of sealing clay research and preservation, the present study puts forward a new method for text segmentation based on radon-like features and adaptive enhancement filters compared with 2D Otsu, GA, and FastFCM. Experimental results show that the method proposed by this research cannot only extract text completely and accurately, but also reduce the interference of incomplete or damaged surface and uneven inward of sealing clay, which provides an appropriate solution to the problem of sealing clay text segmentation.
2. Radon-Like Features and Adaptive Enhancement Filters
2.1. Radon-Like Features
Radon-like features were originated from Radon transform , which is a method of integral transform. The Radon transform formula of 2D continuous function is as follows:
and are, respectively, the slope and intercept of the straight line and Radon transform is a method of evaluating integrals along a straight line which is determined by and . Radon transform is widely used in tomography, such as MRI (magnetic resonance imaging), and its inverse transformation is often used to reconstruct the original images. The discrete performance of (1) is
The Radon-like features just borrowed from the main ideas of Radon transform . Given an image of , the parameterized expression of line is , which is not to make the information about image completely map to line to get an accumulated value, but to make the information distribute to the different line segments of line . The line segments are determined by a collection of knots. Supposing that the collection of crunodes is , so point on line which located between and is the class Radon-like feature. And its value is determined by
is an extraction function which determines how to calculate the values of Radon-like features. When fixing the rotation angle of line and changing its intercept, according to the (3), we can get the Radon-like features image of which contains the values of Radon-like features and is the same size as . If rotation angle is also varied, every pixel of image will produce a vector of Radon-like features. As shown in Figure 1(a), two lines from different angles will intersect with the picture to form nodes and line segments. According to the extract function, , the value of Radon-like features image is highlighted in Figure 1(b).
In Figure 1(a), the line segments are represented by , , , and which are formed by the blue line intersecting with the picture; the line segments are represented by , , and which are formed by the red line intersecting with the picture.
In Figure 1(b), the value of Radon-like features will be defined by the extraction function .
The usefulness of Radon-like feature depends on knots and the extraction function. For the application at hand, image analysis, knots derived from the edges in the image, provides a useful guide to the structures of the image. When a line scans through the input image, its intersections with the image’s edge map define the knots and the line segments. The choice of the extraction function provides enough flexibility to differentiate among various structures of image.
2.2. The Adaptive Enhancement Filters
The linear combinations of the Gaussian model and its derivatives are suitable for the human visual system, which has been well applied in the areas of image denoising, edge detection, and so on. The first and second derivative of Gaussian function filters are mainly used for strengthening the edges. Based on above theories, this paper adopted Leung-Malik filter bank to strengthen the structural information of images. Leung-Malik filter is the composition of polygon filter, bar filter, and punctate filter which are multiscale and multidirection filters, and it can be divided into 36 first and second derivatives of Gaussian function filters (6 directions and 3 scales) shown in first three lines of Figure 2, 8 Gaussian Laplacian filters, and 4 Gaussian filters, shown in the last line of Figure 2 .
As the strokes of Chinese character generally include five primitive strokes, point, horizontal stroke, vertical stroke, left slant, and right slant (as shown in Figure 3), we use the 36 second Gaussian derivative filters and 8 Gaussian Laplacian filters of Leung-Malik filter bank to strengthen the spatial direction information of image.
The parameters of second derivative of Gaussian filters are redefined as 12 directions (0°–165°, an interval of 15°) and 3 dimensions . The scales of 8 Gaussian Laplacian filters are , , , , , , , and . After strengthening of image by these filters from Leung-Malik filter bank, it will achieve 44 energy images , as shown inwhere is a convolution symbol and is the filter from 36 second derivatives of Gaussian filters and 8 Gaussian Laplacian filters.
The construction of the actual LM filter bank needs to determine the maximum window’s size of the filter bank: . The value of decides the fine level of enhancement. Therefore, this paper adopts an adaptive enhancement mechanism to decide the window size by estimating noise of the original image. The detailed information of a sealing clay image is not complicated, which is only interfered by damage, stains, and so on. Thus, in order to boost the system speed, this paper adopts the method of fast noise variance estimation . For an image measuring as seen in , the formula of variance for noise is presented aswhere is the noise variance; is the convolution symbol; is a 3 × 3 template. ConsiderOn the basis of distribution of noise variance, we determine threshold parameters and . According to (7), we get the value of :
3. The Process and Description of Algorithm
When calculating the Radon-like feature values of an image, the function is as follows:
This function calculates the mean values of image ’s pixel values locating along line and situating between knots and . is the maximum energy image:
Besides, this paper extracts the edge of by using Canny operator to generate knots. 12 lines with different direction (0°–360°, with an interval of 30°) are used to scan images to locate knots and determine Radon-like features. The basic process of algorithm is as follows.
Step 1. For an input image , the method of fast noise variance estimation is used to calculate . Based on the values of and and , windows parameter is determined.
Step 2. Enhancement is achieved by using the 36 second Gaussian derivative filters and 8 Gaussian Laplacian filters in the LM filter bank. 44 energy images are obtained and the maximum energy image is got.
Step 3. Canny operator is used to extract the edge of , and an edge graph is got.
Step 4. Based on the edge graph, 12 lines with different direction are used to scan image . The knots are determined by the intersections of the scan line and the edge graph. The pixels between two knots in are the found Radon-like features. Their values are calculated according to the extraction function which is defined in formula (8). 12 scan lines will make 12 different Radon-like feature pictures.
Step 5. Thresholding is conducted on the mean value of the 12 Radon-like feature pictures, and the segmentation result is got.
In Step 2, the adaptive LM filter bank is used to enhance the texture features and it also means to characterize a texture by its responses to a set of filters. In order to reduce the dimensionality of the filter response space, the maximum energy image over both scales and orientations is selected from . Figure 4 shows some results in Step 2. In Figure 4, the top of (a) is a sealing image and its some energy images from 12 directions at scale () are listed in (b). The maximum energy image is shown in lower (a). We can see that maximum energy image contains enhanced main parts and local details.
In Step 3, canny operator is used to get edge of . Though the operator can be directly applied to input image , its result is less complete. It can be seen in Figure 5 that the result of has more complete edges.
In Step 4, based on the edge graph and , Radon-like features are found. After that, these features’ values are calculated according to the extraction function and make 12 different pictures. Figure 6 shows the 12 Radon-like feature pictures of sealing clay image shown in Figure 4. It can be seen that the features of text in 12 directions are selected and enhanced.
4. Experimental Results and Analysis
Based on the above algorithm, this paper selects 30 sealing clay images to conduct segmentation experiment. These images are obtained from three types of sealing clay with different materials and color. Because the algorithm proposed by this paper is to divide the text of sealing clay image, before the experiment, preprocessing of the original sealing clay image is conducted. The central text area is extracted from the sealing clay images and different areas are unified into the same size measuring 128 in length and 128 in width. To demonstrate the advantages of the proposed method we implemented it with Matlab R2010b and compared it with 2D Otsu , Genetic Algorithm , and FastFCM . Figure 7 lists five sealing clay images using different segmentation methods, among which the last line is the result yielded by manual division under the guidance of experts .
We can see that 2D Otsu and GA yield similar results which do not completely separate the text information from the noise, with many areas integrating with each other especially in Figures 7(a) and 7(b). In comparison, FastFCM not only extracts the major form of characters but also suppresses parts of the noise, yielding more detailed information comparing with the former two ways, which is shown in Figures 7(a), 7(b), 7(c), and 7(d). However, looking at every single character, the specific strokes and the intactness of structures are not satisfactory and Figures 7(a), 7(b), 7(c), and 7(d) can illustrate this problem. It should be noticed that, comparing the methods of 2D Otsu, GA, and FastFCM, only GA can obtain the approximate form of the Chinese character “司” in Figure 7(b). Therefore, GA has a better recognizability. For Figure 7(e), GA and FastFCM yield similar results which contain much noise. Though 2D Otsu can obtain better result, the form of characters is incomplete. The method proposed by this paper can solve the problems mentioned above, which not only clearly separates the major body of a character but also has a better performance in terms of richness and intactness. What is more, this method can greatly filter the noise so as to yield clearer and more definite results [16–20].
As is shown in Table 1, by comparing the operation time, we found that 2D Otsu method requires the minimum time. The time required by FastFCM method is longer than that of 2D Otsu method, but is shorter than that of the method used in this paper. GA method required the longest operation time. The method we used in this paper was not advantageous in operating the time, which needed twice the time of FastFCM method and three times the time of 2D Otsu method. Time was mainly spent in scanning and searching the Radon-like features in different directions. The more directions we have scanned, the more detailed Radon-like features we would find and the longer time it would take. Figure 8 shows the relationship between average operation time and the number of directions we scanned. According to several experiments in this paper and with the principle to keep the operation time as short as possible, we found that it could achieve preferable result when the number of directions was 12.
To measure the result of segmentation and further validate effectiveness of the method, the ground-truth image, which was manually segmented under the guidance of experts, was used as a standard reference diagram to compare with the results made by other 4 methods. Misclassification error (ME) was adopted in the measurement of specific differences, which was defined as follows :where , refer to pixels of the text area in standard reference figure and test pattern, respectively, and , stand for pixels of the background region in standard reference image and test pattern, respectively. represents the pixel of statistical texts which are segmented correctly, while means the pixel of statistical backgrounds which are segmented correctly. The value of ranges from 0 to 1, with 0 standing for completely correct segmentation while 1 means completely incorrect segmentation. Therefore, the smaller the value is, the more precise the segmentation is. Moreover, the average values of true positive rate (TPR), false positive rate (FPR), and error probability (EP) are also used to measure the performance, which are calculated the following, respectively:where TP, TN, FP, and FN are true positives, true negatives, false positive, and false negatives, respectively. In this paper, 30 images are tested, and the average values of , TPR, FPR, and EP obtained from 4 methods are shown in Table 2.
It is apparent from Table 2 that, in all experiments, the proposed method is superior compared to other methods, with the least (0.0982), FPR (4.7%), and EP (4.2%) and the maximum value of TPR (96.5%), while FastFCM method has a slightly lower accuracy on segmentation, with its values of (0.1536), TPR (91.2%), FPR (7.3%), and EP (8.6%). By comparison, 2D Otsu method and GA method have the lowest accuracy on segmentation with higher values of (0.2163, 0.2545), FPR (10.1%, 12.7%), and EP (12.3%, 15.2%). All the results indicate that the proposed method is the most efficient to segment the text of sealing clay in terms of accuracy and stability.
Traditional text extraction method of sealing clay needs to contact the sealing clay object directly, which increases the risk of damage. Due to the gradually mature digital technology of cultural relics, this paper proposes a text extraction method based on digital images to facilitate sealing clay research. This method combines Radon-like features and adaptive enhancement LM filter bank and is implemented by two major steps of enhancing the filtering and calculating Radon-like feature images. In experimental testing phase, the result obtained from the method used in this paper is compared with those of 2D Otsu method, GA method, and FastFCM method. According to the result, the method proposed in this paper could not only effectively reduce noise like breakage and muddiness, but also extract text with higher definition, completeness, and accuracy. The disadvantage of this method is its long execution time. The major reason is that the directional scanning process which identifies Radon-like features is relatively time consuming. The more directions it scans, the longer execution time it takes. Therefore, we should promote the scanning strategy to improve the execution speed. Besides, to identify the Radon-like feature, more abundant partial statistical information can be further utilized to extract function so as to acquire more detailed information to enhance the representativeness of Radon-like features. As for applying the method, we can extend the method to solve the problem of segmentation in other types of images, such as blood vessels image segmentation, cells image segmentation, and road image segmentation.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work is partially supported by the National Natural Science Foundation of China (Grant no. 61202198) and Social Science Foundation of Zhejiang Province (Grant no. 11JCWH13YB). This program is supported by Scientific Research Program Funded by Shaanxi Provincial Education Department (Program no. 2013JK1139) and supported by China Postdoctoral Science Foundation (no. 2013M542370) and supported by the Specialized Research Fund for the Doctoral Program of Higher Education of China (Grant no. 20136118120010).
L. Ma, S. Zhao, L. Zhu et al., “A modified fuzzy C-Means segmentation algorithm for illumination pattern-based images under bias field,” Application Research of Computers, vol. 23, no. 4, pp. 135–136, 2006.View at: Google Scholar
L. Bo, L. Rong, F. Jiulun et al., “Fast algorithm for maximum fuzzy entropy thresholding method,” Pattern Recognition and Artificial Intelligence, vol. 23, no. 6, pp. 867–873, 2010.View at: Google Scholar
J. Li, B. Dai, K. Xiao, and A. E. Hassanien, “Density based fuzzy thresholding for image segmentation,” in Advanced Machine Learning Technologies and Applications, vol. 322 of Communications in Computer and Information Science, pp. 118–127, Springer, Berlin, Germany, 2012.View at: Publisher Site | Google Scholar
R. Kumar, A. Vázquez-Reina, and H. Pfister, “Radon-like features and their application to connectomics,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW '10), pp. 186–193, 2010.View at: Google Scholar
J. Liu and W. Li, “Automatic thresholding of gray-level pictures via two-dimensional OTSU method,” Acta Automatica Sinica, vol. 19, no. 1, pp. 101–105, 1993.View at: Google Scholar
Y. X. Qing, H. Z. Hua, and X. Qiang, “Histogram based fuzzy C-mean algorithm for image segmentation,” in Proceedings of the 11th IAPR International Conference on Pattern Recognition. Vol.III. Conference C: Image, Speech and Signal Analysis, Proceedings, pp. 704–707, The Hague, The Netherlands, August-September 1992.View at: Publisher Site | Google Scholar