Abstract

Degradation diagnosis plays an important role for degraded character processing, which can tell the recognition difficulty of a given degraded character. In this paper, we present a framework for automated degraded character recognition system by statistical syntactic approach using 3D primitive symbol, which is integrated by degradation diagnosis to provide accurate and reliable recognition results. Our contribution is to design the framework to build the character recognition submodels corresponding to degradation subject to camera vibration or out of focus. In each character recognition submodel, statistical syntactic approach using 3D primitive symbol is proposed to improve degraded character recognition performance. In the experiments, we show attractive experimental results, highlighting the system efficiency and recognition performance by statistical syntactic approach using 3D primitive symbol on the degraded character dataset.

1. Introduction

Degraded character recognition is an important research topic in OCR with the rapid progress of digital imaging technology and the variety of image acquisition conditions. The accuracy of degraded character recognition results largely affects the accuracy of the overall document processing system. In the real world, some difficulties are involved in character recognition due to camera vibration, lack of focus, noise, and low contrast to background. These kinds of degraded characters cannot be recognized smoothly by traditional OCR methods because it is very hard to get clear character images.

Degradation diagnosis plays an important role for degraded character recognition system, which can tell recognition difficulty of a given degraded character. In this paper, we aim to achieve an adaptive degraded character recognition system by syntactic approach using 3D primitive symbol according to degradation diagnosis. It includes degradation diagnosis, 3D primitive symbol extraction, and syntactic classification. We briefly survey the related works in the following subsection.

1.1. Related Work

Some approaches have focused on degraded character recognition. These works can be divided into two groups: statistical approach and syntactic approach. The statistical approach has been used widely in practical systems since it is simple and objective algorithm. For degraded character recognition, it can be further divided into two categories: binarization based character recognition and gray feature based character recognition.

The binarization based character recognition is to extract character feature from the binary character image by degradation recovery and advanced binarization [1, 2]. It focuses on how to remove degradation and get ideal binary patterns. These processes will inevitably result in information loss and will generate a lot of broken strokes or connected strokes and noise into a binarized image. Taylor and Dance perform the recovery of text from digital camera images by deblurring and by deconvolution and resolution enhancement by linear interpolation and apply thresholding to obtain texts [1]. Kavallieratou and Stamatatos [3] combine global and local thresholding to improve the quality of old documents. Banerjee et al. [4] use a probabilistic contextual model to restore degraded document images.

The gray feature based character recognition is directly to extract features from a gray scale image which can effectively avoid the information loss. It can be further divided into two types: structural features and frequency features. Structural features try to extract character structure from a gray scale image, such as direction feature, skeleton feature, and topological feature [5, 6]. Although the structural features can precisely describe the character structure, it is difficult to extract invariable structural features because it suffers various degradation. In contrast to structural features, frequency features are very effective for the recognition of low resolution gray scale character, such as Fourier transform and wavelet transform. In these features, Gabor filter feature is demonstrated that has good behavior in degraded character recognition [711]. Wang et al. [7] use Gabor filters to extract features directly from gray scale character images, which has excellent performance on both low-quality machine-printed character recognition and cursive handwritten character recognition. Hu et al. [8] propose dominant orientation matrix based on Gabor filter for low resolution gray scale character classification. Hamamoto et al. [9] use a Gabor filter on a multichannel filtering theory for handwritten numeral character recognition. Tavsanoglu and Saatci [10] use a CNN Gabor filter and an orientation map to successfully recognize the handwritten characters. Yoshimura et al. [11] extract Gabor jet features for the character recognition with various font types.

The other syntactic approach uses the primitive symbol to describe the character structure and the grammar to analyze the category of character. According to primitive symbol types, the syntactic approach can be divided into two categories: 1D primitive symbol based syntactic recognition and 2D primitive symbol based syntactic recognition. The 1D primitive symbol based syntactic method extracts primitive symbol in one dimension to describe the character structure. It adapts to character recognition of western languages and online character recognition.

The 2D primitive symbol based method extracts primitive symbol in two dimensions, generally chain-coded to map each 2D image to a single string. Lee et al. [12] use 2D phonetic symbols and an attribute-dependent programmed grammar to recognize the Korean character. Parizeau and Plamondon [13] present an original approach for modeling cursive script allographs and use exclusively morphological and pragmatic knowledge to recognize them. It separates two distinct problems: the recognition of a set of graphics symbols and the reading of a message coded with those symbols. Lucas and Amiri [14] obtain the chain code string from character image and build a statistical model for strings of each class based on a probabilistic version of an n-tuple classifier. Rahman and El Saddik [15] propose curve-fitting algorithm to extract the primitive and the syntactic method to recognize the handwritten Bengali characters.

1.2. Framework

The work in this paper presents an adaptive degraded Chinese character recognition system by statistical syntactic approach using 3D primitive symbol, which spans a broad range from low level degradation diagnosis to high level syntactic character recognition. The general framework of the adaptive degraded character recognition system is composed of seven character recognition submodels corresponding to seven degradation levels as shown in Figure 1. Each character recognition submodel includes 3D primitive symbol extraction and syntactic classifier which is trained by the specific dataset corresponding to each degradation level. For a given degraded character, according to degradation diagnosis result, each character recognition submodel adaptively acts. Thus the degraded character recognition system can better obtain the instruction from the degradation information of the given character to reduce the character recognition error and improve the system performance.

In this paper, the first contribution is the proposal of the adaptive degraded character recognition system which is composed by degradation diagnosis and character recognition. It well integrates degradation cues into every character recognition submodel to improve the system performance. The second contribution is to propose 3D primitive symbol which is directly extracted from the gray scale image and apply the statistical syntactic classifier to recognize the degraded character which degradation includes disk-blurring by lack of focus and motion-blurring by camera vibration. In contrast to 2D primitive symbol used in the traditional syntactic approach, this kind of 3D primitive symbol can better describe the character structure and adapt to various degradation to reduce the recognition error. Experiment results demonstrate that the proposed approach highly improved the performance of multidegradation character recognition system.

The reminder of this paper is organized as follows. Degradation diagnosis is discussed in Section 2. Section 3 introduces the process of 3D primitive symbol extraction. Section 4 presents the syntactic analysis to recognize the degraded character. In Section 5, the experimental studies are presented. Finally, Section 6 summarizes the main contributions of the paper together with discussions on some opening issues and future research directions.

2. Degradation Diagnosis

For degraded character recognition, it is a good way to use degradation diagnosis to help degraded character recognition. In this paper, two degradation types are considered, disk blurring by out of focus and motion blurring by camera vibration. According to degradation type and degradation degree, we build seven degradation levels, which are clear level , light blur level , heavy blur level , motion blur level in direction, motion blur level in direction, motion blur level in direction, and motion blur level in direction. According to these degradation levels the character dataset is classified into seven subsets. For each degradation level, the corresponding character recognition submodel is designed by statistical syntactic approach using 3D primitive symbol, which is built by the corresponding character subset (seen from Figure 1). For a given degraded image, the character recognition submodel adaptively acts according to degradation diagnosis result of the given character image.

For a given degraded character, we use the dual-diagnosis method to diagnose its degradation level, which is finished by two parts: disk-blurring diagnosis and motion-blurring diagnosis. The algorithm applies the gray distribution feature to evaluate disk-blurring degradation [16]. It is performed by three steps: preprocessing, gray distribution feature extraction, and classification. After disk-blurring diagnosis, clear level and heavy blur level can be diagnosed, and the other levels are diagnosed into one big level. This one big level needs to be further diagnosed by motion-blurring diagnosis.

The gray value distribution of the motion-blurring image in different direction is different [17]. According to this gray distribution characteristic, we use four edge structure elements to statisticise the difference in four motion-blurring directions. The edge density ratio feature is extracted to diagnose the motion-blurring levels in four motion-blur directions. In this paper, the four edge structure elements are applied to obtain the edge images in four directions, , , , and [18]. When the motion-blurring diagnosis is accomplished, the rest five degradation levels can be diagnosed. The detailed algorithm is performed by the steps in Algorithm 1.

Input: an input character image
Output: motion-blurring level
Step  1. A given character image is preprocessed to a uniform image ( pixels).
    Four edge structure elements are used on to acquire the edge images , , , .
Step  2. The edge density ratio feature and can be computed by (i) and (ii):
                       (i)
                      (ii)
   Here, , , and are the respective gray distribution features of
    , , , , namely if the bin of
    gray histogram is more than defined thresholding , it would be recorded .
Step  3. The one big level after disk-blurring diagnosis can be further diagnosed as followed:
     if
     elseif
     elseif
     elseif
     else
   Here , , and are the thresholdings for motion-blurring diagnosis in four directions.

3. 3D Primitive Symbol

We propose a novel primitive symbol to describe character structure. It is directly extracted from the gray scale space without any binarization. So it can avoid the information loss as binarization and obtain the more information to describe character structure. As shown in Figure 3, 3D primitive symbol extraction can be implemented by five steps: normalize a input image into the standard size ; design and apply four-directional Gabor filters to extract structure information from the normalized character image; divide the gray scale space into three subspaces that the outputs of Gabor filters locate in; in each subspace, obtain the primitive string code base on block sampling; connect all primitive string code into one primitive string code.

3.1. Gabor Filter

Gabor filter is a kind of frequency filter which can extract directional structure directly from gray scale images to tolerate some kinds of degradation. We apply a kind of Gabor filter to a given image . The response output can be defined through the convolution sum: where is the wavelength of Gabor filter and is the standard deviation of Gaussian. are the dimensions of Gabor filter. is the orientation angle of Gabor filter. As the strokes of Chinese character mainly have four directions: , , , and , we use a set of Gabor filter to localize direction spatial frequency at : So after convolution by Gabor filter, the directional output can be obtained, as shown in Figure 3.

3.2. Primitive Symbol

Every gray scale space of the directional output is divided into three subspaces , by thresholding , . In every gray scale subspace , the directional output is divided into blocks. The primitive string is coded on each block. It can be computed as where is the point coordinate in the block , namely, , and is the corresponding code of the block which is searched in all points of block . Finally all codes are concatenated into one string code as primitive symbol, as shown in Figure 3.

Figure 2 shows 3D primitive symbol extraction for seven degradation levels. The more subspaces include the more information. For the different degradation level, we take the different subspace . For the clear level , it is enough to take the top subspace to generate the sting code . For the light blur level , two top subspaces and are considered to form string code . For the other five degradation levels , the primitive numeric string codes are generated by three subspaces as they include more information, which can help to improve the recognition performance.

4. Syntactic Classification

In the section, we provide details on how to use the proposed adaptive statistical syntactic method based on 3D primitive symbol to recognize the degraded Chinese character. As shown in Figure 1, according to 7 degradation levels, we design 7 character recognition submodels by training character sets with 7 degradation levels. In each submodel, the special 3D primitive symbol and the special syntactic classifier are designed to recognize the character subject to the corresponding degradation.

In each submodel, as described in Section 3, the 3D primitive string codes are extracted for each type of degradation. Namely, for the degradation level , the primitive string code is . For the degradation level the primitive string code is . And for the degradation level the primitive string code is . Matching algorithm is applied to recognize the characters. For a given degraded character, it is matched with the reference strings. We also set the reference strings, respectively, for each degradation levels . Namely, for the degradation level , the reference string is which is taken from top subspace of Gabor filter response. And for the degradation level the reference string is . For the other degradation levels the reference strings are taken as .

For a given degraded character, according to the degradation diagnosis result, each character recognition submodel adaptively acts. Here, we define the adaptive classification weights to adaptively assign each submodel action. For example, if the degradation level of the given character is , the weight is set large, and the other weights are set small.

For the given degraded character, the 3D primitive string is extracted and syntactic matching is implemented as (2) where is the recognition category of the given character and is the character category of recognition. The detailed character recognition procedure is implemented as Algorithm 2.

Input: the input character image
Output: character category
Step  1. For the input character image , extract the gray distribution feature
   to diagnose and obtain disk-blurring level ;
Step  2. If , extract the edge gray distribution feature to diagnose
   and obtain motion-blurring diagnosis result . Set .
   Otherwise, update .
Step  3. For the input character image , extract the string code .
Step  4. According to the blurring level of ,
   acquire character category by adaptive classification weights
   to adaptively adjust every sub-model’s action.

5. Experiments

In order to illustrate the performance of the proposed method we have performed a number of experiments on printed Chinese character data sets, including 3755 character categories. We use a point-spread function (PSF) to generate two degradation types of the degraded character sets. One type is disk-blurring degradation by lack of focus. We used 20 blurring run to generate 20 disk-blurring character subsets. The other type is motion-blurring degradation by camera vibration. We use 11 blurring runs and 8 blurring directions to generate 88 motion blurring character subsets.

In all experiments with the proposed character recognition algorithm, the thresholdings for motion-blurring diagnosis are empirically set as , , and . For the edge density ratio feature, the gray histogram is computed with bins. The adaptive classification weight is set to if ; otherwise . Our experiments are all implemented by MATLAB.

5.1. Degradation Diagnosis

In disk-blurring diagnosis part, we randomly select 200 character images, respectively, from character subsets by three blurring run as training data. The other 106 character subsets are used as testing data to demonstrate the effectiveness of degradation diagnosis. Figures 4 and 5 show the degradation diagnosis results by the proposed method.

We use seven degradation levels () to diagnose 21 character subsets by 21 disk-blurring runs (. As these 21 data subsets are degraded by disk-blurring degradation, they theoretically should be diagnosed into according to our degradation levels. Figure 4 shows the diagnosis results of seven degradation levels () on these 21 subsets. The horizontal axis is blurring run by PSF. The vertical axis is the diagnosis ratio to 7 levels. It can be shown that the distribution of 21 degraded character subsets is subject to the normal distribution of 3 disk-blurring levels. The character images with light blurring degree mostly locate in clear level scope. And most of the character images with heavy blurring degree distribute in heavy blurring level scope. Experiment result demonstrates that the proposed algorithm can effectively diagnose the degraded character images by disk-blurring degradation.

We also do the same experiment on the other 88 degraded character subsets by 11 motion-blurring run rm in 8 motion-blurring directions . The final result is shown in Figure 5. The horizontal plane represents the blurring run rm in blurring direction by PSF. The vertical axis is the diagnosis ratio to 7 degradation levels. In order to demonstrate the final diagnosis results clearly, only the rest 4 degradation levels () are shown. In Figure 5, it can be seen that the degraded character images by small motion-blurring run rm are diagnosed to because it has the similar gray value distribution with character images with light disk-blurring level. In this condition, the influence of motion-blurring direction can be neglected. With the motion-blurring run rm increasing, motion-blurring diagnosisplays an increasingly important part in degradation diagnosis. We can see that the proposed edge density ratio features can effectively diagnose the character subsets by 11 motion-blurring runs and 8 motion-blurring directions subject to 4 degradation levels in 4 motion-blurring directions ().

From Figures 4 and 5, experiment results demonstrate that the proposed dual-diagnosis algorithm can effectively be applied on the degradation diagnosis of the degraded characters.

5.2. Degraded Character Recognition

We present both quantitative and qualitative results for degraded character recognition as well as comparative results with a 1D primitive string method (SDCR) trained by all character subsets. More specifically, we have tested the algorithm in 109 degraded character subsets with different degradation. In the proposed adaptive algorithm (ADCR), 21 character subsets are trained to obtain the reference strings which, respectively, correspond to 7 degradation levels. Each character subset includes 3755 degraded Chinese characters.

Figure 6 shows comparative results of the SDCR algorithm and the ADCR algorithm. The horizontal axis is the 109 character subsets. The vertical axis is the degraded character recognition rate. The blue line is the character recognition result of SDCR, and the red line is the result of ADCR. From Figure 6, notice that ADCR algorithm generates better degraded character recognition result than the SDCR algorithm. Experiment demonstrates that the proposed method highly improved the performance of the degraded character recognition system.

6. Conclusion

In this paper, we have proposed a framework for the adaptive degraded character recognition system by statistical syntactic approach using 3D primitive symbol, which is integrated by the degradation information to provide accurate and reliable recognition results. Experiments have validated that the proposed method can highly improve the performance of the degraded character recognition system. We believe that improvements are due to the integration of degradation diagnosis and the information provided by 3D primitive symbol. We have noticed the proposed is only the degraded character recognition for print Chinese character. In the future, we will try some other 3D effective structure features from the gray scale image to recognize the handwriting character. Furthermore, a well degradation diagnosis algorithm can help in tackling degraded character recognition. We will further examine more features to diagnose more degradation sources in character degradation diagnosis in future work.

Conflict of Interests

The author declares that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is supported by National Natural Science Foundation of China (no. 61003102, no. 61103072, and no. 61272271) and the scientific and technological projects of Science and Technology Commission of Shanghai Municipality (no. 11dz1210404).