Research Article

An Efficient Skewed Line Segmentation Technique for Cursive Script OCR

Algorithm 3

Text line segmentation algorithm.
Input: Normalized de-skewed image
Output: Segmented lines.
//Begin.
Step 1. //Preprocessing: image binarization (using adaptive threshold).
Step 2. //De-skew the image (if needed).
Step 3. //Scan Image row by row.
Identify the intensity for each pixel (0 or 1).
Step 4. //Calculate the standard deviation of the image (use as minimum black pixels in a text row).
Step 5. If(Black_Pixels > Std)
Black_Row = row
Step 6. Else
Space_Row = row
Step 7. For (Start from 1st_Black_Row: till Last_space_Row)
Step 8. If (Height_Row > Min_ Height_Row)
//Consider these consecutive text rows as text line until any white_row occurs.
Step 9. If (Space_Row occur)
Step 10. If (Space_Row > Min_Height_Row)
Break text_line and go to Step 11.
Step 11. else
Go to Step 7
Step 12. else
Search for next black_text_row
Step 13. Else
Go to Step 10
//End