CALL FOR PAPERS : DEC-2018

Submission Last Date	:	30-Dec-2018
Acceptance Notification	:	in 15 days
Publication Date	:	in 5 days

Submit Manuscript Online

FOR AUTHORS

FOR REVIEWERS

IJRET^® PUBLICATIONS

DOWNLOADS

CONTACT US

NEWS & UPDATES

Call for Paper Vol-7 Iss-02 Feb-2018

IJRET invites papers from various engineering disciplines for Volume-07 Issue-02, Feb-2018.

Submit Manuscript

Published Vol-07 Iss-01 Jan-18

IJRET Volume-07 Issue-01, Jan-2018 is published now.

Browse Papers

IMPROVEMENT OF TELUGU OCR BY SEGMENTATION OF TOUCHING CHARACTERS

J. Bharathi, P. Chandrasekhar Reddy

Abstract: The reported success rates for Telugu OCRs are 84-87% for fonts sizes from 12-20 and 95.4-98.5% for sizes from 15 to 35. Some of the issues mentioned in the literature are noise and confusion characters. Studies by the authors have indicated that the touching characters constitute about 1% - 2% of the total characters in printed books of normal size fonts (14 pts). The editable output of OCR System has additional errors due to incorrect code selection emphasizing the need to identify the touching characters. Identification of touching characters is a challenge as the touching may occur at different places due to orthography and rules of grammar. A complete strategy of identification, segmentation and recognition system is proposed along with syllable models for segmentation. Effect of normalization methods at preprocessing stage for improving the identification of touching characters and recognition rates of normal characters is studied. A new algorithm is proposed for segmenting the touching conjunct consonants. The use of augmented database shows clear improvement in the recognition rates. The touching characters are identified and segmented successfully with 83% success rate, thus improving the overall performance of OCR System for Telugu.

Keywords: Telugu OCR, Touching characters, Syllable model Non Linear Normalization, Hausdorff distance, Augmented Database

DOI: https://doi.org/10.15623/ijret.2014.0310054