Open Access Open Access  Restricted Access Subscription or Fee Access

Classification and Segmentation of Telugu Touching Characters

J. Bharathi, Dr. P. Chandrasekhar Reddy

Abstract


Improper binarization, inadequacies in printing, orthography of the script, font types, and low scanning resolution are some of the factors which contribute to the touching characters in OCR system. In this paper touching characters of Telugu script are classified based on the location of touching and two algorithms for segmentation of touching characters in top zone and middle zone are proposed. This makes it easier to segment them with appropriate strategies for each class. Characteristics of projection profiles and side profiles are used to identify and segment the touching characters. The success rates of 92.67% and 91.28% are achieved for segmenting the touching characters in top zone and middle zone respectively.

Keywords


Classification of Touching Characters, Side Profiles, Touching Characters in Middle Zone, Touching Characters in Top Zone.

Full Text:

PDF

References


C. Edward Hill, A Primer of Telugu Characters. New Delhi: Manohar Publications, 1991.

J. Bharathi and P. Chandrasekhar Reddy, ―Segmentation of Telugu touching conjunct consonant using overlapping bounding boxes,‖ International Journal on Computer Science and Engineering (IJCSE), Vol. 5, No. 06, pp 538-546, Jun 2013.

G. Richard Casey and Eric Licolinet, ―A Survey of methods and strategies in character segmentation,‖ IEEE Trans.Pattern Analysis and Machine Intelligence, Vol. 4, No. 06, pp 570-578, July 1993.

R.L. Hoffman and J.W. McCullough, ―Segmentation methods for recognition of machine-printed characters,‖ IBM Journal of Research and Development, pp. 153-165, March 1971.

Su Liang, M. Shridhar and M. Ahmadi, ―Segmentation of touching characters in printed document recognition,‖ Pattern Recognition, Vol. 27, No.06, pp 825–840, 1994.

Min-Chul Jung, Yong-Chul Shin and S.N. Srihari, ―Machine printed character segmentation method using side profiles‖, IBM Journal of Research and Development, Vol. 26, No.06, pp. 647–656,1999.

Y. Lu, ―On the segmentation of touching characters,‖ International Conference of Document Analysis and Recognition, Tsukuba, Japan, Oct 1993, pp 440-443.

S. Tsujimoto and H. Asada, ―Resolving ambiguity in segmenting touching characters,‖ 1st International Conference on Document Analysis and Recognition (ICDAR), Saint-Malo, France, Oct 1991, pp 701-709.

A. Nomura, K. Michishita, S. Uchida and M.Suzuki, ―Detection and segmentation of touching characters in mathematical expressions," in Proc. 7th Seventh International Conf. Document Analysis and Recognition, Edinburgh, IEEE Computer Society Press, 2003, pp. 126-130.

Utpal Garain and B. Chaudhuri, ―Segmentation of touching symbols for OCR of printed mathematical expressions: An Approach based on Multifactorial Analysis‖, Proc. 8th International Conference on Document Analysis and Recognition (ICDAR‟05), IEEE, 2005, pp. 177-181.

Veena Bansal and R.M.K. Sinha, ―Segmentation of touching and fused Devanagari characters,‖ Pattern Recognition, Vol. 35, No.04, pp. 875–893, 2002.

Utpal Garain and Bidyut B. Chaudhuri, ―Segmentation of touching characters in printed Devnagari and Bangla scripts using fuzzy multifactorial analysis‖, IEEE Trans. Systems, Man and Cybernetics — Part C: Applications and Reviews,Vol.32, No.4,pp 449-459, November 2002.

M. K. Jindal, R. K. Sharma, and G. S. Lehal, ―A study of different kinds of degradations in printed Gurmukhi Script,‖ International Conference on Computing: Theory and Applications, ICCTA, 2007, pp. 538-544.

L.Pratap Reddy, T. Ranga Babu, N. Venkata Rao and B. Raveendra Babu, ―Touching syllable segmentation using Split Profile algorithm‖, International Journal of Computer Science Issues (IJCSI), Vol. 7, Issue 3, No. 09, Nov 2010.

Manish Kumar, ―Degraded text recognition of Gurmukhi script‖, Ph.D. Thesis, Dept of Computer Science and Engineering, Thapar University, India, March, 2008.

J.Bharathi and P. Chandrasekhar Reddy, ―Segmentation of touching conjunct consonants in Telugu using minimum area bounding boxes‖ International Journal on Soft Computing and Engineering (IJSCE), Vol. 3, Issue. 03, pp 260-264, July 2013.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.