Open Access Open Access  Restricted Access Subscription or Fee Access

NoolOCR for Printed Tamil Text

Dr. L. Jaba Sheela, Syed Mohammed Yasmine

Abstract


Optical Character Recognition (OCR) is a process of converting printed materials into text or word processing files that can be easily edited and stored. The technology has enabled such materials to be stored using much less storage space than the hard materials. OCR technology has made a huge impact on the way information is stored, shared and edited. Prior to optical character recognition, if someone wanted to turn a book into a word processing file, each page would have to be typed word for word. Now a days there are lot of OCR available in the market for different languages but there is no centralized framework for all languages. The intension of the paper is to create a framework capable to handle all available languages. This can be achieved through Eclipse plug-in architecture. So there will be a separate plug-in for different languages.

Keywords


Binarization, Bounding Box, GOCR, OCR, Tesseract

Full Text:

PDF

References


S. Mori, H. Nishida, H. Yamada, "Optical Character Recognition", John Wiley & Sons, 1999.

T. Pavlidis and J. Zhou, “ Page Segmentation and Classification”, CVGIP Vol. 54, No. 6, pp 484-496, November (1992).

A.K. Jain and Y. Zhong, “Page segmentation Using Texture analysis”, Pattern Recognition,Vol. 29,No.5,pp743-770,(1996).

H. Bunke, P.S.P. Wang, "Handbook of Character Recognition and Document Image Analysis ", World Scientific, 1997.

S. Mori, H. Nishida, H. Yamada, "Optical Character Recognition", John Wiley & Sons, 1999.

J. Mantas,”An Overview of Character Recognition Methodologies”. Pattern Recognition, Vol. 19, No 6, p. 425-430, 1986.

A.K. Jain and Y. Zhong, “Page segmentation Using Texture analysis”, Pattern Recognition, Vol. 29,No.5, pp. 743-770, (1996).

G.Siromoney, R. Chandrasekaran and M.Chandrasekaran, “Machine recognition of printed Tamil characters”, Pattern Recognition, vol. 10 (1978).

Impedovo & L. Ottaviano & S. Occhinegro.”Optical Character Recognition - A survey”.Int. Journal of PRAI, Vol. 5, No 1& 2, p. 1-24, 1991.

R. Bradford & T. Nartker.”Error Correlation in Contemporary OCR Systems”. Proceedings ICDAR-91, Vol. 2, p. 516-524, 1991.

S.X. Liao, Q. Liu, "A Study of Moment Functions and Its Use in Chinese Character Recognition", Proceedings of International Conference on Document Analysis and Recognition", vol. 2, pp. 572-575, 1997.

J. Scurmann.”Reading Machines”. Proceedings IJCPR, Munich, p. 1031-1044, 1982.

G. Dimauro, S. Impedovo & G. Pirlo.”From Character to Cursive Script Recognition: Future Trends in Scientific Research.Proceedinngs”, IAPR’92, The Hague, Vol. 2, p. 516-519, 1992.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.