Abstract:
Optical Character Recognition (OCR) is the process of extracting text from an image. The main purpose of an OCR is to make editable documents from existing paper documents or image files. OCR is an important area in pattern recognition and image processing. Research in this field has been carried out since the beginning of the development of digital computers in a number of universities and research institutes, and many solutions to the problem has been proposed. In this paper, I would discuss the process of developing an OCR for Bengali language. Lots of efforts have been put on developing an OCR for Bengali. Though some OCRs have been developed, none of them is completely error free. For my project, I have used Tesseract OCR Engine to develop an OCR for Bengali language. Tesseract is currently the most accurate OCR engine. This engine was developed at HP labs and currently owned by Google. I used a number of software and tools to make Bangla OCR. I first present the complete methodology to build the Bangla OCR, followed by the implementation strategy (in python programming language). I used many image files to test the accuracy of my OCR. I have used the latest 4.00 version of Tesseract for Ubuntu 19. For clean image files, the accuracy rate is very high compared to existing works.
Description:
This thesis submitted in partial fulfillment of the requirements for the degree of Masters of Science in Computer Science and Engineering of East West University, Dhaka, Bangladesh