EWU Institutional Repository

Optical Character Recognizer for Bangla (Bangla-OCR)

Show simple item record

dc.contributor.author Khandaker, Riyad
dc.date.accessioned 2022-05-25T05:21:59Z
dc.date.available 2022-05-25T05:21:59Z
dc.date.issued 2019-09-12
dc.identifier.uri http://dspace.ewubd.edu:8080/handle/123456789/3559
dc.description This thesis submitted in partial fulfillment of the requirements for the degree of Masters of Science in Computer Science and Engineering of East West University, Dhaka, Bangladesh en_US
dc.description.abstract Optical Character Recognition (OCR) is the process of extracting text from an image. The main purpose of an OCR is to make editable documents from existing paper documents or image files. OCR is an important area in pattern recognition and image processing. Research in this field has been carried out since the beginning of the development of digital computers in a number of universities and research institutes, and many solutions to the problem has been proposed. In this paper, I would discuss the process of developing an OCR for Bengali language. Lots of efforts have been put on developing an OCR for Bengali. Though some OCRs have been developed, none of them is completely error free. For my project, I have used Tesseract OCR Engine to develop an OCR for Bengali language. Tesseract is currently the most accurate OCR engine. This engine was developed at HP labs and currently owned by Google. I used a number of software and tools to make Bangla OCR. I first present the complete methodology to build the Bangla OCR, followed by the implementation strategy (in python programming language). I used many image files to test the accuracy of my OCR. I have used the latest 4.00 version of Tesseract for Ubuntu 19. For clean image files, the accuracy rate is very high compared to existing works. en_US
dc.language.iso en_US en_US
dc.publisher East West University en_US
dc.relation.ispartofseries ;CSE00192
dc.subject Optical Character Recognizer for Bangla en_US
dc.title Optical Character Recognizer for Bangla (Bangla-OCR) en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account