Advisor | Gudivada, Venkat N | |
Author | Pala, Venkatesh Reddy | |
Date Accessioned | 2020-02-04T15:21:31Z | |
Date Available | 2021-12-01T09:01:54Z | |
Date Created | 2019-12 | |
Date of Issue | 2019-09-16 | |
xmlui.metadata.dc.date.submitted | December 2019 | |
Identifier (URI) | http://hdl.handle.net/10342/7630 | |
Description | Most western languages have witnessed the power of Artificial Intelligence (AI) in one other form. Primary fact for this achievement is due to the efforts of several researchers contributing to the field of computational linguistics. However, there are many languages in the World which has a great history and abundant literature but not many research activities due to many factors such as lack of motivation, non- availability of open-source corpora and so on. Telugu is one such language where there is a lack of efforts towards the digitization of language. The focus of this research is to extract text from the images to produce corpora for enabling computational linguistics and also to conserve the literature. Deep Learning with Neural Networks has proven solutions in the same domain.Optical Character Recognition is the solution adopted by western languages for digitization. However the same cannot be applied towards Telugu due to the complexity of scripts and the ambiguity in dialects. To address this issue, in this research we built a neural network system that can be adapted later for any such languages like Telugu. By adapting neural networks in this research we achieved an efficiency of 90 percent. Segmentation of characters is taken care by neural networks while we only specified the segmentation on word level. A comparative study of the system we developed and commercial API's is made and our system is proven to be more accurate. | |
Mimetype | application/pdf | |
Language | en | |
Publisher | East Carolina University | |
Subject | OCR | |
Subject | Text Extraction | |
Subject | Indic Scripts | |
Library of Congress Subject Headings | Neural networks (Computer science) | |
Library of Congress Subject Headings | Programming languages (Electronic computers) | |
Medical Subject Headings | Indic Scripts | |
Title | TEXT EXTRACTION FROM IMAGES USING NEURAL NETWORKS | |
Type | Master's Thesis | |
xmlui.metadata.dc.date.updated | 2020-01-29T14:30:11Z | |
Department | Computer Science | |
xmlui.metadata.dc.degree.name | M.S. | |
xmlui.metadata.dc.degree.level | Masters | |
xmlui.metadata.dc.degree.discipline | MS-Computer Science | |
xmlui.metadata.dc.degree.grantor | East Carolina University | |
xmlui.metadata.dc.degree.department | Computer Science | |
xmlui.metadata.dc.access.option | Restricted Campus Access Only | |
xmlui.metadata.dc.embargo.lift | 2021-12-01 | |
xmlui.metadata.dc.type.material | text | |