Please use this identifier to cite or link to this item: http://repo.lib.jfn.ac.lk/ujrr/handle/123456789/816
Full metadata record
DC FieldValueLanguage
dc.contributor.authorRamanan, M.-
dc.contributor.authorRamanan, A.-
dc.contributor.authorCharles Eugene, Y.-
dc.date.accessioned2016-01-04T07:25:53Z-
dc.date.accessioned2022-06-28T04:51:44Z-
dc.date.available2016-01-04T07:25:53Z-
dc.date.available2022-06-28T04:51:44Z-
dc.date.issued2015-12-12-
dc.identifier.urihttp://repo.lib.jfn.ac.lk/ujrr/handle/123456789/816-
dc.description.abstractAn Optical character recognition (OCR) consists of the phases: preprocessing and segmentation, feature extraction, classification and post-processing. This paper focuses on pre-processing and segmentation tasks which plays a major role in the subsequent processes of an OCR. The objective of pre-processing and segmentation is to improve the quality of the input image. In addition this phase removes unnecessary portions of the input image that would otherwise complicate the subsequent steps of OCR and reduce the overall recognition rate. Preprocessing and segmentation step consists many sub processes namely, image binarisation, noise removal, skew detection and correction, page segmentation, text or non-text classification, line segmentation, word segmentation and character segmentation. This paper proposes a new approach to calculate the skew angle, segment and classify the blocks as text or non-text. The skew angle is calculated on the scanned document using Wiener filter, smearing technique and Radon transform. Document image is segmented into blocks using run length smearing algorithm and connected component analysis. Features such as basic, density and HOG are extracted from each block for text and non-text classification. The proposed methods are tested on 54 documents. The testing results show a recognition rate of 96.30% for skew detection and correction whereas the recognition rate is 99.18% for text or non-text classification with binary SVMs using RBF kernel.en_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.subjectPrinted Tamil documents; Skew correction; Textual classificationen_US
dc.titleA Pre-processing Method for Printed Tamil Documents: Skew Correction and Textual Classificationen_US
dc.typeArticleen_US
Appears in Collections:Computer Science

Files in This Item:
File Description SizeFormat 
Skew Correction and Textual Classification.pdf178.5 kBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.