CALL FOR PAPERS : DEC-2018

Submission Last Date	:	30-Dec-2018
Acceptance Notification	:	in 15 days
Publication Date	:	in 5 days

Submit Manuscript Online

FOR AUTHORS

FOR REVIEWERS

IJRET^® PUBLICATIONS

DOWNLOADS

CONTACT US

NEWS & UPDATES

Call for Paper Vol-7 Iss-02 Feb-2018

IJRET invites papers from various engineering disciplines for Volume-07 Issue-02, Feb-2018.

Submit Manuscript

Published Vol-07 Iss-01 Jan-18

IJRET Volume-07 Issue-01, Jan-2018 is published now.

Browse Papers

RULE BASED PSEUDO N-GRAM MODEL FOR TELUGU SCRIPT

N. Swapna, B. Padmaja Rani

Abstract: With the increasingly widespread use of computers and the internet in India, large amount of information in Indian languages are becoming available on the web. Automatic information processing and retrieval is therefore becoming an urgent need in the Indian context. This paper presents a new Rule based Pseudo N-gram for Telugu language. Rule based Pseudo N-gram is an approach, which provides a system that gives set of rules to extracting root words by removing inflections which were unrecognized by Pseudo N-gram. Pseudo N-gram can act as a preliminary stage for Rule based Pseudo N-gram. Pseudo N-gram is process of stripping the word from the end. We composed five rules to describe a Rule based Pseudo N-gram. The rules are written based on the morphology, grammar rules and word derivation structure of Telugu language. Telugu is one of the old and traditional languages of India and it is categorized as one of the Dravidian language family unit with its own high-class script. Telugu is an authorized language of the states of Telangana and Andhra Pradesh. Telugu is a rich morphological large that has high word conflation. Keeping in view of these complexities, we propose a Rule based Pseudo N-gram that provides a reasonable alternative to word based models and is also used for text categorization. We have conducted the experiments on randomly selected Telugu documents and we found the accuracy of Rule based Pseudo N-gram is up to 97.8%.

Keywords: Rule Based Pseudo N-Gram, Pseudo N-Gram, Text Categorization, Morphology, Grammar Rule.

DOI: https://doi.org/10.15623/ijret.2017.0601002