CALL FOR PAPERS :
DEC-2018
| Submission Last Date |
:
|
30-Dec-2018
|
| Acceptance Notification
|
:
|
in 15 days
|
| Publication Date
|
:
|
in 5 days
|
FOR AUTHORS
FOR REVIEWERS
IJRET® PUBLICATIONS
DOWNLOADS
CONTACT US
NEWS & UPDATES
|
AUTOMATIC TEXT SUMMARIZATION USING SUPERVISED MACHINE LEARNING TECHNIQUE FOR HINDI LANGAUGE
Nikita Desai, Prachi Shah
Abstract: Automatic text summarization is a technique which compresses large text into a shorter text which includes the important information. Hindi is the top-most language used in India and also in a few neighboring countries there is a lack of proper summarization system for Hindi text. Hence,in this paper, we present an approach to the design an automatic text summarizer for Hindi text that generates a summary by extracting sentences. It deals with a single document summarization based on machine learning approach. Each sentence in the document is represented by a set of various features namely- sentence paragraph position, sentence overall position, numeric data, presence of inverted commas, sentence length and keywords in sentences. The sentences are classified into one of four classes namely- most important, important, less important and not important. The classes are in turn having ranks from 4 to 1 respectively with “4”indicating most important sentence and “1” being least relevant sentence . Next a supervised machine learning tool SVMrank is used to train the summarizer to extract important sentences, based on the feature vector. The sentences are ordered according to the ranking of classes. Then based on the required compression ratio, sentences are included in the final summary. The experiment was performed on news articles of different category such as bollywood, politics and sports. The performance of the technique is compared with the human generated summaries. The average result of experiments indicates 72% accuracy at 50% compression ratio and 60% accuracy at 25% compression ratio.
Keywords: Hindi Text Summarization; Supervised Machine Learning; SVM; Text Mining; Sentence Extraction; Summary Generation
DOI: https://doi.org/10.15623/ijret.2016.0506065
|
|