CALL FOR PAPERS :
DEC-2018
| Submission Last Date |
:
|
30-Dec-2018
|
| Acceptance Notification
|
:
|
in 15 days
|
| Publication Date
|
:
|
in 5 days
|
FOR AUTHORS
FOR REVIEWERS
IJRET® PUBLICATIONS
DOWNLOADS
CONTACT US
NEWS & UPDATES
|
A LEXICON BASED ALGORITHM FOR NOISY TEXT NORMALIZATION AS PRE-PROCESSING FOR SENTIMENT ANALYSIS
Sudipta Roy, Sourish Dhar, Saprativa Bhattacharjee, Anirban Das
Abstract: Sentiment analysis in the most general sense refers to the classification of a piece of text into either of the three classes–positive, negative or neutral–according to its polarity. The text may be an entire document, a paragraph, a sentence, a phrase or even a single word. Most of the literature on sentiment analysis is dedicated to well-formed text as found in the newspapers, journals and magazines. The unprecedented rise in popularity of the social media brought with it a vast sea of user generated content many of which convey subjective opinions on products, services, organizations, public figures and what not. But the textual data obtained from such sources are extremely noisy. They are characterized by numerous spelling and grammatical errors, as well as by the heavy usage of acronyms, abbreviations, shortened words and slang. The currently available Natural Language Processing (NLP) tools are not designed for handling such types of data. In this report we suggest a number of methods for making the data obtained from social media less noisy and more suitable for sentiment analysis
Keywords: Sentiment analysis, opinion mining, natural language processing, text mining, noise reduction
DOI: https://doi.org/10.15623/ijret.2013.0214013
|
|