Empirical Study of Different Classifiers for Sentiment Analysis
Abstract
The usage of textual or unstructured data has increased rapidly in the present day scenario. Nowadays websites, social networking as well as many organizations use this sort of data. A major problem occurs when we try to determine the sentiment or the class of these data i.e. whether the data is good or bad. Analyzing the sentiment of a text, document or an article is a challenging task in the world. Several methods were implemented for sentiment analysis throughout the years, but still more improvement and perfection is needed. In this paper, some sentiment based datasets were taken along with a dataset created from reviews collected from Flipkart, a popular online shopping website was also used and a sentiment based function is implemented and finally some classifiers like Naive Bayes, Support Vector Machines (SVM), decision tree and k-Nearest Neighbors (k-NN) were used to predict the accuracy of determining the sentiment type or the class. The objective of this paper is to analyze the accuracy and performance of different classifiers.
Keywords
Full Text:
PDFReferences
Liu, Bing. "Sentiment analysis and opinion mining." Synthesis Lectures on Human Language Technologies 5.1 (2012): 1-167.
Liu, Bing, and Lei Zhang. "A survey of opinion mining and sentiment analysis."Mining Text Data. Springer US, 2012. 415-463.
Pak, Alexander, and Patrick Paroubek. "Twitter as a Corpus for Sentiment Analysis and Opinion Mining." In LREC. 2010. Available: http://deepthoughtinc.com/wp-content/uploads/2011/01/Twitter-as-a-Corpus-for-Sentiment-Analysis-and-Opinion-Mining.pdf
Kouloumpis, Efthymios, Theresa Wilson, and Johanna Moore. "Twitter sentiment analysis: The good the bad and the omg!." ICWSM 11 (2011): 538-541.
Liu, Bing. "Sentiment analysis and subjectivity." Handbook of natural language processing 2 (2010): 627-666.
Kouloumpis, Efthymios, Theresa Wilson, and Johanna Moore. "Twitter sentiment analysis: The good the bad and the omg!." ICWSM 11 (2011): 538-541.
Agarwal, Apoorv, et al. "Sentiment analysis of twitter data." Proceedings of the Workshop on Languages in Social Media. Association for Computational Linguistics, 2011.
Pang, Bo, Lillian Lee, and Shivakumar Vaithyanathan. "Thumbs up?: sentiment classification using machine learning techniques." Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10. Association for Computational Linguistics, 2002. Available: http://en.wikipedia.org/wiki/Sentiment_analysis also in http://dl.acm.org/citation.cfm?id=1118704.
Read, Jonathon. "Using emoticons to reduce dependency in machine learning techniques for sentiment classification." Proceedings of the ACL Student Research Workshop. Association for Computational Linguistics, 2005.
Whitelaw, Casey, Navendu Garg, and Shlomo Argamon. "Using appraisal groups for sentiment analysis." Proceedings of the 14th ACM international conference on Information and knowledge management. ACM, 2005.
Ye, Qiang, Ziqiong Zhang, and Rob Law. "Sentiment classification of online reviews to travel destinations by supervised machine learning approaches."Expert Systems with Applications 36.3 (2009): 6527-6535.
Thet, Tun Thura, Jin-Cheon Na, and Christopher SG Khoo. "Aspect-based sentiment analysis of movie reviews on discussion boards." Journal of Information Science 36.6 (2010): 823-848.
Kennedy, Alistair, and Diana Inkpen. "Sentiment classification of movie reviews using contextual valence shifters." Computational Intelligence 22.2 (2006): 110-125.
McDonald, Ryan, et al. "Structured models for fine-to-coarse sentiment analysis." Annual Meeting-Association For Computational Linguistics. Vol. 45. No. 1. 2007.
Mullen, Tony, and Nigel Collier. "Sentiment Analysis using Support Vector Machines with Diverse Information Sources." EMNLP. Vol. 4. 2004. Available in http://research.microsoft.com/apps/pubs/default.aspx?id=65510
Tan, Songbo, et al. "Adapting naive bayes to domain adaptation for sentiment analysis." Advances in Information Retrieval. Springer Berlin Heidelberg, 2009. 337-349.
Dave, Kushal, Steve Lawrence, and David M. Pennock. "Mining the peanut gallery: Opinion extraction and semantic classification of product reviews."Proceedings of the 12th international conference on World Wide Web. ACM, 2003. Also available in [3].
Durant, Kathleen T., and Michael D. Smith. "Mining sentiment classification from political web logs." Proceedings of Workshop on Web Mining and Web Usage Analysis of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (WebKDD-2006), Philadelphia, PA. 2006. Available in http://citeseer.uark.edu:8080/citeseerx/viewdoc/summary?doi=10.1.1.154.9186
Nigam, Kamal, et al. "Learning to classify text from labeled and unlabeled documents. " AAAI/IAAI 792 (1998).
Go, Alec, Richa Bhayani, and Lei Huang. "Twitter sentiment classification using distant supervision." CS224N Project Report, Stanford (2009): 1-12.
Bollen, Johan, Huina Mao, and Alberto Pepe. "Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena." ICWSM. 2011.
O'Connor, Brendan, et al. "From tweets to polls: Linking text sentiment to public opinion time series." ICWSM 11 (2010): 122-129.
Pang, Bo, and Lillian Lee. "Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales." Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2005. Available in http://en.wikipedia.org/wiki/Sentiment_analysis
Pang, Bo, and Lillian Lee. "A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts." Proceedings of the 42nd annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2004. Available in http://en.wikipedia.org/wiki/Sentiment_analysis
Ganapathibhotla, Murthy, and Bing Liu. "Mining opinions in comparative sentences." Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1. Association for Computational Linguistics, 2008.
Zhang, Min-Ling, and Zhi-Hua Zhou. "ML-KNN: A lazy learning approach to multi-label learning." Pattern recognition 40.7 (2007): 2038-2048.
http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=6010 “AFINN”.
http://en.wikipedia.org/wiki/Naive_Bayes_classifier
http://dms.irb.hr/tutorial/tut_dtrees.php “Decision trees”.
http://en.wikipedia.org/wiki/Support_vector_regression
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 3.0 License.