Improvement of clustering using I-K-means-+ algorithm in phrase-level Sentiment Analysis in Persian texts

Publish Year: 1401
نوع سند: مقاله کنفرانسی
زبان: English
View: 282

This Paper With 6 Page And PDF Format Ready To Download

  • Certificate
  • من نویسنده این مقاله هستم

این Paper در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

ITCT16_029

تاریخ نمایه سازی: 22 شهریور 1401

Abstract:

Sentiment analysis is a subfield of natural language processing and data mining that aims to extract useful information from users' comments on social websites. Most researches in the field of sentiment analysis are focused on English and few works have dealt with the issue of Persian sentiment analysis. For this purpose, in this study, an unsupervised system has been proposed to classify the sentiment of Persian texts. In the way that, first, all phrases in the text are formulated then sentiment scores and polarity of each phrase are calculated using the SentiWordNet lexicon and fuzzy linguistic hedges. Finally, with the help of fuzzy entropy filter and I-K-means-+ clustering algorithm, significant key phrases for sentiment analysis are extracted. In the end, the performance of the proposed method in classifying the polarity of the Persian dataset named Taghche is evaluated. Also, a comparison of the performance of I-K-means-+ and K-means clustering algorithms on the IMDB dataset is provided. The results show the superiority of the I-K-means-+ algorithm compared to the K-means algorithm. By using the I-K-means-+ algorithm accuracy and f۱-score improve ۰.۱.

Authors

Seyedeh shadab shahidi

Department of Computer Engineering, Yazd Branch,Islamic Azad University, Yazd, Iran

Sima Emadi

Department of Computer Engineering, Yazd Branch,Islamic Azad University, Yazd, Iran