Sentiment Analysis of User Reviews in Online Stores Using Natural Language Processing and Machine Learning

Publish Year: 1404
نوع سند: مقاله کنفرانسی
زبان: English
View: 10,254

This Paper With 15 Page And PDF Format Ready To Download

  • Certificate
  • من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

CMELC02_084

تاریخ نمایه سازی: 16 خرداد 1404

Abstract:

Users' opinions and perspectives, as the cornerstone of many human activities and decision-making processes, play a crucial role in analyzing customer behavior and preferences. Nowadays, user reviews recorded on e-commerce websites have become a valuable source for identifying users’ needs and interests. Through sentiment analysis, these insights can be effectively extracted. In this study, sentence-level opinion mining techniques were employed to analyze user reviews collected from two popular Persian websites. Several models, including XGBoost, Naive Bayes, K-Nearest Neighbors (KNN), Logistic Regression, Random Forest, and Support Vector Machine (SVM), were utilized to classify sentiments into positive and negative categories. The results indicate that the XGBoost, Logistic Regression, and Random Forest models demonstrated outstanding performance on the Digikala dataset, achieving perfect scores of ۱ in all evaluation metrics, including accuracy, precision, recall, and F۱-score. The SVM model also showed very strong performance with an accuracy of ۰.۹۷ and an F۱-score of ۰.۹۸. In contrast, the KNN and Naive Bayes models achieved accuracies of ۰.۸۵ and ۰.۷۶, respectively, indicating weaker performance compared to the other models. Similar results were obtained on the Fidiboo dataset, where the XGBoost, Logistic Regression, and Random Forest models again achieved perfect scores of ۱ across all metrics, delivering flawless performance. These findings suggest that advanced models such as XGBoost and Random Forest, due to their complex and flexible structures, possess a high capability in identifying patterns within Persian data and accurately classifying them.

Authors

Amirhosein Hasani

Department of Computer Engineering, Hadaf Higher Education Institute, Sari, Iran

Fatemeh Ebrahimi

Department of Computer Engineering, Hadaf Higher Education Institute, Sari, Iran