Distributed Online Pre-Processing Framework for Big Data Sentiment Analytics

Publish Year: 1401
نوع سند: مقاله ژورنالی
زبان: English
View: 194

This Paper With 10 Page And PDF Format Ready To Download

  • Certificate
  • من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

JR_JADM-10-2_005

تاریخ نمایه سازی: 28 خرداد 1401

Abstract:

Performing sentiment analysis on social networks big data can be helpful for various research and business projects to take useful insights from text-oriented content. In this paper, we propose a general pre-processing framework for sentiment analysis, which is devoted to adopting FastText with Recurrent Neural Network variants to prepare textual data efficiently. This framework consists of three different stages of data cleansing, tweets padding, word embedding’s extraction from FastText and conversion of tweets to these vectors, which implemented using DataFrame data structure in Apache Spark. Its main objective is to enhance the performance of online sentiment analysis in terms of pre-processing time and handle large scale data volume. In addition, we propose a distributed intelligent system for online social big data analytics. It is designed to store, process, and classify a huge amount of information in online. The proposed system adopts any word embedding libraries like FastText with different distributed deep learning models like LSTM or GRU. The results of the evaluations show that the proposed framework can significantly improve the performance of previous RDD-based methods in terms of processing time and data volume.

Authors

M. Molaei

Department of Computer Engineering, University of Zanjan, Iran.

D. Mohamadpur

Department of Computer Engineering, University of Zanjan, Iran.

مراجع و منابع این Paper:

لیست زیر مراجع و منابع استفاده شده در این Paper را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود Paper لینک شده اند :
  • B. Ait Hammou, A. Ait Lahcen, and S. Mouline, "Towards ...
  • H. Sadr, Mir M. Pedram, and M. Teshnehlab, "Convolutional Neural ...
  • A. Lakizadeh and Z. Zinaty, "A Novel Hierarchical Attention-based Method ...
  • M.N. Farhan, H. Md Ahsan, and A. Md Arshad, "A ...
  • D. Kılınç, "A spark‐based big data analysis framework for real‐time ...
  • M. Kumar and B. Anju, "Analyzing Twitter sentiments through big ...
  • A. L'heureux, K. Grolinger, and HF. Elyamany, "Machine learning with ...
  • E. Haddi, X. Liu, and Y. Shi, "The role of ...
  • M.K. Sohrabi, and F. Hemmatian, "An efficient pre-processing method for ...
  • M.W. Habib, and Z.N. Sultani, "Twitter Sentiment Analysis using Different ...
  • T. Singh and M. Kumari, "Role of text pre-processing in ...
  • S. Symeonidis, D. Effrosynidis, and A. Arampatzis, "A comparative evaluation ...
  • A.k. Uysal and S. Gunal, "The impact of pre-processing on ...
  • M. Zaharia, R.S. Xin, P. Wendell, T. Das, M. Armbrust, ...
  • J. Damji, "RDD vs. DataFrames and Datasets: A Tale of ...
  • S. Salloum, R. Dautov, X. Chen, PX. Peng, and ZH. ...
  • Y. Bao, C. Quan, L. Wang, and F. Ren, "The ...
  • J.Y. Cho and E.H. Lee, "Reducing confusion about grounded theory ...
  • A. Kumar, S. Abirami, T.E. Trueman, and E. Cambria, "Comment ...
  • T. Chen and C. Guestrin, "Xgboost: A scalable tree boosting ...
  • BT. Hung BT, "Domain-specific versus general-purpose word representations in sentiment ...
  • F. Baratzadeh and Seyed M. H. Hasheminejad, "Customer Behavior Analysis ...
  • نمایش کامل مراجع