Persian SMS Spam Detection using Machine Learning and Deep Learning Techniques

Roya Khorashadizade; Somayyeh Jafarali Jassbi; Alireza Yari

Persian SMS Spam Detection using Machine Learning and Deep Learning Techniques

Publish place: International Journal of Web Research، Vol: 5، Issue: 1

Publish Year: 1401

نوع سند: مقاله ژورنالی

زبان: English

This Paper With 10 Page And PDF Format Ready To Download

دریافت فایل کامل Paper

Certificate
من نویسنده این مقاله هستم

این Paper در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

https://civilica.com/doc/1611866

شناسه ملی سند علمی:

JR_IJWR-5-1_008

تاریخ نمایه سازی: 13 اسفند 1401

Abstract:

Spams are well-known examples of unsolicited text or messages which are sent by unknown individuals and cause issues for smartphone users. The inconvenience imposed on users, the loss of network traffic, the rise in the calculated cost, occupying more physical space on the mobile phone, and abusing and defrauding recipients are but a few of their downsides. Consequently, the automated identification of suspicious and spam messages is undoubtedly vitally important. Additionally, text messages which are smartly composed might be difficult to recognize. However, the present methodologies in this subject are hindered by the absence of adequate Persian datasets. A huge body of research and experiments has revealed that techniques based on deep and combined learning are superior at identifying unpleasant text messages. This work sought to develop an effective strategy for identifying SMS spam through utilizing combining machine learning classification algorithms together with deep learning models. After applying preprocessing on our gathered dataset, the suggested technique applies two convolutional neural network layers, the first of which being an LSTM layer, and the second one which is a fully connected layer to extract the data characteristics, thereby implementing the suggested deep learning approach. As part of the Machine Learning methodologies, the vector support machine makes use of the data and features at hand to determine the ultimate classification. Results indicate that the suggested model is implemented more effectively than the existing techniques, and an accuracy of ۹۷.۷% was achieved as a result.

Keywords:

SMS Spam , spam detection , Support Vector Machine , convolutional neural network , LSTM

Authors

Roya Khorashadizade

Department of Information Technology Science and Research Branch, Islamic Azad University Tehran, Iran

Somayyeh Jafarali Jassbi

Department of Computer Engineering, Science and Research Branch, Islamic Azad University Tehran, Iran

Alireza Yari

Iran telecom IT Research faculty, ICT research institute, Tehran, Iran research center