A Novel Approach to Feature Selection Using PageRank algorithm for Web Page Classification

Publish Year: 1398
نوع سند: مقاله ژورنالی
زبان: English
View: 155

This Paper With 14 Page And PDF Format Ready To Download

  • Certificate
  • من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

JR_JACR-10-4_001

تاریخ نمایه سازی: 13 اردیبهشت 1400

Abstract:

In this paper, a novel filter-based approach is proposed using the PageRank algorithm to select the optimal subset of features as well as to compute their weights for web page classification. To evaluate the proposed approach multiple experiments are performed using accuracy score as the main criterion on four different datasets, namely WebKB, Reuters-R۸, Reuters-R۵۲, and ۲۰NewsGroups. By analyzing the obtained results, it is observed that the accuracy score of the classifier on WebKB, Reuters-R۸, and Reuters-R۵۲ datasets significantly improved from ۹۱% up to ۹۶% compared to the best result achieved by other feature selection methods like IG and Chi-۲. Whereas, the accuracy score of the classifier on ۲۰NewsGroups dataset didn't see any noticeable improvement and remained close to the most compared methods. Evaluating the performance of the proposed approach shows the superiority of it in obtaining higher accuracy scores when compared with the feature sets selected by other methods.

Authors

Farhad Rezvani

Department of Computer Engineering, Urmia Branch, Islamic Azad University, Urmia, Iran

Farhad Soleimanian Gharehchopogh

Department of Computer Engineering, Urmia Branch, Islamic Azad University, Urmia, Iran