Using Support Vector Machines Classifier fo Improve the Performance of Reinforcement Learning based Web Crawlers
Publish place: 9th Annual Conference of Computer Society of Iran
Publish Year: 1382
Type: Conference paper
Language: English
View: 1,972
This Paper With 8 Page And PDF Format Ready To Download
- Certificate
- I'm the author of the paper
Export:
Document National Code:
ACCSI09_098
Index date: 24 January 2008
Using Support Vector Machines Classifier fo Improve the Performance of Reinforcement Learning based Web Crawlers abstract
The main contribution of this paper is introducing an approach for expanding the crawling methods of Cora spider, as a RL-based spider. We have introduced novel methods for calculating the Q-Value in reinforcement learning module of the spider. The proposed crawlers can find the target pages faster and earn more rewards over the crawl than Cora’s crawlers. We have used support Vector Machines (SVMs) classifier for the first time as a text learner in Web crawlers and compared the results with crawlers which use Naïve Bayes (NB) classifier for this purpose. The results show that crawlers using SVMs outperform crawlers which use NB in the first half of crawling a web site and find the target pages more quickly. The test bed for the evaluation of our approaches was Web sites of four computer science departments of four universities, which have been made available offline.
Using Support Vector Machines Classifier fo Improve the Performance of Reinforcement Learning based Web Crawlers Keywords:
Using Support Vector Machines Classifier fo Improve the Performance of Reinforcement Learning based Web Crawlers authors
Ahmad Abdollahzadeh Barfourosh
Computer Eng. & IT Faculty , Amirkabir University of Technology Tehran, Iran
Hamid Reza Motahari Nezhad
Computer Eng. & IT Faculty , Amirkabir University of Technology Tehran, Iran
مراجع و منابع این Paper:
لیست زیر مراجع و منابع استفاده شده در این Paper را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود Paper لینک شده اند :