Discovery of Potential Topics from Blog Articles by Machine Learning

Publish Year: 1393
نوع سند: مقاله ژورنالی
زبان: English
View: 465

This Paper With 7 Page And PDF Format Ready To Download

  • Certificate
  • من نویسنده این مقاله هستم

این Paper در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

JR_ACSIJ-4-2_010

تاریخ نمایه سازی: 7 آذر 1394

Abstract:

This paper presents a method for potential topic discovery from blogsphere. We define a potential topic as an unpopular phrase that has potential to become a hot topic. To discover potentialtopics, this method builds a classifier to detect potentiality of a topic from topic frequency transitions in blog articles. First, thismethod extracts candidates of potential topics from categorized blog articles because categorization enables us to extractspecialists. To extract potential topics from the candidates, a classifier for detecting potential topics is built from topic frequency transition data. For this learning, we propose twotypes of learning methods: supervised learning and semisupervised learning. Though supervised learning provides moreprecise results, it requires enormous size of labeled data. Creating labeled data is costly and difficult. On the other hands, semi-supervised learning can build classifier from small size oflabeled data and a lot of unlabeled data. Experimental results with real blog data show the effectiveness of the proposed method

Authors

Yoshiaki YASUMURA

College of Engineering, Shibaura Institute of Technology, Japan Saitama City, Japan

Yuhei KOSAKA

Graduate School of System Informatics, Kobe University, Japan Kobe City, Japan

Hiroyoshi TAKAHASHI

Graduate School of System Informatics, Kobe University, Japan Kobe City, Japan

Kuniaki UEHARA

Graduate School of System Informatics, Kobe University, Japan Kobe City, Japan