Persian Word Sense Disambiguation using LDA topic model

Publish Year: 1394
نوع سند: مقاله کنفرانسی
زبان: English
View: 944

This Paper With 9 Page And PDF Format Ready To Download

  • Certificate
  • من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

ICESCON01_0495

تاریخ نمایه سازی: 25 بهمن 1394

Abstract:

The Word sense disambiguation is a prominent issue in natural language processing. In this paper, a model is proposed for Persian word sense disambiguation using extraction of new features. To generate this model two groups of features are utilized including words and signs accompanying ambiguous word as well as features derived using topic modeling schemes. A topic model is a probabilistic model for extracting abstract of topics which are included in documents of a corpuse. In the paper at hand unsupervised Latent Dirichlet Allocation method is exploited. Experimental results for four ambiguous popular Persian words extracted from research center of intelligent signal processing corpus, show a precision of 939. It demonstrates the effect of this method on finding proper sense of words.

Authors

Babak Masoudi

Department of information technology, Payamenoor university(PNU),P.O.Box, 59391-3993 Tehran,I.R of Iran

Aboozar Zandvakili

Department of Computer Engineering, College of Engineering, jiroft Branch, Islamic Azad University, jiroft Iran