Improving Persian POS Tagging Using the Maximum Entropy Model

Publish Year: 1392
نوع سند: مقاله کنفرانسی
زبان: English
View: 964

This Paper With 5 Page And PDF Format Ready To Download

  • Certificate
  • من نویسنده این مقاله هستم

این Paper در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

ICS12_231

تاریخ نمایه سازی: 11 مرداد 1393

Abstract:

Part of Speech (POS) tagging is one of the fundamental steps in various speech and text processing applications. POS tagging is the process of assigning the words ininput sentences with their categories according to their contextual and grammatical properties. In addition to the generalPOS tagging difficulties such as the disambiguation of multicategorywords and unknown words, the Persian language,unlike the English language, is a free order language and it has its own characteristics. These challenges can greatly affect the quality of the part-of-speech tagging process. An efficient POStagging process has been developed for some languages, especially for the English language, but just a few researches have been done on the Persian language. To address these issues and achieve high POS tagging accuracy, we chose features which can show the important characteristics of words in a sentence, aswell as maximum entropy as a machine learning classifier. Experimental results show that the proposed Persian POStagging system outperforms the other state-of-the-art Persian taggers.

Keywords:

Natural Language Processing , Part of Speech Tagging , Persian Part of Speech Tagging , Maximum Entropy

Authors

Ahmad A. Kardan

Department of Computer Engineering and Information Technology Amirkabir University of Technology Tehran, Iran

Maryam Bahojb Imani

Department of Computer Engineering and Information Technology Amirkabir University of Technology Tehran, Iran

مراجع و منابع این Paper:

لیست زیر مراجع و منابع استفاده شده در این Paper را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود Paper لینک شده اند :
  • J. Dan, et al., "Speech and language processing: An introduction ...
  • _ Kashefi, M. Nasri, and K. Kanan. "Automatic Spell Checking ...
  • L. Marquez, L. Padro, and H. Rodriguez, "A machine learning ...
  • _ _ _ _ Corpora, vol. 53, 1999. ...
  • _ _ _ _ _ _ _ _ _ Script-based ...
  • S. Tasharofi, F. Raja, F. Oroumchian, and M. Rahgozar, "Evaluation ...
  • K. Toutanova, and C.D. Manning, "Enriching the Knowledge Sources Used ...
  • F. Oroumchian, S. Tasharofi, H. Amiri, H. Hojjat, and F. ...
  • نمایش کامل مراجع