Mehr: A Persian Coreference Resolution Corpus
Publish Year: 1402
نوع سند: مقاله ژورنالی
زبان: English
View: 39
This Paper With 11 Page And PDF Format Ready To Download
- Certificate
- من نویسنده این مقاله هستم
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
JR_JADM-11-3_006
تاریخ نمایه سازی: 20 دی 1402
Abstract:
Coreference resolution is one of the essential tasks of natural languageprocessing. This task identifies all in-text expressions that refer to thesame entity in the real world. Coreference resolution is used in otherfields of natural language processing, such as information extraction,machine translation, and question-answering.This article presents a new coreference resolution corpus in Persiannamed Mehr corpus. The article's primary goal is to develop a Persiancoreference corpus that resolves some of the previous Persian corpus'sshortcomings while maintaining a high inter-annotator agreement. Thiscorpus annotates coreference relations for noun phrases, namedentities, pronouns, and nested named entities. Two baseline pronounresolution systems are developed, and the results are reported. Thecorpus size includes ۴۰۰ documents and about ۱۷۰k tokens. Corpusannotation is done by WebAnno preprocessing tool.
Keywords:
Authors
Hassan Haji Mohammadi
Department of Computer Engineering, North Tehran Branch, Islamic Azad University, Tehran, Iran.
Alireza Talebpour
Department of computer engineering, Shahid Beheshti University, Tehran, Iran.
Ahamd Mahmoudi Aznaveh
Department of computer engineering, Shahid Beheshti University, Tehran, Iran.
Samaneh Yazdani
Department of Computer Engineering, North Tehran Branch, Islamic Azad University, Tehran, Iran.
مراجع و منابع این Paper:
لیست زیر مراجع و منابع استفاده شده در این Paper را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود Paper لینک شده اند :