A Language Independent Method to ExtractEssence of a Text in the Form of phrases

Publish Year: 1394
نوع سند: مقاله کنفرانسی
زبان: English
View: 516

This Paper With 8 Page And PDF Format Ready To Download

  • Certificate
  • من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

CITCONF03_474

تاریخ نمایه سازی: 12 تیر 1395

Abstract:

In this paper, we present a method to extract essence of a text by statistical calculations. Essence of a text presents the most important concepts of that text and is used to retrieve documents in search engines more efficiently and to summarize the text more precisely. The essence is shown in the form of some phrases in this paper. So detection of main phrases of a text plays an important role in this method. We propose a language-independent method based on combination of statistical information extracted from document. In this method, there is no need to any natural language processing, neither ontologies nor document corpuses. We illustrate a real time method to understand main points of a text without any training data. Several parameters have been calculated for each phrase based on its frequency, frequency of related phrases and location of phrase in the text. Based on these parameters, some phrases are removed because of being unimportant. Remaining phrases presents the main points of the text. Evaluations show that the proposed approach provides a high precision and accuracy when extracts the essence of a single text in the form of phrases.

Authors

Javad Davoudi Moghaddam

K. N. Toosi University of Technology, Computer Engineering Faculty, Tehran, Iran

Amin Mosallanezhad

K. N. Toosi University of Technology, Computer Engineering Faculty, Tehran, Iran,

Ali Ahmadi

K. N. Toosi University of Technology, Computer Engineering Faculty, Tehran, Iran

مراجع و منابع این Paper:

لیست زیر مراجع و منابع استفاده شده در این Paper را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود Paper لینک شده اند :
  • E. D'Avanzo, _ Magnini and A. Vallin, "Keyphrase Extraction for ...
  • P. Tonella, F. Ricca, E. Pianta and C Girardi, "Using ...
  • Joachims, Thorsten, _ Probabilistic Analysis of the Rocchio Algorithm with ...
  • Luo, Le, and Li Li. "Defining and evaluating classification algorithm ...
  • نمایش کامل مراجع