A Study on Intelligent Authorship Methods in Persian Language

Publish Year: 1394
نوع سند: مقاله ژورنالی
زبان: English
View: 119

This Paper With 14 Page And PDF Format Ready To Download

  • Certificate
  • من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

JR_JCSE-2-1_006

تاریخ نمایه سازی: 12 دی 1400

Abstract:

Author identification is an attempt to demonstrate the characteristics of the author of a piece of language information so that in the end, it would be possible to significantly distinguish the difference between various texts written by different people. The rapid development of Internet communication has caused Internet tools with anonymous identity, such as emails and weblogs, to become popular communication methods for the perpetrators of illegal acts and has raised some security concerns. Persian language is of interest to a great number of different individuals and organizations for various reasons such as political, social, artistic, cultural and religious issues. In this paper, a number of intelligent writeprint methods which help automatic identification of a Persian writer based on his/her writing style are studied and compared. For this purpose, after collecting two different databases, five feature types including lexical, syntactic, semantic and application-specific features, were used for extracting stylometric characteristics. In this study KNN, Delta, Neural Networks, Decision Tree and Linear Discriminate Analysis classification methods were applied to these databases. The results and their comparison showed that Linear Discriminate Analysis and KNN methods ranked first and second, respectively, in terms of accuracy among the studied methods.

Authors

Zeinab Farahmandpour

Department of Computer Engineering, Faculty of Engineering, Bu-Ali Sina University, Hamedan, Iran.

Hooman Nikmehr

Department of Computer Architecture, Faculty of Computer Engineering, University of Isfahan, Isfahan, Iran.