A Modified Language Modeling Method for Authorship Attribution

Samane Vazirian; Morteza Zahedi

A Modified Language Modeling Method for Authorship Attribution

Publish place: The Eighth International Conference on Information and Knowledge Technology

Publish Year: 1395

نوع سند: مقاله کنفرانسی

زبان: English

This Paper With 6 Page And PDF Format Ready To Download

دریافت فایل کامل Paper

Certificate
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

https://civilica.com/doc/548665

شناسه ملی سند علمی:

ICIKT08_006

تاریخ نمایه سازی: 5 بهمن 1395

Abstract:

This paper presents an approach to a closed-classauthorship attribution (AA) problem. It is based on languagemodeling for classification and called modified languagemodeling. Modified language modeling aims to offer a solutionfor AA problem by Combinations of both bigram wordsweighting and Unigram words weighting. It makes the relationbetween unseen text and training documents clearer with givingextra reward of training documents; training document includingbigram word as well as unigram words. Moreover, IDF valuemultiplied by related word probability has been used, instead ofremoving stop words which are provided by Stop words list. weevaluate Experimental results by four approaches; unigram,bigram, trigram and modified language modeling by using twoPersian poem corpora as WMPR-AA2016-A Dataset and WMPRAA2016-B Dataset. Results show that modified language modelingattributes authors better than other approaches. The result onWMPR-AA2016-B, which is bigger dataset, is much better thananother dataset for all approaches. This may indicate that ifadequate data is provided to train language modeling themodified language modeling can be a good solution to AAproblem.

Keywords:

Authorship Attribution , Authorship Identification , Language Modeling , Text Processing

Authors

Samane Vazirian

Kharazmi International Campus, Shahrood University of Technology Shahrood, Iran

Morteza Zahedi

Kharazmi International Campus, Shahrood University of Technology Shahrood, Iran

مراجع و منابع این Paper:

لیست زیر مراجع و منابع استفاده شده در این Paper را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود Paper لینک شده اند :

Mosteller, Frederick and a. D. Wallace, "Inference and disputed authorship: ...
attribution of literary texts, " International Journal of Applied [27] ...
N. Potha and E. Stamatatos, "A Profile-Based Method for Authorship ...
_ _ _ _ Cambridge University Presss, 2008, pp. 237-238. ...
P. Juola, "Authorship Attribution, " Boston, the essence of knowloge, ...
_ _ _ _ Science and ...
D. I. Holmes, "The evolution of stylometry in humanities scholarship, ...
_ _ _ 30, no.4, pp. 55-64, 2001. ...
M. A. Boukhaled and J.-G. Ganascia, "Using Function Words for ...
_ _ _ _ _ _ _ no. 05, pp. ...
F. Howedi and M. Mohd, "Text Classification for Authorship Attribution ...
J. Diederich, J. Kindermann, , Leopold and G. Paass, "machines, ...
_ _ : _ science, 2007. ...
_ _ _ _ _ Writing ...
_ _ _ 30, no.4, pp. 55-64, 2001. ...
_ _ _ Science and Technology, vol. 60, no. I, ...
E. E. Abdallah, A. E. Abdallah, M. Bsoul, A. F. ...

نمایش کامل مراجع