Author Masking against Author Verification and a means to strengthen it using different features

Publish Year: 1397
نوع سند: مقاله کنفرانسی
زبان: English
View: 431

This Paper With 16 Page And PDF Format Ready To Download

  • Certificate
  • من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

CEITCONF02_086

تاریخ نمایه سازی: 27 اردیبهشت 1398

Abstract:

The main problem in this survey paper that we want to speak about it is Author Masking, besides, we willtalk about Author Verification too. In the author verification, given a pair of documents, decide whetherthey are written by the same author or not. They propose a machine learning attitude based on a numberof different features that characterize documents from widely different points of view. In the authormasking, given a document and a set of documents from the same author, changing the former so that itsauthor cannot be identified, anymore. Masking the writing style of a writer has been useful and used bynovelists for the goal of passing unnoticed, as well as by people who aim to give information withoutbeing linked to it. In fact, these two approaches are in front of together and by studying on authormasking methods we can detect the weaknesses of authorship verification system and simulate the systemthat will be able to find the text s author easily. In this paper, we will state an approach that constructsnon-overlapping groups of homogeneous features, use a random forest regressor for each features groupand combine the output of all regressors by their arithmetic mean to verify the text s author and someapproaches that are in against of that and try to mask the text s author by simple translation,transformation in sentences, word choice distribution, sentence length preference, and etc.The main problem in this survey paper that we want to speak about it is Author Masking, besides, we willtalk about Author Verification too. In the author verification, given a pair of documents, decide whetherthey are written by the same author or not. They propose a machine learning attitude based on a numberof different features that characterize documents from widely different points of view. In the authormasking, given a document and a set of documents from the same author, changing the former so that itsauthor cannot be identified, anymore. Masking the writing style of a writer has been useful and used bynovelists for the goal of passing unnoticed, as well as by people who aim to give information withoutbeing linked to it. In fact, these two approaches are in front of together and by studying on authormasking methods we can detect the weaknesses of authorship verification system and simulate the systemthat will be able to find the text s author easily. In this paper, we will state an approach that constructsnon-overlapping groups of homogeneous features, use a random forest regressor for each features groupand combine the output of all regressors by their arithmetic mean to verify the text s author and someapproaches that are in against of that and try to mask the text s author by simple translation,transformation in sentences, word choice distribution, sentence length preference, and etc.

Authors

Farnaz Mahan

Assistant Professor of Tabriz University, University of Tabriz, Computer Science Department, Tabriz, Iran

Salar Khayat Mirza Rasluli

Master student of soft computing and artificial intelligence, University of Tabriz, Computer Science Department, Tabriz, Iran

Erfan Mohammadzadeh

Bachelor student of computer science, Computer Science Department, Ardabil, Iran