Summarization Algorithm for Data Stream to Speed up Outlier Data Detection

Publish Year: 1402
نوع سند: مقاله ژورنالی
زبان: English
View: 24

This Paper With 12 Page And PDF Format Ready To Download

  • Certificate
  • من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

JR_JCSE-10-1_003

تاریخ نمایه سازی: 18 فروردین 1403

Abstract:

Outlier detection in data streams is an essential issue in data processing. Today, due to the massive growth of streaming data generated by the spread of the Internet of Things, outlier detection has become a significant challenge. Much progress has been made in outlier detection based on local outlier detection algorithms, such as density-based local outlier factor algorithms, suitable for static data. The incremental version of these algorithms is used to detect the local outliers in streaming data. However, outlier detection in streaming data faces the challenges of limited memory capacity, high execution time, inaccessibility of all data at one time, and changes in data distribution (increasing and decreasing input rates, uncertainty, etc.). In this paper, we propose a density-based summarization algorithm, which summarizes data, every time the buffer is filled. The proposed algorithm maintains the desired shape of the clusters, with a low computational cost. To this end, larger clusters are selected and the data of their dense areas are reduced so that the shape of the old clusters is not lost. The proposed summarization algorithm reduces execution time and increases precision, recall, and F۱ score compared with the evaluated algorithms.

Authors

Hadid Mollashahi

Faculty of Electrical & Computer Engineering, University of Birjand, Birjand, Iran.

Hamid Saadatfar

Faculty of Electrical & Computer Engineering, University of Birjand, Birjand, Iran.

Hamed Vahdatnejad

Faculty of Electrical & Computer Engineering, University of Birjand, Birjand, Iran.

مراجع و منابع این Paper:

لیست زیر مراجع و منابع استفاده شده در این Paper را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود Paper لینک شده اند :