CIVILICA We Respect the Science
(ناشر تخصصی کنفرانسهای کشور / شماره مجوز انتشارات از وزارت فرهنگ و ارشاد اسلامی: ۸۹۷۱)

Clustering Large-Scale Data using an Incremental Heap Self-Organizing Map

عنوان مقاله: Clustering Large-Scale Data using an Incremental Heap Self-Organizing Map
شناسه ملی مقاله: JR_ITRC-14-2_005
منتشر شده در در سال 1401
مشخصات نویسندگان مقاله:

Mehdi Fasanghari - Iran Telecommunication Research Center Tehran, Iran.
Helena Bahrami - School of Engineering, Computer and Mathematical Sciences, Auckland University of Technology, New Zealand.
Hamideh Sadat Cheraghchi - Iran Health Insurance Organization Tehran, Iran.

خلاصه مقاله:
In machine learning and data analysis, clustering large amounts of data is one of the most challenging tasks. In reality, many fields, including research, health, social life, and commerce, rely on the information generated every second. The significance of this enormous amount of data in all facets of contemporary human existence has prompted numerous attempts to develop new methods for analyzing large amounts of data. In this research, an Incremental Heap Self-Organizing Map (IHSOM) is proposed for clustering a vast amount of data that continues to grow. The gradual nature of IHSOM enables environments to change and evolve. In other words, IHSOM can quickly adapt to the size of a dataset. The heap binary tree structure of the proposed approach offers several advantages over other structures. Initially, the topology or neighborhood relationship between data in the input space is maintained in the output space. The outlier data are then routed to the tree's leaf nodes, where they may be efficiently managed. This capability is supplied by a probability density function as a threshold for allocating more similar data to a cluster and transferring less similar data to the following node. The pruning and expanding nodes process renders the algorithm noise-resistant, more precise in clustering, and memory-efficient. Therefore, heap tree structure accelerates node traversal and reorganization following the addition or deletion of nodes. IHSOM's simple user-defined parameters make it a practical unsupervised clustering approach. On both synthetic and real-world datasets, the performance of the proposed algorithm is evaluated and compared to existing hierarchical self-organizing maps and clustering algorithms. The outcomes of the investigation demonstrated IHSOM's proficiency in clustering tasks.

کلمات کلیدی:
Self-organizing map (SOM), Binary heap tree, Incremental hierarchical structure, Probability density function.

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/1496500/