K-means Clustering Algorithms on MapReduce: A Review

Elias Ameli Bafandeh; Hossein Deldari

K-means Clustering Algorithms on MapReduce: A Review

Publish place: Third International Electronic Conference on Information Technology, Present and Future

Publish Year: 1393

نوع سند: مقاله کنفرانسی

زبان: English

This Paper With 5 Page And PDF Format Ready To Download

دریافت فایل کامل Paper

Certificate
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

https://civilica.com/doc/342818

شناسه ملی سند علمی:

ITPF03_033

تاریخ نمایه سازی: 25 فروردین 1394

Abstract:

As web clicks, commercial, social networks, andscientific data sources growing with an extraordinary rate, it isvery necessary to analyze this data with powerful clusteringalgorithms. Current data mining Algorithms cannot deal withlarge datasets due to their large size and complexity. Forextracting useful information from these large datasets, newmining techniques are necessary. K-means is the most widely usedpartitional clustering algorithm and it is extremely sensitive to theinitial centroid selection. There is a growing development ofanalysis on large datasets using MapReduce jobs. MapReduceparallel processing is a framework for working with cloudcomputing, K-means is not suitable to be used in MapReducebecause of having repetitive calculation in working with largedata. For this reason, in recent years several research to optimizethe algorithm and reduce dependence on iterative computing isdone. In this article have been trying to work in the field toinvestigate the problem. The most important improvements havetaken place on the algorithm in order to reduce the number ofiterations and improvements done in the central parts of the initialselection.

Authors

Elias Ameli Bafandeh

Department of Engineering Islamic Azad University of Mashhad Mashhad, Iran

Hossein Deldari

Associated professor Islamic Azad University of Mashhad Mashhad, Iran