Presenting a Model of Data Anonymization in Big Data in the Context of In-Memory Processing Framework

Publish Year: 1403
نوع سند: مقاله ژورنالی
زبان: English
View: 44

This Paper With 20 Page And PDF Format Ready To Download

  • Certificate
  • من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

JR_JECEI-12-1_006

تاریخ نمایه سازی: 5 دی 1402

Abstract:

kground and Objectives: Nowadays, with the rapid growth of social networks extracting valuable information from voluminous sources of social networks, alongside privacy protection and preventing the disclosure of unique data, is among the most challenging objects. In this paper, a model for maintaining privacy in big data is presented. Methods: The proposed model is implemented with Spark in-memory tool in big data in four steps. The first step is to enter the raw data from HDFS to RDDs. The second step is to determine m clusters and cluster heads. The third step is to parallelly put the produced tuples in separate RDDs. the fourth step is to release the anonymized clusters. The suggested model is based on a K-means clustering algorithm and is located in the Spark framework. also, the proposed model uses the capacities of RDD and Mlib components. Determining the optimized cluster heads in each tuple's content, considering data type, and using the formula of the suggested solution, leads to the release of data in the optimized cluster with the lowest rate of data loss and identity disclosure. Results: Using Spark framework Factors and Optimized Clusters in the K-means Algorithm in the proposed model, the algorithm implementation time in different megabyte intervals relies on multiple expiration time and purposeful elimination of clusters, data loss rates based on two-level clustering. According to the results of the simulations, while the volume of data increases, the rate of data loss decreases compared to FADS and FAST clustering algorithms, which is due to the increase of records in the proposed model. with the formula presented in the proposed model, how to determine the multiple selected attributes is reduced. According to the presented results and ۲-anonomity, the value of the cost factor at k=۹ will be at its lowest value of ۰.۲۰.Conclusion: The proposed model provides the right balance for high-speed process execution, minimizing data loss and minimal data disclosure. Also, the mentioned model presents a parallel algorithm for increasing the efficiency in anonymizing data streams and, simultaneously, decreasing the information loss rate.

Authors

E. Shamsinejad

Department of Computer Engineering, Central Tehran Branch, Islamic Azad University, Tehran, Iran.

T. Banirostam

Department of Computer Engineering, Central Tehran Branch, Islamic Azad University, Tehran, Iran.

M. M. Pedram

Electrical and Computer Engineering Department, Kharazmi University, Tehran, Iran.

A. Rahmani

Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran.

مراجع و منابع این Paper:

لیست زیر مراجع و منابع استفاده شده در این Paper را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود Paper لینک شده اند :
  • Zhao, H. Jiang, C. Wang, H. Huang, G. Liu, Y. ...
  • Sangeetha, G. Sudha Sadasivam, Handbook of Big Data and IOT ...
  • Patnaik, New Paradigm of Industry ۴.۰: Internet of Things, Big ...
  • Chaudhary, Ch. Choudhary, M. Kumar Gupta, Ch. Lal, T. Badal, ...
  • Zhang, Ch. Liu, S. Nepal, Ch. Yang, J. Chen, Security, ...
  • Salas, J. Domingo-Ferrer, "Some basics on privacy techniques, anonymization and ...
  • Victor, D. Lopez, "Privacy models for big data: A survey," ...
  • K-K. Raymond Choo, A. Dehghantanha, Handbook of Big Data Privacy, ...
  • Al-Zobbi, S. Shahrestani, Ch. Ruan, "Improving mapreduce privacy by implementing ...
  • Luan Hou, X. Kun Huang, Ch. Qun Fei, Sh. Han ...
  • B Mehta, P. Rao U, "Toward scalable anonymization for privacy-preserving ...
  • Zheng, Z. Wang, T. Lv, Y. Ma, C. Jia, "K-Anonymity ...
  • Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, McCauley, ...
  • Ram Mohan Rao, S. Murali Krishna, A. P. Siva Kumar, ...
  • Khan, Kh. Iqbal, S. Faizullah, M. Fahad, J. Ali, W. ...
  • Dobson, K. Roy, X. Yuan, J. Xu, "Performance Evaluation of ...
  • Ullah Bazai, J. Jang-Jaccard, "SparkDA: RDD-Based high-performance data anonymization technique ...
  • Canbay, S. Sagiroglu, "Big data anonymization with spark, in Proc. ...
  • Al-Zobbi, S. Shahrestani, Ch. Ruan, "Experimenting sensitivity-based anonymization framework in ...
  • Mittal, V. E. Balas, L. Mohan Goyal, R.Kumar, Big Data ...
  • He, H. Cai, "Latent-Data privacy preserving with customized data utility ...
  • Matturdi, X. Zhou, S. Li, F. Lin "Big data security ...
  • Ouazzani, H. Bakkali, "A new technique ensuring privacy in big ...
  • Fei, S. Li, H. Dai, C. Hu, W. Dou, Q. ...
  • Canbay, Y. Vural, S. Sagiroglu, “Privacy Preserving Big Data,” in ...
  • Kayem, C. T. Vester, Ch. Meinel, "Automated k-anonymization and l-diversity ...
  • Shish Patel, S. Priyanka, "Online analytical processing for business intelligence ...
  • R. Macwan, S. J. Patel, "k-NMF anonymization in social network ...
  • Reiza, M. A. Armengol de la Hoz, M. S. Garcíaa, ...
  • Novotny, P. A. Bilokon, A. Galiotos, F. Deleze, Machine Learning ...
  • Bowles, Machine Learning with Spark and Python, Second ed., John ...
  • Wang., Zh. Cai, Y. Li, D. Yang, L. Li, H. ...
  • Arbuckle, Kh. El Emam, Building an Anonymization Pipeline, first ed., ...
  • Ram Prasad Reddy, K. V.S.V.N. Raju, V. Valli Kumari, "Personalized ...
  • Domingo-Ferrer, "Big data anonymization requirements vs privacy models," in Proc. ...
  • A Abdelhameed, Sh. M Moussa, M. E Khalifa, "Restricted sensitive ...
  • Canbay, A. Kalyoncu, M. Ercimen, A. Dogan, S. Sagiroglu, "A ...
  • Tekli, B. Al Bouna, Y. Bou Issa, M. Kamradt, R. ...
  • Jain, M. Gyanchandani, N. Khare, "Improved k-anonymity privacy-preserving algorithm using ...
  • Guo, Q. Zhang, "Fast clustering-based anonymization approaches with time constraints ...
  • Wang, Zh. Chi, X. Tong, L. Li, "A differentially k-anonymity-based ...
  • Eyupoglu, M. Aydin, A. Zaim, A. Sertbas, “An Efficient big ...
  • Nezarat, Kh. Yavari, "A distributed method based on mondrian algorithm ...
  • International Congress on High-Performance Computing and Big Data Analysis (HPC), ...
  • Silva, T. Basso, R. Moraes, D. Elia, S. Fior, "A ...
  • Abouelmehdi, A. Beni-Hessane, H. Khaloufi, "Big healthcare data: Preserving security ...
  • Domingo-Ferrer, J. Soria-Comas, "Anonymization in the Time of Big Data," ...
  • Ghavami, Big Data Analytics Methods: Analytics Techniques in Data Mining, ...
  • Z. Zgurovsky, Y. P. Zaychenko, Big Data: Conceptual Analysis and ...
  • Kumar Mishra, X. She Yang, A. Unal, Data Science and ...
  • info at the University of Massachusetts Amherst {Datasets Adult} ...
  • Kiabod, M. N. Dehkordi, B. Barekatain, “TSRAM: A Time-Saving k-degree ...
  • Otgonbayar, Z. Pervez, K. Dahal, S. Eager, "K-VARP: k-anonymity for ...
  • Kaur, S. Agrawal, "Differential privacy framework: impact of quasi-identifiers on ...
  • Wang, Z. Cai, Y. Li, D. Yang, J. Li, "Protecting ...
  • N. Yang, Sh. L. Peng, L. C. Jain, Security with ...
  • Oneto, N. Navarin, A. Sperduti, D. Anguita, Recent Advances in ...
  • Andrew, J. Karthikeyan, Privacy-Preserving Big Data Publication: (K, L) Anonymity, ...
  • info at the University of Massachusetts Amherst {Datasets Bank and ...
  • Banirostam, H. Banirostam, M. M. Pedram, A. M. Rahamni, "A ...
  • Banirostam, E. Shamsinezhad T. Banirostam, "Functional control of users by ...
  • Banirostam, A. Hedayati, A. Khadem Zadeh, E. Shamsinezhad, "A trust ...
  • Banirostam, A. R. Hedayati, A. Khadem Zadeh, "Using virtualization technique ...
  • Shamsinezhad, A. Shahbahrami, A. Hedayati, A. Khadem Zadeh, H. Banirostam, ...
  • Banirostam, E. Shamsinejad, M. M. Pedram, A. M. Rahamni, "A ...
  • El Ouazzani, H. El Bakkali, "A new technique ensuring privacy ...
  • Raj, R. G L D’Souza, "Big data anonymization in cloud ...
  • Jain, M. Gyanchandani, N. Khare, " Improved k-anonymize and l-diverse ...
  • نمایش کامل مراجع