A Column Encryption-Based Privacy-Preserving Framework for Hadoop Big Data Sets

Main Article Content

Hidayath Ali Baig
https://orcid.org/0000-0002-1953-6871

Abstract

The exponential growth of the Internet, the Internet of Things, and Cloud Computing in recent times has led to a significant rise of data across various sectors of business and industry. Big data has become a growing trend in recent years, attracting the attention of academics, corporate leaders, and government officials worldwide. Hadoop is a commonly adopted framework for processing big data. This data expansion has the potential to provide substantial and beneficial advantages, and some early success has been achieved from a technical standpoint in dealing with such a large quantity of data. Along with its many benefits, it also has a slew of disadvantages. These include, but are not limited to, data storage, exchange, curation, transit, analysis, visualization, security and privacy. In this research, the privacy implications of Big Data analytics are being investigated. Several publications suggest methods to secure big data. Each technique has advantages and disadvantages. Regardless of privacy laws, application developers must protect sensitive data. Therefore, there is need for innovative methods to guarantee the protection of individuals' privacy in the context of big data. This paper presents a framework for preserving privacy in data-at-rest within the Hadoop architecture. The framework employs columnar data storage, data masking, and encryption techniques to address these challenges efficiently.

Article Details

How to Cite
1.
A Column Encryption-Based Privacy-Preserving Framework for Hadoop Big Data Sets. Baghdad Sci.J [Internet]. 2024 May 25 [cited 2024 Nov. 20];21(5(SI):1798. Available from: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/10550
Section
Special Issue - (ICCDA) International Conference on Computing and Data Analytics

How to Cite

1.
A Column Encryption-Based Privacy-Preserving Framework for Hadoop Big Data Sets. Baghdad Sci.J [Internet]. 2024 May 25 [cited 2024 Nov. 20];21(5(SI):1798. Available from: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/10550

References

Reinsel J, Rydning DR, Gantz J. Gantz JF, Reinsel D, Rydning J. The us datasphere: Consumers flocking to cloud. White Paper. International Data Corporation (IDC) 2019 Jan.

Anna K, Nikolay K. Survey on Big Data Analytics in Public Sector of Russian Federation. Procedia Comput Sci. 2015; 55: 905–11. https://doi.org/ 10.1016/j.procs.2015.07.144

Apache Software Foundation. Hadoop. 2020. hadoop.apache.org

Hashem IAT, Yaqoob I, Anuar NB, Mokhtar S, Gani A, Ullah Khan S. The rise of “big data” on cloud computing: Review and open research issues. Inf Syst. 2015 Jan; 47: 98–115. https://doi.org/10.1016/j.is.2014.07.006

Mutasher WG, Aljuboori AF. New and Existing Approaches Reviewing of Big Data Analysis with Hadoop Tools. Baghdad Sci J. 2022 ;19(4): 887-898. https://doi.org/10.21123/bsj.2022.19.4.0887

Jain P, Gyanchandani M, Khare N. Big Data Security and Privacy: New Proposed Model of Big Data with Secured MR Layer. Advanced Computing and Systems for Security. Singapore: Springer Singapore. 2019; 31–53. https://doi.org/10.1007/978-981-13-3702-4_3

Mayyahi MA, Seno SA. A Security and Privacy Aware Computing Approach on Data Sharing in Cloud Environment. Baghdad Sci J. 2022; 19(6(Suppl.): 1572. https://doi.org/10.21123/bsj.2022.7077

Merceedi KJ, Sabry NA. A Comprehensive Survey for Hadoop Distributed File System. Asian J Res Comput Sci. 2021; 46–57. https://doi.org/10.9734/ajrcos/2021/v11i230260

Elkawkagy M, Elbeh H. High Performance Hadoop Distributed File System: Int J Networked Distrib Comput. 2020; 8(3): 119-123. https://doi.org/10.2991/ijndc.k.200515.007

Tabrizchi H, Kuchaki Rafsanjani M. A survey on security challenges in cloud computing: issues, threats, and solutions. J Supercomput. 2020; 76(12): 9493–532. https://doi.org/10.1007/s11227-020-03213-1

Leicher A, Kuntze N, Schmidt AU. Implementation of a Trusted Ticket System. In: Gritzalis D, Lopez J, editors. Emerging Challenges for Security, Privacy and Trust. Berlin, Heidelberg: Springer Berlin Heidelberg. 2009; 152–63. https://doi.org/10.1007/978-3-642-01244-0_14

Khalil I, Dou Z, Khreishah A. TPM-Based Authentication Mechanism for Apache Hadoop. International Conference on Security and Privacy in Communication Networks. 2015; 105–122. https://doi.org/10.1007/978-3-319-23829-6_8

Shahin D, Ennab H, Saeed R, Alwidian J. Big Data Platform Privacy and Security, A Review. Int J Comp Sci Netw Secur. 2019; 19(5): 24-34.

Filaly Y, Mendili FE, Berros N, Idrissi YEBE. Hybrid Encryption Algorithm for Information Security in Hadoop. Int J Adv Comput Sci Appl . 2023; 14(6): 1295-302. https://dx.doi.org/10.14569/IJACSA.2023.01406137

Guan S, Zhang C, Wang Y, Liu W. Hadoop-based secure storage solution for big data in cloud computing environment. Digit Commun Netw. 2024; 10(1): 227–36. https://doi.org/10.1016/j.dcan.2023.01.014

Chen Y, Hao Y, Yi Z, Wu K, Zhao Q, Wang X. Searchable Encryption System for Big Data Storage. Commun Comput Inf Sci. 2021; 1452: 139–15. Springer, Singapore. https://doi.org/10.1007/978-981-16-5943-0_12

Anand K. Sentry to Ranger - A Concise Guide. Cloudera Blog. 2021.

Strata. Cloudera introduces RecordService for security, Kudu for streaming data analysis. ZDNET. 2015.

Cloudera. Apache Ranger. 2022.

Cloudera. Apache Knox Gateway Overview. 2022.

GoCypher. Eleven-Z/rhino. GitHub. 2020.

Baig HA. A Protection Layer over MapReduce Framework for Big Data Privacy. Int J Comput Inf Technol. 2022 Apr; 11(2): 68-73. https://doi.org/10.24203/ijcit.v11i2.263.

Baig H A, Sharma Y K, Ali S Z. Privacy-Preserving in Big Data Analytics: State of the Art (September 12, 2020). Int. Conf. on Business Management, Innovation & Sustainability (ICBMIS) 2020. http://dx.doi.org/10.2139/ssrn.3713826

Apache Software Foundation. ORC Specification v1. 2021.

Baig HA, Jummani DF, Ali SZ. A Framework for Preserving the Privacy of Data in Hadoop Clusters using Column Encryption. Int. J. Adv. Res. Eng. Technol. 2021; 8: 17894-902.

Similar Articles

You may also start an advanced similarity search for this article.