Retrieving Encrypted Images Using Convolution Neural Network and Fully Homomorphic Encryption

: A content-based image retrieval (CBIR) is a technique used to retrieve images from an image database. However, the CBIR process suffers from less accuracy to retrieve images from an extensive image database and ensure the privacy of images. This paper aims to address the issues of accuracy utilizing deep learning techniques as the CNN method. Also, it provides the necessary privacy for images using fully homomorphic encryption methods by Cheon, Kim, Kim, and Song (CKKS). To achieve these aims, a system has been proposed, namely RCNN_CKKS, that includes two parts. The first part (offline processing) extracts automated high-level features based on a flatting layer in a convolutional neural network (CNN) and then stores these features in a new dataset. In the second part (online processing), the client sends the encrypted image to the server, which depends on the CNN model trained to extract features of the sent image. Next, the extracted features are compared with the stored features using a Hamming distance method to retrieve all similar images. Finally, the server encrypts all retrieved images and sends them to the client. Deep-learning results on plain images were 97.94% for classification and 98.94% for retriever images. At the same time, the NIST test was used to check the security of CKKS when applied to Canadian Institute for Advanced Research (CIFAR-10) dataset. Through these results, researchers conclude that deep learning is an effective method for image retrieval and that a CKKS method is appropriate for image privacy protection.


Introduction:
CBIR is the most practical approach for meaningful image searching since the number of digital images on digital resources has expanded exponentially. Most conventional image searches are done using metadata of the image. To retrieve this metadata, text query techniques are used. However, all images must have correct metadata to retrieve better-related images. They are choosing the proper metadata for an image as one of the issues impacting search accuracy. As a result, query by image (QBI) is a better option. The CBIR system is discovered to be employed for image processing inquiries from numerous research investigations.
The CBIR system takes a sample image as input and searches a vast library of photos for similar images using low-level features 1 . In CBIR, image retrieval is classified into high-level and low-level features. Initially, single color, shape, and texture elements were used, resulting in good retrieval results due to the availability of diverse visual qualities. Additionally, machine learning techniques are used, which have a high degree of efficiency in automatically extracting low-level information from images 2 .Fully homomorphic encryption (FHE) allows the model to compute an unbounded computation over ciphertext and decrypt it over plain text. Equation (1) illustrates a function f() that performs arithmetic operations such as (addition or/and multiplication) on the plaintext and is equivalent to the ciphertext 3 .
( ( )) = ( ( )) … … … 1 ̇ Cryptography is a branch of mathematics that uses complex algorithms to ensure the confidentiality of information while it is transmitted and stored. Since the fundamental Diffie Hellman paper was introduced in 1976, several new public-key cryptosystems have been created. Most of them are based on two hard mathematical problems: the factorization and the discrete logarithm problems (e.g., RSA, ElGamal cryptosystem, ECC, and many others).Even though these cryptosystems are very safe and contain considerable keyspace, they have been regarded as expensive and slow 4,5 . Armknecht and Sadeghi developed an algebraically homomorphic approach to cryptography in 2008 6 , while Gentry extended foundational work on fully homomorphic schemes in 2009 7 . The same year Gentry modified fully homomorphic encryption utilizing ideal lattices 8 . Van Dijk et al. developed completely homomorphic encryption over integers in 2010 9 . Gentry et al. (2013) also developed homomorphic encryption based on error learning 10 . Fig.1 illustrates the approach of completely homomorphic cryptosystems.

Figure 1. Approach of the FHE
The current state-of-the-art in approximation homomorphic computations for real and complex numbers is the Cheon-Kim-Kim-Song (CKKS) HE scheme. The CKKS system can already be applied to practical use, e.g., in machine learning 11 .
Traditional learning algorithms are dependent on hand-designed features. Deep learning, one of the most powerful techniques of machine learning, may transcend this dependence. There are two steps to deep learning: training to improve the model accuracy and inference to utilize the model for analysis such as classification or prediction (see the next section for more details). In recent years, deep learning has been applied in many fields, including big data analytics and applications such as pattern identification, speech recognition, and computer vision. Deep learning poses privacy problems, especially when using a sophisticated cloud infrastructure and a collaborative approach. Concerns about privacy arise from sensitive input data used in training or inference and from sharing the learned model. The training algorithm efficiency may be improved by using a strong remote server or cloud 12 . However, both the users and the server may have privacy issues when using these environments. An attacker with complete knowledge of the training process may access model parameters, posing privacy concerns and shifting the problem from data privacy to model privacy 13 . The leak of sensitive data among participants should also be considered in the case of collaborative learning. Sensitive data leakage between users and external infrastructure, especially via the Internet, should also be considered 12 .
Through related works [14][15][16][17][18][19][20] The researchers concluded that , there are two significant issues relating to CBIR. Firstly, ensure users' privacy and increase data confidentiality while trade-off exits between security and efficiency. Approaches based on lightweight encryption, such as permutation or substitution, are efficient but insecure; methods based on a heavyweight algorithm are secure but impractical computing costs. Second, a trade-off exists between retrieval accuracy and retrieval efficiency. While many approaches for image retrieval include low-level features like color, texture, and form, the retrieval accuracy usually serves the practical applications due to the "semantic gap" between visual features and the scope of human semantics. A fully homomorphic encryption algorithm was used in this paper because it has been demonstrated through experiments that it is a highly efficient algorithm in preserving data privacy. For cloud computing applications, homomorphic encryption became a popular and powerful cryptographic. Also, it presents security analysis against all the known attacks with respect to the message expansion and homomorphic operations 21 . Still, it has a high cost in the encryption and decryption processes, so the researcher divided the algorithmic load between the client and the server to solve this problem. Another issue addressed in this research is the balance between accuracy and efficiency, achieved by combining deep learning techniques with Random Forest to reduce and choose the right features. This combination resulted in reducing the processing time and increasing accuracy. The motivation of the proposed work consists of building a secure AI model to balance the accuracy of image retrieval, reduce the processing time of retrieval operations, and preserve the privacy of data transmitted through an insecure medium. The following are the major contributions in this paper: 1. Trade-off between security and accuracy of the CBIR. 2. Proposing an effective method for image retrieval based on deep learning techniques. 3. Encrypting the retrieved and sent images using FHE-CKKS algorithms to ensure their security. 4. Improving the classification CNN's accuracy by training it on augmentation images. 5. Using the Random Forest method to select the best features that increase accuracy and decrease retrieval time.
The rest of this paper is organized as follows. In Section 2, the researchers give related work in homomorphic encryption, image retrieval, and deep learning. In Section 3, the researchers explain in brief materials and methods used in this paper. The researchers present them propose in Section 4. Experimental results and performance analysis are given in Section 5. Conclusions and future work are given in Section 6.

Related Works:
A large amount of research has been published in the fields of the CBIR and homomorphic encryption.
In Zhihao Cao et al 16 ., the dimension reduction and image retrieval by convolutional neural networks were used to extract high-level image features. In addition, multiline core component analysis was used to reduce feature dimensions that were too large and strongly correlated. For efficiency, features are binary encoded after feature reduction.
Manisha and et., al 17 . suggested a strategy employing LNDP for local features. This method converts every pixel in an image into a binary pattern depending on its nearby pixels. Thus, both LBP and LNDP extract information using local pixel intensity.
Selvam and et., al 18 . suggested a method for combining the genetic algorithm (GA) with the HARP aggregation algorithm to increase system retrieval accuracy while using less processing time also to recover the relevant image and possible resolution utilizing CBIR.
Kuo et al. 22 introduced deep convolutional neural networks in their paper for image retrieval. It uses DL to train the weights of a NN, resulting in high-level image feature extraction.
Hsin et al. 23 proposed CNN as an aggregate of ensemble models for image retrieval. This image classifier combines AlexNet and Network in Network (NIN), which are both particularly effectively deep learning networks, to achieve image feature extraction. It computes weighted average feature vectors for image retrieval.
Umer et al. 24 developed an efficient content-based image retrieval CBIR system capable of retrieving correct images semantically. They proposed a hybrid features descriptor consisting of color and texture features for this purpose.
According to Xia et al 15 , block and pixel permutation are used to offer a privacy-protected LBP extraction approach in the ciphertext domain in the suggested method. The security of these methods is compromised, but they are efficient.
Fathala et al.'s, 25 paper is essentially dependent on two procedures to retrieval techniques. First, extracting an image features with the histogram, then using statistical features (mean, standard deviation). In this instance, the T-test is used to examine the relationship between a lot of different images.
Challa et al. 26 suggested a modified Reed-Muller Code-based symmetric key fully homomorphic encryption that enables both (MOD 2) additive and multiplication operations limitlessly. The proof of security provides a mathematical analysis and difficulty level. It also analyzes the security of message expansion and homomorphic operations against all known attacks and vulnerabilities.
Syed et al. 27 suggested the use of homomorphic encryption, which allows for the training of deep learning and classical machine learning models while maintaining data privacy and security. The proposed methodology is being evaluated in smart grid applications such as fault diagnosis and localization and load forecasting. The results for fault localization reveal that the proposed privacy-preserving deep learning model's classification accuracy, while utilizing homomorphic encryption, is comparable to the model's classification accuracy on plain data.
Lou et al. 28 described a deep neural network that can process encrypted data using a Shift-accumulation-based LHE-enabled network.
They develop ReLU activations and max poolings using the binary operation-friendly Leveled Fast Homomorphic Encryption over Torus (LTFHE) encryption method. Instead of expensive LTFHE multiplications, they use cheaper LTFHE shifts to accelerate inferences.
Obla et al. 29 introduced a methodical approach to generating higher education-friendly activation employment for CNNs. They began by evaluating commonly utilized functions such as linear correcting units (ReLU) and Sigmoid to find the qualities in a good activation function that contributes to performance. Then, they compare the polynomial approximation methods and determine the best range of approximation for polynomial activation. They also proposed a novel weighted polynomial approximation method for distributing the batch adjustment layer's output. Finally, they demonstrated the efficacy of their strategy employing a variety of datasets, including MNIST, FMNIST, and CIFAR-10.
Owusu et al. 30 developed a new framework, MSCryptoNet, that enables MSCryptoNet models to be executed, converted, and scaled in a privacy-preserving manner. Sigmoid and rectified linear units activation functions are approximated using low degree polynomials in homomorphic encryption systems.
Clet et al. 31 comprehensively covered the three most common homomorphic cryptosystems, BFV, CKKS, and TFHE, concerning the training phase of feed-forward neural networks that have been successfully completed on the MNIST dataset.

Materials and Methods:
The CIFAR-10 dataset contains various images used to train machine learning algorithms and computer vision. Therefore, this dataset is increasingly utilized for machine learning research. The CIFAR-10 dataset has 60,000 images, all of which are 32 × 32 in size and are in the PNG format, and they belong to 10 different classes as illustrated in Table 1. For example, an airplane, a vehicle, a bird, a cat, a deer, a dog, a frog, a horse, a ship, and a truck are classes. A total of 50000 training pictures and 10000 test images were taken from the CIFAR-10 dataset, with the first set being used for training and the second set being used for testing and evaluating the proposed model. There are precisely 1000 photos from each class in the test batch that are randomly picked. Randomly arranged 5000 images from each class are included in each training batch.

Deep Learning
Deep learning attempts to extract complicated features from high-dimensional data and use them to construct a model that connects inputs to outputs (such as classes). Deep learning architectures are typically built as multi-layered networks, with higher-level characteristics computed as nonlinear functions of lower-level features. The most prevalent type of deep learning architecture is a layer neural network 32 . The structure of deep learning is represented by many levels. This section will present these layers and how they affect homomorphic encryption.

Convolutional Neural Network (CNN):
CNN is commonly used for image classification and is defined by a convolution layer whose function is to learn the features derived from the dataset. The convolutional layer is N x N in size and will perform dot product multiplication between neighborhood values. As a result, the convolutional layer only contains addition and multiplication functions. This layer does not need to be changed as it can be used for homomorphic encryption data 33 .

Activation Layer:
The activation layer is a non-linear feature that applies a mathematical procedure to the output of the convolution layer. Because these tasks are not linear, the difficulty increases significantly when utilized to assess Homomorphic Encrypted (HE) data. As a result, designers must develop a substitute element that only requires multiplication and addition 29 .

Pooling Layer:
This sample layer's purpose is to minimize the data size. Pooling can be classified into several types, such as maximum and average pooling, mean pooling, and so on. One would not use the maxpooling option in HE, but average pooling is a solution that is used in HE, because average pooling determines the number of values using two operations that are allowed in HE 28 .

Fully Connected Layer:
It is described as a "Fully Connected Layer" since each neuron is connected to the neuron in the previous layer. There is only a dot product operation in this layer, which consists of multiplication and addition functions. As a result, it can be employed over encrypted data 34 .

Dropout Layer:
This was done to avoid overfitting. Researchers often get excellent classification results for using machine learning model when training, suggesting bias in the training set 35 .

Homomorphic Encryption (HE)
Various tools are employed to protect privacy, such as differential privacy techniques and homomorphic encryption. HE is a type of encryption that allows various kinds of calculations to be performed on ciphertexts to produce an encrypted output. HE is divided into three categories 36 .

Partially Homomorphic Encryption (PHE):
This provides only one encrypted data process, which is either addition or multiplication.

Somewhat Homomorphic Encryption (SWHE):
This provides more than one process, such as multiplication and addition, but the number of operations is limited. Fully Homomorphic Encryption (FHE): This provides multiple multiplication and addition processes without restriction on the number of functions. HE schemes include four stages. 37 : The Key Generation (KeyGen): In this stage, security parameters are generated. In an asymmetric type, a single key is generated, while in an asymmetric type, a pair of secret and public keys are generated.

CKKS Homomorphic Encryption Scheme
The Cheon-Kim-Kim-Song (CKKS) scheme is a leveled homomorphic encryption method that depends on the RLWE problem's difficulty for security. Unlike other HE systems, CKKS allows precise approximation arithmetic on real and complex numbers. The CKKS method interprets decryption noise as a mistake in the calculation of real values. It excels for applications like machine learning where most computations are approximated. With the use of bootstrapping technique as mentioned 38 , the CKKS scheme becomes a FHE (fully homomorphic encryption) scheme.

Proposed system
As demonstrated in Fig. 2, this protocol has two major phases: the offline phase is implemented on the server side, and the online phase is implemented on both the server and the client. The offline phase is referred to as generation. The CNN model phase includes training stages, which consist of three steps performed on plaintext training data to produce a classifier, which is then used with plaintext testing data to produce a trained model. The online phase consists of eight steps on the server side and four steps on the client side, as shown in Fig. 2. The steps on the server side include first, creating the keys, second, sending them to the client's side, third, receiving the encrypted image, fourth, decrypting the encrypted image, fifth, inference of the decrypted image based on the trainer's model, and sixth, sending the decrypted image.

Offline phase
This part has two major stages: the first contains a trained model (M), and the second uses the trained model to extract the features for each image in the training dataset.

Generation CNN model phase
This phase consists of several procedures that are necessary to create the classification model. The steps used in this section are explained in the following section. The major procedures for obtaining the trainer model appear in Fig. 2. The image datasets used in this article include CIFAR-10 (Canadian Institute for Advanced Research), a collection of images used to train machine learning and computer vision algorithms. CIFAR-10 is a popular dataset for machine learning research. It contains 60,000 images in the PNG format with a size of 32x32 colors in ten different classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck. This dataset is divided into two parts, the training dataset and the test dataset. The training dataset consists of 50,000 images for training the network, which accounts for 80% of the total dataset. The test data set, on the other hand, comprises 10,000 images for testing the network and accounts for 20% of the entire dataset. Ten useful augmentations are employed in the proposal system: rotating, horizontal shifting, vertical shifting, and flipping. The rotation process is applied to the original training images at an angle of 15 degrees, which generates a new 50,000 images using the Bilinear interpolation method. An image can be shifted horizontally and retain the same dimensions without distorting it by using horizontal shift augmentation. The process of shifting all the pixels vertically while maintaining the image dimensions is referred to as vertical shift augmentation. The neighborhood of each pixel is thresholded, and the result is a binary number with LBP. Other augmentation methods, which use  Parameters of the algorithms The proposed system includes three main algorithms: CKKS algorithm used for encryption, CNN algorithm used for predict and extract highlevel features, and finally random forest algorithm used to choose the best features. All these algorithms include a set of parameters that raise the efficiency of the proposed system illustrated in tables 2, 3 and 4:   criterion Gini, entropy The ability to evaluate the quality of a split. "gini" for Gini impurity and "entropy" for information gain is supported criteria. max_depth 10 a tree's maximum depth Min samples split 2 The smallest number of samples needed to break apart an internal node into its parts Min samples leaf 1 There is a minimum number of samples that must be taken at each leaf node.

Feature extraction phase
CNN is proposed as a feature extraction method by utilizing the output of model flattening, which converts all the resulting 2-dimensional arrays into a single long continuous linear vector. It extracts 2048 features for each training image and stores these feature vectors as a matrix for all the training images.

Online processing phase
This phase begins when the client requests the public key from the server to encrypt the image and later sends the encrypted image to the server. In this phase using CKKS as the FHE algorithm, the client initially takes a message (M) and convert it to cipher vectors ([⟨ct1,pk⟩]%q,[⟨ct2,pk⟩]%q), where ct1 and ct2 are ciphertexts represented in polynomial form, pk represents the public key, and q refers to the ciphertext modulus. This phase is implemented on the server and client sides as follows: The initial security parameter step (serverside, client-side): this step carries out on the serverside and the client-side, and there are many security parameters: The polynomial degree (a power of two): Degree d of polynomial that determines the quotient ring R.  The ciphertext modulus q:  The large modulus . For both schemes, a larger modulus allows us to perform more homomorphic operations before the noise gets  36 .  The scaling factor ∆. A more significant value of the scaling factor ∆ yields more precision but allows fewer homomorphic operations. More specifically, for a plaintext polynomial ( ) ∈ where ( ) = Encode (z, ∆), after that compute the error in the result of decoding ( ) + ( ) for some accumulated error. In practice, also use this estimate to choose ∆. For example, choosing ∆ = 215 gives us a final decoded message with 4 bits of precision. If 6 bits of precision have been chosen, choose ∆ = 217. Certain operations must increase the size of the scaling factor if the size of our decoded slots becomes small 37 . Generating keys step (server side, client side): Both public (pk) and private keys (sk) are generated on the server side and the client, and each side uses its keys to accomplish the encryption and decryption operations.: Encrypting image (server side, client side): After receiving the public key from the client, the client performs the image encryption procedures. It resizes the image to scale 32x32 using the bilinear method, resizes the image used in order to be suitable for the model that was produced on the server side, and then it reads the image pixel by pixel and encodes each using CKKS encoding. In this step, every pixel in the image is converted into a polynomial with degree 8. Also, the server side encrypts the retrieved images in the same manner, but with the client's public key. Decrypting image (server side, client side): The encrypted image was created on the client side and depends on the public key sent by the server. This image is sent to the server to retrieve the top five similar images to this image. This process is done by decrypting the image with the private key generated on the server side. The output of the decryption process is in a polynomial form, so a decoding of the output is used in order to retrieve the real values of the pixels. The client also utilizes the private key to open images received from the server in the same way that the server does. Similarity matching (server side only): This step of the image retrieval process, by using the model (M), extracts the features of the sent image based on the flatten layer. The Hamming distance approach is used to determine the similarity of the five most similar images to the sent image using the features extracted and the feature vectors stored in the training phase.

Results:
There are many results from this proposed model illustrated as follows: Augmentation data result Fig.4 shows an example of the four augmentations used, the rotating, horizontal shifting, vertical shifting, and flipping. There are 50,000 new photos, as indicated in Table 5, when the rotation procedure is applied to the original training images, and the lost value is compensated for using Bilinear Interpolation on the new images. The rotation was applied at a 15-degree angle. The table above shows that 50,000 new images have been generated by applying horizontal shift to every image in training data in the x-direction, right-toleft one byte. Also, this

CNN Layers
The proposed model has been implemented using ten CNN layers. Each image in the training set passes through all these layers. This network consists of six convolutional layers, three maxpooling layers, and one fully connected layer. Fig. 3 shows the results of the input images that have passed through the network layers.

Training Results
The model has been trained on 450,000 images (450,000 R, 450,000 G, 450,000 B) training images using optimization method RMSprop with an initial learning rate of 0.001.
The dataset goes through 100 epochs to enhance the images. In every single epoch, the weights are changed to get the image closer to the desired image. Table 6 illustrates the model accuracy and loss with the corresponding hyperparameters through the training stage. This Table shows that the number of times the training and validation samples were repeated is 100 times, each time 64 batches are taken. At first, an initial learning rate was 0.01, and after several epochs, the learning rate reached (0.0003).

Testing Results
The 10,000 testing images which represent 20% of the CIFAR-10 dataset have been input into the model architecture and went through all its layers. By using the saved parameters including the weights that the network reached and multiplying them by those weights, the testing images were classified into ten classes. Table 7 illustrates the results of testing accuracy, the time estimate for testing, and the loss value. The predicted classes of the test images have been compared with the actual test image classes to evaluate the trained network through the confusion matrix, which shows classification accuracy for the CIFAR-10 dataset, and through the calculation of recall, precision, and F1-score values as shown in Table 8. Also, Table 9 shows the classification result for each class.  Practitioners and academics frequently employ visualization to monitor learned parameters and output metrics to train and improve their models. Along with graph visualization, there is also a module for monitoring the distribution of tensors and images and sounds in the dashboard component TensorBoard. Fig.5 shows changes in loss and accuracy after every epoch. When an entire dataset is passed through a neural network, both forward and backward propagation -It is essential to understand loss and accuracy as training progresses and at what point these metrics are steady. Understanding this scaler graph will help prevent overfitting.Through this Figure, researchers notice a convergence in the increase between loess and accuracy for each training and validation samples. This indicator leads us to conclude that the proposed network in this research does not have a problem with overfitting.

Figure 5. Analysis of the CNN model NIST Tests Results
This part presents the result NIST for the CKKS algorithms on the CIFAR dataset. In this test, the results were converted from the cipher data, which is in the polynomial format, to binary, and then the NIST measurements were tested. Table 10 illustrates the results of this test.  Table 11 displays the time results for each encryption algorithm and deep learning method as well as the image retrieval time. All trials were performed on a computer with a dual-core processor, a clock speed of 2.7 GHz, and a memory capacity of 4 GB with pre-installed Windows 7.

Comparison with Previous Studies
Several methods have been proposed to enhance the retrieved images. Table 12 illustrates the results  of image classification, while Table 13 shows the retrieval performance for CIFAR-10 datasets in mean average precision for different research projects.  [21] 96.9 Hsin et al [22] 90.19 RCNN_CKKS proposed system 98.87 Table 13. Image retrieval map on CIFAR-10 MAP Kua t al [21] 0.707 Hsin et al [22] 0.867 Umer et al [23] 0.913 RCNN_CKKS proposed system 0.967 Researchers can notice from these Tables that the proposed method performed better. The system analysis proposed by the researchers is depicted in the above section using TensorBoard analysis. Choose the best CNN architecture and parameters based on the results of this analysis. In addition, the random forest method provided valuable features for calculating the distance between stored vectors and input vector features.

Discussion:
Related work studied in this research included results that are less than this research. The presented results showed that the CNN suggested in this research was distinguished from other research with higher results in the classification and retrieval process. This research remarked that the flatting layer provided features used by retrieving similar images because this layer contains properties that depend on trained weights in the training phase. The Random Forest method has been used to choose the best features that give better results in terms of accuracy and time. In addition, the proposed data augmented method led to an increase in dataset samples used to overcome the problems of overfitting and increase the accuracy of training and noticed through previous research that the security of the data sent through the network is not taken care of. Most of the applications that need image retrieval are applications that need to maintain the privacy of the data of the sending person, so proposed a protocol that maintains data privacy.

Conclusions:
This paper has presented an effective content-based image retrieval CBIR system capable of semantically retrieving the correct images with high retrieval performance. More than one algorithm was used together to achieve the highest results. In the encryption part, the CKKS algorithm was used, which is one of the fully homomorphic encryption algorithms. In building a deep learning model, the algorithm CNN used and the random forest has been used to extract the best features, and hamming distance method is used to calculate the distance between the stored and input factors. The researchers propose a method for image retrieval based on CNN developed by taking advantage of the flatten layer, which extracts 2028 image features stored in one feature vector. Next, the researchers apply the random forest algorithm after this layer as features selection to produce 600 features that contribute increased accuracy compared to the previous research. A secure protocol was developed to preserve the data communicated through an insecure connection between the client and the server by using CKKS to provide the highest security. It also overcomes the overfitting issue and improves the msodel's accuracy by increasing the number of training images to use eight different augmentation methods. The researchers note that the CKKS approach is slow and requires more cipher image space, yet it is a powerful encryption method. Results for classification and retrieval were 97.94 and 98.94 percent, respectively. CKKS's safety was also evaluated using the NIST test.
The first contribution achieved in this paper is the trade-off between the accuracy of image retrieval while providing a safe environment for transferring images because of the importance of ensuring the security of applications used in the image retrieval field. Another contribution presented is building CNN model with high classification accuracy, which helped extract the best characteristics from the images. The high accuracy of classification comes through a vital contribution that includes increasing the number of dataset samples by adding a set of expected effects on the images, as previously shown. The last contribution presented by researchers includes using the random forest algorithm to select the best features extracted from the flatten layer. These features are used to calculate the distance between each stored image's features and the input images features, which increases image retrieval accuracy and reduces its time.
Many practical applications can benefit from this proposal, such as building a safe search engine and health fields, because they are the areas that require a safe environment when communicating between the patient and the hospital One of the futures works is to construct a deep learning model for image retrieval performed on encrypted images by utilizing the capabilities of fully homomorphic encryption algorithms, which include the ability to perform addition and multiplication operations on encrypted data. Another future proposal is to retrieve multimedia by combining deep learning, blockchain, and fully homomorphic encryption algorithms.