A Cryptosystem for Database Security Based on TSFS Algorithm

: Implementation of TSFS (Transposition, Substitution, Folding, and Shifting) algorithm as an encryption algorithm in database security had limitations in character set and the number of keys used. The proposed cryptosystem is based on making some enhancements on the phases of TSFS encryption algorithm by computing the determinant of the keys matrices which affects the implementation of the algorithm phases. These changes showed high security to the database against different types of security attacks by achieving both goals of confusion and diffusion.


Introduction:
Numerous business fields have computerized the majority of their information and applications (1). Unauthorized users may access the system, either for copying that information or rolling out malicious actions to part or the whole database. The security mechanism of a database management system (DBMS) must adopt a security technique of information encryption to ensure the protection of sensitive information (2).
Sensitive information may represent medicinal records, master card numbers, and government secret data. Thus the database administrators should guarantee appropriate insurance of sensitive information. Database cryptosystems can be portrayed by the accompanying properties: confidentiality, integrity, and availability (3).

Figure 1. The components of cryptosystem (4)
TSFS is a symmetric algorithm (the same keys are used in encryption and decryption). It encrypts a block of data as a plain-text or message m under the activity of a secret key k to produce cipher-text c. This is defined as: c = Enc k (m)  Equation 1 And reversed by the decryption process with the user-supplied key that is used in encryption, and defined as (5): Manivannan and Sttjarani (6) suggested TSFS as an encryption algorithm by using the symmetric key, which incorporates transpositions and substitutions as highlights in the procedures that reduces the processing time of encryption and decryption. Afterward proposed an enhanced work on TSFS which can scramble the information that consists of alphanumeric and couple of special characters guaranteeing abnormal state of security to encrypt information however forces a couple of imperatives on the information size and the special characters utilized. So, if someone wants to encrypt the user email id, for example, it is not possible to encrypt by using this algorithm. Al-Souly et al. (7) proposed TSFS algorithm by dividing the dataset to isolate characters, amending shifting and substitution processes, supporting a couple of modulo factors and 4×16 arrays correspondingly to maintain a strategic distance from the incorrectness that emerges in the decryption steps. Initially, a random key generator has been utilized just for obtaining the key values. At that point, the key is replaced by numbers dependent on the succession in the alphabet and saved in 4×4 matrix form. The expansion tasks for the keys are achieved by exchanging positions of the columns utilizing add round key technique. It was a comparative study with well-established encryption algorithms DES (Data Encryption Standard) and AES (Advanced Encryption Standard). So, due to time constraints, the algorithm did not cover all special symbols. Preeti and Khatkar (8) aimed to improve the TSFS algorithm and to give powerful protection to the databases while restricting the additional cost of encryption and decryption time by encoding sensitive information as it were. The ETSFS algorithm can scramble the information that comprises of all numbers, alphabetic characters from A to Z, and some symbols: (*, -. /: @ and _). The improved TSFS algorithm is symmetric, which means every procedure or transformation must be reversible and have a backwards process that can drop its impact. The key additionally should be utilized in an inverse manner. Again, the algorithm in this study did not cover all special symbols.
Jawanjal et al. (9) declared that the significant objective is to try the following issues: 1. Performing an enhancement of TSFS algorithm that will uphold special characters too. 2. A lightweight algorithm is proposed, so it will require minimal time to scramble information. 3. Main key1, key 2 which are arbitrarily utilized in ETSFS may have numbers. 4. Development of an ETSFS Algorithm in WampServer version 2.5 and PHP to estimate. 5. Examination of security by ETSFS Algorithm.
Still, the same number of characters and symbols that were allowed in the previous studies to be encrypted data. Ganta  3. It applies an XOR operation on a final output with the key k1. This is denoted as OTP (One Time Pad) algorithm. The limitation of this algorithm is the character dataset includes (0-25) alphabets, (0-9) numbers, and (0-29) special characters.

Problem Statement
Most of cryptosystems based on TSFS algorithm that are built so far (mentioned in the related work) have the following properties: -The encryption performed inside the DBMS which provides insufficient security against intruders, malicious users, or corrupted database administrators. -The limitation of character set used for either the input text (plaintext) or the encryption keys. The character set used included alphanumeric characters of the ASCII code (alphabetic characters from A to Z and numerical characters from 0 to 9) with some special characters that is used in IP addressing and email (*, -, ., /, :, @, and _) that forms in all of 43 characters. -Beside the encryption keys, there was no other factor affects the encryption (and decryption) algorithm's path in a way that append some novelty. The Objectives 1. To design a cryptosystem based on TSFS algorithm for database security. 2. To increase the character set used in the encryption process for both plaintext and encryption keys. 3. To develop a strong and high performance of encryption keys generated through the use of genetic algorithm. 4. To add sort of modernity to the TSFS algorithm by calculating the determinant of all keys matrices to be a decision factors for using different techniques throughout the algorithm path. Proposed Cryptosystem: The proposed cryptosystem implements TSFS algorithm on plain-text taken from SQLite database through encryption process to produced cipher-text which will be stored in a text file and then by decryption process converts that cipher-text back to plain-text as shown in Fig. 2.

Structure Design:
TSFS algorithm takes plain-text in the shape of a 4 × 4 matrix of any printable characters shown in Table 1 that are ranged from 32-126 ASCII characters (95 characters), any character that is not within the range of the printable set remains unencrypted. Padding is also used if the input data is less than 16 characters to add ' ' space character (Its ASCII is 32) at the end of the plain-text.

Keys Generation and Expansion:
Three keys each of 16 printable characters (128 bits length) are randomly generated using Genetic Algorithm (11) (as shown in Fig. 3) at the beginning of each encryption process that represents k00, k10, and k20. (12) The key generation using the Genetic Algorithm will experience various procedures and primary criteria for key selection will depend on the fitness value of the population, these procedures insure randomness of the keys generated as illustrated in the following algorithm: Do

Figure 3. Basic Model of Genetic Algorithm
Step one Selection: To breed a new generation selected is a portion of the existing population.

Step two
Crossover: An operation of choosing solutions of more than one parent to produce a child solution.
Step three Mutation: Changes one gene value (or more) in a chromosome from its former state. Until getting all elements of the key matrix. (13) Each of the three keys generated is expanded into nine keys by performing Route Transposition (14) of the initial key and subsequently resulted keys to gather three groups as illustrated in the example of Fig. 4. If all of the three determinants are equal then the one of a key from the first group is chosen.
For the example above, the determinants of the initial keys will be: Det0 = 116 Det1 = 117 Det2 = 104 Then Det1 that represents Group1 is selected to be MAXDet of the first round. This factor will play a great role in Transposition and Folding phases, as will be shown, which represents an enhancement to the TSFS algorithm that increased the strength of the encryption and decryption processes.

Methodology:
TSFS algorithm implemented in its encryption process on SQLite database on its rest (stored on some media and no read/write operations are performed while using the cryptosystem). As an experimental database, School.db will be chosen which includes one table named (Trip) of four records of the fields illustrated in Fig. 5:

Figure 5. Experimental Database
The plain-text is read from a SQLite database and entered to the TSFS algorithm in blocks, each block size of 4 × 4 matrix. The first block is 'Winter 15th Feb.', and by using the three keys that are generated in the example above and computing their Det and the MAXDet for the first round (of the ten rounds), then the encryption process of the algorithm is illustrated in Fig. 6: This procedure is used for both encryption and decryption phases.

Substitution:
It is the replacement process of plain-text elements with other elements from the same character set by using three keys in each round. The encryption equation for any given element x is: Proceeding to the example above, at the first round k00, k10, and k20 are considered with E equation with the plain-text resulted from transposition technique to produce substitution ciphered matrix: The matrix resulted from the substitution technique is folded like a paper and the direction of folding depends on the MAXDet of that round which will be: 1) Folding by row if the MAXDet is from Group0.

2) Folding by column if the MAXDet is from
Group1.

3) Folding diagonally if the MAXDet is from
Group2.
In the example, since the MAXDet was from Group1 then it must be folding by column: So Det1 which is equal to 117 in the example will produce the matrix in Fig. 12:

Figure 12. MAXDet XOR folding matrix
In the decryption phase, MAXDet is XORed to the matrix resulted from previous procedure then folded according to MAXDet position rules stated above.

Shifting:
The last technique in the methodology that replaces each element with its corresponding element located in the character set backward by its position in the folded matrix.
The letter (Z) in position 2 of the resulted folded matrix of the example, then its ASCII number (90) is shifted backward two places to be the ASCII of (88) which represent the (X) letter, and so on for all elements taking in concern that the shifting is circulated within the range of the character set as illustrated in Fig. 13. Decryption phase is shifted forward.

Results and Discussion:
The cipher-text resulted from the encryption process is stored in a text file called (School_Trip.txt) and to be used in the decryption process to retrieve to the original data: PXlXO ~ITcIzS`GPrDnsupM|T%4T&p8_ PicoO ~1TbIhS@G"rDnsuC}|TY1T&-I_ Q-_3O%Z17cI}S G:hD"s;L]ER&7KSK&V PX<:OY%#TcIxS`G"rDnsub)|T0_T&KM_ This cipher-text is complex enough to stand against brute-force attack that will need 2 128 alternations to guess each of the initial 3 randomly generated keys (16): K00 = niA.0IYq5`gzu.)' K10 = bb?e5)u8L:hpsK+! K20 = gm1Rj1j\sKU*g,V: These keys are stored in a separated file called (School_Trip.key). The complexity of the cipher-text also comes from the use of the determinant factor that is changed with each run of the cryptosystem and with each round of the 10 rounds of the TSFS algorithm.
The latency for the encryption process = 0:00:00.053387 ms which is concerned fair enough regarding the cost of computation time.

Conclusions:
Database attacks are incremented in the risks of information revelation. Numerous organizations must have an arrangement with regulation and legislation on information confidentiality.
The best arrangement focused on protecting the information is utilizing cryptography, alongside different techniques. In the event that sensitive information is encrypted in the database, risks from security faults can be wiped out by using three