Sunday, July 21, 2019
Database security and encryption
Database security and encryption Introduction Organisations are increasingly relying on the distributed information systems to gain productivity and efficiency advantages, but at the same time are becoming more vulnerable to security threats. Database systems are an integral component of this distributed information system and hold all the data which enables the whole system to work. A database can be defined as a shared collection of logically related data and a description of this data, designed to meet the information needs of an organization. A database system is considered as a collection of related data, database management system (DBMS) a software that manages (define, create and maintain) and controls the access to the database, and a collection of database application(s) a program that interacts with the database at some point in its execution (typical example is a SQL statement) along with the DBMS and the database itself [1]. Organisations have adopted database systems as the key data management technology for decision-making and day-to-day operations. Databases are designed to hold large amounts of data and management of data involves both defining structures for storage of information and providing mechanisms for manipulation of information. As the data is to be shared among several users the system must avoid anomalous results and ensure the safety of the information stored despite system crashes and attempts at unauthorized access. The data involved here can be highly sensitive or confidential, thus making the security of the data managed by these systems even more crucial as any security breach does not affect only a single application or user but can have disastrous consequences on the entire organisation. A number of security techniques have been suggested over the period of time to tackle the security issues. These can be classified as access control, inference control, flow control, and encryptio n. 1.1 A Short History Starting from the day one when database applications were build using hierarchical and network systems to todays date when we have so many different database systems like relational databases (RDBMS), object-oriented databases (OODBMS), object-relational databases (ORDBMS), eXtended Query (XQUERY); one factor which was, is, and will be of the utmost importance is the security of the data involved. Data always has been a valuable asset for companies and must be protected. Organizations spend millions these days in order to achieve the best security standards for the DBMS. Most of an organizations sensitive and proprietary data resides in a DBMS, thus the security of the DBMS is a primary concern. When we talk of securing a DBMS, this is with respect to both the internal and the external users. The internal users are the organization employees like database administrators, application developers, and end users who just use the application interface, which fetch its data from one of the databases and the external users can be the employees who do not have access to the database or an outsider who has nothing to do with the organization. The other factors which has made data security more crucial is the recent rapid growth of the web based information systems and applications and the concept of mobile databases. Any intentional or accidental event that can adversely affect a database system is considered as a threat to database and database security can be defined as a mechanism that protects the database against such intentional or accidental threats. Security breaches can be classified as unauthorized data observation, incorrect data modification, and data unavailability, which can lead to loss of confidentiality, availability, integrity, privacy, and theft and fraud. Unauthorized data observation results in disclosure of information to users who might not be entitled to have access to such kind of information. Incorrect data modification intentional or unintentional leaves the database in an incorrect state. Data can hamper the functionality of an entire organization in a proper way if not available when needed. Thus the security in terms of databases can be broadly classified into access security and internal security. Access security refers to the mechanisms implemented to restrict any sort of unauthorized access to the database; examples can be authorization methods such as every user has a unique username and password to establish him as a legitimate user when trying to connect to the database. When the user tries to connect to the database the login credentials will be checked against a set of usernames and password combinations setup under a security rule by a security administrator. Internal security can be referred to as an extra level of security, which comes into picture if someone has already breached the access security such as getting hold of a valid username and password, which can help getting access to the database. So the security mechanism implemented within the database such as encrypting the data inside the database can be classed as internal security, which prevents the data to be compromised even if someone has got unauthorized access to the database. Every organization needs to identify the threats they might be subjected to and the subsequently appropriate security plans and countermeasures should be taken, taking into consideration their implementation costs and effects on performance. Addressing these threats helps the enterprise to meet the compliance and risk mitigation requirements of the most regulated industries in the world. 1.2 How Databases are Vulnerable According to David Knox [2], Securing the Database may be the single biggest action an organization can take, to protect its assets. Most commonly used database in an enterprise organization is relational database. Data is a valuable resource in an enterprise organization. Therefore they have a very strong need of strictly controlling and managing it. As discussed earlier it is the responsibility of the DBMS to make sure that the data is kept secure and confidential as it the element which controls the access to the database. Enterprise database infrastructure is subject to an overwhelming range of threats most of the times. The most common threats which an Enterprise Database is exposed to are: Excessive Privilege Abuse when a user or an application has been granted database access privileges which exceeds the requirements of their job functions. For example an academic institute employee whose job only requires only the ability to change the contact information for a student can also change the grades for the student. Legitimate Privilege Abuse legitimate database access privileges can also be abused for malicious purposes. We have two risks to consider in this situation. The first one is confidential/sensitive information can be copied using the legitimate database access privilege and then sold for money. The second one and perhaps the more common is retrieving and storing large amounts of information on client machine for no malicious reason, but when the data is available on an endpoint machine rather than the database itself, it is more susceptible to Trojans, laptop theft, etc. Privilege Elevation software vulnerabilities which can be found in stored procedures, built-in functions, protocol implementations or even SQL statements. For example, a software developer can gain the database administrative privileges by exploiting the vulnerabilities in a built-in function. Database Platform Vulnerabilities any additional services or the operating system installed on the database server can lead to an authorized access, data corruption, or denial of service. For example the Blaster Worm which took advantage of vulnerability in Windows 2000 to create denial of service. SQL Injection the most common attack technique. In a SQL injection attack, the attacker typically inserts unauthorized queries into the database using the vulnerable web application input forms and they get executed with the privileges of the application. This can be done in the internal applications or the stored procedures by internal users. Access to entire database can be gained using SQL injection Weak Audit a strong database audit is essential in an enterprise organization as it helps them to fulfill the government regulatory requirements, provides investigators with forensics link intruders to a crime deterring the attackers. Database Audit is considered as the last line of database defense. Audit data can identify the existence of a violation after the fact and can be used to link it to a particular user and repair the system in case corruption or a denial of service attack has occurred. The main reasons for a weak audit are: it degrades the performance by consuming the CPU and disk resources, administrators can turn off audit to hide an attack, organizations with mixed database environments cannot have a uniform, scalable audit process over the enterprise as the audit processes are unique to database server platform Denial of Service access to network applications or data is denied to the intended users. A simple example can be crashing a database server by exploiting vulnerability in the database platform. Other common denial of service techniques are data corruption, network flooding, server resource overload (common in database environments). Database Protocol Vulnerabilities SQL Slammer worm took advantage of a flaw in the Microsoft SQL Server protocol to force denial of service conditions. It affected 75,000 victims just over 30 minutes dramatically slowing down general internet traffic. [Analysis of BGP Update Surge during Slammer Worm Attack] Weak Authentication obtaining legitimate login credentials by improper way contributes to weak authentication schemes. The attackers can gain access to a legitimate users login details by various ways: by repeatedly entering the username/password combination until he finds the one which works (common or weak passwords can be guessed easily), by convincing someone to share their login credentials, by stealing the login credentials by copying the password files or notes. Backup Data Exposure there are several cases of security breaches involving theft of database backup tapes and hard disks as this media is thought of as least prone to attack and is often completely unprotected form attack [3]. All these security threats can be accounted for unauthorized data observation, incorrect data modification and data unavailability. A complete data security solution must take into consideration the secrecy/confidentiality, integrity and availability of data. Secrecy or confidentiality refers to the protection of data against unauthorized disclosure, integrity refers to prevention of incorrect data modification and availability refers to prevention of hardware/software errors and malicious data access denials making the database unavailable. 1.3 Security Techniques As organizations increase their adoption of database systems as the key data management technology for day-to-day operations and decision-making, the security of data managed by these systems has become crucial. Damage and misuse of data affect not only a single user or application, but may have disastrous consequences on the entire organization. There are four main control measures which can be used to provide security of data in databases. These are: Access Control Inference Control Flow Control Data Encryption Chapter 2 Literature Review Secure and secret means of communication has been always desired for in the field of database systems. There is always a possibility of interception by a party outside of the sender-receiver domain when data is transmitted. Modern digital-based encryption methods form the basis of todays world database security. Encryption in its earlier days was used by military and government organizations to facilitate secret information but in present times it is used for protecting information within many kinds of civilian systems. In 2007 the U.S. government reported that 71% of companies surveyed utilized encryption or some of their data in transit [4]. 2.1 Encryption Encryption is defined as the process of transforming information (plaintext) using an encryption algorithm (cipher) into unreadable form (encrypted information called as ciphertext) making it inaccessible to anyone without possessing special knowledge to decrypt the information. The encoding of the data by a special algorithm that renders the data unreadable by any program without the decryption key, is called encryption [1]. The code and cipher are the two methods of encrypting data. The encryption of data or a message is accomplished by one, or both, of the methods of encoding or enciphering. Each involves distinct methodologies and the two are differentiated by the level at which they are carried out. Encoding is performed at the word or block level and deals with the manipulation of groups of characters. Enciphering works at the character level. This includes scrambling individual characters in a message, referred to as transposition, and substitution, or replacing characters with others. Codes generally are designed to replace entire words or blocks of data in a message with other words or blocks of data. Languages can be considered codes, since words and phrases represent ideas, objects, and actions. There are codes that substitute entire phrases or groups of numbers or symbols with others. A single system may employ both levels of encoding. For example, consider a code encryption scheme as follows: the = jam, man = barn, is = fly, dangerous = rest. Then the message, the man is dangerous, would read in encrypted form, jam barn fly rest. Although overly-simplistic, this example illustrates the basis of codes. With the advent of electrical-based communications, codes became more sophisticated in answer to the needs of the systems. For example, the inventions of Morse code and the telegraph dictated a need for secure transmission that was more sophisticated. Codes are very susceptible to breaking and possess a large exposure surface with regard to interception and decryption via analysis. Also, there are no easily-implemented means by which to detect breaches in the system. The other method of encryption is the cipher. Instead of replacing words or blocks of numbers or symbols with others, as does the code, the cipher replaces individual or smaller sets of letters, numbers, or characters with others, based on a certain algorithm and key. Digital data and information, including vi deo, audio, and text, can be separated into groups, or blocks, of bits, and then manipulated for encryption by such methods as XOR (exclusive OR), encoding-decoding, and rotation. As an example, let us examine the basics of the XOR method. Here, a group of bits (e.g., a byte) of the data is compared to a digital key, and the exclusive-or operation is performed on the two to produce an encrypted result. Figure 2 illustrates the process. Figure 2: The XOR process for Encryption When the exclusive-or operation is performed on the plaintext and key, the ciphertext emerges and is sent. The receiver performs the exclusive-or operation on the ciphertext and the same key, and the original plaintext is reproduced [5]. Encryption can be reversible and irreversible. Irreversible techniques do not allow the encrypted data to be decrypted, but at the same time the encrypted data can be used to obtain valid statistical information. Irreversible techniques are rarely used as compared to the reversible ones. The whole process of transmitting data securely over an insecure network system is called as cryptosystem that includes à » An encryption key to encrypt the data (plaintext) à » An encryption algorithm that transforms the plaintext into encrypted information (ciphertext) with the encryption key à » A decryption key to decrypt the ciphertext à » A decryption algorithm that transforms the ciphertext back into plaintext using the decryption key [1]. 2.2 Encryption Techniques The goals in digital encryption are no different than those of historical encryption schemes. The difference is found in the methods, not the objectives. Secrecy of the message and keys are of paramount importance in any system, whether they are on parchment paper or in an electronic or optical format [5]. Various encryption techniques are available and broadly can be classified into two categories; asymmetric and symmetric encryption. In symmetric encryption the sender and receiver share the same algorithm and key for encryption and decryption and depends on safe communication network for encryption key exchange whereas in asymmetric encryption uses different keys for encryption and decryption. Asymmetric encryption gave birth to the concept of public and private keys and is preferred to symmetric encryption being more secure [1], [5]. 2.2.1 Symmetric Encryption Symmetric encryption also known as single-key encryption or conventional encryption was the only encryption and by far the most widely used of the two types before the concept of public-key encryption came into picture. The figure below illustrates the symmetric encryption process. The original message (plaintext) is converted into apparently random information (ciphertext) using an algorithm and a key. The key is a value independent of the plaintext. The algorithm produces different outputs for specific keys used at the time i.e. the output of the algorithm changes if the key is changed. The ciphertext produced is then transmitted and is transformed back to the original plaintext by using a decryption algorithm and the same key that was used for encryption. Figure: Simplified Model of Conventional Encryption [7 page 22] The model can be better understood by the following example. A source produces a message X = [X1, X2, X3 à ¢Ã¢â ¬Ã ¦XM] in plaintext. The M elements of X are letters in some finite alphabet. The alphabet usually consisted of 26 capital letters traditionally but nowadays; binary alphabet {0,1} is used. An encryption key K = [K1, K2, K3 à ¢Ã¢â ¬Ã ¦.KJ] is generated and is shared between the sender and the receiver using a secure channel. Also a third party can generate the encryption key and securely deliver it to both the sender and the receiver. Using the plaintext X and the encryption key K as input, the encryption algorithm produces the ciphertext Y = [Y1, Y2, Y3 à ¢Ã¢â ¬Ã ¦.YN] as Y = EK(X) where E is the encryption algorithm and the ciphertext Y is produced as the function of the plaintext X using E. At the receivers end the ciphertext is converted back to the plaintext as X = DK(Y) where D is the decryption algorithm. Figure: Model of Conventional Cryptosystem [7 page 23] The common symmetric block ciphers are Data Encryption Standard (DES), Triple DES, and Advanced Encryption Standard (AES) 2.2.1.1 The Data Encryption Standard Data Encryption Standard has been used in the most widely used encryption schemes including Kerberos 4.0. The National Bureau of Standards adopted it as a standard in 1977 [7]. DES operates on 64-bit blocks using a 56-bit key. Like other encryption schemes, in DES there are two inputs to the encryption function, the plaintext to be encrypted and the key. The plaintext should be of 64 bits in length and the key length is 56 bits obtained by stripping off the 8 parity bits, ignoring every eighth bit from the given 64-bit key. The output from the algorithm after 16 rounds of identical operations is the 64-bit block of ciphertext. A suitable combination of permutations and combinations (16 times) on the plaintext is the basic building block of the DES. Same algorithm is used for both encryption and decryption except for processing the key schedule in the reverse order [6], [7]. The 64-bit plaintext is passed through an initial permutation (IP) that produces a permuted input by rearranging the bits. This is followed by16 rounds of the same function, which involves both permutation and substitution functions. The last round results in the output consisting of 64-bits that are a function of the input plaintext and the key. The left and the right halves of the output are swapped to produce the preoutput. The preoutput is passed through a final permutation (IP-1), an inverse of the initial permutation function to achieve the 64-bit ciphertext. The overall process for DES is explained in the diagram below Figure: General Depiction of DES Encryption Algorithm [7 page 67] The right hand side of the diagram explains how the 56-bit key is used during the process. The key is passed through a permutation function initially and then for each of the 16 rounds a subkey (Ki) is generated, by combining left circular shift and a permutation. For every round the permutation function is same, but the subkey is different because of the repeated iteration of the key bits. Since the adoption of DES as a standard, there have always been concerns about the level of security provided by it. The two areas of concern in DES are the key length and that the design criteria for the internal structure of the DES, the S-boxes, were classified. The issue with the key length was, it was reduced to 56 bits from 128 bits as in the LUCIFER algorithm [add a new reference], which was the base for DES and everyone suspected that this is an enormous decrease making it too short to withstand brute-force attacks. Also the user could not be made sure of any weak points in the internal structure of DES that would allow NSA to decipher the messages without the benefit of the key. The recent work on differential cryptanalysis and subsequent events indicated that the internal structure of DES is very strong. 2.2.1.2 Triple DES Triple DES was developed as an alternative to the potential vulnerability of the standard DES to a brute-force attack. It became very popular in Internet-based applications. Triple DES uses multiple encryptions with DES and multiple keys as shown in the figure [below]. Triple DES with two keys is relatively preferred to DES but Triple DES with three keys is preferred overall. The plaintext P is encrypted with the first key K1, then decrypted with the second key K2 and then finally encrypted again with the third key K3.According to the figure the ciphertext C is produced as C = EK3[DK2[EK1[P]]] These keys need to be applied in the reverse order while decrypting. The ciphertext c is decrypted with the third key K3 first, then encrypted with the second key K2, and then finally decrypted again with the first key K1; also called as Encrypt-Decrypt-Encrypt (EDE) mode, producing the plaintext P as P = DK1[EK2[DK3[C]]] Figure: Triple DES encryption/decryption [6 page 72] 2.2.1.3 Advanced Encryption Standard 2.3 Encryption in Database Security Organizations are increasingly relying on, possibly distributed, information systems for daily business; hence they become more vulnerable to security breaches even as they gain productivity and efficiency advantages. Database security has gained a substantial importance over the period of time. Database security has always been about protecting the data data in the form of customer information, intellectual property, financial assets, commercial transactions, and any number of other records that are retained, managed and used on the systems. The confidentiality and integrity of this data needs to be protected as it is converted into information and knowledge within the enterprise. Core enterprise data is stored in relational databases and then offered up via applications to users. These databases typically store the most valuable information assets of an enterprise and are under constant threat, not only from the external users but also from the legitimate users such as trusted ins iders, super users, consultants and partners or perhaps their unprotected user accounts that compromise the system and take or modify the data for some inappropriate purpose. To begin with, classifying the types of information in the database and the security needs associated with them is the first and important step. As databases are used in a multitude of ways, it is useful to have some of the primary functions characterized in order to understand the different security requirements. A number of security techniques have been developed and are being developed for database security, encryption being one of them. Encryption is defined as the process of transforming information (plaintext) using an encryption algorithm (cipher) into unreadable form (encrypted information called as ciphertext) making it inaccessible to anyone without possessing special knowledge to decrypt the information. The encoding of the data by a special algorithm that renders the data unreadable by any program without the decryption key, is called encryption [1]. 2.3.1 Access Encryption There are multiple reasons for access control to confidential information in enterprise computing environments being challenging. Few of them are: First, the number of information services in an enterprise computing environment is huge which makes the management of access rights essential. Second, a client might not know which access rights are necessary in order to be granted access to the requested information before requesting access. Third, flexible access rights including context-sensitive constraints must be supported by access control Access control schemes can be broadly classified in two types: proof-based and encryption-based access control schemes. In a proof-based scheme, a client needs to assemble some access rights in a proof of access, which demonstrates to a service that the client is authorized to access the requested information. Proof-based access control is preferred to be used for scenarios where client specific access rights required are flexible. It becomes easy to include support for constraints if the access rights are flexible. However, it is not the same case for covert access requirements. According to the existing designs, it is assumed that a service can inform a client of the nature of the required proof of access. The service does not need to locate the required access rights, which can be an expensive task, in proof-based access control scheme. [9] In an encryption-based access-control scheme, confidential information is provided to any client in an encrypted form by the service. Clients who are authorized to access the information have the corresponding decryption key. Encryption-based access-control scheme is attractive for scenarios where there are lots of queries to a service shielding the service from having to run client-specific access control. As compared to proof-based access control it is straightforward to add support for covert access requirements to existing encryption-based architectures. In particular, all the information is encrypted by the service as usual, but the client is not told about the corresponding decryption key to use. The client has a set of decryption keys, the client now needs to search this set for a matching key. On the other hand, considering that key management should remain simple, it is less straightforward to add support for constraints on access rights to the proposed architectures. [10] 2.3.1.1 Encryption-Based Access Control Encryption-based access control is attractive, in case there are lots of requests for the same information, as it is independent of the individual clients issuing these requests. For example, an information item can be encrypted once and the service can use the ciphertext for answering multiple requests. However, dealing with constraints on access rights and with granularity aware access rights becomes difficult with the uniform treatment of requests. Further challenges are presented in cases of covert access requirements and service-independent access rights. The main requirements for encryption based access control are: à » Any knowledge about the used encryption key or the required decryption key must not be revealed by the encrypted information. à » For decrypting encrypted information, each value of a constraint must require a separate key that should be accessible only under the given constraint/value combination and we want a scheme that supports hierarchical constraints to make key management simple. à » The decryption key for coarse-grained information should be derivable from the key for fine-grained information to further simplify key management. à » A single decryption key will be used to decrypt the same information offered by multiple services as implied by the service-independent access rights. Because of this, same information can be accessed by a service encrypting information offered by other services in a symmetric cryptosystem. This problem can be avoided by using asymmetric cryptosystem. [8] 2.3.1.2 Encryption-Based Access Control Techniques An access-control architecture will be an ideal one if the access rights are simple to manage; the system is constrainable and is aware of granularity. The architecture also has to be asymmetric, provide indistinguishability, and be personalizable in the case of proof-based access control. Some common encryption-based access control techniques are: Identity Based Encryption An identity-based encryption scheme is specified by four randomized algorithms: à » Setup: takes a security parameter k and returns system parameters and master-key. The system parameters include a description of a finite message space m and a description of a finite ciphertext space c. Intuitively, the system parameters will be publicly known, while the master-key will be known only to the Private Key Generator (PKG). à » Extract: takes as input system parameters, master-key, and an arbitrary ID à à µ {0,1}*, and returns a private key d. ID is an arbitrary string which is then used as a public key, and d is the corresponding private decryption key. The Extract algorithm extracts a private key from the given public key. à » Encrypt: takes as input system parameters, ID, and M à à µ m. It returns a ciphertext C à à µ c. à » Decrypt: takes as input system parameters, C à à µ c, and a private key d. It returns M à à µ m. Standard consistency constraint must be satisfied by these algorithms, especially when d is the private key generated by algorithm Extract when it is given ID as the public key, then à ¢Ãâ â⠬ M à à µ m: Decrypt (params, d) = M where C = Encrypt (params, ID, M) [11] Hierarchical Identity-Based Encryption One of the first practical IBE schemes was presented by Boneh and Franklin. Gentry and Silverberg [7] introduced Hierarchical Identity-Based Encryption scheme based on Boneh and Franklins work. In HIBE, private keys are given out by a root PKG to the sub PKGs, which then in turn distribute private keys to individuals (sub PKGs) in their domains. There are IDs associated with the root PKG and the public key of an individual corresponds to these IDs, any sub PKGs on the path from the root PKG to the individual, and the individual. Public parameters are required only from the root PKG for encrypting messages. It has the advantage of reducing the amount o
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.