Mathematical Problems in Engineering

Volume 2017 (2017), Article ID 1387375, 14 pages

https://doi.org/10.1155/2017/1387375

## A New Reversible Database Watermarking Approach with Firefly Optimization Algorithm

Department of Computer Engineering, Karadeniz Technical University, 61080 Trabzon, Turkey

Correspondence should be addressed to Mustafa Bilgehan Imamoglu; rt.ude.utk@naheglib

Received 5 September 2016; Revised 25 December 2016; Accepted 30 January 2017; Published 6 March 2017

Academic Editor: Lotfi Senhadji

Copyright © 2017 Mustafa Bilgehan Imamoglu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Up-to-date information is crucial in many fields such as medicine, science, and stock market, where data should be distributed to clients from a centralized database. Shared databases are usually stored in data centers where they are distributed over insecure public access network, the Internet. Sharing may result in a number of problems such as unauthorized copies, alteration of data, and distribution to unauthorized people for reuse. Researchers proposed using watermarking to prevent problems and claim digital rights. Many methods are proposed recently to watermark databases to protect digital rights of owners. Particularly, optimization based watermarking techniques draw attention, which results in lower distortion and improved watermark capacity. Difference expansion watermarking (DEW) with Firefly Algorithm (FFA), a bioinspired optimization technique, is proposed to embed watermark into relational databases in this work. Best attribute values to yield lower distortion and increased watermark capacity are selected efficiently by the FFA. Experimental results indicate that FFA has reduced complexity and results in less distortion and improved watermark capacity compared to similar works reported in the literature.

#### 1. Introduction

The rapid development in Information Technologies (IT) has simplified data sharing through the web and enabled collaboration among people, organizations, and governments worldwide. The collaborative environment necessitates either free or commercial sharing of information as in medicine, science, and stock quotes. Most of the time, centralized or distributed relational databases hold shared information where authorized users can access and use it by applications. One of the problems in sharing information is that information can be copied, altered, or modified easily by authorized users and may be distributed to unauthorized users for reuse. Therefore, ensuring copyright and preventing temper of shared data have become a serious problem.

Cryptography can be used to share confidential information without compromising security. However, it does not associate ciphered information with original content and cannot be used for copyright protection. Watermarking is proposed by [1] and provides solutions for authentication, fingerprinting, copy control, and copyright protection during distribution of digital data. Many watermarking techniques have been proposed in the literature to share images, video files, audio files, and relational databases without violating copyrights. Robustness, fidelity, and blindness properties of watermarking techniques have been used for copyright protection of relational databases recently.

In 2003, Kiernan et al. showed that watermarking could be used to provide copyright protection of relational databases [2]. Their method embeds watermark data at bit level on numeric data fields. A secure Message Authentication Code (MAC) is utilized by their work to select some tuples and attributes on the selected tuples to embed the watermark bits. LSB embedding technique is used by their work and tampers the relational database permanently. A pseudo random number generator is used to generate the watermark bits. When an attacker implements simple shifting operation on the LSB bits of numeric fields in the watermarked database, a significant amount of watermark will be lost. After this, many works in the field have been proposed by other researchers. In 2004 [3], Sion et al. proposed a method to insert a* virtual *watermark. Their method first sorts tuples in ascending order by most significant bits of normalized values and then puts markers at tuples where MAC mod yield zero to create partitions. The watermark is then embedded by modifying tuple values close to standard deviation boundary which results in altered partition statistics. Their work has some extra payload information and is dependent on a single attribute.

Li et al. used categorical attribute to detect malicious alterations and used fragile watermarking approach [4]. The tuples in the database are grouped and watermarks are embedded into groups. Verification procedure is applied on group level. Guo et al. proposed a fragile watermarking scheme to detect malicious modifications on a database in 2006 [5]. The method divides the database into partitions according to the primary key value of tuples. However, two LSB bits are modified by their approach to embed the watermark information. In 2006, Zhang et al. proposed the first reversible watermarking scheme in their work [6]. Their method utilized histogram expansion approach to reversibly watermark the selected nonzero initial digits of errors. In 2007, Zhou et al. embedded a bitmap image into the relational database for copyright protection [7]. Their method also implements an error correction mechanism to correct the errors in the detected watermarks. Difference Expansion Based Watermarking (DEW) technique is used in [8] to watermark a database in a reversible manner. DEW was proposed by Alattar in 2004 and used average values with expanding or enlarging to embed the watermark bits [9]. The method described in [8] uses distortion tolerance to determine the distortion tolerance of the attribute. Bhattacharya and Cortesi created the watermark after partitioning tuples as a permutation of tuples in 2009 [10]. A hash function is used by their method for grouping purposes. The proposed method is distortion-free due to the ordering of tuples. Hanyurwimfura et al. used nonnumeric attributes that contain multiwords [11]. Their method uses Levenshtein distance during the embedding procedure. Location of a word is shifted horizontally according to the watermark bit. Tuples and attributes that are used for embedding procedure are chosen dynamically. Farfoura et al. [12] used reversible watermarking technique in their method. Reversible watermarking approach recovers the original data from the watermarked data after watermark extraction. Primary key dependent technique cannot resist against linear transformation attacks because when the watermarked tuple is deleted, watermark cannot be detected. Their work utilized Prediction Error Expansion watermarking technique proposed by [13]. In 2012, Arif et al. emphasized rewatermarking attack and used date stamp with watermark to overcome this problem [14]. Khan and Husain proposed a database watermarking technique based on fragile watermarking [15]. Their method is a kind of zero watermarking approach and utilizes characteristics of database relation. Numeric attributes are evaluated by the method and it is not resilient to attribute value substitution attack. Camara et al. suggested a fragile zero watermarking method to authenticate numerical relational data [16]. Their method is distortion-free and generates the watermark from the original database. The method partitions the database into groups and generates some mathematical values from each group. Results show that their method is robust to some modification attacks.

Jawad and Khan used genetic algorithm with difference expansion watermarking to propose a robust and reversible database watermarking approach called GADEW [17]. GADEW minimizes distortion induced by the embedding procedure and increases watermarking capacity. Their method used DEW technique to embed the watermark bits into the database with a reversible manner.

In this work, we proposed a new reversible database watermarking method using Firefly Algorithm to both minimize distortion and improve robustness of DEW called FFADEW. The method uses a bioinspired algorithm, firefly proposed by [18] to select the best attributes for watermark embedding. Each firefly in a population is a candidate solution for watermarking process and brightest firefly on the current population is the best solution for the current iteration. Other fireflies are moved to the best firefly with the proposed moving operation. Generation of the new populations is ended when the algorithm reaches the desired condition proposed by the algorithm. Brightest firefly in the last population guides the watermarking process. The algorithm contains floating-point values that construct a firefly designating the location of attributes to be watermarked in each tuple. Experimental results show that the method modifies attribute values in a less observable manner (standard deviation and average values of attributes are used) and the method is more robust against popular database watermarking attacks compared to GADEW. Complexity of the proposed method is also less than GADEW as shown in the results by the run times of both algorithms.

The rest of the article is organized as follows: Section 2 gives background information about DEW method and Firefly Algorithm used in FFADEW. The details of the method are explained in Section 3 and experimental results are given in Section 4. Section 5 concludes the article with suggestions.

#### 2. Related Works

The details of the difference expansion watermarking technique and firefly optimization algorithm used in the proposed method are given in this section.

##### 2.1. Difference Expansion Watermarking Technique

Alattar proposed a reversible watermarking approach called difference expansion watermarking (DEW) in 2004 to apply the images [9]. DEW algorithm gets a pair of adjacent pixels and modifies their difference values to show the watermark bit. The algorithm can reconstruct the original image after extracting the watermark. DEW is used to embed a watermark bit into two numeric attributes in the proposed method. The proposed method uses two numeric attributes denoted by and instead of pixel values. Average value of them and difference between them are calculated as in (1) and denoted by avg and , respectively.The symbol denotes the floor function which returns the greatest integer less than or equal to argument . The difference value d is modified to carry the watermark bit as in (2). The attribute values are also modified to give the new difference value . and denote the modified values of the attributes. The method will use the following steps to recover the original values of attributes and corresponding watermark bit value.

*Step 1. *Compute the average of the attributes: and the difference .

*Step 2. *Extract the watermark bit from the difference: .

*Step 3. *Reconstruct the original values of and : ; .

For example, let and be the contents of two numeric attributes in a database. The difference and average values are calculated as and . Assume that current watermark bit is 1. New difference value will be and modified attribute values will be and . Watermark extraction algorithm calculates new difference value and new average value as −15 and 20, respectively. Watermark bit is calculated as and original attribute values are reconstructed as in

##### 2.2. Firefly Algorithm

Biologically inspired algorithms have become popular in solving global optimization problems such as the Travelling Salesman Problem (TSP) recently. Multiple agents that are affected by each other constitute the base of these algorithms. These algorithms sometimes are referred to as Swarm Intelligence (SI) based algorithms since they simulate the Swarm Intelligence characteristics of biological agents such as fish, birds, and ants. Particle swarm optimization proposed in 1995 used the swarming behavior of fish and birds [19]. Ant colony optimization and artificial bee colony optimization are some examples in this field [20, 21].

Yang proposed Firefly Algorithm, one of the nature inspired algorithms, to solve NP-hard problems in 2008 [18]. Their algorithm uses flashing patterns and behavior of fireflies. Firefly is one of the stochastic optimization algorithms in which a global solution is searched using partially randomized movements in order not to get stuck in one of many local solutions. Being a stochastic method, firefly cannot guarantee optimal solution in deterministic time but it will eventually converge to a solution in a reasonable amount of time. In fact, firefly is a metaheuristic method where a trade-off between randomization and local search is controlled by the parameters. Firefly’s heuristic depends on the survival of the population whereas randomized movement avoids local optima. The search for the optimum solution continues unless improvements in the objective function are possible. The solution in the proposed method corresponds to the selection of both tuples and attributes that minimizes distortion even though a single objective function is optimized.

There are nearly two thousand firefly species and each has a unique pattern of flashes. Flashing characteristics of the fireflies can be summarized with three rules given below.

*Rule 1. *Each firefly can attract others regardless of its sex. This means that all fireflies are unisex.

*Rule 2. *Brightness of a firefly determines its attractiveness. If two fireflies are flashing, less bright one will move towards the brighter one. If none firefly is brighter than a particular one, it will move away randomly.

*Rule 3. *The objective function determines the brightness of a firefly.

Brightness can be defined in a similar way with fitness function in genetic algorithm (GA). The brightness is simply proportional to the value of objective function for a maximization problem.

The brightness of a firefly for a particular location could be chosen as , where is the objective function and the attractiveness should be judged by other fireflies. Distance between the fireflies changes the attractiveness parameter. Light intensity decreases in relation to the distance from the source. The light intensity can be given in (4) in the simplest form. Intensity is inversely proportional to the square of distance as shown in the formulation. denotes the intensity at the source.

When the fixed light absorption coefficient for a medium is considered, the light intensity changes according to (5). represents the initial light intensity.

Light intensity determines the attractiveness of the firefly for other fireflies. given in (6) defines the attractiveness of the firefly. in (3) is the attractiveness at the start position. The method assumes that each firefly resides on a point at dimensional space. Assume that coordinates of any two fireflies and are represented by and . The Cartesian distance between two fireflies is calculated as in The movement of one firefly in dimensional space at time towards another attractive firefly is defined as in (8) at time ,

Attractiveness is considered in the second term and random movement is ensured by the third term in (8). is a randomization factor and is a vector of random numbers , where rand is a pseudo random number generator that generates uniformly distributed random numbers in . The equation given in (8) shows that fireflies realize simple random walk for .

#### 3. Proposed Method

A new reversible database watermarking algorithm using Firefly Algorithm, a bioinspired optimization algorithm, is proposed here. The method uses DEW algorithm to reverse the original database after watermark extraction and it also uses optimization algorithm called firefly to determine the best candidate pairs to embed the watermark. The main advantage of the Firefly Algorithm compared to genetic algorithms is its easy implementation and its run time efficiency. In this regard, we used Firefly Algorithm in this work.

The method consists of two algorithms: watermark insertion and watermark extraction. Watermark insertion algorithm embeds the watermark information into specially selected tuples with DEW. Firefly Algorithm determines which attributes are more appropriate for watermark embedding on the selected tuples. Watermark extraction algorithm extracts the specially embedded watermark information from the database and compares it with the original one. This algorithm also reverses the watermarked database to original after watermark extraction and verification. The details of the two algorithms are given in the following sections.

##### 3.1. Watermarking Algorithm

Watermarking algorithm consists of three modules as shown in Figure 1: preprocessing, firefly determination, and watermark embedding. Preprocessing module prepares the database by sorting tuples and columns. Firefly determination module outputs the best firefly for dataset. Watermark embedding module embeds watermark data into dataset according to the best firefly.