Abstract

This paper proposes a blind authentication scheme to identify duplicated regions for copy-move forgery based on perceptual hashing and package clustering algorithms. For all fixed-size image blocks in suspicious images, discrete cosine transform (DCT) is used to obtain their DCT coefficient matrixes. Their perceptual hash matrixes and perceptual hash feature vectors are orderly addressed. Moreover, a package clustering algorithm is proposed to replace traditional lexicographic order algorithms for improving the detection precision. Similar blocks can be identified by matching the perceptual hash feature vectors in each package and its adjacent package. The experimental results show that the proposed scheme can locate irregular tampered regions and multiple duplicated regions in suspicious images although they are distorted by some hybrid trace hiding operations, such as adding white Gaussian noise and Gaussian blurring, adjusting contrast ratio, luminance, and hue, and their hybrid operations.

1. Introduction

Copy-move forgery as a popular digital image tampering technology is extensively used by forgers. In a digital image, some regions are copied and then pasted into other regions in this same image to achieve the purpose of hiding some targets or emphasizing some important objects [1]. Its authenticity is broken. Since the original regions and the duplicated regions come from the same image, their most important characteristics, such as the color palettes, noises, and dynamic ranges, are compatible with the remainder of the image [2]. One may neglect this malicious operation if forgers deliberately hide the tampering traces. A typical copy-move forgery example is shown in Figure 1, where the traffic flow is exaggerated by tampering cars. It is urgent to propose effective copy-move forgery detection methods to detect and locate the tampered regions for digital images. Blind authentication for copy-move forgery mainly focuses on the identifying of tampered regions in digital images without any additional information except for themselves. Based on this advantage, it becomes to a valuable research in image authentication fields [3].

Block-based methods and keypoint-based methods are the common techniques for copy-move forgery detection. Block-based methods indicate that a suspicious image is divided into overlapped and fix-sized blocks. The tampered regions can be identified by matching the similar feature vectors that are extracted from the blocks. Fridrich et al. in [1] first proposed a block-based detection scheme using quantized discrete cosine transform (DCT) coefficients, which is one of the landmark methods for copy-move forgery detection. Popscu and Farid in [4] presented a novel method that use principal component analysis (PCA) to derive an alternative representation for each image block. However, it cannot resist other robustness attacks and identify some little tampered regions. Babak and Stanislav in [5] presented a copy-move forgery detection scheme to extract image features for overlapped blocks based on blur moment invariants. Cao et al. in [6] exploited the mean of DCT coefficients to propose their algorithm that not only can resist the attacks of blurring and additive noise operations, but also considers the detection accuracy rate (DAR) and false positive rate (FPR). However, it is weak to resist the attack of hue or contrast ratio adjustments. Thajeel and Sulong in [7] presented an approach to improve the detection precision based on completed robust local binary pattern. Wang et al. in [8] proposed a copy-move forgery detection scheme to improve the detection precision based on DCT and package clustering algorithms. However, the robustness of feature vectors has been concentrated less. Zhong et al. in [9] presented a scheme that divides the tampered image into overlapped circular blocks. The features of circular blocks are extracted by the discrete radial harmonic Fourier moments. This method obtains outstanding performance under image geometrical distortions. Dixit et al. in [10] proposed a method for detecting copy-move forgery using stationary wavelet transform. The detection accuracy of the proposed method is also considered. Bi and Pun in [11] presented a fast reflective offset guided searching method for image copy-move forgery detection. It aims to reduce the computational complexity. The block-based methods mentioned above need to divide the image into overlapped and fix-sized blocks and then handle each of them. These algorithms can resist some plain postprocessing operations, such as JPEG compression, blurring, and noise interference. However, they did not achieve satisfactory results to resolve a common problem of reducing similar region matching times.

Keypoint-based methods rely on the identification of high-entropy image regions [12]. A feature vector can be extracted for each keypoint. Fewer feature vectors are estimated since the number of keypoints is reduced. Therefore, keypoint-based methods theoretically have lower computational costs for feature vectors matching and postprocessing. Amerini et al. in [12, 13] presented scale invariant feature transform (SIFT) to filter, sort, and classify the keypoint pairs for copy-move forgery detection. Li et al. in [14] try to reduce the similar region matching times and improve the DAR and FPR by segmenting a suspicious image into nonoverlapped patches. Wang et al. in [15] introduced a keypoints-based image passive detecting method based on Harris detector and region growth technology. It is robust for JPEG compression, gamma adjustment, and luminance enhancement. Li et al. in [16] proposed a hierarchical cluster algorithm based on maximally stable color region detector and Zernike moments to extract all keypoint features. Wang et al. in [17] presented a method to segment a suspicious image into irregular superpixels that are classified into smooth, texture, and strong texture. The stable image keypoints can be extracted from each superpixel. The above-mentioned algorithms have moved the copy-move forgery detection field ahead rapidly. However, they did not achieve satisfactory results on the improving of DAR and FPR in order to reduce the matching times. The resistance for other postprocessing operations is less considered, such as adjusting contrast ratio, luminance, hue, and their hybrid operations.

Perceptual hashing [18] is a class of one-way mappings from multimedia presentations to perceptual hash values in terms of the perceptual content. It is widely applied to perform multimedia content identification, retrieval, and authentication. In similar image searching and target tracking, perceptual hash algorithms are applied to generate fingerprints for digital images and then are used to compare them with each other. In addition, perceptual hash values are robust to take into account transformations or “attacks” on a given input and, yet, flexible enough to distinguish between dissimilar files. Such attacks include skew, contrast adjustment and different compression. Perceptual hash values are analogous if features are similar [19]. In a copy-move forgery image, the copy regions are similar with their paste regions. Therefore, perceptual hash algorithms can also be used to generate robust features for detecting the tampered regions.

In this study, a passive authentication scheme is proposed to perform authenticating for copy-move forgery based on perceptual hashing. The novelty of the proposed scheme includes the following: Using perceptual hashing algorithms, the feature vectors of image blocks are robust for improving the DAR and FPR. A package clustering algorithm is used to replace traditional lexicographic order methods to reduce the block matching times, where each package is used to represent a cluster. Using perceptual hash algorithms, the proposed method can effectively identify and locate multiple duplicated regions in digital images that may be distorted by adding white Gaussian noise and Gaussian blurring, adjusting contrast ratio, luminance, hue, and their hybrid operations.

The rest of this paper is organized as follows. Section 2 introduces the proposed method. Section 3 shows the performance of the proposed scheme with a series of experiments. Finally, this paper is concluded in Section 4.

2. The Proposed Scheme

It is impossible in general that there are two identical regions in a naturally formed picture unless it contains large area smooth regions, such as a blackboard or a piece of blue sky [20]. In this study, we suppose that all images do not contain large area smooth regions.

It is an incontestable fact that each suspicious image contains at least two similar regions, that is, an original region and a copy-move forgery region, if the suspicious image is tampered with copy-move forgery. By concluding many existing schemes, the task of passive authentication for copy-move forgery is to detect and locate tampered regions for suspicious images. In our proposed method, two main steps, that is, feature extraction and feature matching, are separately introduced. In the feature extraction step, perceptual hashing algorithms are extended to generate perceptual hash feature vectors that can be used to represent the image blocks in a suspicious image. In feature matching step, the idea of a package clustering algorithm is used to replace general lexicographically sorting algorithms to improve the detection precision and reduce the feature vector comparing times. Figure 2 shows the framework of the proposed scheme.

2.1. Preprocessing Operation

Let be a suspicious image. It should be converted into a gray-scale image by if it is a color image, where , , and represent the red, green, and blue components of , respectively, and represents the pixel value of gray-scale image.

2.2. Feature Extraction Using Perceptual Hashing

In this step, suspicious image is divided into different image blocks. DCT is applied to generate the DCT coefficient matrix for each image block. Finally, perceptual hashing is used to extract a perceptual hash feature vector for each image block according to its generated DCT coefficient matrix. The details of the feature extracting algorithm are shown in Algorithm 1.

Input: A suspicious gray-scale image .
Output: All perceptual hash feature vectors for image
blocks in .
Step  1. Suspicious image is divided into
overlapping
blocks, denoted as , where ,
, , and
.
Step  2. For each block
Step  3. The pixel mean of , denoted as ,
is computed.
Step  4. DCT is applied to generate the coefficient
matrix for block , denoted as .
Step  5. The coefficient matrix is divided into
four sub-blocks, denoted as , , ,
and , respectively.
Step  6. The mean of the first sub-block is
calculated, denoted as .
Step  7. The perceptual hashing matrix for each
sub-block is computed, denoted as
, where .
Step  8. Each perceptual hashing matrix is
converted into a decimal number, denoted
as , to represent feature value for block
, where .
Step  9. The feature vector of block is created,
denoted as ,
according to its pixel mean and four
feature values.
Step  10. End For

In Step , suspicious image with the size of pixels is divided into overlapping blocks by sliding a square window with the size of pixels along with image from the upper-left corner right down to the lower-right corner; that is, the adjacent overlapping blocks only have one different row or column. Each block is denoted as , where , , and indicate the starting point of the block’s row and column, respectively. Therefore, the original regions and their copy-move forgery regions are also divided into different blocks in which there is at least a pair of identical or similar blocks. The main task of the proposed scheme is to detect and locate these identical or similar blocks in pairs.

In Steps  2–10, the feature vector of each block is computed using DCT and perceptual hashing algorithms. It is unideal to directly use pixel values to match similar blocks in suspicious images since the forgers may distort the content of the tampered images. An ideal method is extracting robust features to represent blocks and then the similar blocks can be diagnosed by matching these robust features. The purpose is to strengthen the robustness and improve the detection accuracy of the proposed scheme. In this algorithm, perceptual hash features play this role, which are used to represent image blocks.

In Step , the pixel mean of image block , denoted as , is calculated as follows:where represents the pixel value of th row and th column in . For the pair of two identical or similar image blocks, their pixel means are also identical or similar.

In Step , DCT is applied to exploit the DCT coefficient matrix for each block , denoted as , where the DCT coefficient matrix has the same size with block , , and .

In Step , coefficient matrix is divided into four subblocks. A typical characteristic of DCT is that the energy of an image focuses on the low frequency part and the high frequency coefficients play insignificant roles. This means that not all elements are equally important in and the top-left part of represents most features of block . In the proposed method, each DCT coefficient matrix will be equally divided into four subblocks, denoted as , , , and , as shown in Figure 3.

In Steps  6–9, the feature vector of block is created using perceptual hashing algorithm. According to the typical characteristic of DCT, the energy of the first subblock can be used to approximately represent the energy of whole block . This means that the average energy of block can be approximately represented by the energy of first subblock . In Step , the average energy of subblock , denoted as , is calculated as follows:where indicates the element of th row and th column in subblock .

In Step , the perceptual hash matrixes of the four subblocks , , , and , denoted as , , , and , respectively, are generated according to the average energy of subblock ,that is, . Therefore, is calculated as follows:where , , and . Obviously, perceptual hash matrix can be considered as a perceptual digest from an image block to a binary matrix. It is used to represent the image block.

In Step , perceptual hash matrixes are converted into decimal numbers. In practical application, it is easier to calculate and store decimal numbers than binary matrixes. The four perceptual hash matrixes , , , and are converted into four decimal numbers, denoted as , , , and , respectively, along with their rows.

In Step , the perceptual hash feature vector of block is created. For block , it has five special values that are considered as above, that is, the pixel mean and the four decimal numbers , , , and . In order to more accurately represent block , its perceptual hash feature vector, denoted as , is constructed as follows:Obviously, perceptual hash feature vector has the properties of simpleness and robustness. Therefore, it can be considered as the feature vector for image block .

2.3. Similar Region Matching

In the matching stage of the existing methods, their feature vectors are sorted first by some sorting algorithms, such as traditional lexicographic order algorithms, and then used to detect and locate the similar blocks using block matching methods. However, two kinds of issues in these existing methods should be improved to achieve better matching results. One is the block matching times, it will cause that the proposed matching algorithm has higher time complexity. Another is the precision of locating duplicated regions, which is dissatisfactory. In our proposed scheme, a package clustering algorithm is proposed to detect and locate the tampered regions with the purpose of improving the detection precision. The details of the proposed similar region matching algorithm are described in Algorithm 2.

Input: All perceptual hash feature vectors ,
, where
and , which are the output
of Algorithm 1.
Output: A map that includes the detection results.
Step  1. Creating packages, denoted as ,
, and , where
and is a preset threshold.
Step  2. All perceptual hash feature vectors
are stored into the
packages, respectively, according to
the value of .
Step  3. A map is created with the same size of
suspicious image and all its initial pixel
values are set to zero.
Step  4. For each package   
Step  5. The block pairs contained in will be
matched according to their perceptual
hash feature vectors and coordinate
positions. The values of the corresponding
coordinate positions in the map will be
set to a same pixel value “255” according
to the coordinates of the suspicious
image if the block pairs are diagnosed
as similar.
Step  6. For each block contained in package ,
it will be matched with all blocks
contained in package if
with the same method
of Step  5.
Step  7. End For
Step  8. Outputting the map.

In Algorithm 2, Steps  1 and 2 construct a package clustering algorithm that stores all perceptual hash feature vectors into the prepared packages according to the pixel means of blocks. Steps  4–7 compare all perceptual hash feature vectors to detect and locate the similar blocks according to the proposed package matching rule.

In Step , a set of packages is created. Let be a preset threshold that represents the maximum capacity of all packages. Therefore, packages are created, denoted as , and , since the suspicious image is a gray-scale image (its pixel range is zero to 255), where is a floor function.

In Step , all perceptual hash feature vectors are stored into the packages. Let be a block and be the perceptual hash feature vector of block . Then, block will be put into package , where . For example, assume , , , and . Block will be put into package , where . This indicates that the pixel mean range of package is .

For any two image blocks and , their pixel values are similar if the two image blocks are duplicated. Naturally, their average pixel values and are also similar, where or . Therefore, the perceptual hash feature vectors of blocks and will be stored into the same package or two adjacent packages and , where and . Let the average pixel values of blocks and be and , respectively. We have that the pixel mean range of package is and the pixel mean range of package is . The two perceptual hash feature vectors of blocks and will be stored into the two adjacent packages. We need to match the perceptual hash feature vectors to diagnose the similar blocks in the same package and the adjacent package in the proposed similar region matching algorithm.

In Step , a map is created to mark the coordinate positions of all duplicated regions. It is the output in Algorithm 2. At the initial state, all of its values are set to zero. This means that there is no duplicated region at the initial state.

In Step , the similar image blocks that belong to the same package will be located according to their perceptual hash feature vectors and their actual coordinate distance. , all perceptual hash feature vectors contained in package will be compared with each other. Let and be two image blocks such that , , and be their perceptual hash feature vectors, respectively, and be a preset threshold. Blocks and can be considered as similar blocks if such that , where is an exclusive-OR operation for binary strings and and is a function that is used to count the number of “1” in .

Specially, blocks and that are diagnosed as similar blocks should be excluded if their coordinate positions are adjacent in a suspicious image since the adjacent pixels of the suspicious image are generally smooth. Therefore, the coordinate distance of the two similar blocks and should be considered. Let and be the coordinates of blocks and , respectively. Their actual coordinate distance, denoted as , can be calculated as follows:If , similar blocks and are considered as actual similar blocks, where is a preset threshold. If blocks and are diagnosed as actual similar blocks, the values of the coordinate positions and , which are also the coordinate positions of similar blocks and in the suspicious image, should be marked with the same value, such as 255.

In Step , each block that belongs to package will be matched with all blocks that belong to package with the same method of Step if .

Let and be two image blocks and and be the perceptual hash feature vectors of and , respectively. We have if , where is a particular case of . It can be explained as follows. For blocks and , and such that if , where (resp., ) represents the pixel of th row and th column in (resp., ), we have since they are the pixel means of and , respectively.

Let and be the DCT coefficient matrixes of and , respectively. According to the independence and stability characteristics of DCT [21], we have if . Naturally, we also have , where and are the four subblocks of and , respectively, and . Let and be the mean values of and , respectively. We have since . Therefore, we have according to (3), where and are the perceptual hash matrixes of subblocks and , respectively. Note that and are two binary matrixes, which contain only “1” or “0.” Automatically, we have since the decimal numbers and are uniquely computed from the two binary matrixes and , respectively. Therefore, we have since and , where . It indicates that image blocks and may be a similar block pair if their perceptual hash feature vectors and are similar. In order to authenticate a suspicious image, we should detect all blocks of the suspicious image by comparing the perceptual hash feature vectors of these blocks.

3. Experiment and Analysis

In this section, the performance of the proposed scheme is tested and analyzed with many suspicious images that are involved in three image databases. The first one is the Columbia photographic images and photorealistic computer graphics database [22], which is made open for passive-blind image authentication research communities. In this database, about 1200 images are involved. We used Photoshop 8.0 to tamper images. All tampered suspicious images form the first experiment database. The second database contains two datasets MICC-F2000 and MICC-F220 that are introduced by Serra in [23]. The two datasets provide 1110 original images and 1110 tampered suspicious images with copy-move forgery. The original images contain animals, plants, men, artifacts, and natural environment. Moreover, to further evaluate the performance of the proposed scheme, 200 supplemented images are downloaded from the Internet and tampered with copy-move forgery to form the third database.

3.1. Evaluation Criteria Introduction

To evaluate the performance of copy-move forgery detection methods, researchers usually consider their test results at two different levels, that is, image level and pixel level [3]. At image level, it mainly focuses on the detection of whether an image is tampered or not. Let be the number of tampered images that are correctly detected, be the number of images that are erroneously detected to be the tampered images, and be the number of falsely missed forgery images. The precision ratio and recall ratio [3] can be calculated by the following formulas:where the precision ratio denotes the probability of a detected forgery being truly a forgery and the recall ratio denotes the probability of a forgery being not missed.

At pixel level, it is used to evaluate the accuracy of duplicated regions. Let and be the pixels of an original region and a copy-move region in a suspicious image, respectively, and and be the pixels of an original region and a copy-move region in a detected result image, respectively. The detection accuracy rate (DAR) and false positive rate (FPR) are calculated as follows:where “” means the area of region, “” means the intersection of two regions, and “−” means the difference of two regions. In this sense, the DAR shows the proportion of identified pixels that simultaneously belong to the really duplicated regions and all really duplicated pixels in all suspicious images. The FPR shows the ratio of some identified pixels that actually do not belong to the really duplicated regions and all identified pixels in all suspicious images. The four criteria indicate how precisely the proposed schemes can locate copy-move regions. Then, we can analyze the performance of the proposed scheme at the image level and the pixel level with the four criteria.

3.2. Effectiveness and Accuracy

In this experiment, 400 color images are selected to test the effectiveness and accuracy of the proposed scheme, including 100 original images, 100 forgery images, and other 200 images that are tampered with Photoshop 8.0. All tampered images do not suffer any postprocessing operation. Owing to space constrains, just a part of experimental results is shown in Figure 4. The DAR and FPR are calculated to illustrate the performance of the proposed scheme. In Figure 4, the DAR is generally greater than 0.85 and the FPR is also smaller. It indicates that the duplicated regions can be detected using the proposed scheme even though the duplicated regions are nonregular. Table 1 shows the comparison result of the proposed scheme and other existing schemes that are presented in [6, 7, 9]. It indicates that the and in the proposed scheme are better.

3.3. Robustness Test

In addition to the plain copy-move forgery, the detection for tampered images that are attacked by some postprocessing operations is also considered in the proposed scheme. Therefore, a series of experiments have been done to overall analyze the performance of the proposed scheme. It involves 1000 different suspicious images that come from the three databases. In this experiment, five kinds of attacks are considered, that is, adding white Gaussian noises (AWGN), adjusting contrast ratio (ACR), luminance (AL), hue (AH), and Gaussian blurring (GB), and their hybrid operations. Table 2 presents the parameters for the five kinds of attacks and Figure 5 shows a part of experimental results for the proposed scheme.

In this experiment, the proposed scheme is evaluated by DAR and FPR at the pixel level. The results indicate that the proposed scheme can locate multiple duplication regions although the suspicious images are attacked with different postprocessing operations.

In order to quantitatively evaluate the robustness of the proposed algorithm and analyze its ability to resist different image distortions, 100 tampered images are selected from the three databases. These tampered images are distorted by five kinds of attacks that are shown in Table 2. Then, there are 500 tampered images that will be detected in this experiment. For each kind of attacks, 100 tampered images are selected to be detected. Tables 37 show the detection results with the overall averages of , , DAR, and FPR. The robustness of the proposed scheme is evaluated at image level and pixel level.

Tables 3 and 4 show that the detection results of the proposed scheme are satisfactory for suspicious images that are attacked by adding white Gaussian noises and Gaussian blurring although the suspicious images have poor quality (SNR = 45 or ). Only 14 images in all 600 tampered images are failed to be detected . The detection results of tampered images that are distorted by adjusting contrast ratio, luminance, and hue with different parameters are shown in Tables 5, 6, and 7, respectively. We can draw a conclusion from the three tables that the proposed scheme performs well also for attacks of adjusting contrast ratio, luminance, and hue.

3.4. Performances Comparison

In the last experiment, the performance of the proposed scheme is compared with other schemes presented in [6, 7, 9]. In this experiment, 400 tampered images are randomly selected from the three databases. They are tested by the proposed scheme and other schemes provided in [6, 7, 9], respectively. Figure 6 shows the performance comparison of these schemes with the overall averages of DAR and FPR for the 400 tampered images. We can see that the scheme proposed in [9] has the best detection results for the two kinds of attacks by adding white Gaussian noises and Gaussian blurring. However, its performance clearly drops down if the intensity of these attacks is gradually increased. Conversely, the proposed scheme is more robust for resisting various attacks. In most cases, the proposed scheme can also achieve better results for other three kinds of attacks with adjusting contrast ratio, luminance, and hue. Moreover, the proposed scheme has the lowest FPR results, which means that the proposed scheme can detect most duplicated regions in the selected suspicious images. The precision of the proposed scheme is higher than that obtained in [6, 7, 9].

The experimental results show that the proposed method can locate the tampered regions in a tampered image although it is distorted by some hybrid trace hiding operations, such as adding white Gaussian noise, Gaussian blurring, adjusting contrast ratio, luminance, and hue, and their hybrid operations. The proposed forensic technique can be used in politics, military, jurisprudence, and academic research. For example, a journalist takes a photo for a traffic accident. However, the journalist finds that the influence will be better if some crowds appear in this photo. Therefore, he can use image processing tools to copy some people from the other side of this photo and paste them into the scene and use white Gaussian noise to conceal all tampering traces. Therefore, the authenticity of this traffic accident is broken. The news organization can detect this photo by using the proposed scheme to ensure its authenticity before this news is reported.

4. Conclusion

In this study, a passive authentication scheme is proposed based on perceptual hashing and package clustering algorithms to detect and locate the duplicated regions for copy-move forgery. The experiment results show that the proposed scheme based on perceptual hashing algorithms is robust for some special attacks, such as adjusting contrast ratio, luminance, and hue. A technology application of using perceptual hash strings to construct a feature vector to represent an image block can resist some conventional attacks, such as adding white Gaussian noises and Gaussian blurring. The proposed package clustering algorithm that is used to replace traditional lexicographic order algorithms can improve the performance of the proposed scheme. The evaluation criteria , , DAR, and FPR from the experiments show that the proposed scheme is better but the proposed scheme also has some weaknesses. For example, the time complexity is still unsatisfactory because of the previous image block dividing. Furthermore, the proposed scheme cannot resist some complex attacks, such as block rotation and scaling. In future work, we will focus on the studies of improving the time complexity and extending the robustness for more kinds of complex attacks, such as the rotation and scaling.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (NSFC) under Grant no. U1536110.