Abstract

Nowadays the demand for identifying the authenticity of an image is much increased since advanced image editing software packages are widely used. Region duplication forgery is one of the most common and immediate tampering attacks which are frequently used. Several methods to expose this forgery have been developed to detect and locate the tampered region, while most methods do fail when the duplicated region undergoes rotation or flipping before being pasted. In this paper, an efficient method based on Harris feature points and local binary patterns is proposed. First, the image is filtered with a pixelwise adaptive Wiener method, and then dense Harris feature points are employed in order to obtain a sufficient number of feature points with approximately uniform distribution. Feature vectors for a circle patch around each feature point are extracted using local binary pattern operators, and the similar Harris points are matched based on their representation feature vectors using the BBF algorithm. Finally, RANSAC algorithm is employed to eliminate the possible erroneous matches. Experiment results demonstrate that the proposed method can effectively detect region duplication forgery, even when an image was distorted by rotation, flipping, blurring, AWGN, JPEG compression, and their mixed operations, especially resistant to the forgery with the flat area of little visual structures.

1. Introduction

Nowadays, with the development of state-of-the-art digital image technologies and the widespread use of powerful image editing software, even people who are not experts in image processing can fake an image easily without leaving any visual tampering clues. Digital image forgeries, which seriously debase the credibility of photographic images as definite records of events, have become so widespread a problem that affects social and legal systems, forensic investigations, intelligence services, and security and surveillance systems. In order to recover people’s confidence in the authenticity of digital images, image forensics aiming to reveal forgery operations in digital images are receiving more and more attention.

In recent years, many image forgery detection techniques have been proposed, which can be broadly classified into two categories: active approach and passive approach. Active image forensic techniques represented by digital watermark [1, 2] require prior knowledge about the original image, thus they are not automatic. In addition, the drawback of digital watermark is that an imperceptible digital code (a watermark) must be inserted at the time of recording, which would restrict this approach to specially equipped digital cameras. In contrast, passive forensics aims at identifying the authenticity of an image without prior knowledge and in the absence of watermarks, which works by assuming that even though the tampered images do not reveal any visual artifacts, the underlying statistics of these images would be distinct from the original ones. Owing to its incomparable advantage, passive image forensics has been regarded as the promising research interest in the field of image forensics.

Among forgery techniques using typical image processing tools, region duplication, also being called copy-move, is the most common type of image forgery where a region of an image is copied and then pasted to another nonintersecting region in the same image to conceal an important element or to emphasize a particular object. Due to the nature of region duplication forgery, there will be at least two similar regions in the tampered image, which is not common in natural images and thus can be used to detect this specific artifact. In [3], Weiqi et al. proposed the model of region duplication forgery on which most existing detection methods are based. Since duplicated regions come from the same image, they have similar properties like texture, color, and noise. In practical situations, however, several image intermediate operations and postprocessing operations could be involved in practical region duplication forgery. The intermediate operations could be rotation, flipping, scaling, or illumination modifying. The postprocessing operations include noise adding, JPEG compression, or blurring. In a practical situation, a faked image may be a combination of two or more operations, which is a direct challenge to most existing techniques.

In this paper, we propose a passive detection scheme for region duplication image forgery based on Harris corner points and local binary patterns. Experiment results show that the proposed method can effectively detect region duplication forgery, even when an image was distorted by rotation, flipping, blurring, AWGN, JPEG compression, and their mixed operations, especially resistant to the forgery with the flat area of little visual structures.

The rest of the paper is organized as follows. In Section 2, the related works on region duplication forgery detection are introduced. Section 3 briefly reviews Harris corner points and local binary patterns. In Section 4, the proposed algorithm is described in detail. The experimental results are given and the corresponding analysis is discussed in Section 5. The conclusion is drawn in Section 6.

In the last decade, many passive techniques for region duplication forgery have been proposed, which could be grouped into two categories: block-based methods [310] and keypoint-based methods [1115]. Fridrich et al. [4] first analyzed the exhaustive search and then proposed a block matching detection scheme based on quantized Discrete Cosine Transform (DCT) coefficients. In order to make this algorithm more robust and efficient, Huang et al. [9] and Cao et al. [10] proposed an improved DCT-based detection method, respectively, which reduced the dimension of feature vector. Popescu and Farid [5] proposed a similar method which represented image blocks using Principal Component Analysis (PCA) instead of DCT. Weiqi et al. [3] extracted color features as well as special intensity ratio to represent a block characteristics vector. A different approach was presented by Xiaobing and Shengmin [6] in which the features were represented by the Singular Value Decomposition (SVD). Guohui et al. [7] proposed to decompose the image into four subbands using Discrete Wavelet Transform (DWT) and then apply SVD on the blocks. However, when images are manipulated through geometry transforms like rotation, flipping, or scaling, all these above-mentioned methods cease to be effective. To address this problem, Bayram et al. [8] applied Fourier-Mellin Transform (FMT) to each block and FMT values were finally projected to one dimension to form the feature vector. However, FMT-based method can only detect duplicated regions with slight rotation according to their experimental results. Bravo-Solorio and Nandi [16] proposed a scheme based on log-polar coordinates to detect forgery regions, even when the duplicated regions have undergone flipping, rotation, and scaling. Nevertheless, since the method depends on the pixel values, it is sensitive to the change of the pixel values. Almost all the methods above-mentioned are block-based which attempt to find an effective and robust representation of each block, moreover, they are expected to be insensitive to common postprocessing operations and intermediate operations.

In contrast to block-based methods, keypoint-based methods rely on the identification and selection of high-entropy image regions. In [1113], some approaches that extracted keypoints by Scale-Invariant Feature Transform (SIFT) were proposed to detect the forgery due to their robustness to several geometrical transforms such as rotation and scaling. However, SIFT-based scheme still has a limitation on detection performance since it is only possible to extract the keypoints from peculiar points of the image and not robust to some postprocessing operations like blurring and flipping based on our experimental results. Xu et al. [14] and Shivakumar and Baboo [15] proposed another keypoint-based method which used Speeded Up Robust Features (SURF) to approximately show the duplicated regions in the forged images. The main drawback of most keypoint-based methods is that copied regions are often only sparsely covered by matched keypoints. Thus they do not provide the exact extent and location of the detected duplicated region but only display the matched keypoints. Furthermore, if the copied region exhibits little structure, it may happen that the region is completely missed [17].

Most existing methods are typically evaluated against simple forgeries where human viewers have no trouble to identify the duplicated regions or low resolution images which are a far cry from realistic tampered images with high resolution. Their detection performance on challenging realistic forgery images is far from certain.

3. Theoretical Background

3.1. Harris Corner Detector

Harris corner detector [18] is a widely used interest point detector, which has been applied successfully in several image processing [19, 20] and robotic vision [21, 22] applications, since Harris feature points are stable under majority of the attacks such as rotation, noise adding, and illumination change. Harris corner detector is based on an underlying assumption that feature points are associated with maxima of the local autocorrelation function.

For a given image , its autocorrelation matrix at point can be calculated as follows: where and are the respective derivatives of pixel intensity in the and directions at point . is the weighting function usually of circular Gaussian form as follows: Harris proposed a measure response to detect corners of an image: where is the determinant, is the trace, and is a scalar value empirically chosen from the range . Corner points which are greater than a specified threshold are identified as local maxima of the Harris measure response as follows: where is the set of all corner points, is the Harris measure response calculated at point , is an 8-neighbor set centered around the point , and is a specified threshold.

In the process of Harris feature points extraction, the threshold determines the number of Harris feature points. The larger the value is, the less the number of feature points is. On the contrary, the smaller the value is, the greater the number of feature points is, and the more intensive they are distributed. In order to make the proposed algorithm effective even when the duplicated region is in the flat area with little visual structure or of small size, we propose to employ the dense Harris feature points, namely, the threshold equal to zero, so that a large number of Harris feature points are obtained with approximately uniform distribution, which is more beneficial to enhance the robustness of the algorithm.

3.2. Local Binary Pattern

Local Binary Pattern (LBP), proposed by Ojala et al. [23], is a powerful means of texture description, which has gained increasing attention in many image analysis applications in virtue of its low computational complexity, invariance to monotonic grayscale changes and texture description ability. The LBP operator can be seen as a unified approach to statistical and structural texture analysis, since it describes each pixel by the relative gray levels of its neighboring pixels. Figure 1 illustrates the calculation of the original LBP for one pixel with a neighboring block. These eight neighbors are labeled by thresholding with the central pixel value, weighted with powers of two, and then summed to obtain a new value assigned to the central pixel.

Using circular neighborhoods and linearly interpolation, LBP can be extended to allow the choice of any radius and number of pixels in the neighborhood to form a neighborhood, illustrated in Figure 2. Denote the central pixel at position . Having equally spaced neighborhood pixels on a circle of radius , LBP is calculated by: where and correspond to the gray value of central pixel and neighboring pixel respectively.

For a given image with the size of , the normalized histogram of LBP codes is commonly used as a feature vector, which is computed over the whole image by: where is the maximal LBP pattern value.

The final LBP feature of an image consists of computing the LBP code for each pixel within the image and building a histogram based on these codes. LBP feature is a good local image region descriptor, since it is very fast to calculate, and is invariant to monotonic illumination changes. However, the drawback of the LBP feature lies in the high dimensionality of histograms produced by the LBP codes [24]. Let be the total number of neighboring pixels, then the LBP feature will have distinct values, resulting in a -dimensional histogram. A popular dimensionality reduction method for LBP is “uniform patterns,” proposed by Ojala et al. in [23], and it is considered to convey some fundamental properties of texture. A local binary pattern is called uniform, denoted as , if it contains at most two bitwise transitions from 0 to 1 or vice versa when the binary string is considered circularly. For neighboring pixels, lead to a histogram of dimensions. When “uniform patterns” codes are rotated to their minimum values, denoted by [23], the total number of patterns reduces to : where the value of an LBP pattern is defined as the number of spatial transitions (bitwise 0 and 1 changes) in that pattern. Therefore,

4. The Proposed Method

In this section, the proposed method for region duplication image forgery based on Harris feature points and local binary patterns is described in detail. The flow diagram of our algorithm is shown in Figure 3. The whole detection steps are given as follows.

Step 1 (preprocessing the input image). In our algorithm, we are concerned with gray level images. For a color image in RGB model, it is first converted to a grayscale image using the standard formula where , , and are three channels of the input color image and is its luminance component.
As mentioned before, several image postprocessing operations could be involved in practical region duplication forgery, such as noise adding, JPEG compression, or blurring. It is well known that the high frequency components are not stable when the image is distorted by these postprocessing operations, while the low frequency features are more resistant to these distortions. Thus, in the preprocessing stage, we filter the input image with a pixelwise adaptive Wiener method based on statistics estimated from a local neighborhood of each pixel. With lots of experiments, we find that this filtering in the preprocessing stage has significant improvements on detection performance. Besides, extensive experiments show that multiple filtering contributes to the improvements, especially when the input image is suffering from severe AWGN and JPEG compression. In our experiments, we find that for high strength JPEG compression or AWGN with low SNR five times of filtering is an optimal option. However, for moderate distortions one time of filtering is in effect.

Step 2 (Harris feature points detection and feature extraction). Harris feature points in the filtered image are detected. As described in Section 3.1, dense Harris feature points are employed in order to obtain a sufficient number of feature points with approximately uniform distribution. After obtaining location coordinates of each feature point, LBP is applied to each pixel in a circle patch with the radius of 10 around each feature point. In the proposed algorithm, by means of using rotation invariant uniform LBP () and various combinations of / values, we can realize operators for any quantization of angular space and for any spatial resolution, which combine the information provided by multiple LBP operators.
In our experiments, three variants of rotation invariant uniform LBP, including , , and , are applied to the circle patch around each feature point to extract the features. For a given circle patch with the radius of 10 around the th feature point, three histograms of rotation invariant uniform LBP, denoted by , , and , are used as feature vectors, which are computed using , , and , respectively. It should be noticed that each feature vector is normalized to unit length. Extracted feature vectors are put in separate feature matrices. Assuming that the total number of Harris feature points is , thus we can obtain three feature matrices of size , to be specific, , , and , with dimensions of , , and , respectively.

Step 3 (feature matching). In the feature matching step, the similar Harris points are matched based on their representation feature vectors using the best-bin-first (BBF) algorithm [25] to determine the duplicated regions correctly. For a Harris feature point at location with feature vector , we match it with point at location , whose corresponding feature vector is the nearest neighbor to measured with (Euclidean) distance. It is well known that due to the smoothness of natural image, the best match of a feature point usually lies within its close spatial adjacency. Thus, in order to avoid searching nearest neighbors of a feature point from the same region, we perform the search outside a pixels circle window centered at the feature point. Only pair-wise points with distinct similarities are kept in the matching step. Specifically, we require that, for any other feature vector other than and , the distance between and has to be smaller than that between and by at least a threshold : where is a preset threshold controlling the distinctiveness of feature matching.
For each feature matrix of , , and , we record the indexes of every pair-wise matching points satisfying (11). Formally, let be an index pair of the two feature points which are represented by two rows of each feature matrix. Due to the order of an index pair making no difference, the index pair of matching points is normalized, if necessary, by interchange of positions so that . For each index pair , we increment a matching frequency counter by one as follows: The matching frequency counter is initialized to zero before the algorithm starts. At the end of the matching process, the counter indicates the frequencies with which different index pairs of matching points, which are determined by three feature matrices , , and , respectively, occur. To determine the candidate matching points, the majority rule is utilized. Specifically, all the pair-wise matching points are found, whose occurrence exceeds twice. The matching strategy of feature points is applied for all Harris feature points in the corresponding matrices , , and , and final matching results are stored in a similar points matrix , which records the corresponding spacial coordinates of matching points.

Step 4 (removing false matches and outputting detection result map). Due to a portion of mismatched feature points, we employ a widely used robust estimation method known as the Random Sample Consensus (RANSAC) algorithm [26] to remove false matches in the similar points matrix . The final detection result map is output with color lines connecting all the matching points to identify the duplicated region and forgery region.

5. Experimental Results and Analysis

In our experiments, the tampered images were created by Adobe Photoshop CS3 based on the following two datasets. The first one contains 24 uncompressed PNG true color images with the size of 768 × 512 pixels released by Kodak Corporation for unrestricted research usage [27]. In addition, we collected 50 high resolution color images of size 1024 × 768 pixels from Google image search [28], which formed the second dataset. Through a large number of repeated experiments, threshold is fixed to 0.5, and the size of Wiener filter window is set to . All the experiments were carried out on the platform with Intel Pentium 2.13 GHz and MATLAB R2010b. By using our method, for each image with the two different sizes from the two datasets mentioned above, it takes about 9 s and 13 s to locate the tampered regions, respectively, which are of high efficiency. Nevertheless, if we use C++ or Java programming languages to implement the algorithm, our algorithm will achieve higher efficiency.

5.1. Performance Evaluation

For practical applications, the most important aspect of a detection method is the ability to distinguish tampered and original images. Thus we adopt the evaluation indexes which are defined in [17] to evaluate the performance of our algorithm at image level. We keep a record of some important measures including the number of correctly detected forged images , the number of images that have been erroneously detected as forged , and the falsely missed forged images . From these we can obtain two evaluation indexes precision, , and recall, , as follows: Precision denotes the probability that a detected forgery is truly a forgery, while recall shows the probability that a forged image is detected.

5.2. Effectiveness Test

In the following experiment, we select some original images from the two datasets above-mentioned to test the effectiveness of our algorithm. It is noted that all the duplicated regions are nonregular and meaningful objects, which are commonly true in realistic tampered images with high resolution. All the doctored images in this experiment are without any postprocessing operation and the corresponding detection results are illustrated in Figure 4. The first column shows the original images, the second one gives the tampered images, and the third one shows the detection results. Owing to space constraints, just a part of the experimental results is given here. Figure 4(a) illustrates the case of hiding specific objects and Figure 4(b) shows the case of adding specific objects, which indicates that our algorithm can expose regions of duplication forgeries effectively. Images shown in Figure 4(b) also demonstrate that our algorithm works well even when the tampered images have multiple duplicated regions. The doctored image in Figure 4(c) shows the specific scenario that there are large similar or flat regions in the image, such as large areas of water, sky or grass. Due to the homogenous background in the suspicious images, it is, challenge to discern the forgery. To the best of our knowledge, a number of existing methods cease to be effective under the circumstances; however, the detection results of our algorithm are satisfactory. It is noted that the proposed method outputs detection result maps with color lines connecting all the matching points to identify the duplicated region and forgery region. Although the forgery region cannot be localized precisely to pixel level, we can easily identify the tampered region by color lines, which is sufficient for practical detection requirements.

5.3. Robustness Test

Since forgers usually do their utmost to create an imperceptible tampered image, various kinds of intermediate operations and postprocessing operations are carried out such as rotation, flipping, additive Gaussian noise, Gaussian blurring, JPEG compression, or their mixed operations. In this section, we conduct a series of experiments to test the robustness of the proposed method. Figure 5 indicates that our algorithm can identify duplicated regions in the cases of different angles of rotation and horizontal and vertical flipping with a satisfactory degree. Images shown in Figure 6 illustrate that the proposed algorithm can effectively locate the duplicated regions under common postprocessing operation including Gaussian blurring, AWGN, and JPEG compression, even when the quality of distorted image is pretty poor, such as Gaussian blurring , AWGN , and JPEG compression . It is particularly worth mentioning that our method is robust, even when tampered images are distorted by mixed operations of rotation/flipping transformations and postprocessing operations.

Furthermore, in order to evaluate quantitatively the robustness of our algorithm to different image distortions, we selected randomly 50 original images from the two datasets to generate forged images by copying a square region at a random location and pasting it onto a nonoverlapping region. The sizes of square region were 60 × 60 pixels, 80 × 80 pixels, and 100 × 100 pixels, respectively, each kind of which included translating and two different intermediate operations to generate 450 tampered images. The intermediate operations were flipping (horizontal and vertical) and rotation (90/180/270 degrees), respectively. These tampered images were then distorted by commonly used postprocessing operations with different parameters, such as Gaussian blurring, AWGN, and JPEG compression. In order to obtain more credible evaluation indexes, 300 authentic images from CASIA V2.0 [29] were chosen randomly together with 50 original images and all the distorted images to compose a robustness testing image set. The experimental results were given in Figure 7. In general, the detection results shown in Figure 7 indicate that the larger the area of duplicated region is, the better the detection performance would be, no matter which post-operation the image is distorted by. As can be seen from Figure 7(a), the proposed method has a high detection performance when the images are distorted by Gaussian blurring, even when the image has poor quality () and small forgery region ( pixels), where the precision rate is larger than 89% and the recall rate is larger than 80% in all the cases with different parameters of Gaussian blurring filter. We can draw a conclusion from Figure 7(b) that our method performs well also in the case of processing AWGN distorted images. The precision rate is over 80% till drops to 15 dB, even though there is a slight decay in the recall rate when drops. Results of tampered images distorted by JPEG compression with different quality are shown in Figure 7(c), which indicate that our method performs well in the case of JPEG compression.

5.4. Comparison of Detection Performance

In the last experiment, we compared our method with the method in [11], a typical keypoint-based scheme, based on the SIFT keypoints detection and feature matching which is robust to rotation, scaling, and some postprocessing operations including AWGN and JPEG compression. In [11], the duplicated region is required to contain more than 50 SIFT keypoints, however, which is unrealistic in many practical detections since the duplicated region may not guarantee so many SIFT keypoints especially when the copied region exhibits little structure. Further, according to our experiments, the SIFT method [11] is sensitive to blurring artifacts and if the copied region exhibits little structure or small forgery area, it may happen that the region is completely missed [17].

In contrast to the popular keypoint-based scheme based on SIFT keypoints detection in [11, 12], the proposed method has good robustness against flipping artifacts and Gaussian blurring. As mentioned before, the method in [11, 12] is only possible to extract the keypoints from peculiar points of the image and not robust to some postprocessing operations like blurring and flipping based on our experimental results. Moreover, if the copied region exhibits little visual structure or small area, it may happen that the region is completely missed. However, our method performs well in this kind of scenario. The main reason is that our method employs the dense Harris feature points which is superior to SIFT feature points in [11, 12]. Figure 8 gives an example of different feature detection methods in the same image that shows the number and distribution situation of feature points. As seen in Figure 8, there are 987 feature points detected by SIFT algorithm, while 5175 dense Harris feature points are detected by our method. What is more, the distribution of feature points is widely divergent. As shown in Figure 8(a), SIFT algorithm cannot find reliable feature points in regions with little visual structures, and it is also hard to detect in smaller regions. However, dense Harris feature points employed in our method are nearly well distributed in the image shown in Figure 8(b). Consequently, our method can effectively detect the forgery regions with little visual structures, such as large areas of sky, grass, or water. One example is shown in Figure 9, where an obvious duplicated region is not detected by the SIFT method [11, 12] since SIFT algorithm cannot find reliable feature points in the forgery region. The detection result using our method is shown in the third column of Figure 9, which demonstrates that the proposed method can effectively detect the forgery region with little visual structures.

In the last experiment, we compared our method with two typical approaches: FMT-based [8] and SIFT-based [11], which belong to block-based and keypoint-based detection methods, respectively. Here we still randomly selected 50 original images from the two datasets to generate forged images by copying a square region at a random location and pasting it onto a nonoverlapping region. The sizes of square region were 60 × 60 pixels, 80 × 80 pixels, and 100 × 100 pixels, respectively and included three differently relative locations to generate 450 tampered images. In the first scenario, we evaluated the three algorithms in the case of rotation duplication forgery, where the duplicate region was copied and then rotated with a random angle before pasting. In the second scenario, horizontal and vertical flipping were considered. In the third scenario, these tampered images were distorted by commonly used postprocessing operations with a random parameter just as Section 5.3 showed, including Gaussian blurring, AWGN, and JPEG compression. All the distorted images in the above-mentioned three cases together with their original version and 300 authentic images from CASIA V2.0 composed three test image sets, respectively. The corresponding experimental results are shown in Figure 10. As can be seen from Figure 10, compared to the SIFT-based [11] and FMT-based [8], the proposed method has a high detection performance when the images are distorted by Gaussian blurring, AWGN, and JPEG compression. There are two main reasons for this. On one hand, we first filter the input image with a pixelwise adaptive Wiener method based on statistics estimated from a local neighborhood of each pixel, which has significant improvements on detection performance, especially when the input image is suffering from severe AWGN and JPEG compression. On the other hand, among those doctored images that are not detected by the SIFT method [11], most of them are due to lacking reliable SIFT keypoints in the duplicated region with little visual structures. According to Figure 10, we can also see that the proposed method has a comparative advantage for the detection of flipping forgery, while the SIFT method [11] is slightly superior to the proposed method in terms of rotation detection. The main reason would be that the detection performance of proposed method is slightly inferior to that of the SIFT method [11] in the rotation angles without a multiple of degrees (, , , and ).

6. Conclusion

In this paper a passive forensic method based on Harris feature points and local binary patterns for detecting region duplication image forgery is proposed. We demonstrate the effectiveness of our detection method with a series of experiments on lots of realistic forgery images with high resolution. Experimental results show that the proposed method can effectively detect region duplication forgery, even when an image was distorted by rotation, flipping, blurring, AWGN, JPEG compression, and their mixed operations, especially resistant to the forgery with the flat area of little visual structures. Although having achieved promising detection performance, the proposed method fails to detect region duplication forgery with scaling on account of the fact that Harris corners and LBP are sensitive to image scaling not being computed on multiscale image layers, which is an important work in our future study.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This work was supported by Higher School Science & Technology Fund Planning Project of Tianjin City (no. 20120712), China.