Abstract

Image recaptured from a high-resolution LED screen or a good quality printer is difficult to distinguish from its original counterpart. The forensic community paid less attention to this type of forgery than to other image alterations such as splicing, copy-move, removal, or image retouching. It is significant to develop secure and automatic techniques to distinguish real and recaptured images without prior knowledge. Image manipulation traces can be hidden using recaptured images. For this reason, being able to detect recapture images becomes a hot research topic for a forensic analyst. The attacker can recapture the manipulated images to fool image forensic system. As far as we know, there is no prior research that has examined the pros and cons of up-to-date image recaptured techniques. The main objective of this survey was to succinctly review the recent outcomes in the field of image recaptured detection and investigated the limitations in existing approaches and datasets. The outcome of this study provides several promising directions for further significant research on image recaptured detection. Finally, some of the challenges in the existing datasets and numerous promising directions on recaptured image detection are proposed to demonstrate how these difficulties might be carried into promising directions for future research. We also discussed the existing image recaptured datasets, their limitations, and dataset collection challenges.

1. Introduction

Recaptured image forensics has received a lot of interest in the field of multimedia forensics during the last few years. Furthermore, with the easy availability of high-resolution cameras and liquid crystal displays, images can be easily recreated by trying to recapture LCD panels or printing using a digital camera, as well as high-resolution printers. An attacker may try to recapture a faked image in order to reveal flaws and increase its legitimacy. A decade ago, a survey of recapture image detection was focused on in the literature. Images recaptured detection from a printed version of single acquired photographs have been used in a number of investigations. [14]. In [5], secularity of recaptured photographs is investigated, as nearby light reflects on photographic paper during the reconquering procedure. The authors presented an approach based on a colour vision reflectance concept that models the diffuse and spectacular component of a scene’s surface reflectance. A cascaded dichromatic framework has been developed for the identification of printed restore photographs. When an image is received from a printed photograph, the specularity component varies due to the surface structure and composition of the paper on which the photograph is printed. The original image depicts a Laplacian-type distribution, whereas the recaptured photographs depict the Rayleigh type. The Markov-based strategy described in [4] is another method for distinguishing printed recaptured photographs from actual ones. In Kose and Dugelay [4], the author presents a method to detect capture and recapture images using diverse features such as noise features, texture features, histogram, and colour features. In recent years, image forgery detection [6] has received a lot of attention as a way to validate the authenticity of digital photographs by detecting certain intrinsic image regularities or frequent modification anomalies. The way to acquire recaptured images is shown in Figure 1.

Almost all of the current solutions in this field are based on attributes that are designed to detect one or more specific alterations made to a photograph during the recapturing process. One of the most important alternations is the aliasing-like distortion. The most common feature used to detect this distortion is local binary pattern (LBP) descriptor and its variants [79]. In [7, 10], the author used two dimensions to recognize the type of distortion and used the discrete Fourier transform of the noise residual and the cyclostationarity hypothesis. Another important alteration is blurriness. Existing features for this alternation include multiscale wavelet statistics [7, 9], histogram of image local difference [7], and the dictionary approximation error of edge profiles [11]. In addition, colour distortion also has been considered. The proposed features include colour moments [7, 11] and chromatic covariance matrix [12]. The studies in [7, 13] are found on the idea that real scene and recaptured images have different noise distributions, and thus, the noise level for recaptured image identification is evaluated using these distributions. Meanwhile, compression artefact has been used for recaptured image detection [1416]. However, JPEG compressed photographs are the only ones that work. Further artefact-irrelevant image statistics-based features such as correlation coefficients [14, 17] and co-occurrence matrices [18] extracted from the residual image are used for detection. In [12], the author used a local ternary count histogram of residual maps for recaptured image detection. At last, some deep learning-based methods also have been proposed as detailed explanation have been added in Section 3.

As far as we are aware, this is the first comprehensive review of recaptured image detecting techniques in multimedia forensics. In this study, we have examined the existing approaches, their performance, strengths, and drawbacks. We also listed potential future research directions and pointed out how these issues could be turned into research opportunities. We have also suggested some of the current research terms and directions.

The following is a summary of our work. The prior work is discussed in Section 2. In Section 3, machine and deep learning-based approaches and their limitations have been discussed. In Section 4, we explored and investigated some of the very common features introduced in recaptured images. Section 5 describes the datasets, limitations of the current datasets, and challenges in collecting or generating recaptured image datasets. In Section 5, the limitations of present approaches are explored, as well as prospective research projects.

Over the past few years, numerous image recaptured detection techniques have been developed. In [4, 19], the authors have discussed the detection of image recaptured from printed materials such as printers by exploiting the dithering and specularity effects, respectively. These methods can recognize a recaptured image print based on its specularity or the printer’s dithering characteristics.

Some researchers have developed different techniques to detect images recaptured from both printers and LCD screens [7, 12, 20]. They provide a framework based on numerous aspects associated with recaptured images, including aliasing, noise, blurriness, chromaticity, colour, image contrast, and sharpness, as well as the nonlinearity of the tone response curve.

It was also investigated how to identify photographs acquired from an LCD monitor. By Cao and Kot [7], image recaptured detection is proposed using a detector based on specific aspects of recaptured images. Local binary pattern characteristics were assessed at several levels to recognize the fine texture pattern that occasionally appeared in recaptured photographs. The loss of information in the recaptured photographs is identified using a multiscale wavelet decomposition with the mean and standard deviation of the absolute wavelet coefficients as features that is because of the monitor’s low display resolution in comparison with the image sensor on the camera. Twenty-one different colour attributes are used to detect an apparent increase in saturation in the colours of the image that was recaptured. In the end, the result of each detector is put into a conditional SVM classifier that has been trained.

Later in 2012, Yin and Fang [13] used the MBFDF technique to detect and recognize double JPEG compression in images recaptured from LCD panels. Three different wavelet transformations were used to denoise the image and statistical metrics. From the histogram of the recoverable noise residuals, mean, variance, skewness, and kurtosis were calculated. Their findings demonstrate that the suggested features are useful in detecting photograph copying from LCD monitors. In [7], the author trains an SVM classifier to categorize photographs taken from LCD monitors using a feature set with 136 dimensions as the output. Blurriness, texture, noise, and colour attributes are used to create their descriptors. When the features are merged, they claim to have 97.2 percent detection accuracy when applied to a dataset of recaptured photographs taken with smartphone cameras. In their dataset, they used the following photographs [12]. The images are low in resolution and quality due to the low resolution and quality of the smartphone cameras utilized to complete the recapture. Ng et al. [20] categorized photographic and photo-realistic computer animation photographs. They found that PRCG images captured through LCD monitors were more complicated to identify than recaptured images from original one. Facial spoofing attacks include a person mimicking another person who is allowed by the system to get around a face authentication mechanism. A tablet computer can display a qualified person’s image or video, which can subsequently be submitted to the authentication procedure, where it is captured by the system’s digital camera. In [22], face spoofing was investigated. Some of the approaches utilized deep learning-based techniques to address the problem of recaptured image detection in multimedia forensics. Most of the prior approaches are based on characteristics that try to detect one or more specific changes introduced during the photograph recapture process. Most of the prior discussed conventional approaches use handcrafted features and need prior knowledge of designers and cannot take advantage of big data. Deep learning approaches allow us to autonomously extract and categorize information in a unique network architecture using deep learning methods. The evolutionary development of image recaptured detection traditional and deep learning frameworks is illustrated in Figure 2.

3. Deep Learning and Machine Learning-Based Approaches

Recently, deep learning achieved significant attention in many fields [11, 2328]. The majority of earlier solutions attempted to reflect the statistical aspects of the recaptured images with handcrafted features. By understanding the attributes of the image, the deep learning framework can recognize photographs with distinct qualities automatically without human instructions. Only a few studies have been carried out to address the problem of detecting a recaptured image from its original utilizing machine learning-based algorithms. In [4], the author proposed an effective CNN framework with the Laplacian filter layer signal enhancement layer to classify recaptured images from their original one. The proposed method achieved an accuracy of more than 95% for numerous image size databases. In their network, the signal enhancement layer plays an important role to strengthen the difference between captured and original images. In the same year another work by Choi et al. was carried out [29]. In their framework, the author examines the traces introduced during the recapture process to distinguish recaptured images from original acquired images using a basic CNN. After some successful approaches, there are various practical applications and prospects that require further examination using deep learning-based approaches for real-world applications, which we will discuss in our feature research direction section. Some of the existing pros and cons of machine learning-based techniques are explored below.

In 2021, Hussain et al. [22] introduced a deep learning method for detecting double JPEG compression based on raw DCT coefficients as input to the CNN. The author deployed a DCT layer at the input of the CNN to extract the DCT coefficients from the input images and then feed the coefficients as input to the CNN. The proposed method achieved state-of-the art performance for the detection of small size blocks, especially for the scenario where QF1>QF2. The number of approaches made over the past few years is listed in Figure 3.

Yang et al. [31] develop a deep learning algorithm for recaptured image detection as shown in below figure. The proposed algorithms cannot fully extract useful information since recapturing process induces not only high-frequency information loss but also colour distortions. However, directly employing the Laplacian filter may remove such valuable information, which is very helpful in image recapturing detection. The architecture of the proposed network is shown in Figure 4.

In Li et al. [36], the author proposed a deep learning framework that used preprocessing filter at the input of the CNN because of it the CNN is a lake of end-to-end feature extraction from the input images. Secondly, the suggested network is too shallow, with only three Conv layers, and it is also too slow since the FC layers are too large as shown in Figure 5.

Zhu et al. [37]: the proposed method calculated LBP coding maps of all the input patches before feeding into the CNN and claim that the proposed method achieved better detection accuracy as compared to other cutting-edge approaches, but using LBP as pro-processing at the input rather than integrating the LBP feature extraction process into CNN architecture as a learnable layer led to degrading the end-to-end feature capability of the CNN. The proposed network architecture with detailed parameters is shown in Figure 6.

Choi et al. [29]: this was the first CNN-based technique for image recaptured detection (as illustrated in Figure 7). The proposed CNN checks each image block by block to determine whether it has been recaptured. As a result, the proposed approach is much slower and complex than the deep learning-based voting systems, which are employed for the final choice based on the block unit results. Secondly, in the hazy and flat areas, some erroneous detection occurred as the blurriness effect introduced by recapture operation is not clear in these regions.

4. Features of Recaptured Images

This section contains some of the most common features exploited in recaptured images during the recaptured process. Images have been recaptured from an LCD panel or from a printer introducing unexpected artefacts, which are divided into four distinct categories such as aliasing, blurriness, noised, and illumination nonuniformity.

4.1. Aliasing

Aliasing is one of the most command image distortion effects, which is caused by sampling an image at too low rate. It is dominant for lines having gentle and sharp slopes as shown in Figure 8, which is present in digital camera images when the scene contains details with very high frequency or insufficiently band limited [18, 34]. One of the approaches to detect images recaptured from LCD screen is to look for aliasing artefacts caused by the monitor’s pixel grid being sampled. As we all know, aliasing is commonly referred to as chroma moiré in digital cameras. The cameras are available with a colour filter array or colour filter mosaic [35]. The colour channels of those cameras are typically sampled at lower frequency than the image sensor’s initial frequency. Post-processing approaches are ineffective in removing aliasing artefacts. As a result, aliasing can be used to detect recapture images as a feature. Despite the presence of aliasing artefacts in the recaptured images, the 2D discrete Fourier transform (DFT) noise residuals show peaks in the 2D spectrum. Identification of the peaks in noise residuals might be helpful for the identification of recaptured images.

4.2. Blurriness

Natural landscapes contain many different edges, each with its own level of complexity and contrast. Natural scenes have a wide range of sharpness and contrast in their edges. The acquisition process in a digital camera imparts a certain degree of blur, or distortion, into the image when it captures a scene as shown in Figure 9. This happens even if the camera focused on the image perfectly at the moment of capture. When the lens aperture diameter is extremely small, the latter is used. Sharpening, contrast enhancement, and CFA demosaicing are examples of internal camera processing that might cause more distortion. For the most part, the blur characteristics are specific to the camera at the moment of acquisition. One way for characterizing the blur is to use the capture device’s point spread function (PSF). In reality, determining a device’s PSF is difficult; thus, the LSF is utilized instead. A one-dimensional function that corresponds to the first derivative of the edge spread function is known as a line spread function [37]. The first derivative of a slanted edge test target’s edge profiles taken with the camera may be measured and statistically combined to make a model of the smearing pattern caused by the camera [38].

4.3. Noise

The two main types of noise associated with photographs acquired with a digital camera under normal and high levels of scene light are temporal noise, which is dominated by shot noise, and photo-response nonuniformity (PRNU) noise, which dominates fixed pattern noise. The distribution of image noise in the recaptured image is influenced by the noise characteristics of the recapture camera, the brightness level of the LCD, the capture distance, and the scene content. The noise characteristics of the camera used to photograph the original image are likely to be present in the recaptured image due to the blurring effect induced by the recapture process, but they will be band-limited. The unique PRNU fingerprint of the image sensor has been discovered to be a highly effective approach for identifying the source camera from a shot or a series of photographs [39]. Figure 10 is shown an example of noise artefacts due to bad stitching.

4.4. Contrast, Colour, and Illumination Nonuniformity

The light emitted by the LCD screen can significantly reduce the image detection, contrast, and saturation of a recaptured image, yet the colour of finely recaptured images appears to be different from their original images. The contrast and colour properties of a picture can be calculated as a distinguishing characteristic. Before recapture, the LCD monitor is calibrated and the white point of the recapture camera is adapted to the white point of the LCD monitor to avoid colour balance issues in a recaptured image. In recaptured images with significant areas of low texture, a brightness gradient may be visible.

5. Existing Databases and Dataset Collection Challenges

It is an important to compare numerous proposed methods in a fair way. As we all know, a fair comparison allows researchers to swiftly receive information on the performance, benefits, and drawbacks of the individual approach, as well as fully comprehend the study problem. A public image dataset is an important aspect in providing such a comparison platform for image processing and machine learning research. Recapturing an image display on a digital LCD screen inadvertently often resulted in poor image quality. One of the big challenges in image recaptured detection is to collect a good quality image dataset because it may easily see certain obvious artefacts in the poor-quality sample, such as texture patterns in recaptured images, loss of small details, and colour degradation. Such low-quality recaptured images are usually useless, and they may be easily identified by human eyes. There are just a handful of publicly available datasets with a limited number of samples.

5.1. The Dataset Used in [40]

The dataset consists of 160 digital images in the TIF format with fixed width of 2048 pixels. The dataset consists of 100 recaptured and 60 captured images. Three distinct digital cameras are utilized to provide 20 photographs for individual device in the single capture group. 5 distinct permutations of recapture chains were used to create the recaptured photographs, resulting in 20 images for each group.

5.2. Recaptured Image Dataset (ROSE) [7]

In this dataset, 6 distinct camera brands were used to record 2000 natural photographs and 2700 finely recaptured images using 6 different camera brands. The sizes of the images range from 2272 by 1704 pixels to 4256 by 2832 pixels. The images are in JPG and BMP formats. In the ROSE dataset, we observed dominant artefacts caused by aliasing. Table 1 shows the summary of ROSE dataset.

5.3. ICL-COMMSP Database [11]

The images in the ICL dataset were taken in strictly controlled conditions with a certain distance between the LCD screen and the camera. The loss of details artefacts occurs in the ICL dataset [11]. One of the limitations of the ICL dataset is the loss-of-the-detail artefact that occurs in the ICL database. The detailed description of captured and uncaptured images is shown in Table 1.

5.4. ASTAR Image Dataset [41]

Because smartphones represent the medium to low end of the consumer photography industry, this dataset describes the picture dataset captured by a smartphone device. Dataset contains real scene and recaptured images in pair. This dataset consists of images taken from printed paper and digital screen. The limitation of the ASTAR database, loss-of-the-detail artefacts, texture artefacts, and even the artefacts caused by illumination and lighting reflection appear due to the uncontrolled capturing condition.

5.5. Recaptured Image Dataset [14]

In this dataset, 608 real-world photographs and 589 recaptured images were collected using five separate cameras with front and back cameras. The dataset comprises indoor (offices and homes) and out scene. The scale of the recreated photographs is smaller than the genuine images, which is one of the dataset’s limitations.

5.6. Dataset in [12]

The dataset consists of 1035 single captured images captured using nine separate cameras; a total of 115 photographs were obtained by each camera. A total of 2520 recaptured images are collected by recapturing every single image in the collection of single captured photographs using eight separate cameras.

5.7. Dataset in [31]

This dataset is based on the picture databases available in [6, 7] by central cropping. The size of the images is 512 × 512 pixels. There are a total of 20000 photographs in the database, with 10,000 originals and 10,000 recaptures. This dataset is not publicly accessible to the scholars.

5.8. Dataset Collection Challenges

One of the big challenges in recaptured image data collection is to set the camera settings as part of such an environment with a huge number of customizable variables, the LCD settings, and the surrounding conditions. Without considering all the controllable settings, it might be easy to notice some obvious artefacts in the recaptured images. Such low-quality images are meaningless, and human eyes can clearly discern them. Limitations in existing datasets have been addressed in the previous subsections. Some of the existing dataset and number of images for each dataset are shown in Figure 11. The evolution of deep learning-based analysis framework is also presented in Figure 2.

6. Conclusion, Challengers, and Future Directions

This is the first comprehensive review of recaptured image detecting techniques that we are aware of. We provide a complete assessment of existing image recaptured detection frameworks in this research. We looked at the tactical foundations of the techniques to see how well they performed, as well as their strengths and weaknesses. We also compared authorized image recaptured methods to show how these issues might be turned into lucrative future research opportunities. The current state of our knowledge suggests current research trends and directions.

Prior techniques have the disadvantage of being difficult to adapt to environmental changes. The introduction of new technology in capturing or displaying devices, for example, may alter the properties of recaptured images. The procedures should be redesigned to cope with the changes in such a scenario or a new environment. Table 2 provides a clear comparison and analysis of the advantages and disadvantages of the existing recaptured image detection approaches.

Machine and deep learning-based image recaptured detection methods offer solutions to numerous conventional approaches to problems that are tough to solve. Although prior discussed algorithms have some limitations and challenges, they are not well addressed at the moment. In this study, we conducted an extensive survey to identify current issues in image recapture detection and the benefits and drawbacks of current methods for researchers concerned with the development of deep learning frameworks for image recapture detection techniques.

As a result, it really needs to investigate these issues using deep learning and machine learning methodologies. We also highlighted several intrinsic future research areas and evolutions in machine learning and deep learning that could be useful for prospective solutions to present difficulties.(i)There are only few machines and deep learning-based approaches that have addressed the problem of image recapture detection as we have mentioned in Section 3. We still need more deep learning algorithms with theoretical support and neural networks, such as Bayesian and RNN. Then, they used them to combat deep learning image recapture detection.(ii)The development of deep learning image recapture detecting methods that do not require human involvement is investigated.

Data Availability

The data used in this research can be obtained from the corresponding authors upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors are grateful to the Taif University researchers supporting project no. TURSP-2020/36, Taif University, Taif, Saudi Arabia.