13 Papers You Should Read to Understand Image Forgery Detection

While researching image forgery detection and forensic analysis, we came across several papers that provide valuable insights into the field. We have listed below 13 important papers that every student or researcher should explore.

Papers You Should Read to Understand Image Forgery Detection 2

1. Using MD5 and OpenCV

Mohammad Shahnawaz Shaikh and others, in their paper titled “Image Forgery Detection Using MD5 & Open CV,” propose the use of a dual-layered hybrid framework that integrates cryptographic integrity verification with advanced image feature analysis to detect digital manipulations. The technique first uses the MD5 hashing algorithm to generate a unique “digital fingerprint” of an original image, enabling identification of unauthorized alterations by hash-value discrepancies. This initial screening is then complemented by a more thorough examination using the OpenCV library, which analyzes pixel patterns, texture distribution, color profiling, and shape recognition to reveal forgeries such as copy-move, splicing, and retouching.

2. Using Error Level Analysis Technique and MobileNetV2

Muhamad Nur Baihaqi, Aris Sugiharto, and Henri Tantyoko on their paper titled “Classification of Real and Fake Images using Error Level Analysis Technique and MobileNetV2 Architecture,” propose the use of Error Level Analysis (ELA) as a feature extraction technique combined with the MobileNetV2 deep learning architecture to enhance the detection of digital image forgeries. This unique method specifically focuses on identifying localized compression artifacts in JPEG images by calculating the pixel-wise difference between an original image and its resaved version, which is then visualized through brightness enhancement to highlight inconsistencies in manipulated regions. By integrating these ELA error maps into a fine-tuned, lightweight MobileNetV2 model using k-fold cross-validation and specific optimizers like RMSprop, the researchers achieved a significant accuracy improvement to 93.10% compared to models trained on raw image data alone.

3. Using Machine Learning

Alzamil Lubna, in her paper titled “Image Forgery Detection with Machine Learning,” proposes the use of a combination of image pre-processing and machine learning techniques to enhance the accuracy of detecting manipulated images. Specifically, the research employs a patch-based pre-processing approach that divides images into 100 x 100 patches, which are then passed through a patch classifier to distinguish between raw and computer-generated content before the final results are generated. While the study initially explores the Error Level Analysis (ELA) algorithm paired with various classifiers like Convolutional Neural Networks (CNNs), Support Vector Machines (SVMs), and Random Forests (RFs), it concludes that utilizing a VGG16 pre-trained model following this patch-based pre-processing yields a superior detection accuracy of 94.5%.

4. Using Convolutional Neural Networks

Afshine Amidi and Shervine Amidi, in their CS 230 – Deep Learning teaching at Stanford University titled “Convolutional Neural Networks,” propose the use of a comprehensive architectural framework for Convolutional Neural Networks (CNNs) that integrates specific layers such as Convolutional (CONV), Pooling (POOL), and Fully Connected (FC) to process visual data. Their approach emphasizes the fine-tuning of hyperparameters, including filter size, stride, and zero-padding, to manage feature extraction and spatial invariance. Additionally, the paper details specialized methods for advanced computer vision tasks, such as the YOLO and R-CNN algorithms for object detection, Siamese Networks for one-shot learning in face recognition, and Generative Adversarial Networks (GANs) for synthesizing new content.

5. Using Error Level Analysis

Adarsh N and H.P. Mohan Kumar, in their paper titled “Error Level Analysis in Image Forgery Detection,” propose the use of a hybrid system that combines Error Level Analysis (ELA) with Convolutional Neural Networks (CNN), specifically utilizing the VGG-16 architecture, to identify digital forgeries. The method functions by intentionally reducing an image’s quality at varying compression levels, such as 10%, 50%, and 90%, to generate ELA images that highlight discrepancies in error levels caused by localized manipulations like splicing or retouching. These processed ELA images are then fed into a fine-tuned deep learning model to automate the classification of images as authentic or forged, a combination that reportedly increases validation accuracy by approximately 2.7% compared to standard methods.

6. Using Federated Multi-Modal Deepfake Detection and Cryptographic Evidence

Pranav Bhatnagar, in his paper titled “Federated Multi-Modal Deepfake Detection and Cryptographic Evidence Infrastructure for Law Enforcement,” proposes the use of a distributed network of Media Verification Nodes that combine multi-modal deepfake detection with a federated intelligence exchange. The system uniquely integrates automated visual, audio, and metadata forensic analysis, utilizing methods such as CNN-based GAN artifact detection and quantization inconsistency analysis, to generate a standardized risk score. To ensure judicial integrity, the architecture employs a national cryptographic registry using SHA-256 hashing and append-only databases to maintain an immutable chain-of-custody for digital evidence. This federated approach allows jurisdictions to broadcast novel anomaly signatures across the grid, enabling rapid, coordinated national defense against evolving synthetic media threats.

7. Using Vision Transformer and SVM

Mohamed Abdelmaksoud, Basheer Youssef, Khaled Wassif, and Reda A. El-Khoribi on his paper titled “Hybrid framework for image forgery detection and robustness against adversarial attacks using vision transformer and SVM,” propose the use of a novel, two-phased deep learning approach that combines the feature extraction capabilities of a pre-trained Vision Transformer (ViT) with the classification efficiency of a Support Vector Machine (SVM). This hybrid framework is uniquely designed to detect passive image forgeries, specifically copy-move and splicing, while simultaneously enhancing resilience against malicious adversarial attacks, such as Patch-Fool perturbations, through an integrated adversarial training stage. To address the challenge of limited forensic training data, the method employs a comprehensive dataset preparation phase that merges five standard forensic benchmarks into a unified “design domain dictionary” expanded via both geometric data augmentation and adversarial samples. This methodology avoids the computational intensity of end-to-end transformer fine-tuning by freezing the pre-trained ViT weights, resulting in a system that is both computationally efficient and robust for real-world image forensic applications.

8. Using an Enhanced Dual-Branch Deep Learning Model for Image Forgery Classification

Rupali M. Bora and Mahesh R. Sanghavi, in their paper titled “Error Level Analysis (ELA)-Enhanced Dual-Branch Deep Learning Model for Image Forgery Classification and Binary Mask Generation,” propose the use of a novel dual-branch convolutional neural network (CNN) architecture that integrates handcrafted Error Level Analysis (ELA) features with raw RGB image data to simultaneously classify and localize image forgeries. The model consists of two specialized branches: a top branch that uses a pretrained ResNet50 backbone to extract high-level semantic and global structural features from raw RGB images, and a bottom branch that employs a custom lightweight CNN to capture fine-grained local texture anomalies from ELA-processed versions of the same images. By fusing these diverse feature sets, the framework can categorize images into three distinct classes, Authentic, Copy-Move, and Spliced, while concurrently generating a pixel-wise binary mask for forgery localization. A key advantage of this technique is its ability to learn these localization masks through weak supervision based on ELA artifacts, eliminating the need for manually labeled pixel-wise ground truth annotations during training.

9. Using an Adaptive Compression Factor

Abdulqadir Hamza, Mustapha Aminu Bagiwa and Salisu Aliyu, in their paper titled “An Adaptive Compression Factor Error Level Analysis for Image Forgery Classification,” propose the use of an adaptive compression factor within the Error Level Analysis (ELA) framework to improve forgery detection. Unlike traditional methods that rely on a static compression factor, this technique dynamically assigns a JPEG quality factor based on an image’s file size: smaller, previously compressed images receive a higher quality factor to preserve subtle traces, while larger images receive a lower quality factor to amplify compression artifacts. These generated ELA maps are then processed by a lightweight two-layer Convolutional Neural Network (CNN), which achieved a 96.6% accuracy on an expanded CASIA V2 dataset.

10. Using Biorthogonal Wavelet Transform (BWT) and Singular Value Decomposition

Syed Afnan Hashir, Dhruv Raj Kashyap, Shivansh Tripathi, and Bansidhar Joshi, in their paper titled “An Effective Approach for Image-Based Forgery Detection,” propose the use of a hybrid forensic framework that combines Biorthogonal Wavelet Transform (BWT) with Singular Value Decomposition (SVD) for feature extraction, followed by an Improved Relevance Vector Machine (IRVM) for classification. By dividing images into overlapping blocks and applying BWT-SVD, the method extracts robust feature vectors that are then processed by the IRVM to distinguish between authentic and tampered images. This approach specifically targets copy-move forgery detection (CMFD), leveraging the dimensionality reduction of SVD and the classification efficiency of the IRVM to improve accuracy and reliability when tested against standard datasets like CoMoFoD.

11. Using Fractional Beta Chaotic Maps

Rabha Ibrahim, Hayder Natiq, Ahmed Alkhayyat, Alaa Kadhim Farhan, Nadia Al-Saidi and Dumitru Baleanu, in their paper titled “Image Encryption Algorithm Based on New Fractional Beta Chaotic Maps,” propose the use of an encryption technique that utilizes fractional beta chaotic maps to generate highly complex pseudo-random sequences for shuffling and altering image pixels. This method specifically aims to enhance security against decryption attacks by blurring the correlation between the original and encrypted images while maintaining a sophisticated level of randomness and low correlation coefficients. Through the application of these fractional-order chaotic maps, the technique offers an improved resistance to various cryptanalytic threats, as demonstrated by superior entropy and signal-to-noise ratio performance in experimental assessments.

12. Using a 10-layer Convolutional Neural Network (CNN)

Mamdouh Gomaa, Eman Mohamed, Alaa Zaki, and Alaa Elnashar, in their paper titled “Deep Learning to Detect Image Forgery Based on Image Classification,” propose the use of a unique method for image forgery detection that integrates a custom 10-layer Convolutional Neural Network (CNN) with traditional machine learning classifiers. The technique involves three fundamental steps: feature learning through mask extraction and patch sampling, feature extraction using a pre-trained version of their CNN, and final classification using Support Vector Machine (SVM) or K-Nearest Neighbor (KNN). Unlike typical CNN architectures that utilize multiple fully connected layers, this model employs only a single fully connected layer to mitigate overfitting, especially when working with smaller training datasets. By training on labeled patches drawn from the boundaries of cloned areas, the model effectively learns hierarchical representations of local artifacts caused by tampering, ultimately achieving high detection accuracy on benchmark datasets like CASIA and Columbia.

13. Using deep learning with block and keypoint methods

Fatemeh Zare Mehrjardi and Mohsen Sardari Zarchi, in their paper titled “A hybrid model for image forgery detection using deep learning with block and keypoint methods,” propose the use of a multi-stage framework that integrates a triple-architecture ensemble of deep learning networks with enhanced block-based and keypoint-based techniques. The method begins by employing an ensemble of deep learning models to identify forged images and generate localized heatmaps of suspected regions; this is followed by a block-based analysis optimized via a Genetic Algorithm (GA), which uses a fitness function based on the maximum number of matched keypoints between candidate blocks to precisely isolate tampered areas. By combining the global feature extraction of neural networks with the granular accuracy of block and keypoint matching, the approach achieves high performance in detecting both image-level and pixel-level manipulations.

Featured Image Source: ResearchGate.