Method | Dataset | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
VIPP | IMD2020 | DSO-1 | OpenForensics | FaceSwap | Coverage | NC16 | Columbia | CASIA | ||
ADQ1 | 0.50 | 0.29 | 0.42 | 0.48 | 0.28 | 0.21 | 0.21 | 0.40 | 0.49 | |
ADQ2 | 0.57 | 0.45 | 0.53 | 0.68 | 0.43 | - | - | - | - | |
BLK | 0.43 | 0.26 | 0.46 | 0.26 | 0.11 | 0.24 | 0.23 | - | - | |
CAGI | 0.44 | 0.30 | 0.51 | 0.29 | 0.18 | 0.30 | 0.29 | - | - | |
DCT | 0.43 | 0.31 | 0.35 | 0.42 | 0.19 | 0.22 | 0.18 | - | - | |
Comprint | 0.50 | 0.30 | 0.76 | 0.63 | 0.35 | 0.35 | 0.40 | - | - | |
Noiseprint | 0.56 | 0.40 | 0.81 | 0.67 | 0.35 | 0.33 | 0.41 | 0.84 | 0.21 | |
Comprint+Noiseprint | 0.58 | 0.44 | 0.81 | 0.71 | 0.41 | 0.37 | 0.44 | - | - | |
CATNet | 0.72 | 0.85 | 0.68 | 0.95 | 0.45 | 0.57 | 0.49 | 0.92 | 0.85 | |
TruFor | 0.75 | - | 0.97 | 0.90 | - | 0.74 | 0.47 | 0.91 | 0.82 | |
FusionIDlab | 0.73 | - | 0.75 | - | - | 0.54 | 0.51 | - | - |
This table shows the performance using the F1 score, which balances the precision and recall. A higher score is better, and a perfect score would be 1.0.
In general, newer AI-based methods (at the bottom of the table, e.g., CAT-Net, Trufor & FusionIDLab) demonstrate better performance than older methods (at the top of the table, e.g., BLK, CAGI & DCT). However, there are large performance variations accross datasets. Thus, in practice, no method is perfect, and there is still a lot of room for improvement.
Source: Comprint, TruFor, FusionIDLab
ADQ1 (2009) is based on Aligned Double Quantization detection, using the DCT coefficient distribution.
ADQ2 (2011) is based on Aligned Double Quantization detection, and first estimates the quantization table of the previous compression. Works for JPEG files only.
ADQ3 (2014) is based on Aligned Double Quantization detection, and works using an SVM on the distribution of DCT coefficients (single vs. double compression). Works for JPEG files only.
BLK (2008) looks for mismatches in the block artifact grid of JPEG compression.
CAGI (2018) stands for Content-Aware detection of Grid Inconsistencies. It looks for mismatches in the blocking artifact of JPEG compression and does content-aware filtering of false activations.
DCT (2007) looks for inconsistencies of JPEG blocking artifacts, using the estimated quantization table based on the power spectrum of the DCT coefficient histogram.
Comprint (2022) is an image manipulation detection and localization method that uses the comprint, a compression fingerprint representing the JPEG compression artifacts.
Noiseprint (2019) is an image manipulation detection and localization method that uses the noiseprint, a camera model fingerprint representing the image acquisation artifacts.
Comprint+Noiseprint (2022) combines the fingerprints of Comprint and Noiseprint (see above), and generates a combined heatmap.
CATNet (v2, 2022) is an image manipulation detection and localization method that jointly uses image acquisition artifacts and compression artifacts. CATNet stands for Compression Artifact Tracing Network. It significantly outperforms traditional and deep neural network-based methods in detecting and localizing tampered regions.
TruFor (2023) is a forensic framework that can be applied to a large variety of image manipulation methods, from classic cheapfakes to more recent manipulations based on deep learning. They are based on both high-level (RGB) and low-level (Noiseprint++) features.
Noiseprint++ is a learned noise residual. It is an improvement of their previous work Noiseprint. It is fingerprint that captures traces related to both the camera model and the editing history of the image. Inconsistencies between authentic and tampered regions may become visible in the Noiseprint++.
To reduce the impact of false alarms, TruFor additionally estimate a confidence map. Errors in the anomaly map are corrected by the confidence map, drastically improving the final detection score. In the confidence map, dark areas signify a low confidence and bright areas signify a high confidence.
FOCAL (2023) stands for FOrensic ContrAstive cLustering. Specifically, FOCAL extracts features from an image (trained using contrastive learning), and then clusters these in an unsupervised way (hence, not creating bias from the training set). Additionally, the detection performance is boosted by fusing two versions of FOCAL (i.e., combining ViT and HRNet). FOCAL demonstrated a significantly better performance than state of the art methods in 2023.
Note: it is possible that the FOCAL method gives incorrect results when a bug on the server disables the GPU. This will be given as a warning in the log ("Warning: FOCAL is run on CPU, which may lead to incorrect results."), which you can see when you click on 'Show more info' on top of a result page. The incorrect FOCAL heatmaps are recognizable when the right and bottom border of the image are highlighted in red, whereas the rest of the heatmap is blue.
FusionIDLab (2023) combines the outputs of ADQ1, BLK, CAGI, DCT, Comprint, Noiseprint, Comprint+Noiseprint, and CAT-Net. By combining these methods into a single heatmap, it may be easier to make conclusions. It learned this using a machine-learning approach, and is based on the Pix2Pix architecture.