Tutorial

Heatmap

The image manipulation detection methods output is a heatmap of forgery localization. The heatmap has colors from blue to green to orange to red, as can be seen in the legend below.


A red color means that the method found inconsistent traces in this area, compared to the blue area. In other words, this is an indication of manipulation in the red area. Every method may use another type of manipulation trace (for example, compression artifacts, or camera noise artifacts).

The colors in between such as green and orange also indicate inconsistencies, yet to a lesser extent. In other words, the is a weak indication of manipulation.

Sometimes, the blue and red areas are swapped - i.e., the manipulated region is blue, and the real region is red. In general, it simply means that the blue and red areas are inconsistent with each other. The human has to decide which region is real, and which is fake.


Fingerprint

Some methods, such as Comprint, Noiseprint, TruFor, also output a fingerprint (in addition to the heatmap). This fingerprint an intermediate result that was used to create the final heatmap. It can give extra insights for the human interpreter as well.

In general, areas with a different fingerprint-pattern showcase inconsistent traces, which suggest manipulation. Every method uses a different type of trace to create the fingerprint. For example, Comprint uses compression artifact, Noiseprint uses camera noise artifacts, and TruFor uses both those artifact types.

When the fingerprint shows no pattern, but rather a smooth color, this is typically because the corresponding area in the image under investigation is an (overexposed) area with a smooth color. Since there is no to little detail in this area, we also cannot detect any traces to be used for manipulation detection.


Confidence map

TruFor also outputs a confidence map, whereas other methods do not have this functionality. This confidence map helps to interpret how certain the method was about the heatmap prediction.


As can be seen in the legend above, white or lighter color means that the method is more certain or confident, whereas a black or darker color means that the method is not certain. It is crucial to look at the confidence map when interpreting the corresponding heatmap. If the confidence is low (dark), we cannot make any notable conclusions.


Forgery score

TruFor also output a forgery score, whereas other methods do not have this functionality. This forgery score takes into account both the entire heatmap and the entire confidence map. A score close to 1.0 means that it is more confident that a manipulation occured, whereas a score close to 0.0 means it cannot (confidently) detect a manipulation.


Performance of methods

Not all methods perform equally well. An overview of all methods and their performance can be found here. The best-performing methods are typically the newest AI-based ones, such as CATNet, TruFor, and FusionIDLab, although their performance varies. No method is perfect. However, in general, the power is in the combination of multiple (complementary) methods. When multiple methods agree, you can be more certain of their detection. When they disagree, we should be more careful before making any conclusions.


Loss of performance due to compression

Most methods will be able to detect manipulations straight after the manipulated image comes out of the editing software. However, in practice, we often find these images on social media. That means that they have been compressed to a lower quality, or perhaps even underwent further editing (such as resizing). As such, many traces that we use to detect manipulations are deleted. This makes the detection of manipulated images very challenging, in practice.


Featured Examples

Several examples to heatmaps and their interpretation can be found on the COM-PRESS Examples page. A few of these are featured below.

Correct detection

The image below is an example heatmap using TruFor, overlaid on an image in which the face was manipulated. One can see that the face is highlighted with a red color, which correctly suggests that it was manipulated.

Original imageheatmap
Forgery analysis using TruFor heatmap.

TruFor also outputs a fingerprint, which it internally used as an intermediate step. When you closely inspect the face region, we can indeed see that the fingerprint in that region shows a different pattern than the rest of the image.

Original imagefingerprint
Forgery analysis using TruFor fingerprint.

TruFor also outputs a confidence map, which says how confident it is about the heatmap detection, for each region. In the confidence map, the facial region is white, which means TruFor is confident about flagging this area as manipulation. The borders of the face are darker, which means TruFor is not certain about this area.

Original imageconfidence
Forgery analysis using TruFor confidence map.

TruFor also outputs a forgery score. For this image, the forgery score is 0.988. Since this value is very close to 1, it is confident to have detected a manipulation.


Incorrect detection

The image below is an example heatmap using ADQ1, overlaid on an image in which the samurai statue was manipulated. One can see that almost the entire image region is highlighted, except for some areas with do not contain smooth colors. In general, it is very difficult to detect manipulations in such (overexposed) areas with smooth colors, because they typically do not contain any traces that can be used to detect forgeries. Therefore, from this heatmap, we cannot make any conclusions about manipulation. Note that we can make conclusions using multiple other methods, as shown in the example below.

Original imageheatmap
Forgery analysis using ADQ1 heatmap.

The power of multiple (complementary) methods

In the example above, the ADQ1 heatmap showed an incorrect detection. However, other methods do correctly detect the samurai statue as fake, such as Noiseprint and CATNet shown in the examples below, or Comprint in the example with swapped detection below.

Original imageheatmap
Forgery analysis using Noiseprint heatmap.
Original imageheatmap
Forgery analysis using CATNet heatmap.

Combining multiple methods is also automatically done by the FusionIDLab method (shown below). Using AI, FusionIDLab automatically combines the heatmaps of ADQ1, BLK, CAGI, DCT, Comprint, Noiseprint, Comprint+Noiseprint, and CATNet into a single heatmap. As such, it harmonizes the power of multiple (complementary) methods.

Original imageheatmap
Forgery analysis using FusionIDLab heatmap.

Swapped detection

The image below is an example heatmap using Comprint, overlaid on an image in which the samurai statue was manipulated. One can see that the samurai statue is blue, whereas the rest of the image is highlighted in red. This is because the Comprint method does not know which part is fake - it only knows the samurai region shows inconsistent traces with the rest of the image. The human has to interpret this, and decide which area is real and which is fake.

Original imageheatmap
Forgery analysis using Comprint heatmap.

Comprint also outputs a fingerprint, which it internally used as an intermediate step. We can clearly observe that the samurai area has another inconsistent fingerprint than the rest of the image. Correctly suggesting it was manipulated.

We also notice that the sky, an overexposed area with a smooth color, does not contain any fingerprint. This area does not contain any traces that can be used for forgeries. Hence, we cannot make any manipulation conclusions about the sky.

Original imagefingerprint
Forgery analysis using Comprint fingerprint.

Unconfident detection

The image below is an example heatmap using TruFor, overlaid on an image in which only the face on the left was manipulated (and not the other faces in the image). In the heatmap, not only the face of the guy on the left is highlighted, but also all other faces are, as well as the arm of the second guy on the left, and the hand holding the cup.

Original imageheatmap
Forgery analysis using TruFor heatmap.

In the corresponding confidence map, we can see that the face on the left is white (i.e., confident detection), which indicates that it was indeed manipulated. In contrast, the faces of the other people, the arm of the second guy on the left, and the hand holding the cup are darker in the confidence map (i.e., uncertain detection). In other words, even though these areas were (incorrectly) highlighted in red in the heatmap, the corresponding detections are not confident, and hence we should not make conclusions about those areas anyway.

Original imageconfidence
Forgery analysis using TruFor confidence.

TruFor also outputs a forgery score. For this image, the forgery score is 0.988. Since this value is very close to 1, it is confident to have detected a manipulation.


Loss of performance due to compression

The image below is an example heatmap using ADQ1, overlaid on an image in which the cat walking on the sidewalk in the background was artifically added using the Generative Fill-feature in Adobe Photoshop. The image comes straight out of Photoshop. In the heatmap, the cat is correctly highlighted in red. Most other detection methods also successfully detect this manipulation with ease.

Original imageheatmap
Forgery analysis using ADQ1 heatmap, on edited image coming straight out of Adobe Photoshop.

When sharing the same manipulated image on social media, the image is compressed to a lower quality. For example, in the example below, the same manipulated image was transferred (and compressed) on Telegram. This makes the detection much more challenging, as can be seen in the ADQ1 heatmap below. Also the other methods cannot detect the manipulation anymore, after this compression.

Original imageheatmap
Forgery analysis using ADQ1 heatmap, on edited image and additionally shared (and compressed) on Telegram.
COM-PRESS

Combating disinformation by equipping journalists with new image manipulation insights and detection methods.

© 2022-2024 Copyright IDLab-MEDIA