The ability to take one person’s face or expression and superimpose it onto a video of another person has recently become possible. In particular, pornographic videos called “deepfakes” have emerged on websites such as Reddit and 4Chan showing famous individuals’ faces superimposed onto the bodies of actors.
This phenomenon has significant implications. At the very least, it has the potential to undermine the reputation of people who are victims of this kind of forgery. It poses problems for biometric ID systems. And it threatens to undermine public trust in videos of any kind.
So a quick and accurate way to spot these videos is desperately needed.
Enter Andreas Rossler at the Technical University of Munich in Germany and colleagues, who have developed a deep-learning system that can automatically spot face-swap videos. The new technique could help identify forged videos as they are posted to the web.
But the work also has sting in the tail. The same deep-learning technique that can spot face-swap videos can also be used to improve the quality of face swaps in the first place—and that could make them harder to detect.
The new technique relies on a deep-learning algorithm that Rossler and co have trained to spot face swaps. These algorithms can only learn from huge annotated data sets of good examples, which simply have not existed until now.
So the team began by creating a large data set of face-swap videos and their originals. They use two types of face swaps that can be easily made using software called Face2Face. (This software was created by some members of this team.)
The first type of face swap superimposes one person’s face on another’s body so that it takes on their expressions. The second takes the expressions from one face and modifies a second face to show them.
The team have done this with over 1,000 videos, creating a database of about half a million images in which the faces have been manipulated with state-of-the-art face-editing software. They called this the FaceForensics database.
The size of this database is a significant improvement over what had been previously available. “We introduce a novel data set of manipulated videos that exceeds all existing publicly available forensic data sets by orders of magnitude,” says Rossler and co.
Next, the team uses the database to train a deep-learning algorithm to recognize the difference between face swaps and their unadulterated originals. They call the resulting algorithm XceptionNet.
Finally, they compare the new approach to other forgery detection techniques.
The results are impressive. XceptionNet clearly outperforms other techniques in spotting videos that have been manipulated, even when the videos have been compressed, which makes the task significantly harder. “We set a strong baseline of results for detecting a facial manipulation with modern deep-learning architectures,” say Rossler and co.
That should make it easier to spot forged videos as they are uploaded to the web. But the team is well aware of the cat-and-mouse nature of forgery detection: as soon as a new detection technique emerges, the race begins to find a way to fool it.
Rossler and co have a natural head start since they developed XceptionNet. So they use it to spot the telltale signs that a video has been manipulated and then use this information to refine the forgery, making it even harder to detect.
It turns out that this process improves the visual quality of the forgery but does not have much effect on XceptionNet’s ability to detect it. “Our refiner mainly improves visual quality, but it only slightly encumbers forgery detection for deep-learning method trained exactly on the forged output data,” they say.
That’s interesting work since it introduces an entirely new way of improving the process of image manipulation. “We believe that this interplay between tampering and detection is an extremely exciting avenue for follow-up work,” they say.
Ref: arxiv.org/abs/1803.09179 : FaceForensics: A Large-scale Video Data Set for Forgery Detection in Human Faces
Answer: The upper image in each pair is real.