Being products of deep learning and generative neural network architecture, deepfakes are way more advanced than traditional fake image or videos – and often impossible to identify. The consequent harm can be far-reaching,and social-media portals are gradually waking up to the threat of such refined fakes spreading doctored information to manipulate public opinion on a massive scale. In search of remedial action, Facebook is taking decisive steps – one of which is their recent release of the largest ever dataset of deepfake videos. Comprising over 100,000 video clips, these videos contain the widest available range of face-swapping tricks which Facebook hopes will serve as training data for developing AI tools to identify deepfakes.
In a parallel development, Facebook organised a Deepfake Detection Challenge – inviting deepfake identifying models trained on its released data set. The response was overwhelming, with 2,114 participants submitting 35,000 models.
While anyone would anticipate such models to engage investigative methods like tracing back residual digital trail from the deepfake creation process,it was interesting to note that the first five winning entries in the Challenge employed a different approach. All five of them aimed to identify whether anything in the fakes appeared abnormal – much like we, humans, would do while comparing pictures for authenticity.
The model developed by machine-learning engineer Selim Seferbekov was the winner of the challenge. His model could identify a deepfake with 65% accuracy. It was tested on 10,000 unseen video clips, both new and existing. Hurdles thrown in the testing process included confusing video content – like individuals getting a makeup overhaul, an act that altered the primary look of the characters on screen. Other such test diversions comprised altering video orientation, resolution, playing speed and even obfuscating the faces of on-screen characters with text box and other similar superimposed shapes.
It was observed that all the winners had used EfficientNets – a Google innovation that harnesses convolutional neural network (CNN). Such networks are a novel development, efficient in recognition and identification, and allow structured fine-tuning to enhance accuracy.
Seferbekov knows well that 65% accuracy is nothing that can be put to any real use yet. He is looking to improve by training his model to identify the subtle transitions between frames of a deepfake video. Although such minute shifts are instinctively noticed by human eye, an automatic system would require extremely powerful computing support. Facebook thinks there might also be scope for improving accuracy further by focusing on context analysis – rather than directly on the video content.
Deepfakes are yet an emerging threat – but they can become unmanageable if measures are not explored right now. With technology getting incredibly refined each day, human visual acuity will soon be unable to differentiate between deepfakes and originals. And in that case, only AI-based solutions would be smart enough to call out deep learning and neural network-based fakes.