Sensitive online documents, such as certificates that vouch for banking sites, bear “digital fingerprints” that identify them without revealing their contents. The fingerprints are produced from the documents’ contents by algorithms that are supposed to be irreversible. But recently, older varieties of the algorithms have been weakened. The venerable MD5, for example, has been broken, making it easy to introduce a forgery. Marc Stevens, a PhD student in cryptology at the Centrum Wiskunde and Informatica in Amsterdam, the Netherlands, has created a series of demonstrations of how MD5 can fail. One is shown here: though the two faces are different, their digital fingerprints are the same. This is a harmless example, but it has serious implications for digital forensics.
A. Two Documents
Digital fingerprints are sometimes used to filter out known files among the thousands on a suspected criminal’s computer, helping investigators to focus on files that might contain evidence or contraband. But Marc Stevens can use the broken MD5 encryption algorithm to give two files the same fingerprint–as, for example, with the two images shown here. If a harmless manipulated file gets its fingerprint listed in a commonly used library, malicious files sharing its fingerprint could fly under the radar.
B. Adding Data
Stevens starts by adding junk data to each file to make them the same size. (MD5 checks a file’s length.) He then figures out the difference between the two files’ fingerprints. He continues to add data to both files, now calculated to reduce the differences between their fingerprints. This image, read from left to right, illustrates the approach: the colored bits represent the differences that result as Stevens’s process is applied again and again, until it finally yields identical fingerprints.
Don’t settle for half the story.
Get paywall-free access to technology news for the here and now.