Over the past few years there have been numerous cases in which classified information has leaked to the public domain because it was censored using Adobe Acrobat’s “black box” feature.
Well, you won’t be able to find text-under-the-image on the version of the report handed out by the Senate intelligence committee. That’s because the report on their home page was scanned and the scan was put up for download!
This is one way to make sure that nobody can recover the underlying material. Unfortunately, it also produces a report that’s 23.4MB in length — probably 10x longer than it needs to be. And, even worse, the report isn’t searchable.
As a public service, I have OCR’ed the report and put up two versions for download.
http://web.mit.edu/simsong/www/iraqreport2-textunder.pdf is a copy of the scan but with OCR applied, with the text underneath the original images. It has all the fidelity of the original report but you can search it. No clue why this version of the report is half the size of the original.
http://web.mit.edu/simsong/www/iraqreport2-ocr.pdf is just the OCR’ed text. It’s 4.3MB in length. There are many random OCR errors, including occasional bold text that should be something else, but it’s pretty reasonable, easy to search. and quick to download.
Meet Altos Labs, Silicon Valley’s latest wild bet on living forever
Funders of a deep-pocketed new "rejuvenation" startup are said to include Jeff Bezos and Yuri Milner.
Tonga’s volcano blast cut it off from the world. Here’s what it will take to get it reconnected.
The world is anxiously awaiting news from the island—but on top of the physical destruction, the eruption has disconnected it from the internet.
Going bald? Lab-grown hair cells could be on the way
These biotech companies are reprogramming cells to treat baldness, but it’s still early days.
A horrifying new AI app swaps women into porn videos with a click
Deepfake researchers have long feared the day this would arrive.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.