Skip to Content
Uncategorized

Senate Iraq Report Censored… With a Scanner?

Over the past few years there have been numerous cases in which classified information has leaked to the public domain because it was censored using Adobe Acrobat’s “black box” feature.Well, you won’t be able to find text-under-the-image on the version…

Over the past few years there have been numerous cases in which classified information has leaked to the public domain because it was censored using Adobe Acrobat’s “black box” feature.

Well, you won’t be able to find text-under-the-image on the version of the report handed out by the Senate intelligence committee. That’s because the report on their home page was scanned and the scan was put up for download!

This is one way to make sure that nobody can recover the underlying material. Unfortunately, it also produces a report that’s 23.4MB in length — probably 10x longer than it needs to be. And, even worse, the report isn’t searchable.

As a public service, I have OCR’ed the report and put up two versions for download.

http://web.mit.edu/simsong/www/iraqreport2-textunder.pdf is a copy of the scan but with OCR applied, with the text underneath the original images. It has all the fidelity of the original report but you can search it. No clue why this version of the report is half the size of the original.

http://web.mit.edu/simsong/www/iraqreport2-ocr.pdf is just the OCR’ed text. It’s 4.3MB in length. There are many random OCR errors, including occasional bold text that should be something else, but it’s pretty reasonable, easy to search. and quick to download.

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

The problem with plug-in hybrids? Their drivers.

Plug-in hybrids are often sold as a transition to EVs, but new data from Europe shows we’re still underestimating the emissions they produce.

How scientists traced a mysterious covid case back to six toilets

When wastewater surveillance turns into a hunt for a single infected individual, the ethics get tricky.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.