Kent Anderson, an academic publisher and former executive at the New England Journal of Medicine, has written a comprehensive defense of the idea that digital goods are not inherently free.
There is a persistent conceit stemming from the IT arrogance we continue to see around us, but it’s one that most IT professionals are finding real problems with — the notion that storing and distributing digital goods is a trivial, simple matter, adds nothing to their cost, and can be effectively done by amateurs.
Advertisement
While Anderson runs through a number of cogent points – everything from the need to secure digital goods to the demands of cataloging them – it struck me that the real issue with digital goods is that they have become so easy to create that their sheer volume makes their management costly.
This story is only available to subscribers.
Don’t settle for half the story.
Get paywall-free access to technology news for the here and now.
What makes the endeavor challenging, if not the size of the archive, is its composition: billions and billions and billions of tweets. […]
Each tweet is a JSON file, containing an immense amount of metadata in addition to the contents of the tweet itself: date and time, number of followers, account creation date, geodata, and so on. To add another layer of complexity, many tweets contain shortened URLs, and the Library of Congress is in discussions with many of these providers as well as with the Internet Archive and its 301works project to help resolve and map the links.
With the advent of the Internet of Things, data is becoming “an effect of just living,” which means an additional accelerant on our need to store and manage data.
It seems that the apparent cheapness of information was only a temporary effect of the beginning of the Internet age. As we transferred analogue media produced through labor intensive processes to the web, we discovered that it was many times larger than we needed it to be, and also many times cheaper to get data to end users. Now that the machines themselves are throwing off so much data, and all of us are producing many times what we once did, information once again has a cost.