Skip to Content
Uncategorized

MIT’s DSpace Explained

Electronic repositories stretch to meet scholars’ needs.

In 1978, Loren Kohnfelder invented digital certificates while working on his MIT undergraduate thesis. Today, digital certificates are widely used to distribute the public keys that are the basis of the Internet’s encryption system. This is important stuff! But when I tried to find an online copy of Kohnfelder’s 1978 manuscript, I came up blank. According to the MIT Libraries’ catalog, there were just two copies in the system: a microfiche somewhere in Barker Engineering Library, and a “noncirculating” copy in the Institute Archives.

Google couldn’t find anything. Nor could CiteSeer, an online database of scholarly papers in computer science. Finally I found an e-mail address for Kohnfelder himself in MIT’s online alumni database. A few hours later, he informed me that a scanned copy of his thesis could be downloaded from the website theses.mit.edu. And as it turns out, a copy of Kohnfelder’s thesis has also been entered into DSpace, the big digital-repository project that MIT Libraries and Hewlett-Packard started back in 2002. That copy is indexed by Google Scholar, Google’s academic search engine. But I hadn’t thought to check there.

DSpace is a long-term, searchable digital archive. It creates unchanging URLs for stored materials and automatically backs up one institution’s archives to another’s. Today, DSpace is being used by 79 institutions, with more on the way. But as my little story about Kohnfelder’s thesis demonstrates, archiving data is only half the problem. In order to be useful, archives must also enable researchers to find what they are looking for. Sending e-mail to the author worked for me, but it’s not a good solution for the masses.

Long-term funding is another problem that DSpace needs to solve. “The libraries are seeking ways of stabilizing support for DSpace to make it easier to sustain as it gets bigger over time,” says MacKenzie Smith, the Libraries’ associate director for technology. Today, development on the DSpace system is funded by short-term grants. That’s great for doing research, but it’s not a good model for a facility that’s destined to be the long-term memory of the Institute’s research output. Says Smith: “We need to know how to support an operation like this in very lean times.”

(1) Submitter uses a Web-based interface to deposit files. DSpace handles any format from simple text documents to datasets and digital video.

(2) Data files are organized together into related sets. “Metadata,” technical information about the data, is kept to support preservation.

(3) An item is an “archival atom” consisting of grouped, related content and metadata, which is indexed for browsing and searching.

(4) Items are organized into “communities” corresponding to parts of the organization such as departments, labs, and schools.

(5) DSpace’s modular architecture allows for expansion across disciplinary as well as institutional boundaries.

(6) In functional preservation, files are kept accessible as technology formats, media, and paradigms evolve over time.

(7) The end-user interface supports searching and browsing the archives. Items can be opened in either a Web browser or a suitable application program.

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

The problem with plug-in hybrids? Their drivers.

Plug-in hybrids are often sold as a transition to EVs, but new data from Europe shows we’re still underestimating the emissions they produce.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

How scientists traced a mysterious covid case back to six toilets

When wastewater surveillance turns into a hunt for a single infected individual, the ethics get tricky.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.