Given that I spend much of my time working on digital preservation, I thought it might be fun to talk a little bit about what digital preservation actually is and why it’s an important part of the work we do in this department. But when I started writing, I realized that “a little bit” isn’t enough to cover everything about the massive subject that is digital preservation. So I’m breaking it down into a series, and here for your reading pleasure is the series debut.
As I mentioned a few weeks ago in my post on the e-Resources Fair, Preservation Services is heavily involved in the process of digitizing materials from the Dartmouth College Library collections. I talked about the conservation work that is often associated with digitization, and I talked about the actual scanning and publishing process, in which Preservation staff are also participating. What I failed to mention before was the other preservation aspect of digitization, which is actually the one I’m most involved in…digital preservation. Yes, there is another whole stage of the digitization process, and it’s one that tends to get noticed least because if it works correctly, it is invisible to the researcher viewing this material on our website.
But why is digital preservation important? Well, let’s continue looking at the digitization process as an example. When we create digital versions of our physical collections, we end up with a lot of files. These include image files (like JPEGs, TIFFs, and PDFs), text files (usually in the form of PDF and XML documents), and even audio and video files (such as MP3s). And when I say a lot, I mean a lot. Digitizing a single book can produce hundreds of image files plus one or more text files. Digitizing a whole collection of manuscripts or photographs produces thousands of images.
These thousands of files not only take a lot of time and hard work to create, they also become a valuable part of the library collections. Preservation Services is responsible for preserving them, just as we are responsible for preserving the physical materials they are derived from. And the products of digitization aren’t the only digital files we’re concerned with. As I mentioned in my e-Resources post, the Library owns and subscribes to a huge variety of electronic resources, from e-journals to databases to streaming music. The sum total of these resources amounts to millions of files, each of which contains information that we don’t want to lose.
So the big question (from a preservation perspective) is what happens to all the bits, bytes, and files that make up these digital collections? How do we make sure all of this information remains accessible to our student and faculty researchers as technology changes over the next 5, 10, or even 50 years? To understand this problem, we have to understand what the potential risks are to our digital collections. But that’s a whole post in and of itself, so it will have to wait until next time. Stayed tuned for Part 2…Written by Helen K. Bailey