21 December 2006

Preserving Our Memories (Part 1)

I have heard many strategies for preserving genealogical research over the years, and as this is an important topic in genealogy, I intend to write a series of posts on various aspects of preservation. But for now, here are a couple of in-depth papers on digital data preservation from the National Institute of Standards and Technology (NIST) web site.

The reality for all of us is that media will change over time and file formats will change over time. If you REALLY want your research to last:

- Store your data in simple, open, industry-standard file formats (nothing beats Plain Text)
- Avoid proprietary formats tied to a particular company (e.g., Microsoft Word) or tied to a particular computer platform
- Store your data on multiple media: acid-free paper, CDs and/or DVDs, hard disk(s), flash drive(s)
- Upload your data to a central server (like Rootsweb.com)
- Regularly verify your stored archives and refresh your data onto new media

And last but not least, the single most important way to get your research to last: SHARE IT !!


Anonymous said...

I noted your post on Dick Eastman's site this morning and am very curious as to what you consider "archival" DVDs?? I've not been able to find any reliable-looking evaluations and the NISt survey is not public.

Infinite Ancestors said...

Unfortunately, there are no manufacturer guarantees beyond replacing defective media. And as you note, NIST hasn't published any results yet.

But given the choice between spending between $0.25-$0.50 in bulk for a generally reputable name-brand recordable DVD (for which I'd feel very lucky if my data lasted a decade), or spending $1.50-$2.00 for "archival-grade" media, I'll choose the latter for precious content.

Personally, I have been using MAM-A (Mitsui) gold DVDs, but there is similarly good anecdotal evidence for some of the Taiyo Yuden grades. There are a number companies selling "archival" gold DVDs now, but I'm not aware of more than these two manufacturers (manufacturers sell to many labels)

As important as media is care. Keep unburned and burned discs away from dust, sunlight, and temperature and humidity extremes and swings. And store in a proper acid-free plastic DVD case, with a spindle that keeps the DVD away from the case and doesn't put stress on the DVD.

Of course the real problem is data volume...

If the data size is small (a few gigabytes or less), it's much easier to devise a multi-media strategy (hard disk, CD/DVD, flash, online) and keep multiple offsite and/or online backups.

For tens or hundreds of gigabytes, it's simple enough to keep multiple hard drives or a RAID. Even rotating external drives offsite. But on top of that, it's an awful lot of time and cost invested in DVDs. Online storage is getting less expensive, but upload times are killer for many people, and still there are few guarantees of long-term data integrity.

So for genealogy purposes especially, I'll make the case again that on top of whatever backup strategy is in place, public sharing is the best bet for survival of research.