My colleagues from the National Digital Stewardship Alliance working group on Infrastructure and I were pleased to lead a session on bit-level preservation at the 2012 annual Digital Preservation conference, hosted by the Library of congress.
Bit-Level preservation is far from a solved problem. While we know roughly what transformations and processes can be used to mitigate risk from major threats, there is a considerable (and largely applied) research to be done to determine optimal/cost-effective levels and strategies for replication, diversification, compression, auditing; and to develop better (more reliable & valid) measures of risk.
The talk summarized the major risk factors and mitigation strategies, and noted some inter-relationships:
You may also be interested in other presentations from the conference. Bill LeFurgy has an informative blog post with highlights.
David Weinberger’s talk was particularly provocative, drawing on themes from his recent book Too Big to Know. He claims, essentially, that the increase in data, and even more, the networking of information and people, changes the nature of knowledge itself. Knowledge has constituted a series of stopping points (authoritative source, authoritative texts); a corresponding set of institutions and practices to “filter out” bad information; and a physical representation constrained by the form, length, and relative unalterability of printed books. Now, Weinberger claims knowledge is increasing a profess of filtering forward — of provide summaries and links to a much larger and more dynamic knowledge base. This redefines knowledge, changes the role of institutions (which cannot hope to contain all knowledge in an area), and implies that (a) filters are content; (b) we are forced into awareness of the contingent and limited nature of filters — there is always bad information and contradictory information available. Changes in knowledge also changes the nature of expertise and science — both becoming less hierarchical, more diverse, more linked.
If Weinberger is right, and I suspect he is (in large part), there are undiscussed implications for digital preservation. First, our estimates of the expected value of long-term access should be going up if the overall value of knowledge is increased by the total context of knowledge available. Second, we need to go beyond preserving individual information objects, or even “complete” collections — value resides in the network as a whole, and in the filters being used. Maintaining our cultural heritage and scientific evidence base requires enabling historic access to this dynamic network of informations.