UPF Home
Add your voice to the original UPF Survey

UPF Survey : Follow-up Questions

The UPF specifies that machine-independent algorythms be encapsulated within the stored media. This "Rosetta stone," inscribed with maps for reassembling the digital information archived on the storage media, would serve as a universal translator.
Should the UPF require that the source code used to read data be encapsulated in the storage format? Or should a Recommended Practice allow the option of storing the Rosetta stone as a separate file that might be executable code but would be required to operate on a set of specified platforms?
Should the UPF recommend a list of specific file formats that would be defined by the Rosetta stone?

Related to the Rosetta stone is the issue of platform independence. In respect to a digital archives, how important is platform independence to the concept of a digital preservation format? Is this a "wait and see" issue, or should the UPF make a definitive statement on the requirement of platform-independence?

Several initiatives or digital projects include the concept of a universal identifier, which is analogous to the relational database concept of a unique identifier or an Internet URL. The UPF takes for granted the requirement that each object within a digital archive contains a unique identifier. Key question remain: how should this identifer be used for archives? Should there be a central registration body for these identifications? At what level should material be tagged? Or should there be a standard code representing multiple cataloged levels, followed by more specific or individual codes? Should the unique identifiers be allocated to a standardized area of storage, such as a specific part of the storage media header or TOC?

In our first user survey, reliance on a technical staff and expense of maintaining one were specifically mentioned as major concerns. Tight budgets may require that new staff have skill sets that may stretch across domains. What is the solution for organizations with limited human resources to maintain a wide range of skills? What are the implications of technical expertise upon a universal preservation format, or upon any techincal standard adopted for digital materials? What would be the process for "updating" the Rosetta stone? Another way of asking this question: Should a Recommended Practice advocate technological "Ease-of-Use"?

As we presented our initiative to groups concerned with preserving analog media, we grew aware of a crippling gap between the language used by archivists and the language adopted by digital communities. Words casually chosen by digital initiatives often have deep analog roots. As a result, we wish to preface any Recommended Practice with a glossary of terms that might help bridge the analog-digital domains. Terminology might include the following:

	Access                  Indexing
	Header Information      Storage Tape
	Migration               Data Tape
	Preservation            Platform
	Format                  Operating System
	File                    Algorythm
	Data Stream             Program
	Metadata                Information vs Data
	Database                Query Language
	Source code
	
What other terms should we include, and what should be the basis or sourcebook for defining them?
Take a look at the following sites for related glossaries:

Because this initative has become a SMPTE study group, some professionals working primarily with electronic records feel that the UPF is strictly an initiative dealing with the archiving of moving images. We try to stress that the adoption of a UPF framework would benefit all types of digital information. Do you think that electronic records are inherently different from multimedia in terms of a storage format or framework?

One of the points we try to emphasize in our presentations to archivist groups is that the UPF would help carry on the traditional practices of archivists and librarians by providing a robust, easily expandable framework for digital materials. For example, by incorporating such concepts as unique identifiers and by "gluing" certains kinds of cataloging information to the stored media, we hoped to perpetuate the two principles that are the foundation of standard archival practice: Provenance and Original File Order. We also realize that many variations of these practices have evolved, and that no two archives will follow the same guidelines. So the question here relates to the natural migration of practices: are archiving practices standardized enough to be used as a model or metaphor for designing digital archiving practices? Or is it permissible to explore entirely new methods?

Storage technology evolves at a dizzying pace. Recently, for example, we have looked at PaperDisk from Cobblestone Software, technology that prints digital information on plain paper through a common laser or inkjet printer, then reads it back into a computer through a standard flatbed or hand-held scanner. The potential for this type of technology is enormous. Paper lasts a long time, but imagine using this technology with material that has an even longer lifespan. We have also heard about using DVD as an archival storage medium. Perhaps most exciting is the technology called "HD-ROM," developed by Los Alamos National Laboratory and licensed and sold by Norsam Technologies, which holds "650 GB, 47,000 images on a 2-inch disk."
With planned enhancements, the Norsam HD-ROM may permit up to 12 terabytes of storage per 4 3/4-inch disk.
http://www.entmag.com/archive/1997/may07/050705.html-ssi


That stated, do you feel that the available technology is adequate to address the problem of digital storage? Or might we adopt a straw dog strategy to illustrate how the components of a UPF might be employed over various storage technologies? Also, can the same standard be applied to multiple storage types on a single media as to simple tape to tape storage?


When the problem of physical storage is resolved, does compression cease to be an issue for archival storage? In other words, if a physical storage media can be virtually limitless, then why not store everything in uncompressed format? Or are there other issues to consider, such as speed of retrieval?


Send your commentaries and questions to:
thom_shepard@wgbh.org
In the SUBJECT of your email, please type:
UPF SURVEY FOLLOW UP