Universal Preservation Format: User Survey

UPF Home
Add your voice to the UPF Survey

The success of any initiative dealing with digital preservation hinges on the quality of communication between professionals who manage media collections and leaders within the digital storage industry. Finding common ground between these camps will have a positive and dramatic impact upon all aspects of longterm digital preservation.

We devised this survey as a way to gather information about the needs of organizations involved with the challenge of long-term audio-video digital preservation.



Challenges for the long-term storage and preservation of media assets:

Although the challenges of storage media obsolescence and reliability got the most votes in our survey, those who made additional comments mentioned the challenge to formulate a strategy for migrating materials to a digital storage, dealing with budgetary constraints and, in some cases, the apathy of administrators. One point to consider is the hierarchy of institutions that hold archives. Establishing new strategies must factor in costs, and a recommended practice might look at the costs of migration to digital formats, perhaps looking at how a UPF might lower certain costs. This material must be presented in plain English. It is one thing to convince archivists of the need for a UPF; it may take more to give archivists the resources for convincing administrators and board of directors of this need. Some concepts, key words and quotes:

Concerning storage formats:

The vast majority of those who answered our survey said they had at least some knowledge of data storage formats. Due to the technical nature of some of our questions, the survey may have scared off many who do not have this level of knowledge.

Concerning metadata:

In contrast to the storage format question, metadata as a concept seemed to be understood at least on some level by almost all of those who responded. The term has been somewhat popularized by various search engines or services on the World Wide Web, but many archivists know the concept through finding aids initiatives.

Concerning SMPTE:

Considering that we were appealing to archivists in this survey, it should not be too surprising that so few people knew about the Society of Motion Picture Engineers. Until now, SMPTE has not sought input from the archival or library science communities. One of our missions is to make these groups aware of this technical standards organization.

Concerning wrappers or "bento containers":

Interestingly, only two people thought they had a good understanding of wrappers. The rest of the wrapper responses were fairly evenly divided among the "some" to "no knowledge" choices. This may be attributed to the elusive nature of the concept. More likely, archivists and librarians understand the concept without quite connecting it to the new terminology. For example, anyone who has used Lotus Notes or a presentation program such as Powerpoint or Persuasion has had some experience with the wrapper format. Computer operating systems, including Windows and the Mac O/S, that allow for cutting and pasting among different applications also employs a kind of wrapper technology.

Ultimate goals concerning digital preservation of media collections:

This question evoked a commentary similar to the earlier "challenges" question. Some of the same issues were raised, particularly budgets. The majority stated that they intended to let public demand determine which analog materials would be converted into digital. Also mentioned throughout the commentary was the determination to continue all analog preservation efforts, even while dipping toes into the digital waters. Implicit throughout many of these commentaries was a "wait and see" approach to digital archiving. Establishing a standard for a universal preservation format through SMPTE and other engineering groups would be like placing a life guard before these troubling waters. Some concepts, key words and phrases:

Thinking about migration:

Options for this question had already been touched upon in the commentaries of earlier questions. For example, the majority of respondents saw the need to maintain a dual storage strategy, though we must add that the "no opinion" choose received the second highest number of votes, which -- considering the thoughtfulness of the commentaries -- may be seen more as further evidence of a "wait and see" strategy than as indifference. Some concepts, key words and phrases:
Compression for long-term storage:

With any luck, this question will be rendered moot as digital storage technology catches up with computer processor technologies. For example, there are already technologies being developed capable of storing tetrabytes on media no larger than a floppy diskette. And yet even when affordable storage space is no longer a problem, compression will still be an issue for access and perhaps for transport. Appropriately, most people answered this question with a "Well, it depends..." Commentaries linked this issue with an institution's budget and with the types of media to be archived.
Metadata initiatives:

The Dublin Core Workshop and Berkeley EAD (Encoded Archival Description) Projects were the two metadata initiatives that received the most number of recognition votes, though it is quite possible that two text initiatives, the Text Encoding Initiative (TEI) and Recordkeeping Functional Requirements Project (U of Pittsburgh), would have received more votes if we had included them from the very beginning. These omissions resulted in the criticism that the UPF initiative was "too narrowly focused." We are striving to address that criticism as we formulate follow-up questions.
Cataloging media collections:

Overwhelmingly, the institutions represented in our survey use some form of commercial computer application to catalog their media collections. Popular programs like FileMaker and Access were championed, as were specialized packages, such as Endeavor, Cuadra, Cairs, and Tinlib. Of course, some form of MARC cataloging software was mentioned throughout the commentary. It was interesting that, except for MAVIS, no one mentioned software known as "asset management products" that were designed to "to effectively catalogue, index, access, search and retrieve video clips, digital libraries." (See: http://www.cmis.csiro.au/DMIS/VideoTalk/software.html for more information about this type of application.)
Using descriptive information or metadata:

This was one of the survey's "dah" questions. Asked how metadata might be used, respondents said in effect that the use of metadata was only as limited as the imagination, though granting "easy accessibility to the material" by way of search and retrieval got the most votes. Copyright figured in a couple of the commentaries, including the need for identifying sources and "terms for usage," and for preserving data integrity.
Media collections on the World Wide Web:

Those who knew how they would use the World Wide Web were evenly divided among the limited choices for this question. Commentaries suggested additional possibilities. And though the question used the word "envisioned," it is clear that several institutions are already implementing Web options. A recommended practice might impact on how archivists use the World Wide Web. For example, a common browser might access materials within an archive storaged in a platform independent UPF format, either through an intermediary HTML page or through a browser plug-in.
Distributing and storing metadata:

This first of two "blue sky" questions was perhaps too narrowly focused to generate feedback from most of our respondents, especially those concerned primarily with electronic records. Nonetheless, a few of the commentaries saw the application of metadata streaming for still image collections. Many other commentaries mentioned the problem of cost versus value in establishing detailed metadata streaming.
Metadata streaming:

This second "blue sky" question generated a wider commentary. How deep should descriptive cataloging go? If one picture is worth a thousands words, how many of those words do you include in your metadata? The UPF is not designed to answer these questions specifically, but it would establish a foundation that would help enable archivists and other information professionals to resolve metadata questions for their own institutions.
One area with which the UPF might deal explicitly is the placement of copyright information, and the identification of "original works" received the highest number of votes. The UPF might also grapple with "versioning": assigning unique identifiers to the first generation object or compound document, then referencing them with unique identifiers with subsequent generations.
Closing thoughts:

In case there was any doubt, closing commentaries reassured us that a standard is imperative to any digital migration initiative. Areas discussed in this section that were not singled out in the survey questions include the need of a preservation format to include a common lexicon, a recommended practice or guideline, and even a call for standardizing digital equipment. Education was also stressed throughout the commentaries; archivist organizations should consider establishing online tutorials for its members.
Whatever one feels about the specifics of the Universal Preservation Format, it cannot be denied that many archivists are reluctant to invest their time and money into migration projects that are perceived to be short-term or intermediary solutions. Because our project focuses on such crucial issues as platform independence and embedded metadata, we feel we have the start of a workable solution that will last well into the next century.
List of Contributors

Follow-up questions