TOWARD A UNIVERSAL DATA FORMAT FOR THE PRESERVATION OF MEDIA.
Dave MacCarn
WGBH Educational Foundation
Boston, MA.
Abstract
There is a significant need for a Universal Preservation Format (UPF), designed specifically for digital technologies, that can store compound content (not only media itself but also information about it) so that the content can be accessed easily both today and into the indefinite future. The UPF could break the bond between the recording format and the machine through which the format is accessed.
This paper points out some existing technology that could be used, and is intended to provide a basis for industry discussion toward producing a universal preservation format.
Introduction
Film, video and sound recordings are vital components of our collective memory... This vast source of information, inspiration and creativity--the most known contemporary archive of our society--is threatened. ...[W]e are losing large parts of our recorded past."[1]
According to a recent Library of Congress report, video materials in the public and private sector are estimated to exceed several hundred thousand recorded hours. The same report judges the amount of feet of news film and other film used to record television programming to total in the several millions. Much of this historical material is in danger of being lost.[2]
[T]he preservation of television and video materials faces enormous obstacles, in particular, the vulnerability of videotape to adverse storage conditions, abusive handling, and technological obsolescence.
William T. Murphy, Coordinator
Report on the State of American
Television and Video Preservation
In a letter dated January 16, 1996
Technological obsolescence, in particular, has hindered the preservation of film and video materials by contributing to the enormous expense of accessing stored materials. The standard format for recording television programs has taken several forms over the past fifty years, including kinescope, 2" videotape, 1" videotape and digital tape. As the standard for recorded programming continues to evolve, the equipment used to access materials produced in earlier formats has become increasingly difficult to find and, accordingly, more expensive to use.
Archivists and industry members have addressed the problem by transferring older formats to digital tape, thus attempting to maintain the quality of the original material. However, this costly process simply puts off--rather than solves--the preservation problem. The enormous and rapid changes taking place in digital technology have resulted in a veritable explosion of formats. Thirteen different digital tape formats are available at present (D-1, D-1 SP, D-2, D-3, D-5, D-6, Digital Betacam, Betacam SX, Ampex DCT, Consumer DV, DVCAM, DVCPRO and Digital S) with several more in development (for High Definition Television.) With the format wars heating up, many of these formats may soon become obsolete, making them unsuitable for preserving media information. In addition, digital non-linear editing systems have internal proprietary media formats.
From an archivist perspective this is a nightmare. On one side of the room you store the tapes and on the other side the tape machines and spare parts.
As with the videotape materials produced during the last fifty years, technical obsolescence may make digital formats that are common today inaccessible tomorrow. There is a significant need for a Universal Preservation Format (UPF), designed specifically for digital technologies, that can store compound content (not only the media itself but also information about it) so that it can be accessed easily both today and into the indefinite future. What is not being stated is that we need a universal acquisition format.
This thinking is reflected in a national plan for redefining film preservation put forth by the Librarian of Congress in August of 1994, . As one of his eight recommendations, he suggested that:
[the new electronic technologies] are already transforming film access but archives should insist that certain stringent criteria be met before new technologies are adopted as preservation media.[3]
Paul Messier, Conservator of Photographs and Works on Paper for the Boston Art Conservation, also called for the establishment of criteria for assessing digital video as a preservation medium in a paper that he presented at Playback `96: A Round Table on Video Preservation. His suggested criteria were adapted from those suggested for still images by Basil Manns, Research Scientist at the Library of Congress in his article "The Electronic Document Image Preservation Format." These "criteria" anticipate the technical specifications necessary for the selection and description of data to be preserved through a UPF. However, to date, no further action has been taken within the field of archives.
Use of existing technology
Why can we address this problem now? Along with the surge of digital formats are technologies that are designed to handle digital media of all types. Apple Computer's "Bento Specification"[4] and Avid Technology's "Open Media Framework Interchange Specification"[5] are both media technologies that approach the UPF concept .
Bento
Apple Computer's "Bento Specification" is the underlying technology of Apple Computer's OpenDoc Standard Interchange Format.[6] Bento is a specification for storage and interchange of compound content. Bento defines a standard format for storing multiple different types of objects and an API to access these objects. An object container is just some form of data storage (such as a file.) This storage is used to hold one or more objects (values) and information about the objects (metadata.) Bento containers are defined by a set of rules for storing multiple objects, so that software that understands the rules can find the objects, figure out what kind of objects they are, and use them correctly.
Bento objects can be simple or complex, small (a few bytes) or large (up to 264 bytes, approximately 227 hours of D-1 video.) Bento is designed to be platform and content neutral. so that it provides a convenient container for transporting any type of compound content between multiple platforms. The Bento code currently runs on Macintosh, MS DOS, Microsoft Windows, OS/2 and several varieties of Unix.[7]
OMF
Avid Technology's "Open Media Framework" (OMF) Interchange, now a standard format for the interchange of digital media data among different platforms, has adopted the use of Bento containers. Additionally, the OMF format encapsulates all the information required to transport a variety of digital media such as audio, video, graphics, and still images, as well as the rules for combining and presenting the media. The format includes rules for identifying the original sources of the digital media data, and it can encapsulate both compressed and uncompressed digital media data.
OMF Interchange provides for a variety of existing digital media types and the ability to easily support new types in the future. A single OMF Interchange file can encapsulate all the information required to create, edit, and play digital media presentations.
While OMF Interchange is designed primarily for data interchange, it is structured to facilitate playback directly from an interchanged file when being used on platforms with characteristics and hardware similar to those of the source platform, without the need for expensive translation or duplication of the sample data. OMF Interchange provides for the development and integration of new media and composition types.[8]
The International Multimedia Association (IMA) recently released their Recommended Practice for data exchange which is also based on Bento and OMF.[9] The Society of Motion Picture and Television Engineers (SMPTE) is currently voting on a similar Recommended Practice which uses a subset of the IMA Recommended Practice. However, to date, no one has explored the use of these technologies as a preservation container.
Bento and OMF as a Preservation format
Preservation requires the handling of many different recording formats--such as 2" videotape, 1" videotape, D-1, D-2, D-3, and others--which can be thought of as having data types (4:2:2, 4fsc.) Although Bento allows for any data type, the OMF Interchange and the recommended practice by the IMA only define a minimum number of data types (e.g. TIFF, RGBA and AIFF). By adding additional standard data types (e.g. CCIR 601, 4:2:0, 4:1:1, etc.) to the Open Media Framework Interchange and IMA's Recommended Practice for data exchange, a new Recommended Practice would result in a storage container format that will be able to encompass all present recording forms and allow for all future forms. In moving from the raw recording format (e.g. videotape) to a data tape (or other media) format that incorporates the UPF, the number of formats that archivists need to preserve will be substantially reduced. The UPF breaks the bond between the recording format and the machine through which the format is accessed. In addition, since the UPF can be used not just for videotape but also for text, still image and other digitizable or data materials, commercial technology manufacturers will be able to extend the usefulness of media created on their products.
The storage and retrieval of media could be handled by a so called "media compiler" (figure 1.) This media compiler would then remove the acquisition format from the archive requirement. One has to acknowledge that there will always be a format that media has to be stored on and that this format will always become obsolete at some time. The UPF does not solve this problem. But, it does allow the archivist to choose a format best suited to their needs and one that they feel has the best long term storage potential(one storage manufacturer has created a 50TB near-line storage system using D-3 tape.[10])
Additional header information that defines the formats stored in the UPF could be included. In archival terms, it may be necessary to record these definitions to provide decoding of the original material.
figure 1.
Summary
Clearly, it is time to bring together technology manufacturers, archivists and standards organizations to advance the creation of a Universal Preservation Format that fits the needs of all.
The goal of the Universal Preservation Format is to produce a Recommended Practice that, if adopted, would benefit a broad range of users of archival records--including archivists and technology manufacturers, distributors and producers--by making it efficient, cost-effective and relatively simple to access records originally created in (or transferred into) a variety of digital formats.
Remember that we are talking about an universal archive format and not a universal acquisition format.
Bento and it's use by OMF and the IMA is a good step toward a Universal Preservation Format.
Acknowledgments
Mary Ide, Archivist and Director, Media Archives & Preservation Center, WGBH Educational Foundation, Boston, MA.