SMPTE

Table of Contents
UPF Home
Add your voice to the UPF Survey

Universal Preservation Format: User Survey

3. SMPTE defines compression as "the process of reducing the number of bits required to represent information by removing redundancy." There are basically two kinds of data compression: lossy and lossless. Lossy data compression results in some loss of data and is best suited to graphics and sound, where some loss of data is acceptable. Lossy compression algorithms can often be modified to allow a relative degree of data compression. The more you compress, the less accurate your decompressed data.
Considering that compression of data files is potentially a trade-off between file size and a loss of digital information, what degree of compression do you feel is acceptable for long-term physical storage?

There is so much variety of function in the data to be stored that it seems best to begin with as high a degree of flexibility as possible in terms of standard-setting. It's pointless, I think, for any one organization (or generation, for that matter) to spend a great deal of time determining precise standards where judgment depends so much on so many variables. - Carli

...lossless compression is the most agreeable, but our member institutions also realize that the function and purpose of creating the copy should drive the creation. As with digital image creation, if the function is to totally replace the original, no compression or lossless compression is the *only* option. But, if the copy is only meant as an access/temporary surrogate copy or to transmit files over the internet, a derivative digital copy with lossy compression could be acceptable. - Dale

...a moderate lossless compression with plenty of error correction capability. - Gaustad

...Just as with artifacts, we will make both general and ad hoc decisions; we need tools to do both. - Graham

For critical records, all original data must be preserved, but the standard should recognize various categories of preservation importance. - Hadley

The issue of compression [...] is highly dependent upon the type of collection and [its] use... We typically do not compress files for storage, but there are some types of information where compression is logical and sensible. - Hamal

Lossless compression is only acceptable if the algorithms for recovering the full information are so established or simple that there is no danger that the compressed information will be loss. Also, the media must be such that an uncorrectable error will not corrupt the remainder of the file (as can happen with a JPEG-compressed image). Lossless compression is the only acceptable solution for preservation [...] because we do not know what kinds of analysis or manipulation we may care to use on the files in the future. An automatic indexing program, for example, might work on a lossless file, but not on a file which has been compressed so that humans cannot detect the loss.

These concerns must be balanced against the medium of the original. It is probably better to compress a VHS videotape using MPEG than it is to allow the original to continue to deteriorate. It is just not ideal. - Hirtle

Given the probable improvement of future technologies, and the possible loss of original information before more accurate reproduction is possible, I'd rather use lossless. But economic or other practical considerations tend to intervene, and archivists often have to settle for less than the best. Perhaps a hierarchy of desirables could express the standard. - Lucas

If we are talking about compression then I guess we are talking about migrating the original content from its "native" file format to a compressed format. For many records, this transition away from the native format will be required for the purposes of preservation. If this is the case, then I think compression is acceptable since the link between the preserved information and its original, "native" format has already been broken. Therefore, since the preservation copy is already a "facsimile" there is little lost in leveraging compression techniques. Methods of compression and guidelines for use should be developed. - Messier

Compression depends on the object. Lossy compression is very acceptable for thumbnail images used for browsing, but perhaps not for the larger resolutions that will be sent to the printer. - Ogle

I favor lossless because I cannot predict whether there will be some future cause of loss. - Skarstad

Standards should take into consideration the type of media, for example some loss may be acceptable for audio but not for video. - Vetter

...by allowing choices of compression, the curators and archivists will presumably have storage cost choices. This will allow some items to be preserved with various levels of compression depending upon the "value" of the item. For example, some home videos, collected to document family life, could be converted with quite a lot of compression if we were collecting them for the cultural information about home video production. - Wilson

...Generally speaking we favour lossless compression or no compression for our digital conversion files: we have invested in high resolution capture and want to be able to take advantage of technological advances that will make it feasible to transmit and view high resolution images or recordings. But we have accepted that lossy compression has a place for some material that needs to be kept but where the quality is not a critical preservation factor. - Webb

Some material arrives in the archive already compressed (eg feeds via satellite into newsgathering). The digital transmission will be compressed. The broadcast archive needs to MANAGE compressed materials (cf EC-sponsored project Atlantic), and ensure they are labelled and handled so that the compression does not lead to avoidable deleterious effects. - Wright

Top of Page
NEXT QUESTION | PREVIOUS QUESTION