The development of the OAIS reference model has been pioneered by NASA's Consultative Committee for Space Data Systems (CCSDS) and has been accepted as an ISO (14721:2003) standard[1]. A technical recommendation is also available for consultation on the CCSDS website[2]. As a reference model, OAIS provides a conceptual framework within which to consider the functional requirements for an archival system suited to the long-term management and preservation of digital data. The OAIS framework has applications for both proposed and existing archival systems and also as a way of comparing systems through the mapping of discipline-specific jargon to OAIS terminology. Such terminology can, when mapped, be made clear and unambiguous enough to allow understanding by those beyond dedicated archival staff. The OAIS core entities and work flows within the model are shown in fig. 1 below.
![]() |
Data producers create Submission Information Packages (SIP). A SIP equates to a deposit of digital data plus any documentation and metadata necessary for the archive to facilitate the long term preservation of the data and to provide access for consumers (i.e. reuse). The SIP provides a basis for the creation of an Archival Information Package (AIP) and a Dissemination Information Package (DIP) generated by the archive. The process involves generating preservation and dissemination versions of the deposited data where necessary. For example, a Microsoft Word .doc file might be converted to an XML based format such as an Open Office text document for long term preservation and to PDF for dissemination. Metadata documenting this processing is added to the AIP as is any relevant information from the SIP. Similarly any resource discovery metadata and reuse documentation in the SIP is added to the DIP. Consequently metadata and documentation supplied as part of a SIP assume major importance in terms of data deposition. The OAIS standard notes of the SIP that 'Its form and detailed content are typically negotiated between the Producer and the OAIS'. In practice most repositories offer guidelines to depositors about acceptable formats, delivery media, copyright issues and necessary documentation and metadata.
In general the archival community are actively seeking to become compliant with the reference model through the process of certification (see Archival Strategies). It should, however, be noted that such audit checklists are a very recent development and, for the time being, a state of trust needs to exist between creator and archive.
Data in the Submission Information Package (SIP) should be in (or have clear migration paths to) suitable preservation formats and, together with the associated documentation, this data should be sufficient to support the creation of an Archival Information Package (AIP). The Submission Information Package (SIP) assumes major importance in the relationship between data producer and an OAIS compliant archive where, as well as the data, documentation and metadata play important roles in informing preservation and reuse.
The AIP should consist 'of the Content Information and the associated Preservation Description Information (PDI), which is preserved within an OAIS'.
With the provision of a well formed SIP an archive will have minimal problems in generating the AIP. It is the rich metadata that provides for the ongoing management of the data it references through, for example, the automated audit of data using fixity or checksum values or through migration as a batch process.
Data in the Submission Information Package (SIP) should also be in, or have migration paths to, formats suitable for dissemination for reuse. The submitted format can in many cases be the same for both preservation and dissemination. The SIP needs to contain any documentation that facilitates reuse including metadata relating to resource discovery, fitness for use, access, transfer and use. A well formed SIP will facilitate the generation of the Dissemination Information Package (DIP).
Many of the formats noted as suitable for preservation are also suitable for dissemination and, in general, this is the ideal situation as datasets need only be stored once. However, there is an already noted problem in that archivists generally prefer simple file formats such as ASCII whilst users prefer the smaller file sizes of binary files.
[1] http://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=24683&ICS1=49&IC S2=140&ICS3
[2] http://public.ccsds.org/publications/archive/650x0b1.pdf
[3] http://public.ccsds.org/publications/archive/650x0b1.pdf
[4] See Section 1.7.2 Terminology of http://public.ccsds.org/publications/archive/650x0b1.pdf