Wiki-based archival description and storage

This wikibook is a practical manual for organizations and individuals who have archives that they need to arrange, describe, digitize, and store — and who wish part of these archives to be part of "the sum of all knowledge" in the Wikimedia movement. It explains a system of safe, permanent storage for archival items that encompasses everything from initial accessioning and evaluation; through arrangement, description, and digitization (or digital conversion for digitally-created items); to techniques of shelving, boxing, and conservation. It includes details of how to manage collaborative (crowd-sourced) transcriptions on Wikisource, and the integration of knowledge from the archives into Wikipedia articles.

If you are using the guidelines in this wikibook to manage your archives, we invite you to also edit these pages as you learn what works and what doesn't, in order to share your knowledge with the rest of the community. You may also like to share your specific experiences by creating learning resources on Wikiversity.

Wikis

edit
What is an archive?

An archive is an accumulation of primary source records that have accumulated over the course of an individual or organization's lifetime, that are kept to show the function of that person or organization. The records have been naturally and necessarily generated as a product of regular activities, and may also have been collected and selected intentionally in order to represent the history of the archive owner. Read more on Wikipedia…

What do we mean by 'wiki-based'? Simply that the system described here makes use of the network of wiki sites operated by the Wikimedia movement, as well as the MediaWiki software that these sites run on. Being wiki-based in this way implies a few characteristics of an archival management system: it is web-based and spread across a number of different web sites; many people can contribute to it, and all their contributions are visible to each other and traceable; and that the supporting software is not as prescriptive of how the system works as is the case with a traditional archival database.

Your archive will exist across a number of wikis. Firstly, for material that is old enough to be public domain or that is freely licenced, the following Wikimedia sites will be used:

  • Wikipedia, for encyclopedic articles about items or their subjects;
  • Wikisource, for transcriptions of text;
  • Wikimedia Commons, for digital files (scans, photographs, etc.); and
  • Wikibooks, for the manual that you are now reading.

Secondly, you will have a number of your own wikis, completely under your control and responsibility:

  • a public wiki for material that you are allowed to publish but which is not suitable for any of the Wikimedia sites (primarily due to copyright restrictions); and
  • some number of closed wikis, one for each broad group of users you need to give access to (often, there is just one, called the 'staff wiki' or 'family only wiki' or similar).

For these wikis, we outline the technical infrastructure and administrative processes that you will require.

Read more about archival wikis.

Description

edit

In the archives' profession, the activity of archival 'description' usually refers to what is done with records while they are being arranged for storage. It is when detailed metadata is recorded about each item, as well as the item's place in the broader context of the whole archive (and society). In this wikibook, we take a more general view and say that description is taking place at every stage from acquisition, through accessioning, as well as the traditional stages of item- or series-level description.

Each act of description is, in one sense, the same: a single wiki page is created, and on it as much information as possible is recorded about the item. A page can also contain images, photographs, and videos, as well as links to more detailed descriptions. The most important things to describe are: where it has come from (provenance) and how it was originally stored (original order), as well as what has been done with it now and where it can be found (both online and off).

One advantage of every level of description being done in this way is that this creates a unique and meaningful identifier for what is being described: the URL of the wiki page. For instance, the first page that is created is one for the archive as a whole — and immediately the archive has a home page on the internet, and a globally-unique way of being referenced.

Read more about description.

Storage

edit

The storage of physical items, their digital representations, and digitally-created items is very important. In the wiki system, it is done progressively alongside description because that way it is possible to place each item in its final storage place and give it an identifier.

Here, we outline a system of storage of physical items that works to not only ensure (as far as is possible) their long-term preservation, but also ties in with digitizing those items and making the digital representations available online. The system is also defined by access control (for example, by ensuring that the contents of a single folder or box are all at the same level of access, and thus will be described together on one wiki). Storage of digitally-created items is easier, at least so far as there being no requirement for having a physical storage facility; the preservation and long-term access requirements are just the same though. The approach here is to treat digitizations and digitally-born files as effectively equivalent and store them in the same way in MediaWiki.

Broadly speaking, the storage process is as follows:

  1. Physical items:
    1. Physical items are scanned and/or photographed (the original boxes etc. that they occupy are included in this).
    2. The items are added in accession order to boxes and folders.
  2. Digital items are converted to appropriate long-term formats.
  3. The resulting files (both digitizations and digitally-created) are uploaded to Commons or one of the archives' wikis.
  4. A page is created for each item on its wiki, where all metadata is recorded — including the box or folder identifier. Some items will have just a single page; some will comprise multiple files (e.g. the front and back of a photograph if there's an enscription on the back) and so will have a page for each of these and an index page where the metadata is stored.
  5. The item's page is printed and stored alongside the item. This print contains the full URL of the item, to serve as a unique identifier.

Read more about storage.

edit
  • Pederson, Ann E; McCausland, Sigrid (1987), Keeping archives, Australian Society of Archivists