Duke University
University of North Carolina at Chapel Hill
North Carolina State University
North Carolina State Archives
Stephen Miller
Rare Book, Manuscript, and Special Collections Library
Duke University
April 2000
Last Updated by Joshua McKim
November 30, 2001
The section highlighted in red have been modified from the previous version of these Guidelines. Please review these section for approval.
2.1 Automating the Creation of Finding Aids Using EAD3. Encoding Guidelines
2.2 Granularity of Encoding
2.2.1 Subject Tagging2.3 File Naming of Document Instances
2.2.2 Linking Elements
2.4 Using Upper and Lower Case Tags
2.5 Headings
2.6 Special Characters
2.7 Spacing and Punctuation
2.8 Titles and Emphasis
2.9 Attribute Values
2.10 <note> Element
3.1 XML Declaration, DOCTYPE Declaration, and Declaration Subset4. Additional Issues
3.2 Overview of High-Level Components
3.3 EAD Header
3.4 Frontmatter
3.5 Archdesc
3.5.1 Did
3.5.2 Admininfo
3.5.3 Bioghist
3.5.4 Scopecontent, Arrangement, Organization, and ODD
3.5.5 Controlaccess
3.5.6 DSC (Description of Subordinate Components)
3.5.6.1 Components3.5.7 ADD
3.5.6.2 Box, Folder, and Other Containers in <container> Element
3.5.6.3 Unitdates and Unittitles in Components
3.5.6.4 Physdesc in Components
3.5.6.5 Scopecontent, Organization, Arrangement in Components
3.5.6.6 Admininfo in Components
3.5.6.7 <head> Elements in Components
4.1 Oversize MaterialsReferences and Resources
4.2 Later Accessions and Additions
Appendix A. Entities Reference Examples.
Appendix B. UC Berkeley Filing Title Syntax.
Appendix C. NCEAD Guidelines for XML EAD.
Appendix D. Values forAttribute "findaidstatus."
Appendix E. Examples of Different Types of.
Appendix F. Inclusion of Container Information.
Appendix G. Physdesc.
Appendix H. Add.
The NCEAD Best Practice Guidelines for EAD Encoding provides a guide to the implementation of Encoded Archival Description (EAD) in the multi-institutional NCEAD Project. It is intended to be used as a supplement to the EAD Application Guidelines and Tag Library, in order to further define current best practices for the use of EAD.
Because different institutions have different approaches to archival description, and those approaches have changed and evolved over time, there will be many variations to contend with in a multi-institutional retrospective encoding project utilizing existing finding aids and descriptive information. As stipulated by the design principles of EAD, the DTD itself purposefully supports an extremely wide range of possible practices. As a result, the development of guidelines for an acceptable range of uniform practices agreed upon by all participants in a multi-institutional project is necessary.
As the development of EAD progresses and more institutions implement it as part of their descriptive resources, best practices will continue to develop. At the present it is necessary to consider previous work in developing best practices in order to follow encoding practices compliant with other institutions and union databases.
Whenever working with EAD, XML, SGML, and digital data in general, make a strong effort to ensure uniform and consistent encoding and data structures. Uniform and consistent encoding allows for:
This document provides a framework of best practices based on several documents: the EAD Tag Library 1.0, EAD 1.0 Application Guidelines, prior work to create union standards for EAD encoding in the form of "The Encoded Archival Description Retrospective Conversion Guidelines: A Supplement to the EAD Tag Library and EAD Guidelines" used in the American Heritage Virtual Archive Project and the Online Archive of California, and the "RLG Encoded Archival Description: Recommended Application Guidelines" developed for RLG libraries and several other projects that are listed at the end of this document. The document also takes into account Duke University's experience building an archive of encoded finding aids.
The encoding described within this document refers to XML implementations of EAD only. Although XML standards are continuing to develop, the use of XML is becoming more prevalent on the internet as software to support its creation, storage, and display is written. The differences between SGML and XML encoding are few and summarized in Appendix C. Scripts that convert SGML to XML are also available.
A good way to standardize encoding is by creating a set of EAD templates and macros and consistently use them during finding aid creation. Because the introductory and high-level descriptive elements of the finding aid are quite regular they lend themselves to automatic processing. Information entered into forms or dialog boxes can be processed into valid EAD markup. By requiring the inclusion of agreed upon information in all fields of these mark-up devices EAD finding aids become more consistant. NCEAD will not prescribe any one set of templates or macros. Each institution will choose the system that works best for them as long as the information and structure outlined in these guidelines is maintained. Sharing and discussing these systems will benefit all.
The "granularity of encoding" of a finding aid refers to the amount of effort expended in the application of subject terms, linking, and other elements which, while not necessary for a complete and valid structured EAD document, may be applied to enhance the encoding and searchability.
Thorough tagging of subjects within container lists is an important but time-consuming and expensive endeavor. The benefits of tagging to this level of granularity are unclear. NCEAD will strive to tag of each applicable subject term once outside the high-level <controlaccess> section. Each term tagged should be considered integral to understanding the breadth and context of the collection. At a minimum level, subject tagging MUST be used in the high-level <controlaccess> as explained by Section 3.5.5 of these guidelines. These elements include:
High-level <scopecontent>, <bioghist>, <organization>, <arrangement>, and <odd> may include detailed subject tagging if desired by the participating institution. If an institution wishes to include subject tagging in these high-level elements, such tagging should be done consistently for all encoded finding aids from the institution.
The goal of subject tagging is to assist future searching and indexing of finding aids. It is foreseeable that a search protocol would rank a finding aid higher or more relevant if a subject term were tagged on multiple occasions within a finding aid. Therefore we must be careful to not overstate the signifigance of any particular finding aid. If subject tagging is employed consistently for all finding aids from an institution or institutions, fewer false leads will be encoutered by researchers. At this point, no consensus appears in best practice implementation of subject terms and many guidelines do not even mention this issue outside of <controlaccess>.
Linking elements may be used to facilitate internal navigation of the EAD document and to provide access to external digital archival objects and other resources. Linking can make encoding a complex process. This section provides only a basic guide to simple linking techniques. This should not discourage experimentation with linking elements as described in the EAD Application Guidelines Chapter 7. However, not all linking possibilities are supported by current software and stylesheets. Further, any links used should be tested when the finding aid has been stored in its final location.
As opposed to SGML, in XML all elements must be closed as well as opened. However, the EAD DTD makes a certain exception for five elements namely: <lb>, <extptr>, <extptrloc>, <ptr>, and <ptrloc>. None of these elements may contain other elements nested within or PCDATA (parsed character data). Rather than creating a second element to close the open tag, XML allows the these tags to be automatically closed by including a "/" immediately before the ">". See section 3.4 for more examples.
<exptr show="embed" entityref="dukeseal"/>
<lb/>
Simple internal links may be made using the <ref> and <id> elements as in the following example:
<scopecontent><p>The <ref target="corr">Correspondence Series</ref> contains...</p></scopecontent>
[...]
<c01><did>
<unittitle id="corr">Correspondence Series, <unitdate type="inclusive">1913-1915</unitdate></unittitle>
</did></c01>
Identifiers in the target and id attributes must be identical. Within each document identifiers must be unique. When possible, convert only the first four characters of the text contained within the <ref> element to lowercase and use these as the attribute content. Otherwise, attempt to only use 4-character references. References may consist of both letters and numbers.
2.2.2.2. External Linking
External links may be used to:
Institutional seals may be included within <titlepage> by first declaring an entity in the declaration subset of the current document instance. This entity name is then referenced in the <extptr> element to include the contents of the entity file. See sections 3.1 and 3.4 for templates and examples.
A digital archival object may be included within the finding aid. For most current practical purposes the DAO appears as a thumbnail image embedded within the webpage. To use a DAO, an entity must be declared in the declaration subset of the current document instance. The entity is then referenced from the <dao> element. Descriptive information about the object is typically contained within the <unittitle> and other <did> elements at the level where the <dao> appears. Example of entity declaration and <dao> element:
<!ENTITY TF0021 PUBLIC "-//Duke University::Rare Book, Manuscript, and Special Collections Library//NONSGML (US::NDD::Gedney::TF0021)//EN" "TF0021-small.jpeg" NDATA jpeg>
[...]
<dao entityref="TF0012"></dao>
You may also have the <dao> element refer to an external web page using the "href" attribute.
<dao href="http://scriptorium.lib.duke.edu/">
http://scriptorium.lib.duke.edu/</dao>
The <dao> element is only to be used for materials described by the finding aid and included in the collection but in this case accessed through another web site. Please see the EAD Application Guidelines for more information.
External links may be created to resources or files outside the finding aid. These may be web pages containing additional information or information that is not part of the materials being described. Use <dao> for linking to materials described by the finding aid and included in the collection.
Example of <extref> using href attribute to make a link to a web page:
<extref href="http://scriptorium.lib.duke.edu/">
http://scriptorium.lib.duke.edu/</extref>
As the number of finding aids grow, it is important that each repository should maintain a system for creating a unique name for their EAD XML instances. Filenames should be made up of lower-case letters and numbers only and must include .xml as the suffix. There is no one valid system for creating filenames although using collection names or numbers is good practice. Including an institutional identifier, such as an NUC code, and department or section codes may further help storage of EAD instances although the full necessity of this is not yet known. For repositories which use numbers or other codes for filenames, a crosswalk document will help indicate the name of the collection to which the filename refers.
All opening and closing tags and attribute names in EAD 1.0 document instances MUST be in lowercase characters only to ensure XML compliance.
Because XML is case sensitive the tags <EAD> <Ead> <eaD> <ead> are all interpreted as being different tags by an XML-aware system. This is the result of XML being designed to be compatible with Unicode character set specifications. Declarations for elements and attributes in the EAD DTD are specified in lowercase, making it necessary for tags (element and attribute names) in the document instance to be in lowercase as well. Using XML authoring software, macros, and template forms will prevent many mistakes. Attribute values may contain uppercase characters, and should follow the templates in these guidelines.
Maintain absolute consistency in headings for sections of finding aids so as not to confuse users or create undue difficulties when creating EAD. For example, a container list heading should always be "Container List" as indicated in the recommended templates, never "Container Listing" or "List of Containers." Exceptions may be made depending on the structure of the finding aid being encoded. The <head> is a a generic element. Information in the <head> element assists readers with identifying major sections of the finding aid but does not provide information that is unique to the collection being described or provide a useful search term. Do not use all UPPERCASE lettering for headings.
A generic example of a heading should look like this:<organization><head>User's Pathfinder Note</head>
<p>…</p>
</organization>
Special characters are frequently found in EAD-encoded XML documents, especially for finding aids with significant non-English content. Previously, SGML encoded EAD finding aids used ISO 8879 character references. Now, XML encoded EAD finding aids require the use of Unicode Hexadecimal References. As seen below Unicode hexadecimals are not nearly as intuitive as ISO 8879. See the EAD application guidelines section 6.5.2.1 for further discussion of special character entities in EAD. Software and scripts may exist that convert Unicode to Hexadecimal which may ease retrospective conversion of digital files.
Previously in ISO 8879, these characters would be replaced with the following equivalents:
Currently these characters must be replaced with Unicode hexadecimal equivalents:
For a list of commonly used special characters and their ISO 8879 and Unicode hexadecimal equivalents visit
http://scriptorium.lib.duke.edu/ncead/iso2uni.html
or
http://hotwired.lycos.com/webmonkey/reference/special_characters/
.
Whether using ISO 8879 or Unicode Hexadecimal References all entities should be checked for proper display. This is not always an easy task. Popular web browsers support special character display differently from each other and from different versions of the same software. Unicode Hexadecimal References appear to be the developing standard for XML and these guidelines encourage their use, although for the purposes of proper display, ISO 8879 references are an alternative. As noted above, there may be a need for retrospective conversion at a later date.
While many characters may be represented with ISO or Unicode entities, it is not worthwhile to utilize them for common ASCII characters such as punctuation, hyphens, quotes, or parenthesis. Focus should be placed instead on providing correct representation of alphabetic characters and diacritics for English and other languages.
Spacing and punctuation in EAD instances are very important. While an EAD instance may be perfectly valid according to the DTD, spacing and other errors may crop up throughout a document, especially surrounding inline elements. Inline tags, such as date and subject, are used to identify smaller segments of text or words in a line of text. Common mistakes include lack of spaces and spaces in the wrong place:
Correspondence1934
Bill T. Jones , 1899-1976
The EAD Application Guidelines addresses the issues surrounding spacing and punctuation in sections 4.3.5.1 and 4.3.5.2. Here are some suggestions to follow:
There has been some discussion of the possible implications of this practice on extraction of subject elements from EAD instances for building catalogs and indexes. While it is true that a simplistic extraction of the contents of such element would contain the trailing punctuation and spaces where they are included in the element, it is a trivial issue from a programming standpoint to "clean up" extracted data to remove the trailing punctuation and spaces.
The <title> and <emph> tags use the render attribute to specify appearance when transformed by a stylesheet. The render attribute includes such values as quoted, bold, italics, underline, etc. Here is an example of an italicized title.
<title render="italic">Darkness Visible</title>
Some more suggestions:
It is important to be careful with spacing and punctuation with the <title> AND <emph> tags especially when using quoted or underline render attributes. Be sure that the closing tag immediately follows all affected text. Otherwise quotation marks or underlines may extend beyond the intended characters.
Correct:
<scopecontent><p>The poem <title render="quoted">Trees,</title> which was written..."</p></scopcontent>Which displays as: "The poem "Trees," which was written..."
Incorrect:
<scopcontent><p>The poem <title render="quoted">Trees, </title>which was written..."</p></scopcontent>Which displays as: "The poem "Trees, "which was written..."
Always use quotes surrounding attribute values:
<archdesc level="collection">
The <note> element is considered to be too vague for adequate description of contextual information about the collection. Information within <note> tags may be useful for specific purposes but will generally not be useful for advanced searching. There are appropriate uses for this tag but only after a careful review of applicable guidelines and consultation with colleagues has identified no more suitable tags.
Template for the XML Declaration, DOCTYPE Declaration, and Declaration Subset:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="[nameofstylesheet].xsl"?>
<!DOCTYPE EAD PUBLIC "-//Society of American Archivitsts//DTD ead.dtd (Encoded Archival Description (EAD) Version 1.0)//EN" "ead.dtd" [
<!ENTITY % eadnotat PUBLIC "-//Society of American Archivists//DTD eadnotat.ent (Encoded Archival Description (EAD) Notation Declarations Version 1.0)//EN" "eadnotat.ent">
%eadnotat;<!ENTITY [imagename] PUBLIC "-//[NAME OF OWNER]::[SUBORDINATE NAMED DIVISION OF OWNER]//NONSGML ([brief image description])//EN" "[imagename].gif" NDATA gif>
<!ENTITY [entityname] PUBLIC "-//[NAME OF OWNER]::[SUBORDINATE NAMED DIVISION OF OWNER]//TEXT ([brief entity description])//EN" "[entityname].xml">
...other entitities...
]>
(Note that lines should not break within the declarations as they do above)
The XML Declaration describes the version of XML and the stylesheet that will be used in rendering the document. The DOCTYPE Declaration identifies the XML document as conforming to the EAD DTD. The Declaration Subset is the information between the open and close brackets [ ]. The Declaration Subset contains declarations of entities or elements that are made in addition to those found in the DTD. Because they are declared in the individual document instance, they may be referenced only in that instance, and likewise must be declared in each instance.
The XML Declaration and the Doctype Declaration should appear identical to what is shown above except the name and relative path of the files. Entities, however, should reflect the form shown above but each institution will create and maintain their own sets of entities. Entities are best used to display common information that is repeated across an institution's finding aids. This eliminates the need to retype this information, and if the information changes it only needs to be changed in one location. Make sure your entities contain valid EAD/XML code. In Appendix A, there is an example of an entity subset and entity files that Duke University uses for its EAD finding aids.
<ead>
<eadheader>Describes the finding aid and EAD file itself. (not displayed)
<frontmatter>Information about the finding aid, including electronic title page. (displayed)
<archdesc>The description of the materials in the archival collection.
Possible template for <eadheader>:
<eadheader audience="internal" findaidstatus="unverified-full-draft" langencoding="ISO 639-2">
<!-- One possible eadid -->
<eadid type="SGML catalog">PUBLIC "-//[NAME OF OWNER]::[SUBORDINANT DIVISION]//TEXT (US::[NUC ID]::[LOCAL REFERENCE CODE]::[TITLE OF COLLECTION: John Doe Papers, yyyy-yyyy])//EN" "[FILENAME].xml"</eadid>
<filedesc>
<titlestmt>
<titleproper>Inventory of the [title of collection: John Doe Papers], <date>[inclusive dates of collection: yyyy-yyyy]</date></titleproper>
<author>Processed by: [person(s) who processed the collection]; machine-readable finding aid created by: [person marking up finding aid]</author></titlestmt>
<publicationstmt>
[entity containing name and address: &hdr-[NUC]-[name];]
<!-- Copyright Information Not Required -->
<p><date>© [yyyy]</date> [Copyright holder]. All Rights Reserved.</p></publicationstmt></filedesc>
<profiledesc>
<creation>Machine-readable finding aid derived from
[paper by means of scanning and OCR; OCR file edited for typographical errors before encoding.] [[application name] text editing software.] [typescript by rekeying.] [database name.] [automated markup system.]
<lb/>Date of source: [date of processing completion: month dd, yyyy]
<lb/>Processed by [person who processed the collection] [date of processing completion: month dd, yyyy]; Finding Aid encoded by [person marking up finding aid], [repository], <date>[date of markup: month dd, yyyy]</date></creation>
<langusage>Description is in <language>English.</language>
</langusage>
</profiledesc>
</eadheader>
All elements listed in the template should be considered to be required unless otherwise noted, and all data be included even if it is a placeholder. For instance, if the name of the processor is not known, "Library Staff" would be an appropriate placeholder.
<eadheader audience="internal" findaidstatus="unverified-full-draft" langencoding="ISO 639-2">
This attribute and its values may be used for internal adminstrative purposes. Each of the four values should be assigned a specific meaning. If no meaning is assigned the attribute should be set to "unverified-full-draft." One such example is in Appendix D.
<eadid type="SGML catalog">PUBLIC "-//[NAME OF OWNER]::[SUBORDINATE DIVISION]//TEXT (US::[NUC ID]::[LOCAL REFERENCE CODE]::[TITLE OF COLLECTION: John Doe Papers, yyyy-yyyy])//EN" "[FILENAME].xml"</eadid>
<titlestmt>
<titleproper>Inventory of the [title of collection: John Doe Papers], <date>[inclusive dates of collection: yyyy-yyyy]<date></titleproper>
<author>Processed by: [person(s) who processed the collection]; machine-readable finding aid created by: [person marking up finding aid]</author></titlestmt>
<publicationstmt>
&hdr-ndd-spcoll;
<p><date>© [yyyy]</date> [Copyright holder]. All Rights Reserved.</p></publicationstmt></filedesc>
<profiledesc> <creation>Machine-readable finding aid derived from [paper by means of scanning and OCR; OCR file edited for typographical errors before encoding.] [[application name] text editing software.] [typescript by rekeying.] [database name.] [automated markup system.]
Template for <frontmatter>:
<frontmatter>
<titlepage>
<titleproper>Inventory of the [title of collection: John Doe Papers], <date>[inclusive dates of collection: yyyy-yyyy]</date></titleproper>
<publisher>[repository name][subordinate division]
<lb/>
<extptr show="embed" entityref="[entityname]"/>
<lb/>[institution]
<lb/>[city, state, zip, USA]</publisher>
&tp-[NUC ID]-[name];
<list>
<defitem>
<label>Processed by </label>
<item>[person who processed the collection]</item></defitem>
<defitem>
<label>Date Completed </label>
<item><date>[date of processing completion: month dd, yyyy]</date></item></defitem>
<defitem>
<label>Encoded by </label>
<item>[person marking up finding aid]</item></defitem></list>
<!--copyright inclusion is optional--><p>© [yyyy] [repository]. All rights reserved.</p>
</titlepage>
</frontmatter>
The frontmatter section provides an "electronic title page" for the finding aid. It is this section that creates the title page as displayed to the user by an XML system. The information displayed in the <frontmatter> is important in identifying the finding aid to the user. Like the <eadheader>, this section will usually be generated automatically. The finding aid title, institutional information, and processing and encoding information should be included even if it is just a placeholder, i.e. if the name of the processor is not known, "Library Staff" would be an appropriate placeholder. However, the specifics of tag organization or depth of information provided will be decided by the institution.
<titlepage>
<titleproper>Inventory of the [title of collection: John Doe Papers], <date>[inclusive dates of collection: yyyy-yyyy]<date></titleproper>
"Inventory of the" may be used as a default.
Note that the <titleproper> element provides a title for the finding aid rather than the collection proper. Because <frontmatter> occurs before the <archdesc> we are not yet describing the archival collection, but rather the finding aid itself and XML file that contains it.
John Q. Doe Papers, 1932-1986
Rather than:
Papers of John Q. Doe, 1932-1986
Notice potential confusion between dates of the collection materials and birth and death dates. This may require the revision of existing collection names.
<publisher>[repository name][subordinate division]
<lb/>
<extptr show="embed" entityref="[entityname]"/>
<lb/>[institution]
<lb/>[city, state, zip, USA]</publisher>
&tp-[NUC ID]-[name];
The entity is declared in the document subset of the EAD instance.
The entity is referenced within the document instance through the entityref attribute of <extptr>:
<list>
<defitem>
<label>Processed by </label>
<item>[person who processed the collection]</item></defitem>
<defitem>
<label>Date Completed </label>
<item><date>[date of processing completion: month dd, yyyy]</date></item></defitem>
<defitem>
<label>Encoded by </label>
<item>[person marking up finding aid]</item></defitem></list>
<p>© [yyyy] [repository]. All rights reserved.</p>
</titlepage>
</frontmatter>
<archdesc level="collection" langmaterial="eng">
[ ... ]
</archdesc>
Discussion
The <archdesc> element contains the archival description proper. It contains all information related to the collection being described, whereas elements leading up to <archdesc> describe the finding aid document itself.
The level attribute of <archdesc> is required and represents the highest level of description contained within the finding aid. Although the default is "collection," other possibilities are: series, file, fonds, item, otherlevel, recordgrp, subgrp, and subseries.
The langmaterial attribute describes the language of the material being described using 3-character ISO 639-2 codes. These are available at http://etext.lib.virginia.edu/tei/iso639.html (appearing 1/2 way down page). When 2 or more language codes are required they may be separated with a space:
<arcdesc level="collection" langmaterial="eng fre spa ger">
The <archdesc> contains several important elements:
While the DTD allows freedom in the ordering of some of these elements, the listing above is the recommended order.
Some other elements may be included within <archdesc> as needed:
Other elements such as <note> are not recommended for inclusion at this level unless careful consideration indicates otherwise. See section 2.10 of these guidelines.
Elements contained by <archdesc> and their contents will be examined in turn.
Template for <did>:
<did>
<head>Descriptive Summary</head>
<unittitle label="Title">[title of collection: John Doe Papers],
<unitdate type="inclusive">[inclusive dates of collection: yyyy-yyyy]</unitdate></unittitle>
<unitid>[Collection number]</unitid>
<origination label="Creator"><persname>[authority name of creator]</persname></origination>
<!-- Do not include langmaterial until the next version of EAD DTD is released -->
<langmaterial>Language the Finding Aid is written in</langmaterial><physdesc label="Extent">
<extent>[number of linear feet] Linear Feet</extent>
<extent>[number of items] Items</extent></physdesc>
<repository label="Repository">
<corpname>[Repository]</corpname></repository>
<physloc label="Location">Contact reference services for access to these materials.</physloc>
<abstract label="Abstract">[Abstract]</abstract>
</did>
The <did> is one of the most important elements in EAD because it groups together key descriptive information about the collection or unit that is being described. At the highest level, located directly within the <archdesc> and often called the "high-level <did>," the <did> contains several elements, indicated in the template, which are considered to be required for adequate archival description. These elements should always be included, and placeholder text inserted if information does not exist or is not to be shared. For retrospective conversion, it may be necessary to consult additional sources of information such as MARC records.
<unittitle label="Title">[title of collection: John Doe Papers],
<unitdate type="inclusive">[inclusive dates of collection: yyyy-yyyy]</unitdate></unittitle>
<unitid countrycode="US" repositorycode="[NUC code]" label="Collection Number">[Consult Repository]</unitid>
<origination label="Creator">
<persname>[name of creator of collection in authority form]</persname></origination>
--IN 2002--<langmaterial>[to be determined]</langmaterial>
<physdesc label="Extent">
<extent>[number of linear feet] Linear Feet</extent>
<extent>[number of items] Items</extent></physdesc>
<physloc label="Location">Contact reference services for access to these materials.</physloc>
<abstract label="Abstract">[Abstract]</abstract>
This section of the <dsc> will undergo substantial changes when the new version of the EAD DTD is released. See below.
<admininfo>
<head>Administrative Information</head>
<accessrestrict>
<head>Access Restrictions</head>
<p>[restrictions]</p></accessrestrict>
<userestrict>
<head>Usage Restrictions</head>
<p>[copyright notice]</p></userestrict>
<!-- Note: "[Identification of item]" will always appear in brackets -->
<prefercite><head>Preferred Citation</head><p>[Identification of item], [title of collection: John Doe Papers], [institution].</p></prefercite>
<acqinfo>
<head>Acquisition Information</head>
<p>[gift, purchase, etc.] </p></acqinfo>
<processinfo>
<head>Processing Information</head>
<p>Processed by [person who processed the collection]</p>
<p>Completed [date of processing completion: month dd, yyyy]</p></processinfo></admininfo>
All elements in the <admininfo> will be unbundled in the next version of the EAD DTD due out in early 2002. This section will have to be changed to reflect the update. It would be best to have this conversion in place as soon as the DTD is released to avoid any more retrospective conversions of finding aids. EAD 1.0 will be supported for the forseeable future but legacy data conforming to EAD 1.0 may not be worth maintaining if it can be converted easily. This section will change substantially when the new release is available.
The <admininfo> element is another important required element within <archdesc>. It provides administrative information about the collection helpful to both users who wish access to the collection and to archivists in managing the collection. Items included in the template should be considered to be required. Other available elements within <admininfo> may be added as needed, provided care is taken to ensure that they are applied consistently. Default language should be created wherever possible for headers, information, and restrictions contained within <admininfo> elements, although the uniqueness of many collections will defy these attempts.
<accessrestrict>
<head>Access Restrictions</head>
<p>[restrictions]</p></accessrestrict>
<userestrict>
<head>Usage Restrictions</head>
<p>[copyright notice]</p></userestrict>
The copyright interests in the Edgar Hatcher Papers have not been transferred to Duke University. For further information, see the section on copyright in the Regulations and Procedures of the Rare Book, Manuscript and Special Collections Library or consult a reference archivist.
<!-- Note: [Identification of item] always in brackets -->
<prefercite><head>Preferred Citation</head><p>[Identification of item], [title of collection: John Doe Papers], [institution].</p></prefercite>
<acqinfo>
<head>Acquisition Information</head>
<p>[gift, purchase, etc.]</p></acqinfo>
<processinfo>
<head>Processing Information</head>
<p>[Noteworthy information about the processing of the collection]</p>
<bioghist>
<head>Biographical Note</head>
<chronlist>
<chronitem><date>[yyyy]</date>
<event>[event]</event></chronitem>
<chronitem><date>[yyyy]</date>
<event>[event]</event></chronitem>
<chronitem><date>[yyyy]</date>
<event>[event]</event></chronitem></chronlist>
<p>[Biographical/historical note paragraph describing the creator of the collection.]</p>
<p>[Biographical/historical note paragraph describing the creator of the collection.]</p>
</bioghist>
The <bioghist> element contains the biographical and/or historical note for the collection. Generally a biographical note will contain descriptive paragraphs and/or a chronology of dates and events. See the EAD Tag Library and the EAD Application Guidelines 3.5.1.5. for more information on <bioghist>.
<bioghist>
<bioghist>
<head>Biographical Sketch of John Doe</head>
<p>…</p>
<p>…</p>
</bioghist>
<bioghist>
<head>Biographical Sketch of Jane Doe</head>
<p>…</p>
</bioghist>
</bioghist>
Encoding subject elements such as <persname> <corpname> etc. within the text of the <bioghist> can be useful if there is sufficient time and resources to do so. Since the <bioghist> may contain broader, contextual information, be careful not to encode information that describes people or events that are not represented in the collection.
<head>Biographical Note</head>
A note containing only biographical information about a person <head>Historical Note</head>
A note containing historical information, usually about a corporation, place, family, etc. <head>Biographical and Historical Note</head>
- A note combining biographical and historical information
<scopecontent>
<head>Collection Overview</head>
<p>[Scope and Content Note paragraph describing the collection]</p>
<p>[Scope and Content Note paragraph describing the collection]</p>
</scopecontent>
<p>Also included are:</p>
<list type="simple">
<item>Moses Abramowitz</item>
<item>Jess Benhabib</item>
<item>Clive Bull</item>
<item>David Colander</item>
</list>
Depending upon the finding aid being encoded, different options are available for the use of <scopecontent> <arrangement> and <organization>. Check the EAD Tag Library for an explanation of the subtle differences between Arrangement and Organization.
If a legacy finding aid lists arrangement and organization information as part of the scope and content note and it is easily discernable as separate paragraphs or passages of text these elements may be included directly in the <scopecontent>. If a finding aid has a completely separate section under a different heading it is preferable to place the information in its own element on the same level as scopecontent. For example:
<scopecontent>
<head>Collection Overview</head>
<p>[Scope and Content Note paragraph describing the collection]</p>
<p>[Scope and Content Note paragraph describing the collection]</p>
</scopecontent>
<arrangement>
<head>Arrangement Information</head>
<p>[Arrangement Note paragraph describing the arrangement of the collection]</p>
</arrangement>
<organization>
<head>Organization Information</head>
<p>[Organization Note paragraph describing the organization of the collection]</p>
</organization>
The <odd> element is very useful when retrospectively encoding legacy finding aids. Often older finding aids may contain a long prose description which includes biographical and historical, arrangement, organization, and scope content information all mixed together. In these cases, rather than trying to revise the text to tease apart the various kinds of information, use of the <odd> tag to contain this data is very helpful and time-saving. "Collection Overview" may be used as a <head> for <odd> in the same manner as <scopecontent>.
<controlaccess>
<head>Online Catalog Headings</head>
<p>[Boilerplate Text: These and related materials may be found under these subject headings in online catalogs.]</p>
<list type="simple">
<item>[subject term]--[subdivision]</item>
<item>[subject term]--[subdivision]</item>
<item>[subject term]--[subdivision]</item>
<item>[subject term]--[subdivision]</item>
<item>[subject term]--[subdivision]</item>
<item>[subject term]--[subdivision]</item>
</list></controlaccess>
The basic collection-level <controlaccess> element contains a list of subject terms and is required for adequate collection-level description. These should be identical to the MARC record for the collection. Some legacy finding aids contain a list of subject terms which should also be included in this section.
Although not required, it may be of use to researchers to explain where to find other sources that use these controlaccess terms.
Each subject term within <item> elements will be further surrounded by one of the elements listed below:
- <subject>
- <persname>
- <corpname>
- <geogname>
- <famname>
- <genreform>
- <occupation>
- <title>
- <name>
- <function>
See the EAD Tag Library for definitions of these elements.
Encodinganalog and source attributes may be used if desired to map these elements to their corresponding fields in MARC format:
<corpname encodinganalog="610" source="lcnaf">Board of Game and Fish Commissioners of Minnesota.</corpname>
<subject encodinganalog="610" source ="lcsh">Law enforcement.</subject>
Subject Subdivisions
Separate subdivisions with two short dashes or hyphens (with no spaces) as in the example.
Enclose the entire subject and subdivision within a single element rather than attempting to mark each subdivision individually. For example:
<subject>Education--United States.</subject>
is correct, rather than:
<subject>Education</subject>--<geogname>United States.</geogname>
Because the subject and its subdivisions in the example would occur within a 650 field in the MARC record, the entire subject string is placed within the equivalent <subject> element in EAD. If there is a question as to which element with which to mark a particular subject, refer to its field within the MARC record and consider this against the USMARC to EAD Crosswalks found in Appendix B. of the EAD Application Guidelines.
While the supplied template combined with added subject elements represents a minimum practice, other more complex implementations of <controlaccess> are possible. <controlaccess> terms may be grouped by type. Example:
<controlaccess>
<head>Online Catalog Headings</head>
<controlaccess>
<head>Personal Names</head>
<persname>Doe, John, 1932-1989.</persname>
</controlaccess>
<controlaccess>
<head>Organizations</head>
<corpname>Duke Power Company.</corpname>
</controlaccess>
<controlaccess>
<head>Forms of Material</head>
<genreform>Photographs.</genreform>
</controlaccess>
</controlaccess>
See the EAD Application Guidelines 3.5.3 for another example of subject groupings by type.
The <dsc> element is used for container lists, series descriptions, and other descriptive lists and is the most complex and time-consuming section of the encoded finding aid. When dealing with retrospective conversion of legacy material there will be many variants in container lists to contend with. Based on analysis of current and past descriptive practices, EAD indicates 3 basic types of container lists (specified in the type attribute of the <dsc> element).
Because of the complexity and number of potential variants in container lists, the <dsc> templates can serve as a general guide only. For more extensive examples of <dsc> types see Appendix E. In the following pages some methods of encoding will be discussed but much will depend on the contents of a particular collection, its research value, and staff resources. In many cases, further suggestions and best practices will be placed in the appendices of this document to guide you when there is a need for more complex encoding. Also see the EAD Tag Library and EAD Application Guidelines for extensive examples and greater illumination on specific topics.
<dsc type="analyticover">
<head>List of Series</head>
<dsc type="in-depth">
<head>Container List</head>
<dsc type="combined">
<head>Description/Container List</head>
Some institutions encode finding aids with a series descriptions in one section (analyticover) followed by a complete container listing (in-depth) in another. A user is thus able to review the series descriptions in summary form or as a table of contents and then turn to the container listing for an in-depth box listing. This type of encoding is generally considered a legacy form of archival description but is helpful for the retrospective conversion of finding aids. New finding aids integrate the series descriptions with the container list enabling the use of the "combined" type of <dsc>. XSL stylesheets are able to replicate this effect from fully encoded combined <dsc> finding aids, negating the need to use multiple <dsc>'s for this purpose.
Note that the <dsc> element is a wrapper element meant to provide different views of the same data, rather than two sets of data within one finding aid. For example, an analyticover <dsc> table of contents describes the same series as the in-depth <dsc>, but at a different level. See the sections on Oversize Materials (4.1) and Later Accessions and Additions (4.2) for information on how to include these within the <dsc> element. The Online Archive of California considers it best practice to not use multiple <dsc> in any circumstance. Using multiple <dsc>, while permitted, should be done only after careful consideration and consultation.
Non Tabular vs. Tabular LayoutAll finding aids encoded for NCEAD will follow non-tabular layout. Any tabular layout should be generated by a specially designed stylesheet.
EAD includes tags such as <drow> and <dentry> which may be embedded within the container lists to specify tabular layout. While these will continue to be a part of EAD 1.0, the use of these elements has greatly decreased. Encoders now prefer to allow stylesheets to create desired tabular layouts. It is in fact no longer included as a default feature in EAD 1.0 and must be intentionally "turned on" by the user. Use of tabular tagging greatly increases the complexity of both markup and markup time and is not recommended under any circumstances.
The basic required element within each component <c0x> is <did> with <unittitle> within. For example:
<c02><did>
<unittitle>[collection content]</unittitle>
</did></c02>
Because of the hierarchical nature of EAD, all elements within <did> are also available at this level in addition to such elements as <scopecontent>, <admininfo>, <arrangement>, <organization>, etc. See below for discussion of the use of these elements.
The highest level components should be identified with "series" attributes when what is being described is a true series, usually occurring with the word "Series" in the title, i.e. "Correspondence Series." In this case, set the level attribute for the <c01> to "series." In the case of a box listing without hierarchy in which materials are not arranged into series, <c01> elements are used without the attribute. For example:
<dsc type="in-depth">
<head>Container List</head>
<c01><did><unittitle>Office Files</unittitle>
</did></c01>
<c01><did><unittitle>Correspondence</unittitle>
</did></c01>
</dsc>
The <c02> element is often used without its level attribute set to "subseries." Like <c01> it is normally only set if the component being described contains the word "subseries" or is easily discerned as a subseries. It is a common occurrence to find a long list of components on the <c02> level which are not subseries.
In summary, <c01> is not always a series and <c02> is not always a subseries; they are both simply components of description. It is common to have a finding aid with some <c02 level="subseries"> (or even <c02 level="series"> in the case of additions. See Section 4.2.) used where appropriate while <c02> without an attribute is used in other areas of the finding aid.
Repositories that number component levels should include the numbers in the <unittitle> element:
<c01><did>
<unittitle>1. Correspondence</unittitle></did>
<c02><did><unittitle>1.1. Early, 1966</unittitle></did></c02>
</c01>
Information about the physical location (box, folder, item, reel, etc.) of materials within the collection is encoded within the <container> element. Primarily, this information is used by researchers to request materials and reference services to retrieve materials. It may also help users to understand the amount of materials in a section of the collection. Since EAD is primarily focussed on the structure of the intellectual content, information about the physical organization is of secondary concern. There are several methods of encoding container information that range from more general to more specific. Basic descriptions of three of these methods appear below with each being more explicit of container information. Examples of these methods appear in Appendix F, although they are not meant to prescribe the correct way to encode <container> tags. NCEAD encourages each institutions to consistently encode container information within its finding aids using one of these methods. There is also a wealth of information to be found in the EAD Application Guidelines in sections 3.5.2.4 and 7.2.5. For retrospective encoding of existing finding aids encode container information as employed in the original, especially if you are not double-checking the original collection.
There are advantages and disadvantages to each of these methods of encoding. Quite frankly, it is perfectly acceptable to exclude any information about <containers> from the finding aid. However, most institutions find that this information does assist users and researcheres to some degree. NCEAD cannot require a certain method of encoding because the time and effort spent on the <dsc> is the perogative of the institution. Sections 3.5.2.4. and 7.2.5. of the EAD Application Guidelines describe some of the factors and rules that go into making this decision. Additionally, Appendix F of these guidelines attempts to explore some of the issues. Each institution must decide how and when to use container tags (and other optional tags) and consistency apply those decisions.
It is important that every described <c0x> level include a <unittitle> element. The unittitle is what describes the component level and more broadly describes all components nested within. A component without a unittitle should be carefully evaluated to discover what purpose it serves before including it in the EAD encoding.
Correspondence
1956
1957-1958
Always place the date within a <unittitle>, as the date constitutes the title of the unit as well as a date:
<unittitle>
<unitdate type="inclusive">1956</unitdate>
</unittitle>
<unittitle>Correspondence, <unitdate type="inclusive">(1956-1965), </unitdate><unitdate type="bulk">1961-1962</unitdate></unittitle>
The <physdesc> element may provide general and detailed information about the presence and nature of the formats of material present within a collection (i.e. photographs, nitrate film, computer disks, etc.). The tag may be used by itself for narrative description or in conjunction with further tags for more specific description of the materials. Of primary use are the <extent>, <dimensions>, <genreform>, and <physfacet> tags. Inclusion of this information is optional and will be determined by each institution based on time and resources, researcher needs, and preservation concerns. Note that the <physdesc> element must be outside <unittitle> but within <did>. The <physdesc> is not the same as the information encoded within the <container> element.
For more information about use of the <physdesc> tag please see Appendix G.
In "analyticover" and "combined" type <dsc>s the <scopecontent> element will be frequently used. This tag is used outside of the <did> tag for narrative information limited to the component level it appears in.
<c01 level="series">
<did>
<container type="box">1</container>
<unittitle>[series], <unitdate type="inclusive">[yyyy-yyyy]</unitdate></unittitle>
</did>
<scopecontent><p>[scope and content note about this series]</p></scopecontent>
</c01>
<arrangement> and <organization> may also be used as needed, however if a descriptive narrative blends such information with a scope and content, use <scopecontent> only. If such a narrative blends scope and content, biographical, arrangement, and/or organization information use <odd>.
For short scope and content notes in an "in-depth" type <dsc> and "combined" <dsc>s, you may avoid the use of <scopecontent> by simply encoding the description as part of the <unittitle>. This was developed as a standard practice in the American Heritage Project. For example:
Correspondence, 1989
Includes notes and clippings
EAD encoding:
<c02><did>
<unittitle>Correspondence, <unitdate>1989</unitdate> Includes notes and clippings</unittitle>
</did></c02>
This section will change substantially when the new version of EAD is released early in 2002. Primarily, elements such as <accessrestrict> once bundled within <admininfo> will now be available directly through the <c0x>.
<admininfo> is also available in components. This is most often used when a series, subseries, folder, or other level component is restricted, but may also be used as needed for other administrative information such as preservation concerns. Example:
<c01 level="series">
<did>
<container type="box">1</container>
<unittitle>[series], <unitdate type="inclusive">[yyyy-yyyy]</unitdate></unittitle>
</did>
<admininfo>
<accessrestrict>Restricted</accessrestrict>
</admininfo>
</c01>
Do not use the <head> element for <c0x> elements. <head> elements usually repeat the contents of <unittitle> and thus should not be included.
This section will change substantially when the new version of EAD is released early in 2002. Primarily, elements such as <index> once bundled within <add> will now be available directly through the <c0x>.
The <add> element may be used in a variety of ways and is very useful for retrospective encoding of existing finding aids. Primarily structures such as indexes, bibliographies, separated materials, and related material notes may be encoded using <add> and its associated elements. The <add> tag allows for the description of navigational, reference, and other information that does not have a natural place within other <dsc> tags.
The inclusion of an Oversize Materials listing is one method of describing materials associated by provenance to a collection but because of abnormal size or format they are not able to be stored with the rest of the collection. These oversized materials may be described in a separate section of the finding aid which appears at the end of the container list. (Conversely this may also be done as an <add><separatedmaterial> (See section 3.5.7.), or not encoded at all.)
Encode oversize materials as a series with the title "Oversize Materials" as a <c01> component as if it were another series in the collection. When the series from which oversize materials are drawn are listed, they will fall at the <c02> level as in the example below. Since these items are separated from the rest of the collection that detailed container information be included.
Example:
<c01 level="series"><did><unittitle>Oversize Materials</unittitle></did>
<c02 level="subseries"><did><unittitle>Advertisements Series</unittitle></did>
<c03><did><container type="box">Hartman Center Ovsz. Box 1 (20x24)</container>
<unittitle><corpname>Beechcraft (McCann-Erickson), </corpname><unitdate type="inclusive">
1983-1987</unitdate></unittitle></did></c03>
[...]
View the finding aid containing this example at:
Full SGML source is available at:
A challenge frequently faced by encoders is that of highly active collections to which there are many additions over a period of time. Often these are the result of living donors who donate parts of their collections over time, resulting in a series of several accessions at the repository. Other times they are archival collections with records management schedules which dictate periodic additions. Each addition to a collection necessitates a review to discover whether this acquisition establishes a new collection or is added to the end of the exisisting collection. In the former case, the effect on EAD encoding is simply to create another finding aid, since a new body of materials is being described. For the latter situation, the information about the new edition must be added to an existing finding aid. These additions to finding aids will be added as a <c0x> element creating an artificial series at the end of the document. Adding an artificial series is preferable to encoding the addition as a new <dsc>. Although there are situations where multiple <dsc>s are allowed, in these cases each of the <dsc>s are describing the same body of materials in different ways or to different levels.
In addition to adding the accession description to the end of the finding aid, the new accession will usually mandate changes to collection level description in the <archdesc> to include the content and context of the new accession. Additionally, the <revisiondesc> tag in the <eadheader> should be used to retain information about changes to the finding aid.
The heading "Accession [number]" is used as the <unittitle> on the <c01> level, with the series making up the accession falling at the <c02> level. The <unitid> element is used to further distinguish this accession from previous or subsequent accessions.
Example:
<c01 level="series">
<did>
<unitid>2001-0022</unitid>
<unittitle>Accession 2001-0022</unittitle>
</did>
<scopecontent>
<p>This accession includes correspondence files; author files; galley proofs of books; sales and marketing files, including journals in which advertisements or reviews of Sarabande Books'</p>
</scopecontent>
<c02>
<did>
<container type="box">1</container>
<unittitle>Nonprofit Files, <unitdate>1997</unitdate></unittitle>
</did>
<c03>
<did>
<unittitle>Board Meeting</unittitle>
</did>
</c03>
<c03>
<did>
<unittitle>Nonprofit Activity</unittitle>
</did>
</c03>
The finding aid from which this example is drawn may be found at:
Full XML source is available at:
Dooley, Jackie, ed. Encoded Archival Description Part 1 - Context and Theory. American Archivist V. 60 number 3 Summer 1997
Dooley, Jackie, ed. Encoded Archival Description Part 2 - Case Studies. American Archivist V. 60 number 4 Fall 1997
EAD at Duke: http://scriptorium.lib.duke.edu/findaids/ead/
EAD Help Pages: http://jefferson.village.virginia.edu/ead/
EAD Working Group of the Society of American Archivists. Encoded Archival Description Application Guidelines Version 1.0. Society of American Archivists, 1999.
EAD Working Group of the Society of American Archivists. Encoded Archival Description Tag Library Version 1.0. Society of American Archivists, 1998.
Hoyer, Timothy P., Pollock, Alvin, and Miller, Stephen. Consortial Approaches to the Implementation of Encoded Archival Description (EAD): The American Heritage Virtual Archive Project and the Online Archive of California (OAC). Journal of Internet Cataloging, forthcoming 2000.
Official EAD Site: http://www.loc.gov/ead/
Pitti, Daniel. Encoded Archival Description: An Introduction and Overview. Dlib Magazine, Vol. 5 No. 11 November 1999. http://www.dlib.org/dlib/november99/11pitti.html
Pitti, Daniel, et. al. The Encoded Archival Description Retrospective Conversion Guidelines. Available from http://sunsite.berkeley.edu/amher/
Note that these guidelines were developed in the Beta Version of EAD and have not been updated to reflect EAD 1.0.
RLG EAD Advisory Group. RLG Encoded Archival Description Recommended Application Guidelines. http://www.rlg.org/rlgead/guidelines.html