|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5. METADATA2007 Revised Edition |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Metadata is like interest --- it accrues over time. To stretch the metaphor further, wise investments generate the best return on intellectual capital. Carefully designed metadata results in the best information management in the short- and long-term. Anne J. Gilliland, "Setting the Stage" in Introduction to Metadata (Getty Standards Program). 5.1 Overview
OverviewHow do we find the materials in our libraries, archives, museums, and historical societies? The descriptive tools that allow special collections to be accessed are in myriad forms. Yet, libraries, archives, museums, and historical societies--indeed all cultural heritage institutions--are dependent upon these tools of access to make them viable. Among the many information repositories, libraries have the longest history of providing accessibility in a standardized format. The broad acceptance of cataloging conventions such as the Anglo-American Cataloging Rules and the MARC21 format allows users to move easily from library to library. In contrast, historical societies, museums, and archives, have often used locally developed cataloging and access tools, reflecting the special nature of their holdings. Archives, for example, hold materials in many different formats (e.g., manuscripts, oral histories, photographs, objects, and films). Historical museums are even more idiosyncratic, and art museums are hybrids, combining many objects, archives, and library materials. From institution to institution (and individual collection to individual collection), their access tools vary in descriptive elements and formats. The uniqueness of special collections has made the development and implementation of broad, uniform practices difficult, preventing broad cross-collection access. Recent advances, however, offer hope for greater, more uniform access in the near future. Digitization has been a clear part of those efforts, and every digital project must address metadata issues to provide the best access to their materials and to ensure that their collection information is available in the larger arena of digital access. Today, there are several good descriptive systems available for use in cultural institutions. The most widely adopted is Dublin Core, a general descriptive system used by many multi-partner digitization projects to manage their electronic resources. Other systems include Encoded Archival Description (EAD), which is a system of encoding finding aids for the Web, and the Text Encoding Initiative (TEI), a system for encoding textual documents primarily from the humanities and social sciences. These systems are generally favored by large, established institutions. Other descriptive systems have been developed for specific formats. Often, these individual systems can be related to each other through the descriptive elements that they share (e.g., creator/author or subject). This process is often referred to as "crosswalking." Shared collection access methods (e.g., searching by subject across the holdings of several archives or across an archive, museum, and library) were difficult to accomplish in the pre-digital age. With technology, the dream of shared access is rapidly becoming a reality. The uniform description of resources (what librarians have always called cataloging) in an electronic form is one of the first steps in creating shared access. Describing a resource is a difficult process, but an important one if the resource is to be accessible to the user. The more conformity to uniform practices, the more likely the resource will be located and used. The choice of a "cataloging system" is actually a choice of "metadata" formats. What Is Metadata?Metadata is informally defined as "information about information" or any data associated with a resource that describes that particular resource. A more general definition that is useful for cultural institutions is "structured information about any information resource of any media type or format."1 In this context, an information object is anything that can be addressed and manipulated by a human or a system as a discrete entity. The essential aspect of a metadata system that describes an object, then, is its ability to provide a structured format for information about that object. Metadata itself is essentially a modern term for the bibliographic information that libraries traditionally entered into their catalogs or registry information on collections that museums have entered into their systems; however, the term metadata is most commonly used to refer to descriptive information about electronic resources. Cultural heritage institutions have been creating metadata for as long as they have been collecting cultural materials for their preservation and presentation to the public. The impact that the digital environment has had on metadata is the creation of electronic information in structured formats. The creation of metadata for digital resources is an important part of any digitization project and must be incorporated into a project's workflow. Metadata should be created and associated with the digital resource to support the discovery, use, management, reusability, and sustainability of that resource. Metadata relating to digital resources is most often divided into five conceptual types (with some overlap among the five): Descriptive metadata: information used for the indexing, discovery, and identification of a digital resource. Analytical metadata: information about the subject and context of a digital resource. Structural metadata: information used to display and navigate a digital resource; also includes information on the internal organization of the digital resource. Structural metadata might include information such as the structural divisions of a resource (i.e., chapters in a book) or sub-object relationships (such as individual diary entries in a diary section). Administrative metadata: information needed for the management of the digital resource, which includes information regarding access, display, rights management. Preservation metadata: information about the digital image for preservation purposes, including the resolution at which the images were scanned, the hardware/software used to produce the image, compression information, pixel dimensions, etc., important for migration and long-term sustainability of the digital resource. This can also be referred to as technical metadata. "Finding" or "accessing" holdings is the most visible role of metadata in the electronic environment. Today's users are coming to the digital resource from their home, work, school, etc., at any time of the day, and often without the assistance of a librarian, archivist, curator, museum educator, or other cultural heritage professional. In addition, digital resources present their own unique characteristics, and cultural institutions need to consider these characteristics as they try to integrate management of these resources into their traditional holdings. Metadata for digital resources needs to provide information that:
Unfortunately, there is no uniform metadata solution for all cultural materials. The metadata for text is different from the metadata for visual images. Further, the elements used to describe an object can change and grow as more becomes known about that object. Metadata should be thought of as a dynamic process. New metadata schemes for different formats of cultural materials or for different needs in managing those cultural materials emerge. It is important to stay current as the field of metadata grows and changes. 1. Priscilla Caplan, Metadata Fundamentals for All Librarians, (Chicago: American Library Association, 2003), p. 3. How Do I Select the Best Metadata Standard for my Materials?As indicated above, there is a wide variety of metadata standards available to cultural institutions. Selection of a standard should be based on the needs of the repository and its users. Deciding which metadata system to use for a collection can be a very individualized process and a daunting one. Here are some general guidelines that can be followed while making choices about metadata systems:
Recommended Metadata Standards for North CarolinaAfter a review of the most prominent metadata systems, consortial requirements, the descriptive tools being used by the state's largest digitization projects, and the types and holdings of institutions throughout the state, NC ECHO issued the following policy on metadata: North Carolina ECHO recommends that North Carolina institutions wishing to participate in the statewide digitization project follow the metadata standards of at least Dublin Core, while acknowledging that some participating institutions may additionally employ the more robust descriptive systems such as EAD, TEI and others. NC ECHO chose Dublin Core because it can be used to describe a wide variety of digital resources. It is the base line of metadata standards. In its simplest form, it provides a basic level of access that involves the completion of only seventeen fields of information. In addition, Dublin Core is relatively easy to crosswalk to other metadata systems, so existing descriptive systems (even if they are pretty minimal) can conform to the Dublin Core fields, which are extremely basic. To learn more about the Dublin Core Metadata Initiative consult its web site (http://www.dublincore.org/). North Carolina Dublin Core ElementsThe current Dublin Core standard is composed of 17 element sets (see table below for brief summary). They are familiar points of description and access to most workers in and users of cultural institutions. The 17 NC Dublin Core Metadata Element Set*
NC ECHO has a working group that examines the Dublin Core standard and provides implementation guidelines for NC ECHO participating institutions. These guidelines provide a general introduction to the Dublin Core standard and should assist institutions in analyzing their existing descriptive systems and adapting them to at least the minimal requirements of Dublin Core. Each element has been examined and specific implementation guidelines are included in the guidelines. In addition, the NC ECHO Dublin Core template provides an online tool for the creation of Dublin Core metadata. This web form will help with syntactic expressions and assure uniformity in the creation of HTML-coded Dublin Core so that institutions can concentrate on the content of the metadata rather than its computerized structure. The template and use documentation are available at http://www.ncecho.org/ncdc/index.htm Other Metadata StandardsWhile Dublin Core is the base line of metadata standards, there are other standards that provide richer descriptive tools, retrieval possibilities, and other management capabilities for specific types of cultural materials. For example, Dublin Core is not as efficient a tool as some systems when describing relationships between materials and hierarchies of information. This is, for example, significant in creating description for manuscript and archival collections. Typically, individual collections of manuscripts are composed of series of materials, and a series of material is composed of subseries of materials, and a subseries of material is composed of files or other subseries, and a file of material is composed of individual items. Another metadata standard, Encoded Archival Description (EAD), has been developed to address the need to describe relationships between materials and is discussed here in more detail. A brief list of other metadata standards follows. EAD (Encoded Archival Description)Encoded Archival Description (EAD) is a metadata system that leverages the structure of archival description found in archival finding aids through its encoding standard. It is an Extensible Markup Language (XML) document type definition (dtd) that enables EAD-encoded finding aids to be searched, retrieved, displayed, and exchanged. EAD is platform-independent and is maintained by the Society of American Archivists. It is a recognized international standard. EAD is especially helpful in information retrieval because of its ability to identify particular areas of description in the finding aid and its ability to present information in a hierarchical fashion. By marking up a finding aid in EAD, the relationships between the series and subseries are maintained in the retrieval of the information about the collection. NCEAD is NC ECHO's working group on the implementation of EAD in North Carolina. NCEAD has generated Best Practice Guidelines, tools, and supporting documentation to ease the implementation of EAD for North Carolina institutions. See http://www.ncecho.org/ncead/index.htm. Society of American Archivists EAD Resources
EAC (Encoded Archival Context)EAC is an emerging standard for the description of record creators. It provides sections on identity, description (both formal and informal), relationships, and record maintenance. The standard approaches cultural heritage materials from a new perspective. Rather than describing materials, it describes the creators and provides connections to the materials relevant to those creators. NC ECHO has a working group, NCEAC, that has examined the beta standard and adopted a union model for the NC ECHO project, entitled "North Carolina Biographical and Historical Information Online" (http://digitalnc.org/ncbhio/index.htm). This project includes content guidelines, input forms, and browse capabilities for existing records. Most importantly, the project relies on partner institutions contributing information about the people, families, and corporate bodies that have created the state's cultural heritage materials. EAC Standards Documentation
MODS (Metadata Object Description Schema)MODS is an XML schema developed by the Library of Congress. It is described as a bibliographic element set, but it may be used for a variety of different types of resources. MODS should be considered a richer metadata set than Dublin Core, with the advantages of the XML platform. It has been derived from the MARC standard, but provides a flexible platform for the description of digital objects. Library of Congress MODS site
Visual Resources & Objects StandardsCategories for the Description of Works of Art (CDWA) Cataloging Cultural Objects (CCO) Visual Resources Association Core Categories (VRA Core) To address the issues of metadata for visual and object resources, NC ECHO has collaborated with the North Carolina Museums Council (NCMC) to create a Metadata Working Group. This group analyzed existing metadata standards, primarily CCO, to create basic content guidelines for the description of visual resources and objects. These guidelines, along with recommendations for implementation and systems, will be available soon. Text Encoding StandardsText Encoding Initiative (TEI) Oral HistoriesOral histories present interesting issues for metadata. NC ECHO is working with an Oral History Metadata Group to provide guidance on metadata for institutions that maintain oral history collections. The group will produce recommended guidelines for collection description as well as item-level oral history description. Preservation MetadataMaintaining information about the creation and maintenance of your digital images is an important aspect of digitization because it ensures the longevity of your work. NC ECHO has constructed a preservation metadata standard to aid in the long-term sustainability of the digital content created in digitization projects. The tools developed include a content standard as well as a Microsoft Access database tool available for institutions that might need it. See http://www.ncecho.org/presmet/index.htm for more information. "Crosswalking""Crosswalking," the ability to move data across several different platforms, may be thought of as translating an element set in one metadata system to a related element set in another metadata system. This translation allows a user to search across the two systems. Crosswalking is also referred to as "mapping." As defined by a NISO White Paper, October 1998, a crosswalk is "a set of transformations applied to the content of elements in a source metadata standard that results in the storage of appropriately modified content in the analogous elements of a target metadata standard." For more detailed information on crosswalking, see "Issues in Crosswalking Content Metadata Standards." The crosswalking chart below demonstrates that many metadata systems share the same conceptual fields, even if those fields are not called the same thing in different systems. It is NC ECHO's goal to use crosswalks to tie together the different metadata standards employed by the state's cultural institutions. By creating consistent and standardized metadata throughout digitization projects and representations of your collections, you are contributing to this goal. Crosswalking summary
Shareable MetadataThe principle of shareable metadata goes to the heart of metadata for digitization projects that are published on the Web. Shareable metadata refers to the concept that metadata be generated that conforms to standards and is inclusive of data elements that allow for contextual understanding. Fields such as "repository" (DC.Publisher) and conformance to technical standards all comprise components of shareable metadata. NC ECHO promotes the creation of shareable metadata by its partner institutions through its various implementation and best practice guidelines. For more information about the concept of shareable metadata and the reasons for its application, see "Moving toward shareable metadata" by Sarah Shreeves, Jenn Riley, and Liz Milewicz, available at http://www.firstmonday.org/issues/issue11_8/shreeves/index.html. Controlled VocabulariesA controlled vocabulary is a set of terms used consistently and defined very carefully. It helps little if archivists, museum professionals, and librarians recognize the same metadata fields, but then choose to fill them with their own descriptive phrasing. That is where controlled vocabularies enter the picture. A controlled vocabulary is used when the search results need to be consistent. If indexing is to work, a controlled vocabulary is a must. Several different descriptive elements lend themselves to controlled vocabularies. Names of creators or contributors, genres or mediums, and subject listings all reap the benefits of controlled vocabularies. Other fields, such as Date and Language rely on data content standards that dictate the way that that information is entered. NC ECHO metadata guidelines provides instructions on these data content standards wherever possible. The best practice is to select terms from controlled vocabularies, thesauri, and subject heading lists to use as subject elements, rather than just using keywords. Employing terminology from controlled vocabularies ensures consistency and can improve the quality of search results. It also can reduce the likelihood of spelling errors when inputting metadata records. Recognizing the diverse nature of the statewide initiatives and the involvement of a broad range of cultural heritage institutions, controlled vocabularies have been expanded to include subject discipline taxonomies and thesauri. Several states are developing geographic-based lists of terms that may be helpful in achieving a level of consistency in terminology. Many of the thesauri, subject heading lists, and taxonomies are currently available via the web; online links are provided in the list below wherever possible.
Describing your digital projectWhile metadata is essential to facilitate the use of the materials within your digital project, you should also consider the use of an overall description of your digital project. Primarily associated with the homepage to the project, the inclusion Project Dublin Core at that level will greatly facilitate the location and inclusion of your digital project in consortial and aggregated online resources. The NC ECHO Dublin Core Implementation Guidelines provides an appendix (http://www.ncecho.org/ncdc/ncdublincore2007.htm) that outlines the application of Dublin Core to a digital project as a whole. It is the creation of this metadata that makes easy the inclusion of your digital project in NC ECHO's Catalog of Online Collections and Exhibits. ConclusionNorth Carolina's cultural institutions could scan their entire holdings. They could post on the Internet a digital image of every item sitting on their shelves and in storage cases. They could fill computer server after server with good information, but if it takes a researcher six weeks of scrolling through screens to find what he wants, all of that scanning will have been performed in vain. Metadata, information about information, helps researchers find what they are looking for. If institutions use standard systems of metadata and apply them in standardized ways, they provide their researchers with tools that will help them identify resources within their institutions and will lead to the ability to search across repositories. Further ReadingCaplan, Priscilla. Metadata Fundamentals for all Librarians. Chicago: American Library Association, 2003. Duval, Erik, et al. "Metadata Principles and Practicalities" in D-Lib Magazine, 8(4), April 2002. Hodge, Gail. Metadata Made Simpler. Annapolis: NISO Press, 2001. Hudgins, Jean, Grace Agnew, and Elizabeth Brown. Getting Mileage out of Metadata: Applications for the Library. Chicago: American Library Association, 1999. Introduction to Metadata: Pathways to Digital Information. Martha Baca, ed. California: Getty Information Institute, 1998. http://www.getty.edu/research/conducting_research/standards/intrometadata/ Smith, Terence R. (1996). "The Meta-Information Environment of Digital Libraries." in D-lib Magazine. July/August 1996. http://dlib.ukoln.ac.uk/dlib/july96/new/07smith.html< St. Pierre, Margaret and William P. LaPlant, Jr. Issues in Crosswalking Content Metadata Standards. 1998. http://www.niso.org/press/whitepapers/crosswalk.html Taylor, Arlene. The Organization of Information. Englewood, Co.: Libraries Unlimited, Inc., 1999. Weibel, Stuart (1995). "Metadata: The Foundations of Resource Description". D-Lib Magazine, July 1995. http://mirrored.ukoln.ac.uk/lis-journals/dlib/dlib/dlib/July95/07weibel.html Zeng, Marcia Lei. "Metadata Elements for Object Description and Representation: A Case Report from a Historical Fashion Collection Project." Journal of the American Society for Information Science 50, no. 13 (1999): 1193-1208.
Selected Metadata Schemeshttp://www.getty.edu/research/conducting_research/standards/cdwa/ Dublin Core. NC Dublin Core. http://www.ncecho.org/ncdc/index.htm Encoded Archival Description NCEAD. http://www.ncecho.org/ncead/ MARC Furrie, Betty. Understanding MARC Bibliographic: Machine-Readable Cataloging. The Library of Congress, 2005. http://www.loc.gov/marc/umb METS http://www.loc.gov/standards/mets/ METS: An Overview and Tutorial. http://www.loc.gov/standards/mets/METSOverview.html TEI: Text Encoding Initiative Teach Yourself TEI, http://www.tei-c.org/Tutorials/index.html Seaman, David. The Electronic Text Center Introduction to TEI and Guide to Document Preparation. http://etext.lib.virginia.edu/tei/uvatei.html VRACore http://www.vraweb.org/vracore3.htm
Return to NC ECHO Home Page |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||