|
|
|
|
|
|
|
|
| digitization & metadata | north carolina dublin core implementation guidelines |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
The NC ECHO Dublin Core Implementation Guidelines seek to assist North Carolina institutions in creating practical and useful in-house rules for constructing Dublin Core metadata records. NC ECHO has adopted Dublin Core because it adequately describes resources found in the library, archival, museum, and other cultural heritage institutions that form the collective NC ECHO community. The standard is open and amenable to involving all of these communities, without excluding groups of users. These guidelines and best practices are based upon the Dublin Core Metadata Initiative's discourse on the Dublin Core metadata standard. These guidelines are designed to be helpful to institutions as they are creating Dublin Core applications; however, they are not meant to give an institution a direct interpretation of their digital materials. Rather, these guidelines provide an easy Dublin Core framework for institutions to apply to their specific uses. The guidelines are necessarily broad to allow application across a variety of types of institutions that will use them - libraries, archives, museums, historic sites, etc. Dublin Core represents the lowest common denominator for creating metadata to facilitate maximum accessibility of resources across a broad spectrum of institution types. Format-specific metadata standards exist that may be more appropriate for an institution's digital material. Institutions are encouraged to use other metadata systems as appropriate. Crosswalking (mapping from one metadata system to another) helps to provide multiple metadata expressions for digital objects. However, it is recommended that in envisioning a crosswalking system, Dublin Core be generated from the more specific standard rather than mapping the other way. This is a logical way to proceed as Dublin Core is more general than most other standards. These other standards either have established or have in-process interpretations for North Carolina, and the crosswalks are embedded in all standards produced. Purpose and Scope These best practices offer assistance in creating metadata records for digitized resources, including those that are born digital as well as those that are reformatted from existing physical resources (photographs, text, audio, video, three-dimensional artifacts, etc). Creators of these metadata records may include catalogers, curators, archivists, librarians, web site developers, database administrators, volunteers, and other persons working in cultural heritage institutions. Application of these best practices in the creation of metadata records will result in standardized records that:
This document uses the Dublin Core element set as defined by the Dublin Core Metadata Initiative (DCMI), http://www.dublincore.org/. Because it addresses a diverse audience of cultural heritage institutions comprised of museums, libraries, historical societies, archives, etc., this document seeks to accommodate different backgrounds and metadata skill levels by explaining terms and concepts as needed and by providing examples describing diverse resources. Terminology can often be confusing in creating these kinds of guidelines, so where possible terms have been defined, and a supplementary glossary is included. In addition, a great deal of literature regarding the Dublin Core metadata standard has been generated, and a Resources section provides a list of some of the most important documents for further information. While this listing is by no means comprehensive, it will give those interested a starting point into the rich area of research and discussion in the field of information science. NCDC is a working group constituted at each new edition of the NCDC Implementation Guidelines that oversees the establishment of Dublin Core standards for the NC ECHO project. The working group consists of metadata specialists from throughout North Carolina's cultural heritage institutions and is facilitated by the NC ECHO Metadata Coordinator. It is the aim of the group to provide a broad interpretation of the Dublin Core standard and to devise best practices for implementation in North Carolina. This interpretation is intended to be flexible enough for the wide variety of uses that Dublin Core has but provide enough guidance to ensure that quality metadata is created by institutions using the standard. For more information about the working group structure, please see http://www.ncecho.org/ncdc/ncdcworkinggroup.htm. Metadata is informally defined as "information about information" or any data associated with a resource that describes that particular resource. A more general definition that is useful for us is "structured information about any information resource of any media type or format." In this context, an information object is anything that can be addressed and manipulated by a human or a system as a discrete entity. The essential aspect of the metadata system, then, is the structured format for that information. Metadata itself is essentially a modern term for the bibliographic information that libraries traditionally entered into their catalogs or registry information on collections that museums have entered into their systems; however, the term metadata is most commonly used to refer to descriptive information about World Wide Web resources. The creation of metadata for digital resources is an important part of a digitization project, and must be incorporated into a project's workflow. Metadata should be created and associated with the digital resource to support the discovery, use, management, reusability, and sustainability of the resources. Metadata is most often divided into three conceptual types (with some overlap among the three):
Recognizing that today's users are accessing digital resources from their home, work, school, etc., at any time of the day, and often without the assistance of a librarian, archivist, curator, museum educator, or other cultural heritage professional, metadata needs to provide information that:
WHAT IS DUBLIN CORE AND WHY USE IT? The Dublin Core metadata standard is a set of elements used to describe a variety of networked resources. The semantics of these elements have been established through consensus by an international, cross-disciplinary group of professionals from the library, museum, publishing, computer science, and text encoding communities, as well as from other related fields of scholarship. The Dublin Core Metadata Initiative (DCMI) Element Set has been approved by the American National Standards Institute (ANSI) and assigned the number Z39.85. The Dublin Core metadata standard embodies the following characteristics:
NC ECHO has adopted Dublin Core because it adequately describes resources found in the library, archival, museum, and other cultural heritage institutions that form the collective NC ECHO community. The standard is open and amenable to involving all of these communities, without excluding groups of users. Other metadata standards, such as MARC, have historically been difficult to adopt by non-library communities, such as museums or historical societies for their non-library collections. While more robust metadata standards exist and are encouraged by NC ECHO, Dublin Core provides a minimum standard that is internationally accepted. It provides a framework for metadata expression and includes the minimum amount of information that should be included. Dublin Core is relatively simple to learn and easy to use for those institutions that might not have a professional cataloger on staff, and its elements cover the most essential information about a resource. These implementation guidelines focus on defining Dublin Core fields (content) and provides examples using HTML <meta> elements (syntax). The Dublin Core standard is independent of syntax, though, and these implementation guidelines can be used to construct data in a variety of different systems. Overview of Dublin Core Elements and Metadata Types
Interoperability: "Shareable" Metadata Traditionally, as cultural heritage institutions automated collections information, each sector developed unique practices, procedures, and semantics for describing their objects. Interoperability is a set of hardware, software, policies, and procedures that allows for the exchange and re-use of information across a collaborative network. This network aims to encompass the entire state through the broad cultural heritage framework using a variety of technical initiatives. In order to share data effectively, institutions need to be aware of the impact that semantic choices create (particularly for describing similar concepts, such as "author" or "creator." In addition, accurate syntactic information enables computer operations to work effectively in the computer environment. By adopting a common set of best practices, controlled vocabularies, and input tools, and by participating in interoperable networks, institutions can increase their visibility and provide opportunities to create new connections with other cultural heritage institutions. These efforts better serve the needs of constituent communities and have the potential to create new user communities. Controlled Vocabularies For many fields in the NCDC implementation, best practice is to select terms from controlled vocabularies, thesauri, and subject heading lists for completion of the subject elements, rather than just using keywords. Employing terminology from controlled vocabularies ensures consistency and can improve the quality of search results, while reducing the likelihood of spelling errors when inputting metadata records. It also allows metadata from multiple institutions to be pulled together and provide meaningful results. Recognizing the diverse nature of the statewide initiatives and the involvement of a broad range of cultural heritage institutions, controlled vocabularies have been expanded to include subject discipline taxonomies and thesauri. Several states are developing geographic-based lists of terms that can be helpful in achieving a level of consistency in terminology. Many thesauri, subject heading lists, and taxonomies are currently available via the web and online links are provided wherever possible. As well, there are several standards for expressing information, such as date (these are referred to as data value standards). These data value standards are also used throughout these guidelines and links are provided to promote better understanding of the variety of different standards that comprise quality metadata generation. Crosswalks Crosswalks involve the mapping of the elements of one metadata standard to the corresponding elements fields of another metadata standard. A fully specific crosswalk contains a semantic mapping as well as a conversion specification. Crosswalks provide the ability to create and maintain a set of metadata and to map that metadata into any number of related content metadata standards. In order to build successful crosswalks and mapping schemes, it is important to maintain consistency across metadata standards. NC ECHO is striving to construct such consistent applications to assist in the crosswalk process using metadata best practice guidelines created for the variety of metadata standards being implemented throughout the state. In addition, NC ECHO seeks to promote the creation of metadata that will be consistent with national standards and interpretations. General Input Guidelines The best practice is to follow the general grammatical rules of the language involved when entering descriptive information about resources. In addition, it may be useful to consult the Anglo-American Cataloging Rules, 2nd Edition or Describing Archives: a Content Standard for more information and details on general rules and guidelines for data entry. Dublin Core metadata also involves syntax that makes it easier for the computer to understand the metadata information. Separating content and input is important in understanding the relationship between metadata and traditional modes of description. These guidelines include information about content and about an input structure for Dublin Core in HTML. Below are a few general input guidelines. After that, there is a discussion on the HTML input structure which may be used. Punctuation Avoid complicated punctuation in describing your resource. Use consistent English punctuation rules. In transcribing information from the resource itself, follow the punctuation present in the resource. Abbreviations In general, the following abbreviations are allowed: common or accepted abbreviations (such as "St." for "Saint"); designations of function (such as "ed." for "Editor"); terms used with dates (such as "b." for "born" or "fl." for "flourished"); and distinguishing terms added to names of persons, if they are abbreviated on the item (such as "Mrs."). These are particularly important when part of a controlled vocabulary. We suggest, however, that abbreviations not be used if they would make the record unclear. In most instances, though, spell out words rather than using abbreviations. For example, use "circa" rather than ca. This general rule provides greater interoperability of metadata and increased potential for understanding by users. Abbreviations assume a familiarity with the language that the World Wide Web has largely dismantled. Capitalization In general, capitalize the first word (of a title, for example) and proper nouns (place, personal, and organization names) as capitalization is used in the English language. If a resource is in another language, follow the capitalization rules for the language of the resource (i.e., capitalizing all nouns in German). Capitalize content in the description element according to normal rules of English language writing. For all other elements, enter content in lower case except for acronyms, which should be entered in capital letters. Initial Articles Omit initial articles at the beginning of the title, such as: the, a, an, le, la, los, el, der, die, das, etc. Keywords versus Subject Terms Best practice recommends that subject terms be taken from a controlled vocabulary whenever possible for more accurate retrieval and collocation of resources. However, other non-controlled terms or keywords that identify the resource with some precision can be added to a record to enhance resource retrieval and discovery, especially in cases where such terms are too new to be included in controlled vocabularies. The Description field provides a free-text arena in which to include keywords that will enhance retrieval. Authorities Personal names, corporate names, and geographic names should follow the controlled vocabulary of the Library of Congress Name Authority File (http://authorities.loc.gov/) or other controlled vocabularies for authorized form of names. While not all names are available in these resources, the Anglo-American Cataloging Rules 2nd Edition (AACR2) or Describing Archives: a Content Standard (DACS) provide guidelines to establish the authoritative form of names that may be associated with any resource. In entering this information, personal names should be entered as last name first, separated by a comma, then first name, then middle name or initial. If birth and death dates are known, enter them following the last name element. Separate these dates with a hyphen. For corporate names, enter the highest level of hierarchy and any middle stages of the hierarchy necessary to understand the role of the subdivision. Separate these stages with a period (full stop). Other stages in the hierarchy may be eliminated if they are not essential for distinguishing the function of the group being described. Corporate body names can be very complex to establish. Chapter 23 of AACR2 provides guidance on the varieties of corporate bodies that can be encountered and solutions for establishing authorized forms of names. Mandatory Elements The NC ECHO application guidelines specify 11 mandatory elements. These are considered essential in the description of any resource and are critical in supporting an interoperable environment. Some of these elements are mandatory only if applicable. For instance, language is required only if a language is represented. For a three-dimensional artifact, language may not be applicable. A chair does not have language information associated with it unless there is text of some sort on the chair. The same would be true for a photograph of an object or a non-textual image.
Guidelines for the application of each element appear below. Qualifiers The basic elements available in the Dublin Core metadata set are intended to cover most of the information needed to give an adequate description of the digital resource. However, there is often a need to be more specific about a resource than is possible using the basic element set. To help remedy this situation, the NC ECHO Dublin Core application is a "qualified" Dublin Core that consists of the elements and also their official qualifiers. These qualifiers are defined as refinements or schemes. Specific refinements and schemes are discussed within the individual element sections of the guidelines. Refinements Qualifiers defined as refinements serve to refine or specify the meaning of the content of an element. Schemes Qualifiers defined as schemes define rules for constructing a term, date or other type of data in accordance with a controlled list of terms or a specific format of representing a type of data (e.g., dates). Values for schemes are represented through a coding system used in the description of resource. The purpose of the scheme qualifier is to introduce a degree of consistency and standardization into the Dublin Core record and to communicate the standards or controlled vocabularies used in constructing the content of a given element. These guidelines primarily focus on the implementation of Dublin Core in a World Wide Web environment, although Dublin Core may be effectively applied in other environments. In fact, these implementation guidelines can guide the creation of information in various collection management systems such as CONTENTdm, PastPerfect, Re:discovery, and so on. While the examples included in the guidelines are expressed in HTML <meta> elements, it should not be assumed that the guidelines only apply to HTML. Below are procedures for the syntactic expression of Dublin Core in the HTML 4.0 standard. The correct syntactic expression is important for the harvesting of metadata information both locally and consortially, but the syntactic requirements of Dublin Core are not complex. Here is the way a Dublin Core element is written in HTML: <meta name="DC.[Dublin Core element].[Dublin Core refinement]" scheme="[scheme code]" content="[textual content of the element]"> The <meta> tag is used for all Dublin Core elements. All of the information relating to each DC element goes within the <meta> tag. name refers to the name of the element. This can include just an element's name or an element's name with a refinement. The element name and refinement are separated by a period (full stop). The DC preceding the element name indicates that a Dublin Core element is being used; prefacing all name values with DC is required. Examples of values: DC.Title scheme refers to the scheme used in formulating the content of the element. Codes are used to identify the schemes. Examples of this are lcsh (Library of Congress Subject Headings) and lcnaf (Library of Congress Name Authority File). Scheme codes are listed in appropriate elements. If no controlled vocabulary or data value standard is in use, the scheme can be left out entirely. It should not be entered with an empty value. Examples of values: iso8601 content refers to the textual content of the element. This may include alphanumeric expressions or uniform resource locators (URLs) to indicate where the metadata is located. Directions on the construction of content are included in the Input Guidelines for each element below. Syntax requires an equal sign (=) between the attribute (name, scheme, or content) and the value. The Value is always represented in quotation marks. All of the DC <meta> tags are located in the <head> section of the HTML document: <html> DUBLIN CORE METADATA ELEMENT SET Each element is described in detail below, including its mandatory status and whether or not it can be repeated. The Dublin Core standard as promoted by the Dublin Core Metadata Initiative (DCMI, http://www.dublincore.org/) does not specify required elements and labels every field as repeatable. In an effort to promote quality metadata creation, however, the NCDC working group considered how essential each of the DC elements is and if the ability to repeat a given field is important. The result is that, in the NC ECHO Implementation of Dublin Core, certain fields have been identified as required and/or repeatable. Following the description of each element, available refinements and schemes (with corresponding codes) are listed. Finally, input guidelines are detailed for creating the content of the particular element. These guidelines are a work in progress, and input from a variety of institutions as to how they are meeting or not meeting individual institution's needs are requested. If an institution is using a particular scheme, NC ECHO would like to present that scheme with these guidelines so that other institutions can consider it for use. Institutions willing to share their scheme implementations should forward the schemes and a description of how they are used to NC ECHO for inclusion in these guidelines. Please contact the NC ECHO Metadata Coordinator with suggested additions to the scheme lists. These input guidelines are meant to be helpful rather than confusing; therefore, if there is any point at which there is confusion in the application of a particular element, please contact the NC ECHO Metadata Coordinator immediately. This will result not only in better metadata application in your institution, but more clearly written guidelines and overall improved metadata creation. In many cases, we have relied upon well-established content guidelines from the library cataloging community as represented by the Anglo-American Cataloging Rules, 2nd Edition (AACR2) or archival description standards as represented by Describing Archives: A Content Standard (DACS). Museum metadata content (as represented by the related document NC ECHO Museum Core) are based upon the new content standard Cataloging Cultural Objects (CCO). A crosswalk to NCDC elements is included in those guidelines. If there are content guidelines currently in use in a particular institution, please convey that information to the NC ECHO Metadata Coordinator. As new content guidelines are formulated or received from institutions, they will be incorporated into these Dublin Core Implementation Guidelines. Complete record examples appear at the end of the element guidelines. These examples come from cultural institutions in North Carolina and are illustrative of the principles covered in the individual element structures. It should be noted that all examples include the required fields, but optional fields are only included where appropriate. An ancillary document has been created (http://www.ncecho.org/ncdc/COCEancillarystandard.htm) for the description on digital projects at the project level. Included is element-specific guidelines and an example of that description for a digital project.
Description: Name or label given to the resource by the creator or publisher; may also be identifying phrase or name of the resource supplied by the holding institution. DACS provides a useful distinction between formal and supplied titles that will help in filling out this information. In DACS a Formal Title is defined as a title "that appears prominently on or in the materials being described" whereas a supplied title is provided by the metadata creator when there is no formal title. (DACS, 2.3, p. 17). In many cases, the metadata creator will be supplying the title using a brief identifying phrase or name of the resource. Refinements: Title.Alternative Schemes: none Input Guidelines:
Description: Entity or entities primarily responsible for creating the intellectual content of the resource, including individuals, families, and corporate bodies. Examples include authors of written documents, artists, illustrators, photographers, collectors, organizations, etc. For archival collections, the creator is the entity that is responsible for the collection's existence. This can include authorship but also focuses on the reason that the collection is brought together. Refinements: none Schemes: Library of Congress Name Authority File (lcnaf) Input Guidelines:
Description: The topic of the content of the resource reflecting what the resource is about or what it is. Subjects can be expressed by topical, personal, family, or corporate body terms for significant people, places, organizations, events, and topics reflected. For geographic or temporal topics, see the element Coverage. Refinements: none Schemes: REQUIRED Use codes to indicate from which controlled vocabulary the term was derived. Controlled Vocabularies List Including Code Scheme
* Be wary of using the locally created vocabularies. The purpose of controlled vocabularies is to provide an environment where terms assigned by various metadata creators are represented in the same way. **If you are using a locally created vocabulary, you'll need to create a thesaurus so that approved terms are uniformly applied. Submitting New Vocabularies: The vocabularies listed here are those most likely to be in use in North Carolina. In order to ensure that this list remains updated, useful, and consistent across all cultural heritage institutions in North Carolina, please contact the NC ECHO Metadata Coordinator if a vocabulary you are using is not on this list so that it can be added and a code can be devised. Input Guidelines:
Description: A textual description of the content of the resource, such as an abstract, table of contents, or a free-text account of the resource. The description element allows for the inclusion of natural language descriptors (keywords) as well as narrative explanation of the content of the resource. Refinements: none Schemes: none Input Guidelines:
Description: Entity or entities that make the resource available. Publisher is the institution that published the digital resource and/or the institution that is hosting the digital resource. Refinements: none Schemes: Library of Congress Name Authorities File (lcnaf) Input Guidelines:
Description: Person(s), family(ies), or organization(s) who made significant intellectual contributions to the resource, but whose contribution is secondary to the person(s), family(ies) or organization(s) specified in the Creator element(s). Examples include editor, transcriber, translator, illustrator, etc. Refinements: none Schemes: Library of Congress Name Authorities File (lcnaf) Input Guidelines:
Description: Creation date(s) for the original resource. Refinements: Date.Created Schemes: ISO 8601 W3C Date Time Format: http://www.w3.org/TR/NOTE-datetime Input Guidelines:
Date Format Examples
Description: A broad term drawn from a controlled vocabulary that describes the genre or nature of the resource. Refinements: none Schemes: DCMI Type Vocabulary Input Guidelines:
DCMI Type Vocabulary List
Description: The physical or electronic format of the resource being described. Format may include the physical extent, dimensions, or media-type of the original resource, or the electronic media-type or extent of the digital resource, such as file format, file size, or playtime. This element can be used to identify the software and hardware needed to load and to use the digital resource. Refinements: Format.Extent Schemes: see Subject for Format.Medium (Original) schemes Input Guidelines:
Examples from Internet Media Types
Description: An unambiguous character string or record number that clearly and uniquely identifies a digital resource. The Identifier element ensures that individual digital resources can be managed, stored, recalled, and used with reliability. Refinements: none Schemes: URI, ISBN, Public Identifier Input Guidelines:
Description: When applicable, use the Source element to cite any aggregated resource from which the digital resource was derived. For instance, a digital resource could represent a letter from an archival collection. The collection then becomes the source for this digital resource. Some digital resources are "born digital" and may derive from no pre-existing resource; in these cases, the Source element is not used. Note the relationship between the Source element and the Relation element. Source is a specific kind of relationship. Because the Source element shows a derivative relationship with another resource, do not repeat that information in a Relation element. See Relation for more detail on the role of that element. Refinements: none Schemes: none Input Guidelines:
Description: Indicates the language(s) of the intellectual content of the resource. This is the language(s) in which a text is written or the spoken language(s) of an audio or video resources. Visual images do not usually have a language unless there is significant text in a caption or in the image itself. Refinements: none Schemes: ISO639-2 http://www.loc.gov/standards/iso639-2/englangn.html Input Guidelines:
Example language codes
Description: Contains information necessary to find or to link to a related resources. Content of this element may include an identifier, such as the title or a URL, URI, etc., the physical location of the related resource, if important, information about the nature of the relationship between the two resources, and so on. A resource may be related to other resources in a variety of ways that require more than one Relation element to describe. The same resource can be part of a larger resource while simultaneously containing a smaller resource within itself; it can be a different version of another resource; or contain the same intellectual content as another resource, but in a different format. Note that Source is a specific kind of relationship (see Source above for more information). Relation elements are less common in the description of digital surrogates than other applications of Dublin Core. The options below are included in case they are relevant to the description being created. In particular, many of the relation elements refer to aspects of description for those objects that are "born digital" rather than digital surrogates that typically form a digitization project. Refinements: REQUIRED. Use one of the following refinements to explain the nature of the relationship between the described resource and the related resource described in the Relation element. Include the refinement in the label name, not the element text. The "described resource" is the resource for which you are creating Dublin Core. The "related resource" is what you are referring to in the Relation or Source elements.
Schemes: none Input Guidelines:
Description: Describes the spatial or temporal characteristics of the intellectual content of the resource. Spatial refers to the location(s) covered by the intellectual content of the resource, not the place of publication. Temporal coverage refers to the time period or era covered by the intellectual content of the resource, not the publication date. For artifacts or art objects, the spatial characteristics usually refer to the place where the artifact/object originated while the temporal characteristics refer to the date or time period during which the artifact/object was made. Refinements: REQUIRED. There are two refinements for the Coverage element in order to distinguish either the spatial or temporal characteristics of the element: Coverage.Spatial: describes geographical/place information using controlled vocabularies or conventions, such as coordinates in a defined grid system. Coverage.Temporal: describes a date/time period according to accepted standards and controlled vocabularies. Schemes: Spatial Getty Thesaurus of Geographic Names (tgn) (http://www.getty.edu/research/tools/vocabulary/tgn/) Temporal DCMI Period (http://dublincore.org.documents/dcmi-period/). A subset of eras have been created from the Library of Congress that NC ECHO encourages you to use: Era terms for American History
*Note that the dates for these eras are flexible. They are provided to give you a general range for the era, and should not be considered definitive from a historical perspective. Aside from the date ranges for wars (which do begin on a certain date and end on a certain date), eras are fluid occurrences and subject to a multitude of interpretation. We are attempting to tie cultural heritage materials, particular people, corporate bodies, and families with social, cultural, economic, and intellectual climates, and we are using the era labels to achieve that. These are not meant to convey any historical interpretation beyond the general understandings of American history. Input Guidelines:
Description: Contains a rights management or usage statement, a URL that links to a rights management statement, or a URL that links to a service providing information on rights management for the resource. A rights management statement may contain information concerning accessibility, reproduction of images, copyright holder, restrictions, securing permissions for use of text or images, etc. Refinements: none Schemes: none Input Guidelines:
Below is the compilation of examples used throughout the implementation guidelines. Example 1: Photograph
<html> Example 2: Letter
<html> Example 3: Collection of Papers <html> A B C D E H I L M O P Q R S T U V W X AACR2: Anglo-American Cataloging Rules, 2nd Edition. Content rules used in the creation of cataloging records. AAT: Art and Architecture Thesaurus; a publication of the Getty Information Institute, a thesaurus for terms to describe art and architecture. abstract: information relating to the general contents, nature, and scope of the described materials. One way to write an abstract is to consider four elements: 1) specific types and forms of material present, noting the presence of graphic or other non-textual materials; 2) the dates within which the material bulks largest; 3) the functions or activities resulting in the creation of the records; 4) the most significant topics, events, persons, places, etc. access point: a name, term, phrase, or code that is used to search, identify, or locate a file, document, record, or object. acquisitions information: information about the acquisition of the collection or objects by the repository. administrative information: information regarding the administration of the collection or object. May include acquisitions information, provenance, user restrictions, access restrictions, copyright ownership, citation information, and general processing information. Administrative information can refer to all or part of a collection. administrative metadata: metadata primarily intended to facilitate the management of resources. angled brackets: an SGML/XML syntax convention to set apart a tag, < >. APPM: Archives Personal Papers and Manuscripts: A Cataloging Manual for Archival Repositories, Historical Societies, and Manuscript Libraries by Steven Hensen. Archival cataloging rules published by SAA as a supplement to AACR2 and superseded by Describing Archives: A Content Standard (DACS) attribute: modifier for the meaning of elements, named properties of an element that may carry different values depending up on the context in which they occur. authority control: the process of verifying and authorizing the choice of unique access points, such as names, subjects, and forms, and assuring that the access points are consistently applied and maintained in an information retrieval system. See also controlled vocabulary. authority file: a group of authority records searchable by all established headings and cross-references. authority record: an entry that contains information about an access point. An authority record establishes the form of the heading and determines cross-references and relationships of the heading to other headings. biographical/historical note: description of the life and activities of a person, family, or corporate body that generated the document described therein. A biographical/historical note is intended to provide contextual information for researchers. boilerplate text: standardized text used for labels and other information supplied for all of an institution's digital files (i.e. copyright notice, citation format, etc.). CCO: Cataloging Cultural Objects, a content standard for the description of cultural objects and their images sponsored by the Visual Resources Association. CDWA: Categories for the Description of Works of Art, a metadata schema for describing works of art for the purpose of art historical scholarship. close tag: the tag that closes an element, also called end tag. controlled access: a list of index terms for a finding aid. controlled vocabulary: formal limits on a vocabulary, useful for consistent use of vocabulary terms. crosswalk: an authoritative mapping from the metadata elements of one scheme to the elements of another. DACS: Describing Archives: A Content Standard, a content standard for archival description, including single- and multiple-level description. Maintained by SAA, the standard was first published in 2004 and supersedes APPM. DCMI: Dublin Core Metadata Initiative, an open forum engaged in the development of interoperable online metadata standards that support a broad range of purposes and business models, responsible for the maintenance of the Dublin Core metadata schema. descriptive metadata: metadata primarily intended to promote discovery, identification, and selection of information resources. dtd: Document Type Definition. Documentation of the XML markup language that lists constraints and instructions for the markup language. The dtd is used in the validation of XML. Dublin Core: metadata schema created for the World Wide Web. Consists of 15 elements typically used in conjunction with HTML. Maintained by the DCMI. EAD: Encoded Archival Description. An SGML/XML dtd for the construction of archival finding aids that reflect the hierarchical arrangement of archival materials. EAD provides a framework for information retrieval and display on the World Wide Web. Maintained by SAA with support from the Library of Congress. element: an essential building block of metadata schemas that serves to identify and surround the content of sections of the metadata. Elements are constructed of an open tag (start tag) and a close tag (end tag). Elements may contain other elements, attributes and values, or PCDATA. They can also be empty. encoding rules: the syntax or prescribed order of the elements contained in a metadata description. end tag: See close tag. HTML: hypertext mark-up language; most common mark-up language found on the web. Used for display manipulation only. An international standard for coding text to make it appear with formatting on web pages. HTML includes the structure of documents (title, headings, etc.) and the formatting (bold, fonts, and font size). For example, <i>Headline</i> would make the word Headline appear in italics. instance: the text and tags (excluding the dtd and related files) of an individual SGML/XML-encoded document, such as a single EAD-encoded finding aid. interoperability: the ability of multiple systems using different hardware and software platforms, data structures, and interfaces, to exchange and share data. ISAD(G): General International Standard Archival Description, a general framework for archival description developed by the International Council on Archives. ISBN: International Standard Book Number, an identifier for nonserial print publications. LCNAF: Library of Congress Name Authority File. A controlled vocabulary used for the names of persons, corporations, uniform titles, and series titles. LCSH: Library of Congress Subject Headings. A controlled vocabulary used for creating subject terms and geographical terms. MARC: Machine-readable Cataloging. Data structure standard used in Integrated Library Systems (ILS) for Online Public Access Catalogs (OPACs). metadata: structured information that describes, explains, locates, and otherwise makes it easier to retrieve and use an information resource. metadata harvesting: a technique for extracting metadata from individual repositories and collecting it in a central catalog to facilitate search interoperability. metadata schema: a set of metadata elements and rules for their use that has been defined for a particular purpose. metalanguage: a language used to describe other languages. SGML and XML are examples of metalanguages. METS: Metadata Encoding and Transmission Standard, a specification for structural metadata. OAI: Open Archives Initiative, an organization that maintains a protocol for harvesting metadata from distributed repositories. open tag: the tag that opens an element, also called start tag. preservation metadata: metadata primarily intended to help manage the process of ensuring the long-term preservation and usability of information resources. provenance: history of ownership of materials prior to acquisition by the current institution. qualifier: in Dublin Core and other metadata schemas, a term that restricts the meaning of an element or identifies the encoding scheme used in representing the value of the element. rights metadata: metadata primarily intended to enable the management of rights related to information resources; a type of administrative metadata. SAA: Society of American Archivists. schema: a formally defined metadata scheme. In XML, a way of defining a document type used as an alternative to a dtd.
semantics: the definitions of metadata elements, as opposed to the rules for encoding or representing the values of the elements (syntax). SGML: Standard Generalized Markup Language. XML (eXtensible Markup Lanugage) is a subset of SGML and has been widely implemented on the World Wide Web. source code: the code (usually HTML) behind any web page viewed in a browser. To see the source code of a page in Internet Explorer, right click on the page and select View Source or click on View on the tool bar and then select View Source. In Netscape, it is referred to as Page Source. start tag: See open tag. structural metadata: metadata that describes the internal organization of a resource and its place in an external organization. surrogate: a secondary object meant to substitute for the original, such as a photograph of an artwork used in place of the artwork. syntax: how a metadata schema is structured for exchange in machine-readable form, including the rules regarding that structure, definitions of metadata elements (semantics). Common syntaxes include MARC, SGML, and XML. tag: another term for element, it refers to the syntactic structure of expressing elements. For all intents and purposes, these terms are used interchangeably, although tag refers to the actual representation of the element, while element refers to the intellectual content of the tag. tag library: a document that lists the names of the SGML or XML elements and attributes alphabetically, along with their definitions and rules for their use. technical metadata: metadata primarily intended to document the creation and characteristics of digital files. thesaurus: an arrangement of a controlled vocabulary in which all allowable terms are given and relationships among terms are shown. URL: Uniform Resource Locator - the "address for a web site" (Ex.: http://www.ncecho.org/Guide/index.htm). HTTP is the method of connection; www.ncecho.org is the name of the host computer or server, also known as the domain name; /Guide/ is the particular folder on that computer; and index is the specific file; .htm is the kind of file that index is (also referred to as "URI"). vocabulary: the universe of values that can be used in a particular metadata element. VRA Core: Visual Resources Association Core Categories, a metadata schema for representing visual resources. W3C: World Wide Web Consortium, an international committee that provides vision and standards for the Internet. XML: eXtensible Markup Language, an SGML language that is constructing of a set of rules that allow for the definition of tags that separate a document into individual parts and subparts. Stored in ASCII Text format, XML documents are text files that store structured information. Caplan, Priscilla. Metadata Fundamentals for all Librarians. Chicago: American Library Association, 2003. Dublin Core Metadata Initiatives, http://www.dublincore.org/ Duval, Erik, et al. "Metadata Principles and Practicalities" in D-Lib Magazine. v.8(4), April 2002. Hodge, Gail. Metadata Made Simpler. Annapolis: NISO Press, 2001. Hudgins, Jean, Grace Agnew, and Elizabeth Brown. Getting Mileage out of Metadata: Applications for the Library. Chicago: American Library Association, 1999. Introduction to Metadata: Pathways to Digital Information. Martha Baca, ed. California: Getty Information Institute, 1998. Also available at: http://www.getty.edu/research/institute/standards/intrometadata/. NC ECHO Guidelines to Digitization, Chapter 5: http://www.ncecho.org/Guide/metadata.htm | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||