The Library of Congress >> Especially for Librarians and Archivists >> Standards

MARC Standards

HOME >> MARC Development >> Discussion Paper List


MARC DISCUSSION PAPER NO. 2016-DP18

DATE: May 27, 2016
REVISED:

NAME: Redefining Subfield $0 to Remove the Use of Parenthetical Prefix "(uri)" in the MARC 21 Authority, Bibliographic, and Holdings Formats

SOURCE: PCC Task Group on URI in MARC in consultation with the British Library

SUMMARY: This paper discusses modifying $0 (Authority record control number or standard number) in the Authority, Bibliographic, and Holdings formats so that dereferenceable HTTP URIs may be recorded without the parenthetical standard identifier source code prefix code “(uri).”

KEYWORDS: Authority record control number or standard number (AD, BD, HD); Subfield $0 (AD, BD, HD); URI (AD, BD, HD)

RELATED: 2007‐06/1, 2010‐06, 2015‐07, 2016‐DP04, 2016‐DP05

STATUS/COMMENTS:
05/27/16 – Made available to the MARC community for discussion.

06/25/16 – Results of MARC Advisory Committee discussion: Concerns were raised that  the omission of the parenthetical prefix "(uri)" from subfield $0 "Authority Record Control Number or Standard Number" would introduce a syntactical inconsistency into the MARC format. Many tags within MARC data exchanged in the German speaking MARC community currently contain three $0 subfields – two with control numbers, and one with an actionable URI. All of them contain parenthetical prefixes to define the source. However, the paper’s recommendation is limited to the removal of the parenthetical prefix when subfield $0 contains a URI. The paper was converted to a proposal and approved with the following amendment: The final sentence of the proposed redefinition of subfield $0 will be amended to make it more explicit by adding the phrase “and should not be included” as follows: “In the latter case, the parenthetical "(uri)" is redundant and should not be included if the identifier is given in the form of a web retrieval protocol, e.g., HTTP URI, which is self-identifying.”

08/10/16 - Results of MARC Steering Group review - Agreed with the MAC decision to convert to and approve as a proposal.


Discussion Paper No. 2016-DP18: Redefining Subfield $0 to Remove the Use of Parenthetical Prefix "(uri)"

1. BACKGROUND

The MARC Authority and Bibliographic formats make provision for recording identifiers associated with various entities in subfield $0 (Authority record control number or standard number). Currently, the identifier must be preceded by the standard identifier source code or MARC organization code in parentheses directly following the $0.

The subfield $0 may also be used to record dereferenceable HTTP URIs that identify entities described in a MARC 21 record. Following the established convention of supplying source code in parentheses, the term “(uri)” was instituted for the HTTP URI recorded in subfield $0. Dereferenceable HTTP URIs embedded in MARC 21 records as identifiers have proven useful in disambiguating identities, maintaining authority data, and transforming MARC 21 data to various linked data formats. This transformation, however, requires the removal of the parenthetical (uri) in order to make the URI actionable in applications according to corresponding protocols. This requires added programming requirements to the transformation routine.

The PCC URI Task Group on URIs in MARC was charged by the Program for Cooperative Cataloging (PCC) to “develop a work plan for the implementation of identifiers in $0 and other fields/subfields” in legacy MARC 21 data. One of the Task Group’s goals is for library data to function and serve on Web‐based services that may be extensible via API, connecting users to resources in a distributed environment with no additional programming need. After internal testing, and in consultation with a number of ILS vendors, programmers, systems engineers, and discovery designers, the Task Group has come to consider the parenthetical (uri) for actionable URIs to be both redundant, in that a URI is self‐referential, and that it is an impediment to the use of library data in a distributed environment. The Task Group therefore suggests that linked data applications would be more capable of exploiting URIs embedded in MARC records if the requirement for the parenthetical (uri) in the subfield $0 were removed.

Note: The term URI also implies the internationalization of the uniform resource identifier, IRI.

2. DISCUSSION

Current Definition of $0

In the Bibliographic, Authority and Holdings format, subfield $0 is defined as follows: from Appendix A of the Bibliographic format, “Control Subfields”: http://www.loc.gov/marc/bibliographic/ecbdcntf.html:

$0 - Authority record contol number or standard number
Subfield $0 contains the system control number of the related authority record, or a standard identifier such as an International Standard Name Identifier (ISNI). The control number or identifier is preceded by the appropriate MARC Organization code (for a related authority record) or the Standard Identifier source code (for a standard identifier scheme), enclosed in parentheses. See MARC Code List for Organizations for a listing of organization codes and Standard Identifier Source Codes for code systems for standard identifiers. Subfield $0 is repeatable for different control numbers or identifiers.

NOTE: Subfield $0 is sometimes referred to in the field descriptions as Authority record control number.

100 1#$aBach, Johann Sebastian.$4aut$0(DE101c)310008891

100 1#$aTrollope, Anthony,$d1815­1882.$0(isni)0000000121358464

MARC subfield $0 “Authority record control number” was first defined in 1997, it was widespread over the MARC formats in 2007, and redefined as “Authority record control number or standard number” in 2010. Because the value “uri” for a “Uniform Resource Identifier” is on the list of “Standard Identifier Source Codes”, subfield $0 is capable of carrying a URI. In 2015 $0 was added to the fields 336 (Content Type), 337 (Media Type) and 338 (Carrier Type) , the first case in which only URIs and not authority record control numbers were available to populate this subfield.

The syntactical instruction that “the control number or identifier is preceded by the appropriate MARC Organization code (for a related authority record) or the Standard Identifier source code (for a standard identifier scheme), enclosed in parentheses” ensures that the identifier given in the subfield is unambiguous by specifying the applicable organization or type of standard identifier. As MARC covers a variety of numbers and identifiers in $0, this specificity is essential. An algorithm that can read MARC should be able to interpret the parenthetical qualifier given in parentheses, decide whether it is a MARC Organization Code or a standard identifier, and use the identifier accordingly. Without the context the parenthetical prefix, the control or standard number in $0 is meaningless. A URI is already unambiguous without the parenthetical prefix. It should be possible to parse $0 to read the value as a URI when the parenthetical prefix is absent.

Based on the results of tests by the PCC Task Group URI in MARC and similar approaches, there is a clear estimation that the advantages of dropping the parenthetical prefix in the case of a dereferenceable URI, such as an HTTP URI, is advantageous to a linked data environment. Parsing MARC data for URIs is significantly easier when a subfield $0 contains a URI, and nothing else. In addition, the subfields $u specifically defined for carrying a “Uniform Resource Identifier” in many fields of the MARC formats do already have the syntax of containing a URI, and nothing else.

3. PROPOSED CHANGES

Redefine subfield $0 as follows:

$0 - Authority record control number or standard number
Subfield $0 contains the system control number of the related authority record, or a standard identifier such as an International Standard Name Identifier (ISNI). The control number or identifier is preceded by the appropriate MARC Organization code (for a related authority record) or the Standard Identifier source code (for a standard identifier scheme), enclosed in parentheses. In the latter case, the parenthetical (uri) is redundant if the identifier is given in the form of a Web retrieval protocol, e.g.
HTTP URI, which is self-identifying.

See MARC Code List for Organizations for a listing of organization codes and Standard Identifier Source Codes for code systems for standard identifiers. Subfield $0 is repeatable for different control numbers or identifiers.

Subfield $0 is to be used for recording URIs which represent objects in RDF triple statements.

Fields affected: Numbers and Codes (033, 034, 043), Heading Fields (X00, X10, X11, x30), Physical Description, etc. (336, 337, 338, 340, 344, 345, 346, 347, 348, 370, 380, 381, 382, 385, 386, 388), Date/Time and Place of an Event Note (518), Subject access (648, 650, 651, 654, 655, 656, 657, 662), Added entry (751, 752, 754) and Series Added entry (80X), Machine‐generated Metadata Provenance (883) in the Bibliographic Format.

Control Fields (034, 043), Headings General Information (336, 348, 368, 385, 386, 388), See also Tracings (500, 510, 511, 530, 550, 551, 562), Notes (672, 673), Heading Linking Entries (700, 710, 711, 730, 748, 750, 751, 755, 762, 780, 781, 782, 785), Machine‐generated Metadata Provenance (883) in the Authority Format.

4. EXAMPLES

The following examples show the various fields in which $0 is currently defined and where an HTTP URI can be recorded, either on its own, or in addition to, a control number to facilitate data transformation and support linked data applications.

Example 1. (Bibliographic Format) Modified from current practice

A URI version of an existing control number is appended in a second $0.

a) Entry includes control number from Deutsche Nationalbibliothek. URI version of the number added in separate $0.

100  1# $aBach, Johann Sebastian$d1685‐1750$4aut$0(DE‐588)11850553X$0http://d‐nb.info/gnd/11850553X$0(DE‐101)11850553X

Same example that abides by current guidelines, including (uri) prefix:

100  1# $aBach, Johann Sebastian$d1685‐1750$4aut $0(DE‐588)11850553X$0(uri)http://d‐nb.info/gnd/11850553X $0(DE‐101)11850553X

b) Entry includes control number from ISNI. URI version of the number added in separate $0.

100 1# $aTrollope, Anthony,$d1815‐1882.$0(isni)0000000121358464$0http://isni.org/isni/0000000121358464

Same example that abides by current guidelines, including (uri) prefix:

100 1# $aTrollope, Anthony,$d1815‐1882.$0(isni)0000000121358464$0(uri)http://isni.org/isni/0000000121358464

Example 2. (Bibliographic Format)

A URI is added to an existing text string through a vocabulary service lookup, e.g. LC Linked Data Service, Getty Vocabularies, etc.

a) This example shows a URI added from the LC Name Authority File from the LC Linked Data Service.

710 2# $aCalifornia Poets in the Schools (Project),$eissuing body,$epublisher.$0http://id.loc.gov/authorities/names/n85319780

Same example that abides by current guidelines, including (uri) prefix:

710 2# $aCalifornia Poets in the Schools (Project),$eissuing body,$epublisher.$0(uri)http://id.loc.gov/authorities/names/n85319780

b) This example searches the geographic area code in 043 field from the LC Linked Data Service and adds a URI for the geographic area code.

043 ## $aa‐cc‐‐‐$0http://id.loc.gov/vocabulary/geographicAreas/a‐cc

Same example that abides by current guidelines, including (uri) prefix:

043 ## $aa‐cc‐‐‐$0(uri)http://id.loc.gov/vocabulary/geographicAreas/a‐cc

c) This example shows a URI for an Index Term (Genre/Form), architectural drawings (visual works) from the Getty Vocabularies.

655  #7 $aArchitectural drawings (visual works)$2aat$0http://vocab.getty.edu/aat/300034787

Same example that abides by current guidelines, including (uri) prefix:

655  #7 $aArchitectural drawings (visual works)$2aat$0(uri)http://vocab.getty.edu/aat/300034787

Example 3. (Authority format)

a) Example below shows the URI for the genre/form term, Diaries, from the LC Genre/Form Term list from LC Linked Data Service.

380 ## $aDiaries$2lcgft$0http://id.loc.gov/authorities/genreForms/gf2014026085

Same example that abides by current guidelines, including (uri) prefix:

380 ## $aDiaries$2lcgft$0(uri)http://id.loc.gov/authorities/genreForms/gf2014026085

b) This example shows a URI for audience characteristics from the LC Demographic Group Term list from the LC Linked Data Service.

385 ## $nage$aPreteens$2lcdgt$0http://id.loc.gov/authorities/demographicTerms/dg2015060394

Same example that abides by current guidelines, including (uri) prefix:

385 ## $nage$aPreteens$2lcdgt$0(uri)http://id.loc.gov/authorities/demographicTerms/dg2015060394

5. BIBFRAME

$0 containing actionable URI will be highly desirable for transformation of MARC data to BIBFRAME.

6. QUESTIONS FOR DISCUSSION

6.1. Is it permissible allowing data recorded in $0 without parenthesis?

6.2. What impact does the removal of “(uri)” have on existing MARC implementations that libraries provide?


HOME >> MARC Development >> Discussion Paper List

The Library of Congress >> Especially for Librarians and Archivists >> Standards
( 09/02/2016 )
Legal | External Link Disclaimer Contact Us