NAME: Coding Digital Items in Leader/06 (Type of Record) in the USMARC Bibliographic Format
SOURCE: Library of Congress
SUMMARY: This paper explores issues concerning coding the Leader/06 when the item being cataloged is in digital form. It reviews the use of the code in Leader/06 in library systems and its impact since the completion of format integration. A revised definition for code "m" (Computer file) is proposed to allow for more flexiblity in the creation and coding of records for digital items. Issues concerning digital reproductions are raised.
KEYWORDS: Leader/06 (Bibliographic); Type of record; Digital reproductions; Computer files
RELATED: DP92 (January 1996); 95-9 (June 1995)
STATUS/COMMENTS:
5/6/96 - Forwarded to USMARC Advisory Group for discussion at the July 1996 MARBI meetings.
7/6/96 - Results of USMARC Advisory Group discussion - Early in the discussion it was decided to focus on digital materials, not on other combinations of materials for which the distinction between content vs. carrier is relevant. The consensus was that it was desirable to change the definition of code m so that one does not have to code everything digital that way, although clear guidelines are needed. It was requested that a proposal be written for the Midwinter MARBI meetings with a redefinition of code "m" with the coding of Leader/06 for digital items dependent upon the content of the item, rather than how it is represented. Two options should be included: 1) a narrow definition that includes only executables; and 2) a broader definition that includes executables, data sets, and raw data that is not numeric. It was also recommended that the definition of code o (Kit) be clarified and an order of precedence established for deciding how to code.
DISCUSSION PAPER NO. 97: Coding Digital Items in Leader/06 (Type of Record) 1. BACKGROUND With the completion of the last phase of Format Integration in early 1996 MARC bibliographic records may contain coding for more than one set of characteristics in the new field 006 (Fixed-Length Data Elements--Additional Material Characteristics). Leader/06 (Type of record) contains a code that is used to determine what type of 008 (Fixed-Length Data Elements) is included in the record; the 008 character positions vary in 008/18-34 depending upon the type of material as coded in the Leader. Field 006 includes applicable codes that would otherwise be coded in 008/18-34, so that additional information may be given for other additional aspects of the item. A choice must be made as to which form of material the field 008 should be coded for. In terms of description, the decision as to which form is primary and which is secondary does not have much impact, since all characteristics can also be given in the record. However, the Leader/06 code is used for many purposes, particularly for retrieval of records. Format Integration opens up the opportunity to supply more information about the item than in the past, but it also brings up many questions about how to apply this new flexibility. In our current environment, distinctions between types of material have become blurred. With the advent of the personal computer and the growth of the Internet it becomes questionable whether categorizing all digital material as a computer file is useful for retrieval and manipulation of bibliographic records. If all digital material were coded as a computer file the record for a computerized version of, for example, an original photograph will be coded differently than the record for the original (if separate records are created). This may cause problems for retrieval, particularly in systems that separate records by form of material. Also, because of economic considerations, many users are adding information about the digital item on the MARC record for the original, rather than creating a separate record. The coding in Leader/06 was discussed on two previous occasions at meetings of the USMARC Advisory Group. Proposal No. 95-9 (Encoding of Digital Maps in the USMARC Bibliographic Format) was considered by the USMARC Advisory Group in June 1995. It proposed renaming code "e" in Leader/06 from "Printed map" to "Cartographic material" so that all maps, whether digital or print, could be coded the same (there is also a code for "manuscript map"). Because of the increasing number of digital map images becoming available (resulting partly from digital library projects and the Content Standards for Geospatial Metadata), this change was considered necessary for the map community. In many cases the bibliographic record for the paper copy will contain information about the location of the digital image. This paper brought up the issue of coding for content rather than for physical carrier. Although the portion of the proposal concerning Leader/06 was approved, it was suggested that a broader discussion paper be presented. Discussion Paper No. 92 was presented to the USMARC Advisory Group in January 1996. It explored changing the definition of code "m" in Leader/06 so that it is used only for executable software. Questions were raised concerning the use of the Leader/06 after format integration with the availability of 006. Participants listed the many uses made of the code in Leader/06 by library systems. There was general agreement that in cases where the content of the electronic material is clear, that identifying the primary record type in the Leader/06 by its content rather than carrier better served users. These cases include electronic text, music CDs, digital maps, digital photographs, etc. Participants felt that to define code "m" as only executable software was too restrictive for various reasons: the growing existence of hybrids which include pictures, graphics, text, software; files that don't fit into a category, e.g. survey data; and, defining "m" only as executable software would not allow input of an 006 for computer file characteristics for electronic text, since its secondary characteristic is not an executable. The group thought that it was likely that each constituency would need to issue guidelines. Another discussion paper was requested for the USMARC Advisory Group meeting in July. 2. LEADER/06 Systems use the Leader/06 for various purposes: - to separate databases based on form of material - to determine workform displays for keying the 008 - to sort records - to validate correct use of fixed fields - to support boolean searching - matching records for duplicate detection - display of labels to identify fields - to select subsets for products distributed - to generate icons that show format when searching multiple databases Because of the many uses of this coded element, the decision as to which characteristic to consider primary has great impact for retrieval and manipulation of the record. The 006 can give the additional descriptive information for the secondary form of material, but most systems are not currently using it for retrieval the same way as 008 is used. The current definition of Leader/06 in the USMARC Format for Bibliographic Data, code "m" for Computer file is: m - Computer file Code m indicates that the content of the record is for a body of information encoded in a manner which allows it to be processed by a computer. The information in the computer file may be numeric or textual data, computer software, or a combination of these types. Although a file may be stored on a variety of media (such as magnetic tape or disk, punched cards, or optical character recognition font documents), the file itself is independent of the medium on which it is stored. This definition implies that any digitized item needs to be identified in the Leader/06 as a Computer file. It also implies that a separate record would need to be made for a digitized reproduction, since the record for the original would be coded according to the original carrier/content. For systems that rigidly separate records by leader/06 values this could be a problem, since the user may not be able to bring together the record for the original with the record for the reproduction. Opening up this definition so that the institution is not mandated to categorize all digitized items as computer files allows for more flexibility. 3. TYPES OF DIGITIZED ITEMS For purposes of record creation, digitized items may fall into several different categories: Item exists only in digitized form. This would apply to computer software, databases, Internet resources, CD-ROMs with software, images, text, etc., a journal issued only electronically, etc. Item issued simultaneously in more than one form, one of which is digital. This would mainly apply to textual items that might be published in printed form and in digital form at the same time. Digitized reproductions. This would apply to items that were digitized from the original and are intended to substitute for it. The content of the item is essentially the same, although there may be some differences because of the technology of digitizing. A portion of an item is digitized. An example might be an item that is in print form, but a portion of it, such as its table of contents, has been made available digitally. A digitized collection that exists only digitally. Specific items might be digitized and put together as a collection with some unifying elements. In this case the items do not exist together as a collection in their original form, but only in the digital form. With the definition of Field 856 (Electronic Location and Access), a bibliographic record can be created with a link to the electronic location of the item. Subfield $3 (Materials specified) allows for electronic location information to be given for a subset of the item in the record. Electronic location and access information was assigned to a field in the holdings block because it was considered equivalent to location information in field 852 (Location) (but for electronic location). Thus, as with field 852, it could be embedded in the bibliographic record to give copy specific information or theoretically communicated in a separate holdings record. So far it appears that most uses of field 856 have been in the bibliographic record to point to the electronic location. Technological advances have resulted in numerous projects to digitize existing material. Librarians want to provide descriptive information and access to these materials through catalog records. In special collections, particularly in visual materials, institutions have often chosen to include field 856 in the record for the original to give information about the digitized item. In these cases, additional descriptive information about the item in its digitized form has not been needed, but only information about the location and access to it. This technique has been attractive because of the retrieval problems when the digitized item is cataloged separately as a computer file, and the economic considerations of creating a separate record. It allows for the focus of the record to be on the content of the item, rather than its carrier. 4. CODING FOR CONTENT Coding for the content of the item for digitized materials would be consistent with the method used for handling microforms. The USMARC bibliographic format in Leader/06 says the following: "Microforms, whether original or reproductions, are not identified by a distinctive Type of record code. The type of material characteristics described by the codes take precedence over the microform characteristics of the item." There is not a separate Leader/06 code for microform, although there is one for computer file. By handling digitized items in Leader/06 as language, cartographic, music, etc., records for digital reproductions would not be separated from the originals, and allows flexibility for record creation (i.e., using one record and adding a field 856 with location information of the digital reproduction or creating a separate record). The statement above about the treatment of microforms could be revised to include digitized materials. Additional information about the computer file aspects can be given in an 006 for computer file fixed field data and in 007 for physical description. A general material designator (GMD) would be given in 245$h to indicate that the physical format is computer file; it is not necessary for the 008 to agree with the GMD, so the 008 for the content of the item would be given. 5. REVISED DEFINITION OF CODE "M " In order to allow for records for digital items to be coded for their content, or intent, rather than as a computer file, the following definition might be considered: m - Computer file Code m indicates that the content of the record is for a body of information encoded in a manner which allows it to be processed by a computer. Code m is used in Leader/06 when the computer file characteristics are the primary aspect of the item. The information in the computer file may be numeric data, computer software, a combination of these types, or a mixture of various types of computer files, none of which predominate. Although a file may be stored on a variety of media, the file itself is independent of the medium on which it is stored. In case of doubt, consider a computer file. This definition does not mandate whether or not to use a separate record for the digitized item, but leaves it up to the cataloger. After format integration, since all variable fields are available for all types of material, the Leader/06 code no longer determines field validity. Once the electronic aspects are moved from the Leader/06, then the format issues can be divorced from the cataloging rules. Since field 006 can give characteristics of a second form of material, the choice of code in Leader/06 is not dependent upon the choice of AACR2 chapter used. The cataloger still needs to choose which chapter of AACR2 is appropriate for description, which will determine, among other things, which fields are needed in the record. The cataloger is no longer constrained by the format, since any USMARC defined fields will be valid. 6. QUESTIONS Consideration needs to be given to the following questions: 1. Of the types of digitized items in section 3, how should each be encoded in Leader/06? 2. For digital reproductions (three categories: digitized reproductions, a portion is digitized, a collection of digitized items that only exists digitally), how should each be represented in the bibliographic record? In separate records, in the record for the original, or should it be a local decision? If in the original, what fields should be added in addition to field 856 (e.g. 530, 533)? Should 006 for computer files be required to be added to the record if the record for the original is used? 3. How important is it that the choice of Leader/06 be mandated, or could users emphasize the aspect they want to depending upon individual needs? Should there be options in determining the code in Leader/06? 4. Is it practical to consider the solution of using a cluster of holdings fields for information about the digital reproduction? This would include field 856 (a holdings field), possibly field 007 and/or field 843 (the same as 533 in bibliographic), and any other appropriate holdings fields (e.g. for serials 853-868 for holdings data). 5. Does the revised definition give users the flexibility to decide whether or not to consider the digitized item under code "m"? Is it clear how to code? 6. In terms of selecting the Leader/06 code, is further clarification needed for nontextual materials having two or more attributes when one is not computer file (e.g. music for America, the Beautiful on a wall chart)? 7. Does the GMD and the existence of a computer file 006 suffice to determine what is the actual physical form of the item? For instance, if the Leader/06 were coded "e" for cartographic material and there is a field 006 for computer file, could one determine whether this record represents a paper map with accompanying computer disk or a computer file that displays maps? Note that the GMD is not a required data element. Does an additional data element need to be defined (perhaps in the Leader) for carrier? If an element for carrier is defined, might it be used instead of field 006 if there is no information in 006 that needs to be conveyed? Note that the only element in the computer file 008 (and hence in 006) that might be useful is 008/26 (Type of computer file), which may be redundant with the Leader/06 if the record is coded for content. There has also been some discussion about no longer using GMDs, which may make a new code for carrier necessary. 8. Should we define a new value in the Leader/06 for entities that contain multiple modes of expression where no particular one predominates? If so, how would we distinguish kits, mixed material, and this new value? If so, how would its 008 be defined?