Z39.50 International Standard Maintenance Agency - Library of Congress, Network Development and MARC Standards Office

Z39.50 Text

Part 1: Title, Abstract, Foreword, Section 1

[Table of Contents | Next Section]


Information Retrieval (Z39.50-1995): Application Service Definition and Protocol Specification

Abstract: This standard specifies a client/server based protocol for Information Retrieval. It specifies procedures and structures for a client to search a database provided by a server, retrieve database records identified by a search, scan a term list, and sort a result set. Access control, resource control, extended services, and a "help" facility are also supported. The protocol addresses communication between corresponding information retrieval applications, the client and server (which may reside on different computers); it does not address interaction between the client and the end-user.

Foreword
(This foreword is not a formal part of American National Standard ANSI/NISO Z39.50-1995, but is included for information only.)

ANSI Z39.50-1995, Information Retrieval (Z39.50) Application Service Definition and Protocol Specification is a revision of ANSI Z39.50-1992.Draft versions of this standard were referred to as Z39.50-1994 This was changed to Z39.50-1995, as part of the approval and publication process. There is no approved 1994 version of Z39.50. Z39.50-1995 is the final, approved version of the standard for which the various drafts were referred to as Z39.50-1994. Implementors should take note that any earlier draft, referred to as Z39.50-1994, is not the latest version of this standard.

The 1992 version was a revision of Z39.50-1988, which was prepared by a NISO (National Information Standards Organization) committee that was disbanded after Z39.50-1988 was approved. In its place the Z39.50 Maintenance Agency was established in 1989, administered at the Library of Congress.

The protocol was originally proposed (in 1984) for use with bibliographic information. As interest in Z39.50 broadened, the Z39.50 Implementors Group (ZIG) was established, in 1990. Members include manufacturers, vendors, consultants, information providers, and universities, who wish to access or provide access to various types of information, including bibliographic, text, image, financial, public utility, chemical, and news. ZIG membership is open to all interested parties.

Various enhancements were proposed by implementors for the 1992 version, to support a wide range of information retrieval activities. But those features were not yet fully developed, and their incorporation into the 1992 standard would have caused significant delay. The Z39.50 Maintenance Agency had been assigned, as top priority, to revise Z39.50-1988 to achieve bit-compatibility with the international standard, ISO 10162/10163, Search and Retrieve, SR. (Z39.50-1992 replaced and superseded Z39.50-1988, and is a compatible superset of SR.) The proposed new features were therefore deferred, with a commitment to implementors that development of the required features would proceed, and that the resultant subsequent version would be a compatible superset of the 1992 standard.

In 1992 the maintenance agency conducted a formal survey among Z39.50 implementors to determine the relative importance of proposed new features. The survey's purposes were to begin to narrow the list to a manageable set, to determine whether the proposed features were adequately specified and understood, and to gauge their perceived cost and complexity. The survey results revealed certain features to be indispensable, and that certain others features could be eliminated from further consideration. For a third set of features, the survey was inconclusive and the disposition of those features eventually was determined by consensus.

Development of Z39.50-1995 began in late 1991. For each meeting of the ZIG, from December 1991, through April 1994, a revised draft was developed by the Z39.50 Maintenance Agency. Each draft underwent careful scrutiny by implementors, and was discussed at length both over the ZIG Internet mail list, and at the ZIG meeting. Comments and discussion for each draft, and agreements reached at each ZIG meeting, were incorporated into the subsequent draft. In April 1994, the ZIG recommended that the draft be finalized.

The 1992 version came to be known as "version 2", and the 1995 version, "version 3". However, although these version designations do have specific protocol significance, they do not refer to versions of the standard. Z39.50-1992 specifies protocol version 2; Z39.50-1995 specifies protocol versions 2 and 3.

Although Z39.50-1992 replaced and superseded Z39.50-1988 (and Z39.50-1988 is obsolete) the relationship between Z39.50-1992 and Z39.50-1995 is quite different: Z39.50-1995 is a compatible superset of the 1992 version. An implementor may obtain complete details of version 2 from the Z39.50-1995 document, and build an implementation compatible with Z39.50-1992.

Z39.50-1995 represents a consensus of the ZIG, which has in effect acted in an advisory role to the maintenance agency, in the effort to develop both Z39.50-1992 and the Z39.50-1995.

Basics of the Protocol
The protocol specifies formats and procedures governing the exchange of messages between a client and server enabling the client to request that the server search a database and identify records which meet specified criteria, and to retrieve some or all of the identified records.

The client may initiate requests on behalf of a user; the protocol addresses communication between corresponding information retrieval applications, the client and server (which may reside on different computers); it does not address interaction between the client and user.

Z39.50-1992 provides the following basic capabilities, all of which are supported in Z39.50-1995 as well. The client may send a search, indicating one or more databases, and including a query as well as parameters which determine whether records identified by the search should be returned as part of the response. The server responds with a count of records identified and possibly some or all of the records. The client may then retrieve selected records. The client assumes that records selected by the search form a "result set" (an ordered set, order determined by the server), and records may be referenced by position within the set. Optional capabilities include:

Query Formulation
This standard fully specifies and mandates support of the type-1 query, expressed by individual search terms, each with a set of attributes, specifying, for example, type of term (subject, name, etc.), whether it is truncated, and its structure. The server is responsible for mapping attributes to the logical design of the database. Terms may be combined in a type-1 query, linked by boolean operators. Terms and operators are expressed in Reverse Polish Notation.

Attribute Sets
The attributes associated with a search term belong to a particular attribute set, whose definition is registered, that is, assigned a unique and globally recognized attribute-set-id, an Object Identifier, which is included within the query.

Appendix ATR defines and registers the attribute-set bib-1, which specifies various attributes useful for bibliographic queries. Additional attribute sets may be registered outside of the standard. The bib-1 attribute set was developed by the bibliographic community; it is intended that attribute sets will be developed and registered as needed by other communities.

Response Records
The protocol distinguishes two types of records that may occur in response messages from the server: database and diagnostic records.

Appendix REC registers object identifiers for various MARC formats, including USMARC, UKMARC, Norway MARC and CANMARC; these object identifiers accompany database records returned by the server. There are several other types of record formats defined, and there is a provision for registration of additional record formats.

Diagnostic records are similarly accompanied by an object identifier which identifies their format. Appendix ERR defines and registers two diagnostic record formats (one of which was defined in Z39.50-1992) which includes various diagnostic codes useful for bibliographic applications. Additional diagnostic record formats may be registered.

New Features
Provided below is a summary of the enhancements in Z39.50-1995. The designations "version 2" and "version 3" refer to protocol version; "Z39.50-1992" and "Z39.50-1995" refer to the respective standards. Thus where a particular feature is described as "new in Z39.50-1995", that generally means it applies in either protocol version. An example is Scan: an implementor may add the Scan service to an existing implementation of Z39.50-1992 without incorporating any other new features.

The enhancements described below fall into four categories: search, retrieval, new services and facilities, and miscellaneous enhancements.

Search
Attributes. There are a number of enhancements pertaining to attributes and attribute sets. In version 3, attributes may be combined from different attribute sets, within a single query (even for a single search term). This presents two advantages: First, it is useful when searching multiple databases. (Although version 2 supports multiple-database searches, all attributes within a query must belong to a single attribute set, which inhibits the ability to search multiple databases, unless those databases are similar.) Second, new attribute sets may now be defined with less replication.

Version 3 provides two further enhancements allowing flexibility in the definition of attribute sets. First, new data types for attribute values are defined (in version 2 only numeric values are allowed). Second, an attribute set definition may now list alternative sets of evaluation rules (for example, whether the server is allowed to substitute an attribute that it thinks is more appropriate), and the query may select one of the alternatives. The enhanced bib-1 attribute set definition exploits this new feature.

The bib-1 definition in Z39.50-1995 also includes many new attributes (as well as all of the attributes in Z39.50-1992).

Extended Result Set Model. The basic model of a result set is developed in Z39.50-1992; the 1995 version describes an "extended result set model", which supports extended proximity searching.

The extended model also supports a new version 3 search function, restriction, which is (in effect) an operation on a result set. It permits selection of re-cords from a result set, based on specified attributes.

Search Term. The search term for a query may take on a variety of data types in version 3. (In version 2 a search terms is binary and thus essentially has no data type, so the type is often described by a structure attribute.) This enhancement will simplify queries (as well as attribute set definitions) by reducing the need for structure attributes.

Intermediate Results. In Z39.50-1995 the server may provide information per query component (i.e. per sub-query, per database), as part of the Search response (version 3 only), or as part of resource-control when the server reports on the progress of the search. The server may also create and provide access to a result set for individual query components.

Retrieval
Segmentation. In version 2, a retrieval response is limited to a single message; the server attempts to fit the requested records into the message, and if it cannot, it simply fits as may as it can. The client might want to retrieve, for example, ten thousand records, knowing it cannot retrieve them in a single message. Typically the client will request all ten thousand records, wait for the response, determine how many records are retrieved, and then send another request for the remaining records. This works well in many environments but is unacceptably slow for high-speed networks. The server must await a request before sending each set of records, which introduces a delay; the delay may be negligible for conventional networks, but is intolerable for high-speed networks. In version 3 a server may respond to a retrieval request with multiple consecutive response messages without intervening requests.

A more serious segmentation problem occurs when a single record is too large to fit in a single message. Version 3 thus introduces a second level of segmentation: an individual record may span response messages. A client or server may choose to support either level of segmentation, or no segmentation (in which case version 2 rules apply).

Retrieval Tools. The ZIG has worked intensively over two years to develop an extensive model and suite of tools for a wide range of retrieval functions to support various retrieval applications, in particular, document retrieval. The model is detailed in Appendix RET. Several new object classes are designated in Z39.50-1995 (schemas, tagSets, variants) and specific objects from these and other classes are defined. Appendix RET provides detailed semantics for these objects and describes how they are used together to provide a variety of document retrieval capabilities. Following are a few examples:

New Services and Facilities
Scan and Sort. Scan and Sort are new services in Z39.50-1995. These are used respectively to scan terms in a list or index, and to sort a result set.

Scan is currently the only service in the Z39.50 Browse facility, but it is intended that various other browse capabilities will be added in future versions.

Extended Services. Extended Services is a new facility in Z39.50-1995. It includes a new Z39.50 service, the Extended Services service, used to initiate a specific extended service task, which is executed outside of the Z39.50 session and whose progress may be monitored using Z39.50 services. Specific extended services include:save a result set, set a periodic query schedule, export a document, order a document, and update a database.

Explain. The new Explain facility allows a client to retrieve details of the server implementation: general features (description, contact information, hours of operation, restrictions, usage cost, etc.) databases available for searching, indexes, attribute sets, attribute details, schemas, record syntaxes, sort capabilities and extended services. The server maintains Explain information in a special database that may be accessed by the client using the Z39.50 search and retrieval facilities. The format of the Explain information is detailed in the standard.

Some Explain information is transparent to the client, intended for direct display to the client-user, and is so designated (e.g. "general features"). Some Explain information is intended to be shared by client and user. For example, the client may retrieve a list of searchable databases; for each database in the list the client might display an informal name, an icon, and a brief description. Meanwhile the client would retain the actual database name to be used in a protocol message, which probably would not be displayed. Some Explain information may be completely transparent to the user. For example, the client may retrieve information about attributes supported for a database and use that information when formulating a query (when converting a user-supplied query to a Z39.50 type-1 query).

Miscellaneous Enhancements
Termination and Re-initialization. Version 3 includes a more flexible approach to termination of a Z39.50 session, to allow, in effect, re-initialization without taking down the network connection.

Concurrent Operations. Multiple concurrent operations are allowed in version 3. In version 2, operations are strictly serial.

Diagnostics. Most Z39.50 services include diagnostic capability. In version 2 a diagnostic must conform to a specific format defined within the standard. In version 3, diagnostic formats may be externally defined and registered. One such (new) format is defined, along with a comprehensive set of diagnostics.

Access Control Formats. Z39.50-1992 provides access control, but does not define any access control formats. Z39.50-1995 defines formats for encryption and authentication, and a format allowing the server to prompt the client for arbitrary information.

Character Set Support. A new data type, "International String", has been introduced for character strings. Its definition allows greater flexibility for a client and server to agree to the use of a particular language and one or more character sets during a session.

Units. New data types are introduced for support of units. These definitions allow standard representations to be used to represent unit type and unit. For example, unit type might be "mass", and unit, "kilogram".

Extensibility and Negotiation. Version 3 provides a powerful extensibility feature. Each protocol message includes a field designated for information whose format is to be defined externally. These externally defined formats will be registered and maintained by the Z39.50 Maintenance Agency, as provisional extensions to the standard, for experimental use and possible consolidation into a subsequent version.

In Z39.50-1995 the concept of a "negotiation record" is introduced. The client may include a negotiation record within the initialization message to propose that some condition be in effect for the session (for example, the use of a particular language and one or more character sets). The server may respond, indicating whether the proposal is accepted, or indicate a counter-proposal.

The negotiation record is an application of the new extensibility feature. Negotiation records will be defined externally and maintained by the Z39.50 Maintenance Agency.

1. Introduction
This standard, ANSI/NISO Z39.50-1995, Information Retrieval (Z39.50) Application Service Definition and Protocol Specification, is one of a set of standards produced to facilitate the interconnection of computer systems. It is positioned with respect to other related standards by the Open Systems Interconnection (OSI) basic reference model (ISO 7498). This standard defines a protocol within the application layer of the reference model, and is concerned in particular with the search and retrieval of information in databases.

1.1 Scope and Field of Application
This standard describes the Information Retrieval Application Service (section 3) and specifies the Information Retrieval Application Protocol (section 4). The service definition describes services that support capabilities within an application; the services are in turn supported by the Z39.50 protocol. The description neither specifies nor constrains the implementation within a computer system. The protocol specification includes the definition of the protocol control information, the rules for exchanging this information, and the conformance requirements to be met by implementation of this protocol.

This standard is intended for systems supporting information retrieval services, for organizations such as information services, universities, libraries, and union catalogue centers. It addresses connectionoriented, program-to-program communication. It does not address the interchange of information with terminals or via other physical media.

1.2 Version
This standard, Z39.50-1995, specifies versions 2 and 3 for the Z39.50 service and protocol.Note the following:

  1. ANSI Z39.50-1992 specifies version 2 only.
  2. For compatibility with version 1 of the Search and Retrieve Protocol (ISO 10163-1991), version 2 of Z39.50 is assumed identical to version 1 of Z39.50; thus implementations that support version 2 automatically support version 1. (Version 1 of ANSI Z39.50-1992 should not be confused with ANSI Z39.50-1988.)

Certain procedures specified within the standard apply specifically to version 2 or version 3 and are noted as such.

[Note: For minimum requirements beyond version 2, for a Z39.50 implementation to claim conformance to version 3, see Z39.50 Version 3 Baseline Requirements.]

1.3 Referenced Standards
ANSI/NISO Z39.53-1994 -- Codes for the Representation of Languages for Information Interchange.
ANSI/NISO Z39.58-1992 -- Common Command Language for Online Interactive Information Retrieval.
ISO 2709 -- Documentation - Format for Bibliographic Information Interchange on Magnetic Tape 1981.
ISO 4217 -- Codes for the representation of currencies and funds 1990.
ISO 7498 -- Information Processing Systems - Open Systems Interconnection - Basic Reference Model 1984.
ISO 8649 -- Information Processing Systems - Open Systems Interconnection - Service Definition for the Association Control Service Element 1987.
ISO 8650 -- Information Processing Systems - Open Systems Interconnection - Protocol Specification for the Association Control Service Element 1987.
ISO 8777 -- Information and Documentation - Commands for Interactive Text Searching.
ISO 8822 -- Information Processing Systems - Open Systems Interconnection - Connection Oriented Presentation Service Definition 1988.
ISO 8824 -- Information Processing Systems - Open Systems Interconnection - Specification of Abstract Syntax Notation One (ASN.1) 1990.
ISO 8825 -- Information Processing Systems - Open Systems Interconnection - Specification of Basic Encoding Rules for Abstract Syntax Notation One (ASN.1) 1990.
ISO 10160 -- Information and Documentation - Interlibrary Loan Application Service Definition for Open Systems Interconnection 1991.
ISO 10161 -- Information and Documentation - Interlibrary Loan Application Protocol Specification for Open Systems Interconnection 1991.
ISO 10163 -- Information and Documentation - Search and Retrieve Application Protocol Specification for Open Systems Interconnection 1991.
ISO -- International Register of Coded Character Sets To Be Used with Escape Sequences 1992.

[Table of Contents | Next Section]