Z39.50 Attribute Architecture -- Open Issues

Z39.50 Attribute Architecture

List of Open Issues

October 15, 1998

The Z39.50 Attribute Architecture will be finalized following discussion at the the October ZIG meeting. The meeting discussion will be confined to the following open issues that have been identified.

Distinction Between Abstract and Field Name Attribute Types
Should the architecture continue to maintain a formal distinction between abstract and field-oriented access points , by defining these as different attribute types? Or should these two types be merged?
Nesting of Abstract Access Points
(This question becomes moot if we merge Abstract and Field Name types.) Should searching be allowed, for example, on "address within author", where 'address' and 'author' are defined as Abstract access points? Or should the architecture continue to prohibit nesting of abstract access points, in which case it would be suggested, for this example, that either address and author be modeled as Field Name access points, or authorAddress be defined as a separate attribute.
Normalized value of Weight
The architecture defines the Normalized Weight Attribute to be: "the weight of an operand in a weighted boolean query, based on a normalization value (e.g. 1000)" and further mandates that "An attribute set definition that includes this type should specify a normalization value." Should the architecture continue to mandate that an attribute set definition including Normalized Weight specify a normalization value?
Stopwords
The issue (apparently) is that there are many potential functional requirements related to the specification of how a server is to parse a language string to derive search terms. Why single out stop words? As an example, if you can tell a server not to drop stopwords, why can't you tell the server not to drop punctuation (so when searching on 'c++', to avoid a search on 'c')?
Language Strings vs. Character Strings
There are some who appear to be dissatisfied with or unclear about the formulation of the distinction between these two type of structures. It may simply be that the architecture document isn't adequately articulating the difference. Or, should we reconsider this whole distinction?
Character Strings vs. Numeric Quantities
Do we really need all those comparison attributes for character strings? E.g. lexically less than, etc. I.e. when comparing strings in terms of lexical order why can't they be treated as numbers, like we decided to do with dates?
Searching According to Document Identifier
In particular, what is the new-architecture-analogy to the Z39.50R URL, where the doc id extracted from the url in the search term with bib-1 Use attribute DocId and structure urx?
Cross Domain Searching
According to the proposed plan, there will be a bibliographic set developed under the new architecture (tentatively called "bib-2") as well as a cross-domain set to include dublin core access points (tentatively called the "xdc set"). However, according to an alternative suggestion:

Instead of a "bib-2", the "successor to bib-1" should be developed for document-like-objects, not bibliographic objects. Support for bibliographic requirements should be based on profiling this set.
There should be not just a single cross-domain set, but different cross-domain sets corresponding to different object types. There should be a (single) cross-object set.
Formal Vs. Informal Approach to Attribute Set Design
The attribute architecture seems to emphasize a more formal approach to attribute set design, while there is a need to continue to develop attribute sets on an informal basis. While the architecture does not need to address the informal approach, it should not preclude it. The more formal approach is necessary when semantic interoperability is paramount, that is, for people who want semantic predictability for cross-database searching. The less-fromal approach applies when an attribute set is developed, say, for a single database.
Organizational Vs. Structural Formalism
Two levels of formalism seem to be addressed by the architectute:

Formal data structures related to the description and execution of searching.
Organizational formalism: who own and/or develops an attribute set, and how they lay it out for developers/systems to use.
But the document does not make a clear, explicit distinction between these. A developer should be able to adhere to the first level without necessarily adhering to the second.

The following open issues concerning the Utility attribute set have been identified.

Masking
Should the Utility set include a masking attribute? (It already has a regular expression.) If so, should it have the same semantics as the Z39.58 masking in bib-1 (although the Z39.58 standard has been withdrawn, so the attribute should not refer to Z39.58)? Or should it be based on ISO 8777 masking?
Content Authority
What value(s) should the Utility set define for Content Authority?
Indirection
What value(s), besides URL, should the Utility set define for Indirection?

The following issue pertains to Dublin Core/Cross Domain (XDC) searching using bib-1.
The addition to bib-1 of 15 Use attributes corresponding to DC elements was approved at the June 98 ZIG meeting.
A number of those (9 or so) are perceived to be redundant, for example, DC-title is thought by many to be un-necessary given the existing bib-1 title (4).
Alternatives are:

Reverse the decision to add the 15 elements. Define a mapping (or adopt existing mapping proposed at Copenhagen August 97) from existing bib-1 Use attributes to DC attributes, for the 9 or so that are appropriate, and add only those for which there is no appropriate mapping.
Reverse the decision to add the 15 elements, don't define any mapping, but do identify and add the necessary attributes for which there would not be any appropriate corresponding pre-existing attribute.
Reverse the decision to add the 15 elements, don't define any mapping, don't add any attributes.
Keep the 15 elements, use them for xdc searching. Pre-existing attributes like title (4) would be used to restrict a search to the bibliographic domain. An xdc attribute would be defined as a superset of its corresponding pre-existing bib-1 attribute; thus DC-title would retrieve all records that title(4) would retrieve, and more.
Keep the 15 elements and define the redundant ones to be identical to their pre-existing corresponding attributes; both sets to be used for xdc searching.
Keep the 15 elements. Use the pre-existing bib-1 attributes for xdc searching and the new DC attributes to search DC records only.

Library of Congress
Comments