ZIG meeting - June 2, 1998 -- Day 2

Washington DC

6. Attribute Sets and Attribute Architecture

The issue is version 3, not backward compatibility with v2. (The presumption has been a gradual move to v3.)

Doing it right is important for the long term health of Z39.50. If you believe the working life of Z39.50 is only two years, we should abandon the attribute architecture work because it will take us two years to do the attribute design work and another two years for that work to be implemented across vendors. Is the investment worth it to us? We must recognize that the new attribute architecture and classes will require a re-working of existing attributes like STAS -- a significant investment. We need to make an economic case to do this work.

See handout "Attribute Architecture Issues for discussion at the June ZIG meeting." What attributes do we need? What do we do about the Dublin Core (DC)? It is clear that people want to use the DC access points. What is not clear, and a topic of substantial controversy, is whether we should pick attributes in v2 that sound like DC semantics and add the others that can't be mapped to existing attributes. Or should we simply embed all DC elements into Bib-1, as proposed, and treat them like GILS (i.e., they take their semantics from the group that developed them)? Maybe we should not adopt DC, but instead charter work on a lowest common denominator Cross Domain attribute set that is similar to DC but based on ZIG experience. (C Lynch doesn't like this option personally, but others do.)

How many attribute sets do there need to be and how are they parceled out per discipline? The number is growing, but in practice we are perpetually inheriting each other's sets. This is different from the neat mosaic of the world that we held historically where different communities would manage sets for their communities and overlaps would be negotiated. This is a reality check problem.

Suppose we move ahead with the attribute architecture work and keep Bib-1 as the community sewer (treat it with benign neglect). If we were going to develop a new bibliographic attribute set, what's the scope of the work? What about DC and discipline-specific attribute sets? Should the attribute work extend into the abstracting and indexing world? (A&I databases need the bib attribute set, but they also require specialized access points for different disciplines, so the boundary conditions are unclear.) How many different attribute sets do we need? There seems to be consensus that we need a set at the gross semantic level, but it would also be useful to have an attribute set tied closely to MARC fields and subfields (semantics well understood and real applications that want control over searching MARC fields and subfields).

Denenberg: we can soften the dichotomies just presented. We're not talking about a massive upheaval in the standard vs. maintaining the status quo. Adopting a new architecture does not necessarily mean the death of v2 systems or the death of Bib-1. He thinks C Lynch overstated the case. We do not need to terminate Bib-1 to go to the new architecture -- we should challenge this operating assumption. It's not just a question of the life span of Z39.50 determining whether we do this work, but also the prospect that doing this work will extend the life span of Z39.50 if we continue to be ambitious and keep the standard up with the needs of the community. Also, the arguments that have been made that no users or vendors are asking for a new architecture misses the point. (Users don't know about attribute architecture, and vendors are happy with the status quo.) The new architecture was intended to give those developing new attribute sets some guidelines (thus attribute set developers, not users or vendors, want an architecture) . The discussions within CIMI, ZBIG, etc. are somewhat confused about what to do with Bib-1, DC, etc.

LeVan: DC is missing semantics and structure-based, position-based queries which are not supported in the current architecture. C Lynch: the new architecture is to provide guidelines and a framework in which groups can position their work and build on the work of others. This is a different problem or set of needs from those the bibliographic community is having with interoperability. Hinnebusch: the problem is not the current architecture but the expediency of mapping attributes. The integrated library system community can continue to not do it right for a long time with Bib-1. On the other hand, they could start doing it right with Bib-1 and increase interoperability. This is a red herring. Communities developing new attribute sets need to integrate with library systems; this will force vendors to do it right -- to integrate catalog searching, government information searching, museum information searching, etc. He's not sure whether vendors are looking at this in their product line.

Christian: Lots of other people are now tuned in to semantic interoperability problems and trying solutions. When we build these other systems (like GILS), how do we make sure that Z39.50 work will nicely interoperate with other developments? Sutton is dealing with freedom of information documents and their work must be interoperable with other systems and information.

Reich: how will the new attribute architecture help with interoperability? He worked to make GILS and GEO interoperable. If you don't have a wonderful registry where people can walk through an inheritance tree, a new architecture may only confuse matters. How would the process occur if you want to use the biology core attributes and attributes from related attribute sets? Denenberg: how does the current process enable that interoperability? Nothing in the new architecture precludes the mechanism that you describe. C Lynch agrees. The new attribute architecture does not do a great deal to improve over the current state of affairs where different communities maintain their attribute sets with rational boundaries and no overlap. It may help disentangle things, but the fundamental problem of how different communities have practiced and coordinated to not do redundant work is conserved across the new architecture.

So what are the problems? Stovel gave a presentation on Bib-1 at the NISO attribute workshop in March. She described the growth of Bib-1. For example, there were 23 Use attributes in 1988, 79 in 1992, 99 in 1995; another 60 were added in February 1998, with 15 DC and others currently pending.

C Lynch: the working group made no assumption about the presence or absence of Explain. The new attribute architecture would require some fiddling with Explain but can be accommodated by Explain. The group did look at the ambiguous status of Use attributes in existing practice. Is a Use attribute a field or an intellectual thing? They also looked at data elements and attribute duality. There is a significant group of developers interested in more direct equivalence between data fields and search points. Pedersen: if you look at implementations and profiles describing the use of attributes, only part of the attribute set is actually used. Hinnebusch's survey of CIC sites looked at attribute use; the results will be available soon, though a preliminary look indicates that most vendors use only a small set of attributes. LeVan: many communities are using different attributes for different purposes; some communities are not interested in interoperability with other communities. We have not only intersection problems but union problems. It may be interesting to know how many attributes are broadly used, but of what use is this information for specialized communities?

Skipped to ZBIG presentation. (See next agenda item.)

Continuation of attribute architecture discussion:

Hinnebusch: when you use attributes from different attribute sets, how much of the semantics cross over? The big problem is combinations of attributes from the same or different attribute sets. If you want to give the server choices, how do you do that? Our different practices evolved because there was no architecture or guidelines for how to do things. Nothing's ever been said about what it means when you send a term without attributes. Guidelines must indicate the kinds of things people have to think about when they develop attributes.

C Lynch: in 1994-95 a significant number of systems with similar content tried to interoperate and we had a big mess. Everyone made different assumptions about what a term without attributes meant, what terms with multiple or repeated attributes meant, etc. In theory you could fix this by re-doing Bib-1, but by this time additional attribute sets had appeared and people wanted to combine attributes from multiple sets in the same query. The problem of indeterminate semantics was complicated by the different kinds of attributes in different sets. Where do the assumptions come from when you combine attributes from different sets? We need a broad framework.

Reich: so how do we proceed? Does this new architecture work scare people because of the complexity of attributes and attribute combinations? He previously thought the issue was interoperability. Denenberg: all of the issues are related. Smith: we need to see the structure or coupling of data structures and documents or records. LeVan: structure-based queries are missing from the current architecture. Zeeman: the problem with bibliographic data is that there is nothing inherent in the attribute set or record set that's returned that would allow the system to know without prior knowledge how to search for an element in a record that's returned; you need external knowledge. Denenberg: that is supported by the new attribute architecture. Zeeman: no, it's not. Denenberg: all of the attribute sets that we've speculated about conform to the new architecture. Hinnebusch: but that doesn't address Zeeman's concern about how to move from an element, for example, to a MARC tag. D Lynch agrees that this is important to do, though maybe we should do it in a more general way, but that's not the current topic. What are we trying to accomplish today other than hand wringing? Are we supposed to craft something?

C Lynch: there has been a proposal for about six months that suggests we adopt a template for future attribute sets and possibly a strategy for migrating existing sets into this framework. He is fairly confident that if we agree on the framework, we can thrash through the remaining technical issues. Do we agree to adopt this framework and develop definitional strategies for new attribute sets and migration strategies for old attribute sets? If we don't want to do that, what do we want to do?

Stovel: the original work was trying to figure out the technical framework for attributes. That's done. Now we're struggling with the implications of this framework. Does this technical approach give us what we need? Denenberg: what would be useful at this point is considering Percivall's reservations about the new architecture. Percivall: we have not contemplated moving to the new architecture, so we don't know the implications of doing so. Perhaps instead we should formalize our ad hoc agreements. It is not clear to him that the list of "rather tentative" issues on the handout warrant overthrowing current work.

Hinnebusch: what would be the impact on the CIP attribute set if we adopted this architecture? Would it make a difference? C Lynch hasn't looked specifically at the CIP attribute set in this context. Where he suspects most folks will have trouble is the way the non-Use attributes are done. There's arbitrariness in how the non-use characteristics are sliced up into different attribute types or classes. The group did look at STAS and the new architecture would require some work. The intellectual work is at the level of the data and the access points. The difficulty is with the non-Use attributes, but he's not sure how extensive the work would be for different communities. Reich: the investment is not that large (e.g., adding only some non-Use attributes), but if this is to facilitate interoperability, then let's do a walk through of what's involved in combining Use attributes from multiple sets under the new architecture. What is the scenario about Use attribute interoperability? Will Use attributes get dispersed even further than they are now?

Denenberg: what advice can we give developers about whether to use an existing attribute or create their own? Turner: what is the vendor commitment to this, especially the library system vendors? We need to hear from vendor product managers and marketing people. D Lynch: the ILS question is irrelevant; if the only purpose of the standard is library catalog interoperability, then we've been doing lots of work well beyond that (and wasting resources). Turner: libraries would like to be able to search across different systems, e.g., GILS, geo-spatial, etc. D Lynch: the European community is using middleware with Z39.50 to search across different museums, archives, geo-spatial, A&I databases, library catalogs. Hinnebusch: we can't ignore ILS vendors, but they probably won't do any better with the new attribute architecture than the old. Nonetheless, do we know what the impact of the new architecture will be on ILS vendors? They use a relatively small set of attributes, so the impact may be relatively small. For example, they could use a new OID but continue to do everything else the same. Randall is not going to push ILS vendors to use the new attribute architecture, but rather to clean up what they're doing now. We cannot make a business case for ILS vendors to migrate to the new architecture if migrating won't sell more systems.

Waldstein: some of our problem may be presentation. The set of questions is scary, but it's unclear whether the work will really be disruptive. St Gelais: from the vendors perspective, we don't object to moving to a new architecture, but the benefits, the attribute classes, and how to build them are unclear. C Lynch: we can talk in terms of a strategic direction of adopting the new attribute architecture, but that's a meaningless statement for an implementor. An implementor will end up implementing new or revised attribute sets that conform with the new attribute architecture. It's at that point that we can talk about the work involved and tangible benefits. The issue is not whether ILS vendors will adopt this, but can we spec one or several attribute sets within that architecture that will provide a higher level of interoperability and better retrieval of Bib-1 -- then we can talk about cost benefits to ILS vendors. One problem is the interoperability (or lack thereof) of bibliographic systems; there the issue is purely how good is the successor to Bib-1 and can we write a spec for it that is sufficiently rigorous to discuss conformance. The other set of issues in the ILS community is when these vendors begin to expand to A&I and other kinds of databases. But it is unclear whether this is part of the strategic direction of ILS vendors or only of the people who purchase their systems.

Stovel: C Lynch's characterization of the ILS world captures the whole problem. We need to be more specific (e.g., to have Domain Specific attribute sets), but this need conflicts with those who need broader searching (across disciplines, data types, etc.). LeVan: as we implement these new architectures there is a constant requirement to slip new functionality into existing systems. Well, if we can do this in the existing stuff, what makes doing it right preferable to a kluge? Hinnebusch: we don't know until we really try to do this in different communities. It may turn out to not be a significant enough difference to make it worthwhile. LeVan: some people are worried about existing stuff becoming destabilized by the new architecture. Maybe we need to reassure them that we won't destabilize them and that we understand the need to support existing stuff.

Denenberg agrees that we are talking about developing new attribute sets and that the existing installed base will not be abandoned -- unless we begin to talk about Bib-2, which seems to hit raw nerves. We should pay particular care in the development of bib -2 (if we do it) to be sure that we don't abandon Bib-1. Christian: if we don't do this, Z39.50 will be left behind by other work outside the ZIG. Reich is concerned about putting another new thing on top of previously existing "new things" that have not been implemented yet (e.g., persistent result sets, Explain). D Lynch suggested the partitioning of attribute sets.

Stovel: Reich's argument is an argument for proceeding with the new attribute architecture. Once it's there, people doing implementations at that point will begin there and the rest of us have to catch up. LeVan: a similar argument was used with version 3, but experience didn't demonstrate it. Denenberg: adopting an attribute architecture in itself doesn't accomplish anything (like C Lynch said); what you implement is an attribute set developed under this architecture and agreed-upon guidelines. Presumably attribute sets are developed because there's an application or need. Does anyone here still have reservations about what we're doing? Do we agree that we should move forward with this?

Moen: will the implementor community or vendors buy into this? FCLA, NLC and Blue Angel are doing work that raises the awareness of consumers who can pressure the ILS vendors. CIMI is looking for guidance and wants to be able to make assumptions about what other people are doing in different domains. We need agreements to move forward. For example, do we incorporate Dublin Core?

Hinnebusch: this is a sociological issue: how do you discover what's going on elsewhere? C Lynch: the attribute set architecture does provide more hospitable ground for such coordination and gives you some hope of success. Percivall: maybe we only need a set of business rules or guidelines for constructing attribute sets, not a new attribute architecture. Turner: the proposal does not address the higher level management issue of how attribute set developers are going to communicate with or "discover" one another. Denenberg: the serious problem that we have not addressed is whether or not to define a particular attribute in your set or to profile a different set? There was a difference of opinion about whether this is a maintenance or registry issue (i.e., register your intent to develop an attribute set in a particular area) or something more.

Reich raised the issue of compatibility with version 2; years ago we thought everyone would move eventually do a full implementation of v3. If that doesn't happen, what does it mean? C Lynch: this was analyzed in some detail a while back. There is only one critical v3 feature, which is the ability to intermix attributes from multiple sets. But if you look at what you have to do technically to move form v2 to minimal v3, it's a moderate nuisance, not a big investment. The move to full-fledged v3, however (e.g., Extended Services, Explain), is a big investment -- none of which is required by the attribute architecture.

Pedersen wants to move to the new architecture and v3; arguments about legacy v2 systems are irrelevant. The problems are not technical. There are toolkits to help you develop v3. The important issue is that libraries want to query across different kinds of information (not just bib info) because that's what libraries do. The architecture group must address this problem. Many European projects had trouble trying to create interoperability not because of different attribute sets but because of the semantics of attributes across sets. C Lynch: the problem was not that vendors didn't implement Bib-1, but that they interpreted Bib-1 differently. D Lynch: we don't have a reasonable partition of attributes into areas of reasonable interest, which is why we have (for example) six different titles.

Hinnebusch: is anyone opposed to moving ahead with the new attribute architecture? Reservations, yes. Opposition, no. We have minimal consensus to move forward. People who work in the library community need to do some public relations with vendors and customers about this.

C Lynch grouped the issues:

There are technical issues like nesting, occurrence, etc.

Where do we go from here? What do we need to do to make attribute sets happen (based on the architecture blueprint)?

What about the Dublin Core and how does it relate to ZIG activities?

Agreed

Stovel provided an overview of the attribute architecture workshop. The group talked about the different kinds of attribute sets that were needed. Denenberg also talked about protocol attribute sets for Explain and Extended Services. In terms of Use attributes, people wanted to have one set of Use attributes for cross-domain searching. They also wanted sets of Use+ attributes for specific areas like bibliographic and geo-spatial information, and a ZIG-defined Mechanical attribute set (e.g., Relational attributes). This implies mixed attribute sets. Does this make sense to the ZIG?

Denenberg provided further explanation of the Use attribute set for cross-domain searching. The group examined the premise that this was the thrust of Dublin Core and rejected it because Dublin Core does not define or address searching. Even those who do think it addresses searching don't think it's for cross-domain searching but least-common-denominator searching. It is not the ZIG's business to tell DC what they should be doing with searching. Nonetheless, we need Use attributes for searching across different disciplines (perhaps call it "CD" for "cross-domain" set). These could have the same semantics as Dublin Core.

LeVan disagreed strongly with the assertion that Dublin Core was not intended for cross-domain work. DC folks took significant effort with multiple domains (communities) when doing their semantics. He agreed that they did not deal with searching. The DC folks conceded to LeVan the responsibility to see that DC is merged into Z39.50. The ZIG wants to do DC searching. Who would maintain a Cross Domain attribute set? LeVan likes the idea of a Cross Domain attribute set and thinks the ZIG should be influential in it's creation, but that the work should be done explicitly in liaison with the DC folks because they have credibility in cross-domain work and the ZIG does not.

Denenberg: who is the DC community? LeVan: those who read the DC list and attend the meetings. A technical advisory committee was recently established for the DC community to thresh out the difficult issues and find direction. They are finalizing an RFC for a simple DC. While we may disagree on the details and motivation, we need to support cross-domain searching. Denenberg would like the ZIG to appoint a technical group to liaison with the DC folks. LeVan agreed that this is the way to go. Moen: if we needed to expand the 15 DC attributes, could we do so on the basis of what's needed for searching (rather than tagging data)? Yes, as long as we're explicit. ??: what's the benefit of calling it "cross-domain" rather than Dublin Core? It's an issue of ownership as well as being a more informative name. C Lynch would hate to see us waste time reinventing DC. Casting DC into a functioning attribute set will need some other stuff (like non-Use attributes or an Any attribute), but the notion here is not to recapitulate the work that went into DC.

The ZIG agreed that we need to liaison with the DC group and that LeVan will be our point of contact.

Stovel: is there a place in DC for "Any," "AnyWhere" and "ServerChoice"? In the new Utility attribute set ? Or the new Mechanical attribute set? LeVan prefers in the Mechanical attribute set rather than arguing with DC folks to adopt Any as a Use attribute. C Lynch agrees. Initially we thought the Utility attribute set would be bigger, but upon investigation discovered it was small and can be handled in the Mechanical attribute set. The new Mechanical attribute set may include relational attributes. Lynch presumes that the Mechanical set would be defined and managed by the ZIG through the Maintenance Agency (like things in the base standard). Do folks agree? LeVan would like to see non-Use attributes evaluated and added to the Mechanical set. Lynch wants the opposite -- let's be parsimonious with this Mechanical set. Stovel doesn't think LeVan and Lynch's views are mutually exclusive. There was some debate about whether we're talking about a small number of non-Use attributes. Hinnebusch: if we move non-Use attributes into the Mechanical attribute set, do we need to revisit the non-Use types? Waldstein: are we still talking about Relation, Position and Structure making up the Mechanical attribute set -- especially if we're no longer saying the server must pick only one? Zeeman: the combining rules become more complicated. Waldstein must be able to specify multiple Relation attributes. Denenberg: if we want to lump a bunch of orthogonal attributes into the Mechanical attribute set, then we must allow multiple attributes of the same type. C Lynch wants someone to do something about the Mechanical attribute set, the sooner the better. Denenberg will solicit help to do a draft of the Mechanical set.

Hinnebusch summarized: LeVan owns the Cross Domain attribute set. Denenberg owns the Mechanical attribute set.

Stovel: how will we set up Domain Specific sets? Are they useful? How do we define domains? How do they work together? (See number 9 on the handout on "Attribute Architecture Issues.") What about "interdisciplinary domain-specific" attribute sets? What does that mean? LeVan: this is not a two-layer problem, but multiple layers, such as domain-specific and cross-domain interoperability. Hinnebusch: yes, but if we leave it up to communities to define the domain, some domains will include other domains. Do we need to recognize this fact in some document (that domains can contain multiple other domains)? Reich: we're making this harder than we have to. A nice hierarchical structure of domains is at best a short-term fiction. There are indeed many layers, but the new architecture handles this. Hinnebusch: there are many specific domains and one universal Cross Domain attribute set. C Lynch recommends using the word "community" rather than "domain." ??: even within a community there are many domains (e.g., the "-ologies" in ZBIG's Darwin Core) and lots of combinations of attribute sets; he hopes that the new architecture guidelines will help with semantics in the different domains. LeVan: this raises an important question -- the immediate step after agreeing on the architecture is to profile this. This new profile will be a model for others, so we need to support enthusiastically whoever steps up to do this.

Moen: if we recast CIMI attributes in the new architecture, does this involve only Use attribute work since the Mechanical set has everything else? C Lynch: this must be looked at on a case-by-case basis; some communities may need to develop other Relation attributes. Non-Use attributes will probably be incorporated into the Mechanical attribute set unless they are conspicuously domain specific. If an attribute takes a lot of explanation about discipline-specific practice, then it probably will NOT go into the Mechanical attribute set.

Hinnebusch: we also need to consider the Collections profile, which has Use attributes that are really collection-navigation attributes. Is that stuff domain-specific? LeVan is considering protocol-related attribute sets.

LeVan: how do communities self-identify themselves? Who is the "bibliographic community?" Hinnebusch: other than the bib community, the ZIG knows who's doing GILS, CIP, ZBIG, etc. We don't know who's doing bib. C Lynch: the other thing that makes the bib community different is that we packaged the bib community standards with the protocol standard. We need to explicitly decompose this and show that the bib community is not coterminous with the ZIG. We may need multiple communities to deal with the bibliographic stuff.

Reich: all communities want to interoperate at a level above Dublin Core but below the precise searching of union catalogs. LeVan: the scope statement of the bibliographic community or communities will define the group. Needleman: who develops the scope statement for the bibliographic community? Hinnebusch: if someone uses it, Bib-1 Use attributes will probably migrate to Bib-2.

Percivall: we have a Cross Domain set, a DC set, a Bib-2 set, and the possibility of additional attribute sets for different communities defined by a community profile. Denenberg: we're jumping the gun. We haven't reached consensus that there will be a Bib-2 set. We agreed on a Mechanical set and a Cross Domain set (though we're quibbling about what to call it). We can either create Bib-2 or fix Bib-1. LeVan: the new attribute architecture was motivated by the need to fix problems in Bib-1. Zeeman: the decision to do Bib-2 should not come from the ZIG, but from the bibliographic community, which is not coterminous with the ZIG. What group has the formal standing to speak for the "bibliographic community"? None?

Hinnebusch: bibliographic attribute sets will be developed by a group (call it the "bib group" for now) chaired by Stovel. Anyone who wants to can work with this group. The first task of the group is to define a scope statement.

Zeeman: there is no difference between fixing Bib-1 and doing Bib-2. We cannot retroactively change the semantics of Bib-1. Denenberg: there is a big difference between Bib-1 and Bib-2. Bib-1 is maintained by the Z39.50 Maintenance Agency and is part of the standard. Bib-2 is not. We can, however, change the semantics of Bib-1 if we want. The bib group can recommend that there be no Bib-2 and recommend that the ZIG do such-and-such to fix Bib-1.

Pedersen: how can we assure that the bib group is responsive to the international community. LeVan: the library community is important to the ZIG, but the ZIG does not dictate the activities of the library community. The library community should behave towards the ZIG the same way that other communities relate to the ZIG. Stovel: we are successfully keeping certain work inside the ZIG and putting community-specific work outside of the ZIG. Hinnebusch: though attribute sets have historically grown out of the ZIG, they may be defined for use outside of Z39.50. Denenberg: when we first started this effort, we wanted to get domain-specific attributes outside of the ZIG, but we were unclear on the bibliographic attributes. Waldstein is uncertain here; every community has its bibliographic content databases and interests. It does feel like bibliographic information is different.

Reich: there's a portion of bibliographic information that will be common to all domains; other portion will be unique to libraries and union catalogs, for example. There is a space between Dublin Core and domain-specific communities. Pedersen: since the chair of the new bib group is in the ZIG, can the meetings overlap with the ZIG? Hinnebusch: Z39.50 has been denigrated on occasion because of its close ties with the library community.

Stovel wants the bib group to work outside of the ZIG as the first example of the new architecture. This would leave the ZIG free to focus on the guidelines. LeVan: there is a scary part to this that must be acknowledged. If the geo-spatial community is doing attribute sets, the ZBIG doing attribute sets, etc. -- the ZIG cannot say "no" to what they've done or to the semantics of their attributes. C Lynch hopes that those who are going to fight and bitch about this join the new bib group. Hinnebusch: it's important to see that divorcing the bib group from the ZIG does not mean divorcing it from the bibliographic community or ZIG people interested in Bib-2. If the ZIG constitutes the group, then the ZIG has responsibility for the group. Pedersen is concerned about international involvement, but Stovel is very interested in international participation. Denenberg would like to keep the group in the ZIG, but he's willing to let it go ahead on its own. Agreed to live with ambiguous relationship between the bib group and the ZIG.

Stovel: how is the ZIG going to develop best practices and guidelines (the absence of which led to the failure of Bib-1)? Denenberg: the Maintenance Agency will maintain the documents and guidelines if people agree. Agreed. But how do we do it? Reich: developing best practices and guidelines is a big effort. The attribute architecture proposal doesn't provide much guidance. Hinnebusch: would it be useful for the ZIG to discuss the proposal in some detail, including the interaction of different attribute sets? Moen: this will be in iterative process as groups migrate from Bib-1, CIMI, whatever to the new attribute architecture. Denenberg: the ZIG will go through an iterative and evolutionary process in developing the guidelines. Percivall's view of the guidelines is more along the lines of a core set (whatever it's called), a bib set (whatever it's called) and Domain Specific sets that guidelines would indicate how to combine. Hinnebusch: you're dangerously close to being assigned to write guidelines. Turner agrees that this is the kind of guidelines that we need. Van Lierop wants some Use attributes to be mandatory and others optional as defined in the attribute set. Hinnebusch: this is covered in the attribute architecture. LeVan and Denenberg: no, only a profile can say support for a particular Use attribute value is mandatory (not an attribute set or guideline document). Evidently we need a profile for the library community.

D Lynch: there is a parallel (almost) between GRS schemas, attribute sets and inheritance. In GRS-1 schemas we reference tagsets by tagtypes. So the tagset is this vocabulary we've been talking about. You can have as many tagsets as you want. A schema incorporates those along with rules for how they go together, etc. We may want to do the same thing with attributes -- attribute schemas and values. So if you wanted to build the biological attribute schema, you would say attribute type 1 is the Use attributes from the Mechanical schema, attribute type 2 is X, attribute type 4 is Y. Same OID as schema. C Lynch: how does this relate to the previous discussion? D Lynch: it answers the question of how you have some stuff mandatory in some attribute sets but not others. It's partly a question of how inheritance works. For this attribute list, all the rules come from this schema. Zeeman: this means carrying the profile identifier into the protocol. So? Denenberg: is what is mandatory simply server recognition of the attribute or server support of the attribute? Reich: how does the attribute set differ from the profile?

C Lynch sides with LeVan on this. We have confounded attribute sets and profiles. He always thought of attribute sets as agreed-upon vocabulary and semantics, nothing more, which is why guidelines and best practices are needed. Profiles are where groups agree on what part of that attribute set they support mandatorily or will gracefully reject. Zeeman: we started talking about attribute sets and profiles together because groups doing profiles are finding the vocabulary insufficient. C Lynch: it is useful to discourage groups and separate these two so that work can be shared among groups. Denenberg: what does it mean to say that a server must support an attribute? We were all over the map with this and shouldn't go there again. Hinnebusch: it's important that we recognize that the problem Van Lierop raised is serious. The solution we've adopted is that this is addressed in profiling. Vendors know and care nothing about profiles, and there's no place in the PDU to put a profile ID.

LeVan: profiles are about tasks, which is where you make things mandatory. Attributes are not about tasks but information. Hinnebusch: there is no vendor who will politically and economically buy into mandatory attributes. Zeeman: our standard does not require conformance to an attribute set but to combinations of attribute values. Hinnebusch: it's the consensus of the ZIG that profiles are where things are made mandatory; attribute sets are vocabulary. Moen: in our efforts at profiling, because it was a new approach for dealing with emerging interoperability problems, the definition of attribute sets went right along with profiling -- but we do need to separate them as D Lynch suggested. Hinnebusch: the ATS profile does not say that you must support author, title, subject searching, only that you must return a diagnostic if you don't (you won't return something else). Van Lierop: but the profiles aren't being implemented. That's a separate problem. Hinnebusch: the only way to do what Van Lierop wants is to create market pressure. Some vendors do not have the indexing structure to do this, or maybe the semantics are unclear.

Specific architectural questions:

Should nesting be permitted for Use attributes? We agreed that nesting must be permitted for Fieldname attributes, but it's unclear if the new architecture should support nesting of Use attributes? C Lynch: the architecture proposal says you don't nest Use attributes. If a server gets a query with nested Use attributes, it should return "malformed query" or "you idiot, you're nesting Use attributes; don't do that." This applies only to class 1 attributes. Stovel: the proposal allows semantic indicators which can accomplish nesting. Reich: all of their current Use attributes are all database Fieldname attributes (if migrated to the new architecture); he does not condone nesting of Use attributes.
Denenberg: should class 1 explicitly preclude the nesting of Use attributes or should it simply discourage it? Hinnebusch: the new architecture allows for the development of a new class of Use attributes that would allow nesting. C Lynch: let's disallow it now because allowing it later will not screw things up too badly. Agreed. (Note that nesting of Fieldname attributes is allowed, just not the nesting of Use attributes.)

Should specification of occurrence be permitted for Use attributes? C Lynch: there is no debate about occurrences for FieldNames (we need those). The question is really a data modeling issue. If we believe that Use attributes are ordered in a record, then it makes sense to talk about occurrence for Use attributes. Otherwise, this doesn't make sense. Waldstein wants to be able to specify occurrence of Use attributes. LeVan: do you have intimate knowledge about the structure of the record? Denenberg: let's adopt the same strategy here as we did with nesting of Use attributes. Agreed.
Backup: "occurrence" is a single Occurrence attribute, which is consistent with the architecture, but we could have multiple Occurrence attributes, which gets ugly. Nonetheless it can be solved syntactically. Hinnebusch wants the architecture document to say that, using the type 1 query, here is how you syntactically construct this kind of thing. Denenberg will draft the text. Hinnebusch is concerned that we not do ugly stuff in the architecture document just because the type 1 query is inadequate.
D Lynch: this business about fields within fields and where they are really means there's a missing operator in the query. LeVan: this is wrong because you'd end up with a list of no term values. Waldstein and D Lynch: perhaps it's time to revisit the type 1 query, rather than base the new architecture on (assume) the type 1 query. Denenberg: this will slow us down. C Lynch is concerned about a two-year process of working on a new query. The architecture group accomplished what they did because they constrained the problem. Redoing the query is another degree of freedom and scoping it will be difficult. Remember the type 102 query never had broad implementation. Hinnebusch thinks scope can be contained; the type 102 folks contained their work (e.g., they didn't want full NLP in their first pass). Hinnebusch and LeVan argued about whether there is a real need to redo the query here. D Lynch was not suggesting redoing the query, just expressing his discomfort with putting operator-type activity in an attribute set.
Denenberg: is the proposal here to scratch structured searching? LeVan: no; let's just scope out what we can and cannot do today or with the new architecture. Denenberg is concerned about doing this work now, too late (two years) in the game. Hinnebusch: the concept of what's reasonable and what people want to do with Z39.50 has changed in the past two years. D Lynch: let's restrict what we're allowing, e.g., multiple occurrences of multiple things in a single list is over the line. Hinnebusch is concerned that a kluge now will haunt us later; let's do it right if we do it.
Stovel: so we'll allow nesting or simple occurrence, but not both together. Agreed. This should cover real world applications for the next year or two.

Is anchoring sufficiently specified? And does it do everything that people need? (See section 3.1.2 of the attribute architecture document.) C Lynch: the idea is whether the context (path) is clear or not. LeVan: that's insufficient; maybe what we need in our Mechanical attribute set is a wildcard. C Lynch: the current proposal is more restrictive; there is a single wildcard that can be used anywhere in the path with some basic rules. Denenberg: do we want to allow anchoring? LeVan proposed dropping "anchor" as a separate attribute type and do something else, like AnyOne and AnyNumber (?) which will do what anchoring was intended to do without worrying about interactions.
D Lynch: assume that the thing is at the top unless stated otherwise. Waldstein assumed the default was floating; something feels wrong here. This feature only applies to Fieldname attributes (which relate to the physical structure of the record) in the Mechanical attribute set, not Use attributes (which are abstract constructs). Agreed: repeating FieldNames cannot have occurrences.

Is the rule that "a class 1 attribute set may not define any attribute types not defined for class 1" overly restrictive? LeVan: there is no need to say this. C Lynch: maybe what we need is some guidelines that tell people not to create new attribute types without first talking to folks about their interactions with existing attributes. Agreed to strike the rule and replace it with a statement about guidance.
There was some discussion about whether we would ever have class 2 attributes and, if so, (St Gelais) what are the rules or guidelines for mixing classes and setting precedence? Then there was talk of eliminating the whole notion of class since none of us can think of what a class 2 attribute type would be. C Lynch: without the notion of class the architecture document sounds very restrictive about what can be done with Z39.50 attributes. Agreed.

agreed

New Category	New Type	Old Bib-1
Access Point	Database Fieldname	Use
Access Point	Abstract	Use
Query Management	Weight
	Hit Count
	Stopwording
Qualifying	Language
	Content Authority
	Expansion / Interpretation	Relation, Truncation
Comparison	Comparison	Relation
Format / Structure	Format / Structure	Structure
Occurrence	Occurrence
Indirection	Indirection	Use

C Lynch, Denenberg and Stovel will revise the architecture document. The revised doc will be posted on the web for a one-month review period. Denenberg will also work on the Mechanical attribute set. Stovel will draft a scope statement for the new bib group . Drafting guidelines for creating attribute sets will be an agenda topic for the Madrid ZIG in October.

7. ZBIG (Z39.50 Biological Implementors Group)

ZBIG is interested in exploiting Z39.50. All of their databases are relational, but they have semantic problems within their community and integration problems with other communities.

ZBIG is comprised of natural history museums world-wide. Their items are analogous to archives, in that every specimen is unique. People want to bring these specimens together for analysis. Natural history has a breakdown of "-ologies," each of which has their own collection of databases and way of doing business. Most catalogs have been done independently. Data partitioning within this community has peculiarities as well as similarities. The data will become useful and relevant to the world and decision-making when we can create synthetic data sets. We need to integrate.

There was a workshop in 1992 of the different "-ologies" people to create a high-level information model for biology in 1993. In 1997, they tried to further develop the model using an object-oriented framework of complex structured attributes, including collection objects, collection events, and localities. The "Darwin core" attribute set will include top-level attributes across the -ologies. They want to be able to publish collection objects and authority records for taxon (the names, full reference of where it came from, etc.). They also have a controlled vocabulary.

Initial work with Z39.50 began in 1998 with the following constraints:

Data is in relational databases with different schemas.

Information is used primarily for analysis, but browsing is also important.

They must handle large result sets.

The system must be low cost and low maintenance.

The system must be easy to use and stable.

The performance of the prototype is adequate. Z39.50 and extensions to analysis applications provides powerful IR. GRS-1 is an acceptable record syntax and the translation from RPN to SQL is adequate. They did not user ZSQL query, but wrote their own RPN to SQL translator. There is still extensive work to be done on attribute sets and profiles.

They do have plans to search across bibliographic and geo-spatial communities.

day 1
day 3