ZIG Meeting
12/8/2000
Library of Congress
Z39.50 URL Session
Chair/Moderator: Kevin Gamiel
Notes provided primarily by Les Wibberley.
- Background and history of z39.50 URLs
- Issues: How is it being used today, is current form okay, if not, what
is needed, etc.
- Kevin's URL need: specify query, result set, and specific record from result
set
- goal: set in motion a process ot determine the requirements, develop a
definition, and reach agreement
Ray: Background of the ZURL
-
September 1994 - ZIG meeting at CNIDR - defined session and fetch url
-
when users click on this url, it will open a z client, point ot existing
service, construct query, and return result set with one or more items
-
potential problems cited for this proposal were
-
DocIDs change over time
-
what about result set name
-
transient or persistent result, etc.
-
issue: want two separate URLs?
-
search and retrieve url were defined
-
resulted in RFC 2056 via the IETF in 1996
Dave Vieglais
-
presentation on how University of Kansas is using Z39.50 URLs
-
150 million records of natural history museum available from around the
world via Z39.50
-
treat URLs as subset identifier - identifies subset of a database
-
format used: z39.50/machine/database/subsetofinformaton
-
subset of information is operation + query
-
used LDAP URL as an example
-
point in hierarchical tree instead of database, scope of search, keyword=value
-
applied this concept to z39.50 url
-
z39.50url = protocol/machine/database/query
-
uses index data's search syntax
-
includes record syntax, etc.
-
Z client is implemented as protocol handler, implemented in Internet Explorer
-
brings back XML, and displayed by IE
-
structure of xml document generated is defined
-
response envelope
-
header section
-
error section
-
records section - contains records, translated from GRS-1 or USMARC
-
uses GRS-1 tags as xml tags
-
easy to transform this XML/GRS-1 record into whatever you want
-
easy to transform into nicely formatted html, for example
-
can imbed z39.50 url into the result, and have IE submit a broadcast query
to resolve
-
native z39.50 protocol support in Internet Explorer
-
one issue: need for escape characters in url makes it somewhat messy
-
architecture
-
protocol handler is an internet to z39.50 client, registered with explorer,
loads the interface, and invokes client
-
nice architecture diagrams shown
-
easy to use alternative approaches, like SOAP over HTTP to go to http server
-
this is used as alternative approach, when no z39.50 server available (e.g.
http server to oracle)
-
this model works well with z39.50 and other protocols
-
does GRS-1 to XML translation for returning records, similar approach for
marc records
-
conclusions
-
z url combined with xml provides extremely flexible ir solution
-
use of common result set structure facilitates interop. between alternate
protocols (http, ldap, sql,...)
-
benefit of z39.50 to biological community has been the concept of registered
attribute sets and schemas
-
use these Z39.50 URLs heavily
Questions
-
are there other current uses we didn't list?
-
z39.50 URLs are used extensively
-
is current usage so minimal that we can simply start over?
-
if we generalize it, is this a job for the zig or who?
-
can we encode entire z39.50 PDUs?
-
define what follows the question mark in the url, to be broader than a
query
-
how transient is a url?
-
should we have result set offsets in a url?
-
how do we address a specific result record for a system without a docID
concept
-
can we stringify the query structure
-
what about the content of the response?
-
is it possible to provide a stateful implementation without need for alternative
z protocol identifiers?
-
Discussion
-
parameters included in http URLs to do z39.50 searches are a good indicator
of what is needed in Z39.50 URLs
-
there is no standardization of the query parms used in http queries - each
search engine is different
-
what is the benefit of changing the current definition of the ZURL
-
the current spec doesn't go far enough, broaden the scope of what the url
can do
-
major difference is that this will require taking the fields and define
a boolean query on the client side
-
current extensions would be defined as needed - stated in the IETF rfc
-
Is this an extension, or a redefinition of the url
-
extension of the url would allow for backwards compatibility for existing
URLs
-
there are several users of the current zurls, both the retrieval and search
versions
-
Mike would like to see a new zurl, which defines equivalency to existing
url formats
-
need to relax the restrictions in the current url for database and docID
-
change to a keyword=value format for all parms beyond the ? in the url
-
How about defining a URI scheme, rather than a URL?
-
would remove need to conform to syntax of url
-
want to remain consistent with url parsers
-
Ray: proposal that a subset meet to gather requirements, factor in concerns,
and draft up a new url scheme
-
not certain whether it will be fully compatible with current scheme
-
Ralph: we need to carefully protect the namespace for the z39.50 url, and
not end up with a plethora of URLs
-
Dave has implemented this in an experimental form, could adopt to change
-
Dave would like group to focus on IR in general, rather than just Z39.50
-
There is a crying out need for a standard format for IR
-
Agree, but make sure that the z39.50 url needs are fully address
-
We need to be sure to address this correctly
-
First, define a z39url which works well for us, then generalize it for
IR
-
Is the IETF still the right place to go to address this new URL
-
there is a w3c interest group formed to address URIs, starting up in a
few weeks
-
Mark: IETF is still the appropriate place to register new URLs
-
This will be posted to the list, with invitation for participation