SRU (Search/Retrieval Using URL)

SRU VERSION 1.2

SRU Search/Retrieve Operation

SECTIONS: Parameters | Records | Result Sets | What's new in 1.2?

The searchRetrieve operation is the main operation in SRU. It allows the client to submit a search and retrieve request for matching records from the server.

PARAMETERS

Request Parameters

Name Mandatory/Optional? Description
operation mandatory The string: 'searchRetrieve'.
version mandatory The version of the request, and a statement by the client that it wants the response to be less than, or preferably equal to, that version. See Version.
query mandatory Contains a query expressed in CQL to be processed by the server. See CQL.
startRecord optional The position within the sequence of matched records of the first record to be returned. The first position in the sequence is 1. The value supplied MUST be greater than 0. The default value if not supplied is 1.
maximumRecords optional The number of records requested to be returned. The value must be 0 or greater. Default value if not supplied is determined by the server. The server MAY return less than this number of records, for example if there are fewer matching records than requested, but MUST NOT return more than this number of records.
recordPacking optional A string to determine how the record should be escaped in the response. Defined values are 'string' and 'xml'. The default is 'xml'. See Records.
recordSchema optional The schema in which the records MUST be returned. The value is the URI identifier for the schema or the short name for it published by the server. The default value if not supplied is determined by the server. See Record Schemas.

For version 1.1: If the recordXPath parameter is included, it is the abstract schema for purposes of evaluation by the XPath expression.
recordXPath which was in 1.1 is excluded in 1.2.
   
resultSetTTL optional The number of seconds for which the client requests that the result set created should be maintained. The server MAY choose not to fulfil this request, and may respond with a different number of seconds. If resultSetTTL is not supplied then the server will determine the value. See Result Sets.
sortKeys which was in 1.1 is excluded in 1.2.    
stylesheet optional A URL for a stylesheet. The client requests that the server simply return this URL in the response. See Stylesheets.
extraRequestData optional Provides additional information for the server to process. See Extensions.

Example:

This example is a search for the term "dinosaur", requesting that at most one record be returned, in the 'mods' schema.

Response Parameters

The response to an SRU searchRetrieve request is an XML document. The table below provides a summary and description of the elements provided by the XML document. The "Type" column indicates either an XML Schema type ("xsd:") or a type defined within the schema.

Name Type Mandatory/Opetional? Description
version xsd:string Mandatory The version of the response. This MUST be less than or equal to the version requested by the client. See Versions.
numberOfRecords xsd:integer Mandatory The number of records matched by the query. If the query fails this MUST be 0.
resultSetId xsd:string Optional The identifier for a result set that was created through the execution of the query. See Result Sets.
resultSetIdleTime xsd:integer Optional The number of seconds after which the created result set will be deleted. The result set may also become unavailable before this. See Result Sets.
records sequence of <record> Optional A sequence of records matched by the query, or surrogate diagnostics. See Records.
nextRecordPosition xsd:integer Optional The next position within the result set following the final returned record. If there are no remaining records, this field MUST be omitted.
diagnostics sequence of <diagnostic> Optional A sequence of non surrogate diagnostics generated during execution. See Diagnostics.
extraResponseData <xmlFragment> Optional Additional information returned by the server. See Extensions.
echoedSearch
RetrieveRequest
<echoedSearch
RetrieveRequest>
Optional The request parameters echoed back to the client in a simple XML form. See Echoing the Request.

RECORDS

All records in SRU are transferred in XML. Records are not assumed to be stored in XML. Records which are not natively XML must be first transformed into XML before being transferred. Records in the response may be expressed as a single string, or as embedded XML. If a record is transferred as embedded XML, it must be well-formed and should be validatable against the record schema.

The records parameter in the response is a sequence of record elements, each of which contains either a record or a surrogate diagnostic explaining why that particular record could not be transferred. If the record schema is unknown or the record cannot be rendered in that schema, then the server MUST return a diagnostic.

Each record element is structured into the following elements:

Record Parameters

Name Type Mandatory/Optional? Description
recordSchema xsd:string mandatory The URI identifier of the XML schema in which the record is encoded. Although the request may use the server's assigned short name, the response must always be the full URI. See Record Schemas.
recordPacking xsd:string mandatory The packing used in recordData, as requested by the client or the default. See below.
recordData <stringOrXmlFragment> mandatory The record itself, either as a string or embedded XML.
recordIdentifier
(new in 1.2)
xsd:string optional

An identifier for the record by which it can unambiguously be retrieved in a subsequent operation. For example via the 'rec.identifier' index in CQL.
recordPosition xsd:positiveInteger optional The position of the record within the result set. See Result Sets.
extraRecordData <xmlFragment> optional Any additional information to be transferred with the record. See Extensions.

An example record, in the simple Dublin Core schema, packed as XML:

<record>
  <recordSchema>info:srw/schema/1/dc-v1.1</recordSchema>
  <recordPacking>xml</recordPacking>
  <recordData>
    <srw_dc:dc xmlns:srw_dc="info:srw/schema/1/dc-v1.1">
     <dc:title>This is a Sample Record</dc:title>
    </srw_dc:dc>
  </recordData>
  <recordPosition>1</recordPosition>
  <extraRecordData>
    <rel:score xmlns:rel="info:srw/extensions/2/rel-1.0">
      0.965
    </rel:rank>
   </extraRecordData>
</record>

Record Packing

In order that records which are not well formed do not break the entire message, it is possible to request that they be transferred as a single string with the <, > and & characters escaped to their entity forms. Moreover some toolkits may not be able to distinguish record XML from the XML which forms the response. However, some clients may prefer that the records be transferred as XML in order to manipulate them directly with a stylesheet which renders the records and potentially also the user interface.

This distinction is made via the recordPacking parameter in the request. If the value of the parameter is 'string', then the server should escape the record before transfering it. If the value is 'xml', then it should embed the XML directly into the response. Either way, the data is transfered within the 'recordData' field. If the server cannot comply with this packing request, then it must return a diagnostic.

RESULT SETS

SRU does not assume support of persistent result sets -- that a result set created by one request may necessarily be accessed by a client in a subsequent request. SRU does expect the server to state whether or not it supports persistent result sets, and if so the result set model described below is required.

There are applications in which result sets are critical; on the other hand there are applications in which result sets are not viable. An example of the first might be scientific investigation of a database with comparison of data sets produced at different times. An example of the latter might be a very frequently used database of web pages in which persistent result sets would be an impossible burden on the infrastructure due to the frequency of use.

Even if the server does not make result sets available for public manipulation, the following model is also important to understand in order to allow a single request to both match records and then sort them.

Result Set Model

Processing of a query results in the selection of a set of records, represented by a result set maintained at the server; logically it is an ordered list of references to the records. Once created, a result set cannot be modified. Any operation which would somehow change a result set instead creates a new result set. Each result set is referenced via a unique identifying string, generated by the server when the result set is created.

From the client's point of view, the result set is a set of records each referenced by an ordinal number, beginning at 1. The client may request a given record from a result set according to a specific schema. For example the client may request record 1 in Dublin Core, and subsequently request record 1 in MODS. The requested schema is not a property of the result set (nor of the requested records as a member of the result set); the result set is simply the ordered list of records.

A record might be deleted or otherwise become unavailable while a result sets which references that record still exists. If a client then requests that record, the server is expected to supply a surrogate diagnostic in place of the record. For example, if the record at position 2 in a result set is deleted and then a client requests records 1 through 3, the server should supply, in order: record 1, a surrogate diagnostic for record 2, record 3.

The records in a result set are not necessarily ordered according to any specific or predictable scheme, unless it has been created with a request that contains a sort specification as part of the query. See sorting in CQL for more information regarding the specifics of sorting. If search and sort specifications are supplied on the same request then only the final sorted result set is considered to exist, even if the server internally creates a result set and then sorts it.

resultSetId

If the server supports result sets, it may include a resultSetId in the searchRetrieve response, along with an idle time described below. If another query is submitted then the server will again supply a result set id. If the result of the query would modify an existing result set (for example, a request to sort an existing result set), then the server must supply a new id for this new set. The server should maintain unique names for each result set created, even if the result sets no longer exist, such that clients do not mistakenly request records from the new set when meaning to refer to the previous set with the same identifier.

resultSetIdleTime

The server may supply an idle time along with a result set. The server is making a good-faith estimate that the result set will remain available and unchanged (both in content and order) until a timeout (a period of inactivity exceeding the idle time). The idle time is an integer representing seconds; it must be a positive integer, and should not be so small that a client cannot realistically reference the result set again. If the server does not intend that the result set be referenced, it should omit the result set identifier in the response.

VERSION

In any actively developed protocol or piece of software, there is a concern about interoperability between different versions. In SRU, there is an explicit interoperability mechanism, with precisely defined semantics. The mechanism defined allows for clients and servers using different versions to interact without protocol level errors. Versions of SRU will always be recorded as strings of the format 'major.minor' where major and minor are independent integers.

Operations: All SRU 1.2 operations have a version parameter, with the exception of the parameterless form of the explain request.

For example:

http://lx2.loc.gov:210/LCDB?version=1.2&operation=searchRetrieve&query=dinosaur

The version parameter on a request both indicates the version of the request and is a statement by the client that it wants the response to be less than, or preferably equal to, that version. The version parameter in the response message is the version of the response. If the server cannot supply a response in that version or lower, then it must return a diagnostic. If possible this diagnostic would be in the version requested or lower, but that is not a requirement. Here are some examples of how this works in practice. If a 2.0 client asks a 1.1 server for a 2.0 response, then the server is able to respond with a 1.1 response as it is lower than version 2.0. If a 1.1 client asks a 2.0 server for a 1.1 response then the server is able to reduce its response version to accommodate the client. If a 1.1 client asks a 1.1 server for a 1.1 response, then there is no version mismatch and the server is able to accommodate the request. Version 1.0 Version 1.0 was an experiment, and has been officially deprecated. The version 1.0 implementation trial was very useful informing the development of version 1.1, the first non-experimental version. Version 1.0 does not have a version parameter in any of the requests or responses and hence cannot be considered to be part of this version interoperability system. If a client requests version 1.0, then the server may return a 1.0 response but is under no obligation to do so.Version Documentation and Changes Documentation for all versions of SRU will be maintained such that server and client authors are able to track changes between versions. A summary document of changes between versions will also be maintained. This version interoperability solution will only work so long as there are no additional mandatory parameters added to a request, and as such the SRU editorial board will endeavor to only ever add optional elements. If there is a requirement to add a mandatory parameter to a request, then this will be announced with as much prior warning as possible. While only the documentation for the most recent version of the protocol is maintained, the changes between versions are listed below. Note that although version 1.0 is officially abandoned, the changes from 1.0 to 1.1 are listed, for the benefit of those who have already implemented 1.0. » Changes between Versions

STYLESHEETS

In order to render the response, "thin" clients may provide a stylesheet to turn the response XML into a natively renderable format, often HTML or XHTML. This allows a web browser, or other application capable of rendering stylesheets, to act as a dedicated client without requiring any further application logic. The echoedRequest parameter on the response enables a client to use this stylesheet to also have the request it just made available without any client side logic.

Operations: All operations, other than the parameterless explain request, have the stylesheet parameter.

The value of the parameter is the URL of the stylesheet to be included in the response. This URL is to be included in the href attribute of the xml-stylesheet processing instruction before the response xml. It is likely that the type will be XSL, but not necessarily. If the server cannot fulfill this request it must supply a diagnostic. This parameter may not be used with SRU via SOAP. It is a SOAP error to return a stylesheet, and hence an error to request one. If this parameter is not supplied, then the server can, at its discretion, include a default stylesheet. The default stylesheet URL may be included in the explain document. For example, upon receiving the request . . .

http://lx2.loc.gov:210/LCDB?version=1.2&operation=searchRetrieve
&stylesheet=/master.xsl&query=dinosaur

. . . the server must include the following as beginning of the response:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="/master.xsl"?>
<sru:searchRetrieveResponse ...

ECHOING THE REQUEST

Very thin clients, such as a web browser with a stylesheet as above, may not have the facility to record the query that generated the response it has just received. In order to prevent clients having to maintain this information, the server may echo the request back to the client along with the response. There are no request elements associated with this functionality.

There is one response element per parameter for the operation in which the request is echoed. The name is the name of the response element prefixed by echoed. The parameters are rendered into XML, as per the SRU over SOAP request.

xQuery
xQuery is an additional parameter for searchRetrieve and scan, which has the query rendered in XCQL. This has two benefits:

  1. The client can use XSLT or other XML manipulation to modify the query without having a CQL query parser.
  2. The server can return extra information specific to the clauses within the query. See the next section on extensions for more information.

baseUrl
A server can include is own base URL in the echoed request. This allows the client to easily reconstruct queries by simple concatenation, or retrieve the explain document to fetch additional information such as the title and description to include in the results presented to the user. For example:

<echoedSearchRetrieveRequest>
<version>1.2</version>
<query>dc.title = dinosaur</query>
<recordSchema>mods</recordSchema>
<xQuery>
<searchClause xmlns="//www.loc.gov/zing/cql/xcql/">
<index>dc.title</index>
<relation>
<value>=</value>
</relation>
<term>dinosaur</term>
</searchClause>
</xQuery>
<baseUrl>http://lx2.loc.gov:210/lcdb</baseUrl>
</echoedSearchRetrieveRequest>


 

What's New in 1.2

  1. Record identifier (optional) added to record structure.
  2. XCQL parameter becomes optional.
  3. XPath parameter dropped (becomes an extension).
  4. base url added to response.
  5. record hits incorporated into xcql.
  6. Z39.92 replaces explain.
  7. Sorting is no longer a protocol function.