SRU (Search/Retrieval Using URL)

Implementation Stories

Ralph LeVan, April 2005

Pears is easy to use and there are lots of record handlers. SRW/U makes building interfaces trivial. So, I decided to provide searching for the SiteSearch documentation.

The database is built with Pears and a spidering record handler that returns records with the URL of the page in one field, the title from the page in another and the body in yet another. The database description made a phrase index for the URL, a phrase and keyword index for the title, a keyword index for the body and a keyword index combining all fields. This created a database with 724 records and 36K index terms from 371K nips.

I exposed the database via SRW/U. The URL for it is http://alcme.oclc.org/srw/search/SiteSearchDocumentation. You'll get back an Explain record with a stylesheet reference. The stylesheet renders a user interface. (It's not a very elegant interface and needs a little work.) The '=' relation gets you adjacency searches and the 'exact' relation gets you phrase searches. Try dc.title=pears.

Now, one of the beauties of SRU is that you generate good URL's. So, here's that Pears search embedded in a URL:
http://alcme.oclc.org/srw/search/SiteSearchDocumentation?query=dc.title= pears&version=1.1&maximumRecords=10

There's a parameter on the search screen that controls how many records you get back. My default is 1.

All this code is checked into my CVS repository, if you want to pull it yourself. Otherwise, I'll make a new Pears jar soon.


Janifer Gattenby, October 2005

The DBNG (Digital Bibliography of Dutch History - Digitale Bibliographie voor de Nederlandse Geschiedenis) is a new database realized by a joint project between OCLC PICA and the Koninklijke Bibliotheek (KB), the Dutch Royal Library. The database was formed by combining 4 separate databases, de-duplicating them and harmonising names, classifications and subject headings. It now has more than 200,000 titles covering books, periodical titles, articles and some abstracts and summaries. Whilst OCLC PICA created the database using PSI (Pica Search and Index Engine) that is SRU enabled, the KB developed a web based user interface that includes an SRU client. Via SRU, there are more than 20 search access points and search limiters, most of them also enabled for scanning. The searching is rich, including date range searching, keyword truncation, boolean and proximity searching, sorting by year of publication and relevance.

Result data can be returned in one of 6 XML schemas, short or full Dublin Core (DC), UNIMARC or PicaMARC.

The database is available at: http://www.dbng.nl/ .