Authorities in Evergreen 2.0 ============================ :author: Dan Scott :copyright: 2011 Laurentian University :backend: slidy :data-uri: :max-width: 45em :icons: :duration: 45 This talk is licensed under a http://creativecommons.org/licenses/by-sa/2.5/ca/[Creative Commons, Attribution, Share Alike license]. image::images/cc_by_sa_360.png[] Presentation source: http://bzr.coffeecode.net/eg2011_authorities/ Quick overview of authorities ----------------------------- image::images/william_kate.png[] Quick overview of authorities ----------------------------- image::images/william_kate_xout.png[] Quick overview of authorities ----------------------------- * Supports a controlled vocabulary for names, subjects, and titles ** Also 'floating subdivisions' (form, chronological, geographical, general) * Implemented as a separate set of MARC records, with one authorized heading per authority record * Each authority record can include tracings ** 'See reference' leads from an unauthorized heading to an authorized heading ** 'See also' reference leads from one authorized heading to another authorized heading * As authorized heading evolves over time ('Cookery' -> 'Cooking', or addition of death dates to names), changes should propagate automatically from authority record to bibliographic records using that heading Authorities in Evergreen 1.x ----------------------------- * Features: ** Import and forget ** MARC editor: *Validate* button and context menu ** Search assistance (facets) * Limitations: ** No way to edit or delete authority records ** No linkage between authorities and bibs ** No way to search authorities ** No way to publish authorities Authorities in Evergreen 2.0 / 2.1 ---------------------------------- * Features: ** Import, edit, and delete authorities ** Bibs are linked to authority records instead of just string matches ** Edits to authority records automatically apply to linked bibs ** MARC editor: select and apply, including 'see from' and 'see also' entries ** Duplicate authorities are prevented by default ** Search and publish authorities via SRU and Z39.50 ('2.1') * Limitations: ** Controlled fields are hard-coded ** No linkage between authorities (4xx 'see from', 5xx 'see also', 7xx 'reciprocal' entries) ** Floating subdivisions are just another subfield ** No real concept of thesauri (LCSH vs. MeSH vs. LAC) Getting authorities into Evergreen 2.0 / 2.1 -------------------------------------------- * Largely unchanged since 1.6.x; two options: ** *MARC Batch Import*: best for batches of 5,000 records or less ** `marc2are.pl` script: best for bulk loads image::images/staff_client_vandelay_import.png[] Don't fear the command line: import ----------------------------------- [source, bash] ------------------------------------------------------------------------------ perl marc2are.pl --user admin --pass secret authorities.mrc | perl pg_loader.pl --auto are --order are | psql -U evergreen -h localhost -d evergreen ------------------------------------------------------------------------------ * Advantages: ** Fast - no overhead ** With piped input, uses parallel processing * Disadvantages: ** Requires command-line and database access Managing authority records -------------------------- * Once authority records have been loaded, the **Cataloging** -> **Manage Authorities** menu entry becomes useful image::images/authority_manager.png[] ifndef::backend-slidy[] ** *Search* each authority type ** Results display in alphabetical order, along with the number of linked bibliographic records for each ** Page through results one at a time, or jump directly to a different page of results ** *Actions* for each record are `Edit`, `Delete`, and `Merge` endif::backend-slidy[] Editing an authority record --------------------------- * **Actions -> Edit** opens the authority record in a MARC editor: image::images/authority_edit.png[] Editing an authority record: result ----------------------------------- image::images/authority_edit_result.png[] Merging authority records ------------------------- * **Action -> Merge** enables you to merge two or more authority records * Bibliographic records are linked to the master authority record image::images/authority_merge.png[] Controlling fields manually --------------------------- * Fields in bibliographic records that are controlled by an authority record will be linked via a ‡0 subfield per MARC rules * Right-clicking a field in the MARC Editor triggers an alphabetical browse of available authority records, including 4xx and 5xx matches * Ability to create an authority record on the fly Browsing authority records -------------------------- * The classic MARC editor offers an authority browser on controllable fields image::images/authority_browse.png[] Browsing authority records (raw) -------------------------------- * Browse interface exposes MARCXML records to the Web ** Template: http://'hostname'/opac/extras/'browse-type'/marcxml/'index'/'scope'/'term'/'page'/'per-page' *** 'browse-type' can be `browse` to place closest match in the middle of results, or `startwith` to place closest match at start of results *** 'index' can be `authority.author`, `authority.subject`, `authority.title`, `authority.topic` to search `1xx ‡a` **** `.refs` index variants such as `authority.author.refs` include 4xx and 5xx fields in the search criteria *** 'scope' is effectively always `1` (entire consortium); authority records have an owning library, but no interface is exposed to manipulate that *** 'term' is the string to match against the `‡a` field *** 'page' is the page of results to display; this can be a positive or negative integer *** 'per-page' is the number of results to display per page * Example: http://'hostname'/opac/extras/browse/marcxml/authority.title.refs/1/rowling/0/10 Authority search / browse indexes (2.0/2.1) ------------------------------------------- * Authority categories map to main entry fields as follows: ** Author: `100‡a`, `110‡a`, `111‡a` ** Subject: `148‡a`, `150‡a`, `151‡a`, `155‡a` ** Title: `130‡a` ** Topic: `150‡a` * When you launch the authority browser from a bibliographic field, the bib field's tag is replaced with a `1` and the browse is launched according to the preceding categories ** For example, the editor converts `100`, `600`, `700`, `800` to `100` and an 'author' browse is launched ** More sophisticated mapping exists, covering the case of bib `240` to auth `130`, but is not currently in use Controlling uncontrolled bibliographic records ---------------------------------------------- * A http://svn.open-ils.org/trac/ILS/browser/trunk/Open-ILS/src/support-scripts/authority_control_fields.pl[script] is available to link bibliographic records against authorities * De-duplicate your authority records first (you're on your own for that) * This script is **slow**; run it in parallel against subsets of your database: [source,bash] ------------------------------------------------------------------------------ perl authority_control_fields.pl --start 1 --end 50000 perl authority_control_fields.pl --start 50001 --end 100000 perl authority_control_fields.pl --start 100001 --end 150000 ------------------------------------------------------------------------------ * 'Slow' = ~4 days to control 1,000,000 records with no parallelism on our slow test server Controlling every subfield in a field ------------------------------------- * Controlling everything other than subfield `0` is a simple matter of changing `/openils/var/web/xul/server/cat/marcedit.js`: [source,javascript] ------------------------------------------------------------------------------ /* Filter out subfields that are not controlled for this tag */ if (!control_map[source_f.tag][sf_iter.@code.toString()]) { continue; } ------------------------------------------------------------------------------ to: [source,javascript] ------------------------------------------------------------------------------ /* Filter out everything except for subfield 0 */ if (sf_iter.@code.toString() === '0') { continue; } ------------------------------------------------------------------------------ Publishing authorities ---------------------- * As of Evergreen 2.1, authorities will be exposed via SRU with the following indices: ** `id`: the numeric ID of the authority record ** `author`: browses name headings ('100', '110', '111') ** `subject`: browses all subjects ('148', '150', '151', '155') ** `title`: browses uniform titles ('130') ** `topic`: browses just topical terms ('150') * Example: http://localhost/opac/extras/sru_auth?version=1.1&operation=searchRetrieve&query=author%3Dwill browses the 'author' index for entries beginning with 'will' ** *Limitation*: as there is no 'all' index for all authorities, simple queries that do not define an index currently fail * SRU leaves us just one step away from offering a Z39.50 server, thanks to Simple2ZOOM Modifying field/subfield control maps sanely -------------------------------------------- * Mike Rylander has a set of patches in the works that define field/subfield control sets in the database, rather than in code ** Still needs a configuration interface for modifying the defaults (although the current defaults are much more extensive) ** Still needs to be hooked into the *MARC Editor* interface ** Likely landing point will be *Evergreen 2.2* * Limitations in 2.2 (currently): ** No linkage between authorities ('see from', 'see also', 'reciprocal' entries) ** Floating subdivisions are still just another subfield ** No real concept of thesauri (LCSH vs. MeSH vs. LAC) Acknowledgements ---------------- * Laurentian University, University of Windsor, and the Conifer consortium * International Institute of Social History * Mike Rylander