SRU VERSION 1.1 ARCHIVE
The CQL Context Set
Indexes - Relations - Relation
Modifiers - Relation
Qualifiers - Boolean
Modifiers
The CQL context set defines a set of indexes, relations and relation
modifiers. The indexes supplied are 'utility' indexes which do not directly
reference any data. These utility indexes are for instances when CQL
is required to express a concept not directly related to the records.
Historical note: In CQL version 1.0, this was the 'srw' index set. Implementers
may wish to accept the 'srw' as a reserved name for the identifier 'http://www.loc.gov/zing/cql/srw-indexes/v1.0/'
with the same semantics as below. srw.resultSetName has been renamed
to cql.resultSetId for consistency.
The reserved name for this context set is: cql
The identifier for this context set is:info:srw/cql-context-set/1/cql-v1.1
Indexes
-
resultSetId
A search clause may be a result set id. This is a special case, where
the index and relation are expressed as "cql.resultSetId =" and
the term is the result set id returned by the server in the 'resultSetId'
parameter of the response. It may be used by itself in a query
to refer to an existing result set from which records are desired.
It may also be used in conjunction with other resultSetId clauses
or other indexes, combined by boolean operators. The semantics
of resultSetId with relations other than "=" is undefined.
-
serverChoice
This is the default when the index and relation is omitted from a
search clause. 'cql.serverChoice' means that the server will choose
an index for the given term. The relation used is 'scr', hence
'cql.serverChoice scr "term"' is an equivalent search clause to
'"term"'.
-
anywhere
This means "search all indexes from all context sets you know". (By
contrast, cql.serverChoice means essentially "search any index --
your choice -- from any context set you know".)
-
allRecords
A special index which matches every record available. Every
record is matched no matter what values are provided for the relation
and term, but the recommended syntax is: cql.allRecords = 1.
Relations
Implicit Relations
These relations are defined as such in the grammar of CQL. The cql context
set only defines their meaning, rather than their existence.
Default Relations
These relations are defined as being widely useful as part of a default
context set.
-
scr is used to mean "server choice relation". It
is used when the client wishes the server to choose the most appropriate
relation for the index or term. It is assumed when relation is omitted.
-
exact is used for exact string matching, when the
term is a character string. =/cql.string is synonymous.
-
all and any may be used when the
term contains multiple items to indicate "all of these items" or "any
of these items". These queries could be expressed using boolean AND
and OR respectively. These relations have an implicit relation modifier
of 'cql.word'.
-
within may be used with a search term that has
multiple dimensions. It matches if the database's term falls completely
within the range, area or volume described by the search term. For
example: dc.date within "2002 2003"
-
encloses may be used when the index's data has
multiple dimensions. It matches if the database's term fully encloses
the search term. For example: xxx.dateRange encloses 2002
Relation Modifiers
Term Functions
These relation modifiers request that the server perform some algorithm
on each item within the term before processing. If named algorithms
are required, then further context sets should define relation modifiers
for these.
-
stem
The server should apply a stemming algorithm to the words within
the term. For example such that computing and computer both match
the stem of 'compute'.
-
relevant
The server should use a relevancy algorithm for determining matches
and the order of the result set.
-
phonetic
The server should use a phonetic algorithm for determining words
which sound like the term.
-
fuzzy
The server should be liberal in what it counts as a match. The exact
details of this are left up to the server, but might include permutations
of character order, off-by-one for numerical terms and so forth.
Relation Qualifiers
These modifiers qualify the relation to more precisely determine its
semantics.
-
partial
When used with within or encloses, there may be some section which
extends without the term. This permits for the database term to
be partially enclosed, or fall partially within the search term.
-
Term Format
These relation modifiers describe the format or structure of the
term in some fashion.
-
word
The term should be broken into words, according to the server's definition
of a 'word'
-
string
The term is a single item, and should not be broken up.
-
isoDate
Each item within the term conforms to the ISO 8601 specification
for expressing dates.
-
number
Each item within the term is a number.
-
uri
Each item within the term is a URI.
-
masked (default modifier)
The following masking rules and special characters apply for search
terms, unless overridden in a profile via a relation modifier.
To explicitly request this functionality, add 'cql.masked' as a
relation modifier.
-
A single asterisk (*) is used to mask zero or more characters.
-
A single question mark (?) is used to mask a single character,
thus N consecutive question-marks means mask N characters.
-
Carat/hat (^) is used as an anchor character for terms that
are word lists, that is, where the relation is 'all' or 'any',
or '=' when used for word adjacency. It may not be used to anchor
a string, that is, when relation is 'exact' (string matches are,
by default, anchored). It may occur at the beginning or end of
a word (with no intervening space) to mean right or left anchored."^" has
no special meaning when it occurs within a word (not at the beginning
or end) or string but must be escaped nevertheless.
- Backslash (\) is used to escape '*', '?', quote (") and '^'
, as well as itself. Backslash not followed immediately by one
of these characters is an error.
See masking examples below.
- unmasked
Do not apply masking rules.
- oid
The term is an ISO object identifier, dot-separated format. Example
'zeerex.set exact/cql.oid "1.2.840.10003.3.1"'
Masking examples:
-
dc.title = c*t (matches cat and coast etc.)
dc.title = "*fish food*" (matches unanchored 'fish food')
-
dc.title = c?t (matches cat and cot, not
coast or ct)
" ?" (matches any single character)
-
dc.title = "^cat in the hat" (matches 'cat
in the hat' where it is at the beginning of the field)
dc.title any "^cat ^dog eats rat" (matches 'cat eats rat', 'dog
eats cat', 'cat loves bat', but not 'bat loves cat')
-
dc.title = "\"Of Couse\" she said"
dc.identifier exact "\\\"\^\*\?andSomeMoreCharacters"
Boolean Modifiers
The CQL context set defines four boolean modifiers, which are only used
with the prox boolean operator.
|