Indexes - Relations - Relation Modifiers - Relation Qualifiers - Boolean Modifiers
The CQL context set defines a set of indexes, relations and relation modifiers. The indexes supplied are 'utility' indexes which do not directly reference any data. These utility indexes are for instances when CQL is required to express a concept not directly related to the records.
Historical note: In CQL version 1.0, this was the 'srw' index set. Implementers may wish to accept the 'srw' as a reserved name for the identifier 'http://www.loc.gov/zing/cql/srw-indexes/v1.0/' with the same semantics as below. srw.resultSetName has been renamed to cql.resultSetId for consistency.
The reserved name for this context set is: cql
The identifier for this context set is:info:srw/cql-context-set/1/cql-v1.1
resultSetId
A search clause may be a result set id. This is a special case, where
the index and relation are expressed as "cql.resultSetId =" and the
term is the result set id returned by the server in the 'resultSetId'
parameter of the response. It may be used by itself in a query to
refer to an existing result set from which records are desired. It
may also be used in conjunction with other resultSetId clauses or
other indexes, combined by boolean operators. The semantics of resultSetId
with relations other than "=" is undefined.
serverChoice
This is the default when the index and relation is omitted from a search
clause. 'cql.serverChoice' means that the server will choose an index
for the given term. The relation used is 'scr', hence 'cql.serverChoice
scr "term"' is an equivalent search clause to '"term"'.
anywhere
This means "search all indexes from all context sets you know". (By
contrast, cql.serverChoice means essentially "search any index -- your
choice -- from any context set you know".)
allRecords
A special index which matches every record available. Every
record is matched no matter what values are provided for the relation
and term, but the recommended syntax is: cql.allRecords = 1.
Implicit Relations
These relations are defined as such in the grammar of CQL. The cql context
set only defines their meaning, rather than their existence.
<, >, <=, and >= retain their regular meanings as relations pertaining to ordered terms
= is used:
For word adjacency, when the term is a list of words. That is to say that the words appear in that order with no others intervening.
Otherwise, for exact equality of value.
<> is 'not equal to'.
Default Relations
These relations are defined as being widely useful as part of a default
context set.
scr is used to mean "server choice relation". It is used when the client wishes the server to choose the most appropriate relation for the index or term. It is assumed when relation is omitted.
exact is used for exact string matching, when the term is a character string. =/cql.string is synonymous.
all and any may be used when the term contains multiple items to indicate "all of these items" or "any of these items". These queries could be expressed using boolean AND and OR respectively. These relations have an implicit relation modifier of 'cql.word'.
within may be used with a search term that has multiple dimensions. It matches if the database's term falls completely within the range, area or volume described by the search term. For example: dc.date within "2002 2003"
encloses may be used when the index's data has multiple dimensions. It matches if the database's term fully encloses the search term. For example: xxx.dateRange encloses 2002
Term Functions
These relation modifiers request that the server perform some algorithm
on each item within the term before processing. If named algorithms are
required, then further context sets should define relation modifiers
for these.
stem
The server should apply a stemming algorithm to the words within the
term. For example such that computing and computer both match the
stem of 'compute'.
relevant
The server should use a relevancy algorithm for determining matches
and the order of the result set.
phonetic
The server should use a phonetic algorithm for determining words which
sound like the term.
fuzzy
The server should be liberal in what it counts as a match. The exact
details of this are left up to the server, but might include permutations
of character order, off-by-one for numerical terms and so forth.
These modifiers qualify the relation to more precisely determine its semantics.
partial
When used with within or encloses, there may be some section which
extends without the term. This permits for the database term to be
partially enclosed, or fall partially within the search term.
Term Format
These relation modifiers describe the format or structure of the term
in some fashion.
word
The term should be broken into words, according to the server's definition
of a 'word'
string
The term is a single item, and should not be broken up.
isoDate
Each item within the term conforms to the ISO 8601 specification for
expressing dates.
number
Each item within the term is a number.
uri
Each item within the term is a URI.
masked (default modifier)
The following masking rules and special characters apply for search
terms, unless overridden in a profile via a relation modifier. To
explicitly request this functionality, add 'cql.masked' as a relation
modifier.
A single asterisk (*) is used to mask zero or more characters.
A single question mark (?) is used to mask a single character, thus N consecutive question-marks means mask N characters.
Carat/hat (^) is used as an anchor character for terms that are word lists, that is, where the relation is 'all' or 'any', or '=' when used for word adjacency. It may not be used to anchor a string, that is, when relation is 'exact' (string matches are, by default, anchored). It may occur at the beginning or end of a word (with no intervening space) to mean right or left anchored."^" has no special meaning when it occurs within a word (not at the beginning or end) or string but must be escaped nevertheless.
Masking examples:
dc.title = c*t (matches cat and coast
etc.)
dc.title = "*fish food*" (matches unanchored 'fish food')
dc.title = c?t (matches cat and cot, not
coast or ct)
" ?" (matches any single character)
dc.title = "^cat in the hat" (matches
'cat in the hat' where it is at the beginning of the field)
dc.title any "^cat ^dog eats rat" (matches 'cat eats rat', 'dog
eats cat', 'cat loves bat', but not 'bat loves cat')
dc.title = "\"Of Couse\" she said"
dc.identifier exact "\\\"\^\*\?andSomeMoreCharacters"
The CQL context set defines four boolean modifiers, which are only used with the prox boolean operator.
unordered
The order of the two terms is unimportant. This is the default.