CQL: Contextual Query Language
Specification
Query Syntax
| BNF
See Also CQL Context Sets
Query Syntax
- CQL Query
A CQL query consists of either a single search clause [example 1], or
multiple search clauses connected by boolean operators [example 2].
It may have a sort specification at the end, following the 'sortBy'
keyword [example 3]. In addition it may include prefix assignments which
assign short names to context set identifiers [example 4].
Examples:
- dc.title any fish
- dc.title any fish or dc.creator any sanderson
- dc.title any fish sortBy dc.date/sort.ascending
- > dc = "info:srw/context-sets/1/dc-v1.1" dc.title any fish
- Search Clause
A search clause consists of either an index, relation and a search term
[example 1], or a search term by itself [example 2]. If the clause consists
of just a term, then the index is treated as 'cql.serverChoice', and
the relation is treated as '=' [example 3]. (Treated
differently in versions 1.1 and 1.2. See note
1.)
Examples:
- dc.title any fish
- fish
- cql.serverChoice = fish
- Search Term
Search terms MAY be enclosed in double quotes [example 1], though need
not be [example 2]. Search terms MUST be enclosed in double quotes if
they contain any of the following characters: < > = / ( ) and
whitespace [example 3]. The search term may be an empty string [example
4], but must be present in a search clause. The empty search term has
no defined semantics.
Examples:
- "fish"
- fish
- "squirrels fish"
- ""
- Index Name
An index name always includes a base name [example 1] and may also include
a prefix [example 2], which determines the context set of which the
index is a part. The base name and the prefix are separated by a dot
character ('.'). If multiple '.' characters are present, then the first
should be treated as the prefix/base name delimiter. If the prefix is
not supplied, it is determined by the server.
Examples:
- title any fish
- dc.title any fish
- Relation
The relation in a search clause specifies the relationship between the
index and search term. It also always includes a base name [example
1] and may also include a prefix providing a context for the relation
[example 2]. If a relation does not have a prefix, the context set is
'cql'. If no relation is supplied in a search clause, then = is assumed,
which means that the relation is determined by the server. See note 1 regarding
version differences.
Examples:
- dc.title any fish
- dc.title cql.any fish
- Relation Modifiers
Relations may be modified by one or more relation modifiers. Relation
modifiers always include a base name, and may include a prefix for a
context set as above [example 1]. If a prefix is not supplied, the context
set is 'cql'. Relation modifiers are separated from each other and from
the relation by forward slash characters('/'). Whitespace may be present
on either side of a '/' character, but the relation plus modifiers group
may not end in a '/' [example 2]. Relation modifiers may also have a
comparison symbol and a value. The comparison symbol is any of = <
<= > >= <>. The value must obey the same rules for quoting
as search terms, above [example 3].
Examples:
- dc.title any/relevant fish
- dc.title any/ relevant /cql.string fish
- dc.title any/rel.algorithm=cori fish
- Boolean Operators
Search clauses may be linked by boolean operators. These are: and,
or, not and prox
[example 1]. Note that not is 'and-not' and must
not be used as a unary operator. Boolean operators all have the same
precedence; they are evaluated left-to-right. Parentheses may be used
to overide left-to-right evaluation [example 2].
Examples:
- dc.title any fish or dc.creator
any sanderson
- dc.title any fish or (dc.creator
any sanderson and dc.identifier = "id:1234567")
- Boolean Modifiers
Booleans may be modified by one or more boolean modifiers, separated
as per relation modifiers with '/' characters. Again, boolean modifiers
consist of a base name and may include a prefix determining the modifier's
context set [example 1]. If not supplied, then the context set is 'cql'.
As per relation modifiers, they may also have a comparison symbol and
a value [example 2].
Examples:
- dc.title any fish or/rel.combine=sum dc.creator any sanderson
- dc.title any fish prox/unit=word/distance>3 dc.title any squirrel
- Proximity Modifiers
Basic proximity modifiers are defined in the CQL
context set. Proximity units
'word', 'sentence', 'paragraph', and 'element' are defined in the CQL
context set, and may also be defined in other context sets. Within
the CQL set they are explicitly undefined. When defined in another
context set they may be assigned specific meaning.
Thus compare "prox/unit=word" with "prox/xyz.unit=word".
In the first, 'unit' is a prox modifier from the CQL set, and as
such its values are undefined, so 'word' is subject to interpretation
by the server. In the second, 'unit' is a prox modifier defined by
the xyz context set, which may assign the unit 'word' a specific
meaning.
The context set xyz may define additional units, for example, 'street':
prox/xyz.unit="street"
Note that this approach, 'prox/xyz.unit="street"', is preferable to
'Prox/unit=xyz.street'. In the first case, 'unit' is a modifier define
in the xyz context set, and 'street' is a value defined for that modifier.
In the second, 'unit' is a modifier from the cql context set, with
a value defined in a different set. so its value would have to be
one that is defined in the cql context set. Pairing a modifier from
one set with a value from another is not a good practice.
- Sorting (See
note 2 regarding version differences.)
Queries may include explicit information on how to sort the result set
generated by the search. The sort specification is included at the end,
and is separated by a 'sortBy' keyword. The specification consists of
an ordered list of indexes, potentially with modifiers, to use as keys
on which to sort the result set. If multiple keys are given, then the
second and subsequent keys should be used to determine the order of
items that would otherwise sort together. Each index used as a sort
key has the same semantics as when it is used to search.
Modifiers may be attached to the index in the same way as to booleans
and relations in the main part of the query. These modifiers may be
part of any context set, but the CQL context set and the Sort context
set are especially important. If a modifier may be used in this way
should be stated in the description of its semantics, and it is the
only time at which modifiers may be attached to indexes. As many types
of search also require specification of term order (for example the
<, > and within relations), these modifiers are often specified
as relation modifiers.
Examples:
- "cat" sortBy dc.title
- "dinosaur" sortBy dc.date/sort.descending
dc.title/sort.ascending
- Prefix Assignment
Warning: The use of Prefix Maps is very uncommon.
A Prefix Map may be used to assign context set names to specific identifiers
in order to be sure that the server maps them in a desired fashion.
It may occur at any place in the query and applies to anything below
the map in the query tree. A prefix assignment is specified by: '>'
shortname '=' identifier [example 1]. The shortname and '=' sign may
be omitted, in which case it sets a default context set for indexes
[example 2].
Examples:
- > dc = "http://deepcustard.org/" dc.custardDepth > 10
- > "http://deepcustard.org/" custardDepth > 10
- Case Insensitive
All parts of CQL are case insensitive apart from user supplied search
terms, values for modifiers and prefix map identifiers, which may or
may not be case sensitive. If any case insensitive part of CQL is specified
with both upper and lower case, it is for aesthetic purposes only.
Examples:
- dC.tiTlE any fish
- dc.TitlE Any/rEl.algOriThm=cori fish soRtbY Dc.TitlE
Notes:
- In version 1.2 the default relation is '=',
while in version 1.1, the default relation is 'scr'. In version 1.1
the '=' relation means "adjacency". In version 1.2 the "="
relation from version 1.1 is replaced by new relation 'adj'.
- In version 1.1, a sort
parameter is included in the searchRetrieve operation. That parameter
is dropped in version 1.2 and instead the sort specification becomes
part of the CQL query.
BNF
Following is the Backus Naur Form (BNF) definition for CQL. ["::=" represents
"is defined as"]
sortedQuery |
::= |
prefixAssignment sortedQuery
| scopedClause ['sortby' sortSpec] |
sortSpec |
::= |
sortSpec singleSpec | singleSpec |
singleSpec |
::= |
index [modifierList] |
Note:
The above three assignments are new in version 1.2 to accomodate the
sortSpec. |
cqlQuery |
::= |
prefixAssignment cqlQuery
| scopedClause |
prefixAssignment |
::= |
'>' prefix '=' uri
| '>' uri |
scopedClause |
::= |
scopedClause booleanGroup searchClause
| searchClause |
booleanGroup |
::= |
boolean [modifierList] |
boolean |
::= |
'and' | 'or' | 'not' | 'prox' |
searchClause |
::= |
'(' cqlQuery ')'
| index relation searchTerm
| searchTerm |
relation |
::= |
comparitor [modifierList] |
comparitor |
::= |
comparitorSymbol | namedComparitor |
comparitorSymbol |
::= |
'=' | '>' | '<' | '>=' | '<=' | '<>' | '==' |
namedComparitor |
::= |
identifier |
modifierList |
::= |
modifierList modifier | modifier |
modifier |
::= |
'/' modifierName [comparitorSymbol modifierValue] |
prefix, uri, modifierName, modifierValue,
searchTerm, index |
::= |
term |
term |
::= |
identifier | 'and' | 'or' | 'not' | 'prox' | 'sortby' |
identifier |
::= |
charString1 | charString2 |
charString1 |
:= |
Any sequence of characters that does not include any of the
following: whitespace
( (open parenthesis )
) (close parenthesis)
=
<
>
'"' (double quote)
/
If the final sequence is a reserved word, that token is returned
instead. Note that '.' (period) may be included, and a sequence of
digits is also permitted. Reserved words are 'and', 'or', 'not', and
'prox' (case insensitive). When a reserved word is used in a search
term, case is preserved. |
charString2 |
:= |
Double quotes enclosing a sequence of any characters except
double quote (unless preceded by backslash (\)). Backslash escapes
the character following it. The resultant value includes all backslash
characters except those releasing a double quote (this allows other
systems to interpret the backslash character). The surrounding double
quotes are not included. |
|