SRU (Search/Retrieval Using URL)

CQL: Contextual Query Language

Specification

Query Syntax | BNF

See Also CQL Context Sets



Query Syntax

  1. CQL Query
    A CQL query consists of either a single search clause [example 1], or multiple search clauses connected by boolean operators [example 2]. It may have a sort specification at the end, following the 'sortBy' keyword [example 3]. In addition it may include prefix assignments which assign short names to context set identifiers [example 4].

    Examples:
    1. dc.title any fish
    2. dc.title any fish or dc.creator any sanderson
    3. dc.title any fish sortBy dc.date/sort.ascending
    4. > dc = "info:srw/context-sets/1/dc-v1.1" dc.title any fish
  2. Search Clause
    A search clause consists of either an index, relation and a search term [example 1], or a search term by itself [example 2]. If the clause consists of just a term, then the index is treated as 'cql.serverChoice', and the relation is treated as '=' [example 3]. (Treated differently in versions 1.1  and 1.2. See note 1.)

    Examples:
    1. dc.title any fish
    2. fish
    3. cql.serverChoice = fish
  3. Search Term
    Search terms MAY be enclosed in double quotes [example 1], though need not be [example 2]. Search terms MUST be enclosed in double quotes if they contain any of the following characters: < > = / ( ) and whitespace [example 3]. The search term may be an empty string [example 4], but must be present in a search clause. The empty search term has no defined semantics.

    Examples:
    1. "fish"
    2. fish
    3. "squirrels fish"
    4. ""
  4. Index Name
    An index name always includes a base name [example 1] and may also include a prefix [example 2], which determines the context set of which the index is a part. The base name and the prefix are separated by a dot character ('.'). If multiple '.' characters are present, then the first should be treated as the prefix/base name delimiter. If the prefix is not supplied, it is determined by the server.
    Examples:
    1. title any fish
    2. dc.title any fish
  5. Relation
    The relation in a search clause specifies the relationship between the index and search term. It also always includes a base name [example 1] and may also include a prefix providing a context for the relation [example 2]. If a relation does not have a prefix, the context set is 'cql'. If no relation is supplied in a search clause, then = is assumed, which means that the relation is determined by the server. See note 1 regarding version differences.

    Examples:
    1. dc.title any fish
    2. dc.title cql.any fish
  6. Relation Modifiers
    Relations may be modified by one or more relation modifiers. Relation modifiers always include a base name, and may include a prefix for a context set as above [example 1]. If a prefix is not supplied, the context set is 'cql'. Relation modifiers are separated from each other and from the relation by forward slash characters('/'). Whitespace may be present on either side of a '/' character, but the relation plus modifiers group may not end in a '/' [example 2]. Relation modifiers may also have a comparison symbol and a value. The comparison symbol is any of = < <= > >= <>. The value must obey the same rules for quoting as search terms, above [example 3].
    Examples:
    1. dc.title any/relevant fish
    2. dc.title any/ relevant /cql.string fish
    3. dc.title any/rel.algorithm=cori fish
  7. Boolean Operators
    Search clauses may be linked by boolean operators. These are: and, or, not and prox [example 1]. Note that  not is 'and-not' and must not be used as a unary operator. Boolean operators all have the same precedence; they are evaluated left-to-right. Parentheses may be used to overide left-to-right evaluation [example 2].
    Examples:
    1. dc.title any fish or dc.creator any sanderson
    2. dc.title any fish or (dc.creator any sanderson and dc.identifier = "id:1234567")
  8. Boolean Modifiers
    Booleans may be modified by one or more boolean modifiers, separated as per relation modifiers with '/' characters. Again, boolean modifiers consist of a base name and may include a prefix determining the modifier's context set [example 1]. If not supplied, then the context set is 'cql'. As per relation modifiers, they may also have a comparison symbol and a value [example 2].
    Examples:
    1. dc.title any fish or/rel.combine=sum dc.creator any sanderson
    2. dc.title any fish prox/unit=word/distance>3 dc.title any squirrel
  9. Proximity Modifiers
    Basic proximity modifiers are defined in the CQL context set. Proximity units 'word', 'sentence', 'paragraph', and 'element' are defined in the CQL context set, and may also be defined in other context sets. Within the CQL set they are explicitly undefined. When defined in another context set they may be assigned specific meaning.

    Thus compare  "prox/unit=word"  with "prox/xyz.unit=word". In the first, 'unit' is a prox modifier from the CQL set, and as such its values are undefined, so 'word' is subject to interpretation by the server. In the second, 'unit' is a prox modifier defined by the xyz context set, which may assign  the unit 'word' a specific meaning.

    The context set xyz may define additional units, for example, 'street':

                              prox/xyz.unit="street"

    Note that this approach, 'prox/xyz.unit="street"', is preferable to 'Prox/unit=xyz.street'. In the first case, 'unit' is a modifier define in the xyz context set, and 'street' is a value defined for that modifier. In the second, 'unit' is a modifier from the cql context set, with a value defined in a different set. so its value would have to be one that is defined in the cql context set. Pairing a modifier from one set with a value from another is not a good practice.
  10. Sorting (See note 2 regarding version differences.)
    Queries may include explicit information on how to sort the result set generated by the search. The sort specification is included at the end, and is separated by a 'sortBy' keyword. The specification consists of an ordered list of indexes, potentially with modifiers, to use as keys on which to sort the result set. If multiple keys are given, then the second and subsequent keys should be used to determine the order of items that would otherwise sort together. Each index used as a sort key has the same semantics as when it is used to search.

    Modifiers may be attached to the index in the same way as to booleans and relations in the main part of the query. These modifiers may be part of any context set, but the CQL context set and the Sort context set are especially important. If a modifier may be used in this way should be stated in the description of its semantics, and it is the only time at which modifiers may be attached to indexes. As many types of search also require specification of term order (for example the <, > and within relations), these modifiers are often specified as relation modifiers.

    Examples:
    1. "cat" sortBy dc.title
    2. "dinosaur" sortBy dc.date/sort.descending dc.title/sort.ascending
  11. Prefix Assignment
    Warning: The use of Prefix Maps is very uncommon.
    A Prefix Map may be used to assign context set names to specific identifiers in order to be sure that the server maps them in a desired fashion. It may occur at any place in the query and applies to anything below the map in the query tree. A prefix assignment is specified by: '>' shortname '=' identifier [example 1]. The shortname and '=' sign may be omitted, in which case it sets a default context set for indexes [example 2].
    Examples:
    1. > dc = "http://deepcustard.org/" dc.custardDepth > 10
    2. > "http://deepcustard.org/" custardDepth > 10
  12. Case Insensitive
    All parts of CQL are case insensitive apart from user supplied search terms, values for modifiers and prefix map identifiers, which may or may not be case sensitive. If any case insensitive part of CQL is specified with both upper and lower case, it is for aesthetic purposes only.
    Examples:
    1. dC.tiTlE any fish
    2. dc.TitlE Any/rEl.algOriThm=cori fish soRtbY Dc.TitlE

Notes:

  1. In version 1.2 the default relation is '=', while in version 1.1, the default relation is 'scr'. In version 1.1 the '=' relation means "adjacency". In version 1.2 the "=" relation from version 1.1 is replaced by new relation 'adj'.
  2. In version 1.1, a sort parameter is included in the searchRetrieve operation. That parameter is dropped in version 1.2 and instead the sort specification becomes part of the CQL query.  

BNF

Following is the Backus Naur Form (BNF) definition for CQL. ["::=" represents "is defined as"]

sortedQuery ::= prefixAssignment sortedQuery
| scopedClause ['sortby' sortSpec]
sortSpec ::= sortSpec singleSpec | singleSpec
singleSpec ::= index [modifierList]
                            Note: The above three assignments are new in version 1.2 to accomodate the sortSpec.
cqlQuery ::= prefixAssignment cqlQuery
| scopedClause
prefixAssignment ::= '>' prefix '=' uri
| '>' uri
scopedClause ::= scopedClause booleanGroup searchClause
| searchClause
booleanGroup ::= boolean [modifierList]
boolean ::= 'and' | 'or' | 'not' | 'prox'
searchClause ::= '(' cqlQuery ')'
| index relation searchTerm
| searchTerm
relation ::= comparitor [modifierList]
comparitor ::= comparitorSymbol | namedComparitor
comparitorSymbol ::= '=' | '>' | '<' | '>=' | '<=' | '<>' | '=='
namedComparitor ::= identifier
modifierList ::= modifierList modifier | modifier
modifier ::= '/' modifierName [comparitorSymbol modifierValue]
prefix, uri, modifierName, modifierValue, searchTerm, index ::= term
term ::= identifier | 'and' | 'or' | 'not' | 'prox' | 'sortby'
identifier ::= charString1 | charString2
charString1 := Any sequence of characters that does not include any of the following:
whitespace
( (open parenthesis )
) (close parenthesis)
=
<
>
'"' (double quote)
/
If the final sequence is a reserved word, that token is returned instead. Note that '.' (period) may be included, and a sequence of digits is also permitted. Reserved words are 'and', 'or', 'not', and 'prox' (case insensitive). When a reserved word is used in a search term, case is preserved.
charString2 := Double quotes enclosing a sequence of any characters except double quote (unless preceded by backslash (\)). Backslash escapes the character following it. The resultant value includes all backslash characters except those releasing a double quote (this allows other systems to interpret the backslash character). The surrounding double quotes are not included.