HCMS Query Language
Detailed documentation of the query language in HCMS
Important note: query language uses some special characters that must be properly escaped or encoded, depending on the context. Especially in the most common case of passing query as a URL query parameter, the & and # characters must be always encoded (%26 or %23 resp).
Condition
Condition is the basic building block of queries: each valid query must contain at least one condition (although reverse reference can be used instead, see below).
Condition always compares one property (or, more precisely, data entity that is mapped to that property) to static value. Note that comparing two properties is not supported.
Basic condition (with an operator)
path operator value
- path locates property in the mapping; see '→path section' for details.
- It might contain facet declaration (#facetname) at the end. This declaration does not affect evaluation in any way.
- operator is one of the following:
- = and != for equality test, works for all possible values including null.
- <, >. <= and >= for comparison of string and numbers. Note that comparing other values (true, false, null) might be technically allowed, but the result is not defined.
- =^ to search for strings that start with specified string value (non-string value is not allowed). This is especially useful for hierarchical properties like censhare domains.
- value is a JSON value to compare with.
- All JSON scalar types are supported as literal values:
- string, in double quotes, with all standard escape sequences supported
- number, integer or floating point
- boolean (true or false)
- null
- Instead of literal value, variable expression can be used in form ${variableName} or ${variableName:defaultValue}
- Available variables depend on context; userId might is available, but only if the request is authorized by JWT with valid user id in sub field.
- See request logging for a complete list.
- If the variable is not available and no default value us defined, query evaluation fails with error (HTTP code 400).
- Default value is any JSON literal. Note that string needs to be put inside quotes; using unquoted string value is a syntax error!
- Available variables depend on context; userId might is available, but only if the request is authorized by JWT with valid user id in sub field.
- All JSON scalar types are supported as literal values:
Condition without operator
path alone, without any operator and value, can also be used as a condition with following semantic:
- Simple path to property, with no facet declaration, evaluates as "present, non-null" condition.
- It's just an alternative way to write path != null
- Path with facet declaration and no operator just declares the facet without affecting the result at all.
- Path ending with "schema cast" evaluates as "valid and accessible entity of this schema".
- In the simplest form, just @schemaname is completely valid expression, useful in mixed query endpoint to list entities of given type or one of several types at once. For example: @user | @company selects all users and companies.
- If the schema name contains characters other than letters, numbers and underscore, it must be surrounded by quotes. Otherwise, the quotes are optional.
- Examples: @user | @company | @"article+v1.1"
Static variable condition
variable operator value
This is similar to the basic condition, but instead of path the left side is a variable in the same form as the value on the right side (${variableName} or ${variableName:defaultValue}). Literal values are supported only on the right side of the operator; left side must be always a variable.
Unlike the normal conditions, these expressions are evaluated first before the database query itself and their result is always true or false. The value is then used to prune boolean conditions, if possible.
Example: ${jwt/claim/roles:null}="media_full" | @image.owner=${userId:-1}
Note that this type of condition is available only from version 3.2; in all earlier releases, using variable syntax instead of path is reported as a syntax error.
Path
Path consists of one or more segments separated by dot .; each segment define one "step" through the schema:
- Property name: select this property.
- Property must exist at the appropriate place in the schema, otherwise the query parsing fails with error.
- If the property is mapped as scalar value, this segment must be the last one. The only exception is if the property represents relation or asset reference; in this case, path can continue by "schema cast".
- Property name can be specified either as a JSON string literal or directly without quotes.
- Quotes might be omitted if the name contains only (English) letters, digits, underscore or dash. Otherwise, it must be in a form of proper JSON string.
- Schema cast in the form @schema: set currently selected entity schema. This segment is usually followed by some property, but it's not required; schema cast as last segment represents condition "entity is of this type".
- When there is no initial entity, all paths must start with schema cast. This happens in several cases:
- In the "mixed query" endpoint, when all entities are searched for.
- Non-inlined relations and references (mapped as scalar).
- Note that by using schema cast, scalar property is changed to entity root and path can continue to properties in the specified schema.
- When several entities share the same asset type, schema cast can be used to select correct one. This is especially useful for inlined relations/references.
- When querying mixin entity, schema cast can be used to specify special condition for one implementing schema.
- Note that schema cast always makes sure that the asset is really available as an entity of given schema. If the asset has incorrect asset type or if the request is not authorized to read/list it, the path just won't match anything.
- When there is no initial entity, all paths must start with schema cast. This happens in several cases:
- Special wildcard character *: matches any language/locale for localized mapping.
- Cannot be used anywhere else; wildcards for real properties are not allowed.
- Localized mapping of features must use wildcard; filtering by feature language is not supported by the search engine.
- Localized mapping of assets (relations) can either use this wildcard, or one hardcoded language - in the latter case, only asset with that language will be selected.
- Function invocation in form $function(param1, param2, param3)
- Currently, the only available function is fulltext search $text(<index>, <term>, <language>)
- Function invocation is always the last segment.
Array properties
Array mappings are not reflected in the path specification at all; array properties are treated just like simple properties of the type of array items.
Condition with array property is evaluated as "true" if at least one array item matches the given condition. In the case of conditions without an operator (or its equivalent !=null), the condition is interpreted as "non-empty array".
Note that separate instances of paths (separate conditions) are evaluated completely independently and different array item can match in each of them. Complex conditions that need to match single item must use [] grouping (see lower).
Logical operators
Logical operators are able to modify or connect any other expression - including simple conditions, reverse lookups and other logical expressions.
- ! is unary operator used to negate expression
- | and & are binary operators implementing disjunction (OR) and conjunction (AND). These symbols are borrowed from C/Java/JavaScript language family:
- | = disjunction = OR = at least one operand matches
- & = conjunction = AND = both operand must match
- Priority is not defined; always use parentheses ( ) in complex expressions to define evaluation order.
- There is no "short-cutting"; both arguments are always evaluated, even if only one is sufficient to define result. Note that this is consistent with Java behavior of these operators.
Expression grouping
There are two constructions used to group expressions together: parenthesis and square brackets
- ( expression )
- path [ expression ]
Any expression can be enclosed by parenthesis ( ) and the result is the same as the expression inside. Parentheses are needed to define grouping in complex queries using logical operators (which do not have any defined priority of evaluation).
Example of complex expression that needs parenthesis:
address.city="New York" & (address.street="Broadway" | address.street="Park Avenue")
Square brackets [ ] are similar to parenthesis, except that they are preceded by path that defines common path prefix. This path prefix is used by all paths inside the square brackets.
The example above can be also written more succinctly:
address[city="New York" & (street="Broadway" | street="Park Avenue")]
This prefix, however, is not just a syntactic sugar. If it contains any array property, square brackets make sure that all expressions inside always match the same item in array. This is especially important if the condition inside contains conjunction.
For example, expression address.city="New York" & (address.street="Broadway" | address.street="Park Avenue") would find following entity (which is almost certainly not desired behavior):
{
"address": [
{"city": "New York", "street": "8th Ave"},
{"city": "London", "street": "Broadway"}
]
}
Reverse reference resolution
Asset references (relations, asset reference features) are followed just by using corresponding property in '→path'. Reverse direction can be also used by special syntax, without need to include reverse mapping in the target entity (which is not even possible in case of asset reference features).
Reverse reference has two basic forms; the only different is the expression, which is optional and can be omitted:
- { refSchema : refPath : expr }
{ refSchema : refPath }
refSchema is name of the schema that defines the reference.
- Note that the whole expression does not change schema (unlike schema cast). This means that the result entities are not in the refSchema!
- If the schema name contains characters other than letters, numbers and underscore, it must be surrounded by quotes. Otherwise, the quotes are optional.
- refPath is path to the relation/reference property.
- The property must be mapped either to relation, or to asset reference feature. Any other mapping is an error.
- expr is optional expression evaluated in the reference schema.
- Note that this expression is considered "top-level" in that schema; any path prefix from external square brackets is ignored.
Reverse reference resolution is standalone expression and cannot be used as part of path. Result is set of entities that are referred from refSchema entities that conform to the expr expression.
Special considerations
Pair values
Some features have two values (ie value pair) instead of single value. These features must be mapped as an object with each value of the pair available as a separate property. Each of those sub-properties can be used in conditions that are theoretically separate, but in reality the database search for both values at once. This leads to some limited behavior:
- Two conditions can be present, each for one condition in pair. These two conditions must be connected by a logical conjunction (the & operator) inside square bracket.
- Example: date-range[from>"2018-01-01" & to>"2018-01-02"]
- Without the square brackets, these conditions would be treated as completely separate and the query would be considered invalid.
- It is possible to have only one condition, for the first value in pair.
- Standalone condition for the second value is always an error and such a query would cause request to fail.
- In this case, the square brackets are not mandatory: date-range.from>"2018-01-01"
- Note that equality check will not work as expected.
- All operators are allowed in any combination, but their semantics is very confusing.