Chapter 4. The select Statement

AQL's select statement provides a mechanism for constructing complex patterns out of simpler building blocks. This statement is structured similarly to a SQL SELECT statement:

select <select list>
from <from list>
[where <where clause>]
[consolidate on <column> [using '<policy>']]
[order by <expression>]

Note that the where, consolidate, and order by clauses are optional.

The select List

The select list in an AQL select statement consists of a comma-delimited list of output expressions. Each expression must be in the form:

<expr> as <colname>

where <expr> is a scalar expression and <colname> is a column name that names the output column where the expression's result will go.

The current version of AQL supports two types of expression in the select list: references to columns of input relations (for example, A.annot), and function calls like

CombineSpans(SpanBetween(A.annot, B.annot), B.annot)

The from List

The second part of a select statement in AQL is the from list. An AQL from list consists of a comma-delimited list of input views or nested AQL statements, along with corresponding to local (scoped within the select statement) names. The following example shows a from list that references a view and a nested extract statement:

select ...
from 
    (extract dictionary 'first.dict' on D.text as name from Document D) as FN,
    LastName as "Last Name"

This example assigns the result of the statement

extract dictionary 'first.dict' on D.text as name from Document D

to the local name FN. The example also assigns the outputs of the LastName view to the local name Last Name.

This example also shows two important pieces of punctuation. All nested statements in AQL must be surrounded by parentheses. Secondly, local names that contain spaces, punctuation characters, or AQL keywords must be enclosed in double quotes.

The where Clause

The third part of a select statement in AQL is the where clause. The where clause defines a predicate over the cross product of the select statement's input relations. In the current version of AQL, this predicate must be the conjunction of a set of boolean-returning scalar functions:

function1() and function2() and ... and functionn()

The where clause is optional and can be omitted from a select statement if there are no predicates to apply.

The consolidate Clause

The consolidate clause is an optional clause that tells the system what to do if the other parts of the select statement produce spans that overlap. The general structure of this clause is:

consolidate on <column> [using '<policy>']

where <column> is a column in a relation in the from clause and <policy> is one of the supported consolidation policies. For example,

consolidate on Person.name using 'ContainedWithin'

tells the system to examine the field Person.name of all output tuples and to use the ContainedWithin consolidation policy to resolve overlap.

System Text currently supports the following consolidation policies:

  • ContainedWithin: If spans A and B overlap, and A completely contains B, then remove the tuple containing span B from the output. If A and B are exactly the same, then remove one of them. The choice of which tuple to remove is currently an arbitrary one.

  • NotContainedWithin: If spans A and B overlap, and A completely contains B, then remove span A from the output. If A and B are exactly the same, then remove one of them.

  • ContainsButNotEqual: Same as ContainedWithin, except that spans that are exactly equal are retained.

  • LeftToRight: Process the spans in order from left to right; when overlap occurs, retain the leftmost longest nonoverlapping span. This policy emulates the overlap-handling policy of most regular expression engines.

If the using portion of the consolidate clause is omitted, the system will default to the "ContainedWithin" policy.

The order by Clause

The order by clause of a select statement tells AQL to sort the output tuples of the statement within each document. The current implementation only supports numeric arguments to this clause. For example:

order by GetBegin(P.person)

specifies that, within each document, the statement should return tuples ordered by the beginning of their person attribute.