AQL's select
statement provides a mechanism for constructing
complex patterns out of simpler building blocks. This statement is structured
similarly to a SQL SELECT statement:
select <select list> from <from list> [where <where clause>] [consolidate on <column> [using '<policy>']] [order by <expression>]
Note that the where
, consolidate
, and order
by
clauses are optional.
The select list in an AQL select statement consists of a comma-delimited list of output expressions. Each expression must be in the form:
<expr> as <colname>
where <expr>
is a scalar expression and
<colname>
is a column name that names the output column
where the expression's result will go.
The current version of AQL supports two types of expression in the select list: references to columns of input relations (for example, A.annot), and function calls like
CombineSpans(SpanBetween(A.annot, B.annot), B.annot)
The second part of a select statement in AQL is the from
list. An
AQL from
list consists of a comma-delimited list of input views or
nested AQL statements, along with corresponding to local (scoped within the
select statement) names. The following example shows a from
list
that references a view and a nested extract
statement:
select ... from (extract dictionary 'first.dict' on D.text as name from Document D) as FN, LastName as "Last Name"
This example assigns the result of the statement
extract dictionary 'first.dict' on D.text as name from Document D
to the local name FN
. The example also assigns the outputs of the
LastName
view to the local name Last Name
.
This example also shows two important pieces of punctuation. All nested statements in AQL must be surrounded by parentheses. Secondly, local names that contain spaces, punctuation characters, or AQL keywords must be enclosed in double quotes.
The third part of a select
statement in AQL is the
where
clause. The where clause defines a predicate over the cross
product of the select statement's input relations. In the current version of
AQL, this predicate must be the conjunction of a set of boolean-returning
scalar functions:
function1() and function2() and ... and functionn()
The where
clause is optional and can be omitted from a
select
statement if there are no predicates to apply.
The consolidate
clause is an optional clause that tells the system
what to do if the other parts of the select
statement produce
spans that overlap. The general structure of this clause is:
consolidate on <column> [using '<policy>']
where <column> is a column in a relation in the from
clause
and <policy> is one of the supported consolidation
policies.
For example,
consolidate on Person.name using 'ContainedWithin'
tells the system to examine the field Person.name
of all output
tuples and to use the ContainedWithin consolidation policy to resolve overlap.
System Text currently supports the following consolidation policies:
ContainedWithin: If spans A and B overlap, and A completely contains B, then remove the tuple containing span B from the output. If A and B are exactly the same, then remove one of them. The choice of which tuple to remove is currently an arbitrary one.
NotContainedWithin: If spans A and B overlap, and A completely contains B, then remove span A from the output. If A and B are exactly the same, then remove one of them.
ContainsButNotEqual: Same as ContainedWithin, except that spans that are exactly equal are retained.
LeftToRight: Process the spans in order from left to right; when overlap occurs, retain the leftmost longest nonoverlapping span. This policy emulates the overlap-handling policy of most regular expression engines.
If the using
portion of the consolidate
clause is
omitted, the system will default to the "ContainedWithin" policy.
The order by
clause of a select
statement tells
AQL to sort the output tuples of the statement within each
document. The current implementation only supports numeric
arguments to this clause. For example:
order by GetBegin(P.person)
specifies that, within each document, the statement should return tuples
ordered by the beginning of their person
attribute.