Auto-Group and Auto-Sum:
logQL automatically groups all string and date type fields. For all other number fields, it automatically sums. For instance the following query (from Quick Start section):
SELECT account, out
is equivalent to the following in SQL:
SELECT account, sum(out) FROM acc GROUP BY account
To disable Auto-Group and Auto-Sum, use the GREP command. Simply replace SELECT with GREP in your query. For instance:
GREP account, out
Is the SQL equivalent of:
SELECT account, out FROM acc
Parts of a logQL Query:
There are four parts to logQL query. Not all parts are always required.
1. SELECT/GREP: Used to specify the required fields in the output.
2. FROM: The data files to query on.
3. USE: The Meta to use to query the data file.
4. WHERE: The filter conditions.
5. ORBER BY: Sort the output on one or more fields.
SELECT/GREP: This is used to select the fields from the data file. As mentioned in section titled 'Auto-Group and Auto-Sum' select results are automatically grouped and summed.
Syntax: SELECT <comma separated list of fields>
Example: SELECT account, in, out
Column Alias: An alias can be entered for the column with the following syntax:
Syntax: SELECT <field1> "<alias1>", <field2> "alias"
Example: SELECT account "A/c", out "Credit"
Tip: To find all the distinct values in a give field, execute the query SELECT <fieldName>
From: This is used to select the files to query on. You can query a single file or multiple files:
Syntax: FROM <filename>.
Example: FROM account.xls
Syntax: FROM <fileName1>, <fileName2>
Example: FROM account.xls, ./old/nov-account.xls, ./old/oct-account.xls
Syntax: FROM <partialName>*.xls
Example: FROM ./old/*-account-2007.xls
Example: FROM ./old/*-account-*.xls
Syntax: FROM <fileName>@<zipFileName>
Example: FROM jon-nov-2007.xls@emp-expenses.zip
Example: FROM jon-*-2007.xls@emp-expenses.zip
Example: FROM jon.xls@*-expenses.zip
Use: Specify the Meta to use when reading the data file.
Syntax: USE <metafile>
Example: USE ./meta/apacheConfig.xml
Syntax: USE <metaName>@<filename>
Example: USE apacheCommon@meta.xml
To run queries on the CSV file without defining the meta use the function:
Syntax: USE csv(<headerLine>)
Example: USE csv(3)
USE csv
Note that <headerLine> is an optional parameter. By default, it assumes the header to be at line 1.
IMPORTANT: By default, logQL assumes all the fields to be of type string. To know how to change type information, see section 'Changing types'.
To run queries on these files without defining the meta use the function:
Syntax: USE sep (<sep>,<headerLine>)
Example: USE sep( '\t', 4)
USE sep('\t')
Note that <headerLine> is an optional parameter. By default, it assumes the header to be at line 1.
IMPORTANT: By default, logQL assumes all the fields to be of type string. To know how to change type information, see section 'Changing types'.
Changing types: when using csv or sep functions, logQL assumes all the fields to be of type string. However, logQL does allow you to change this to type number or date:
Syntax: sum(<fieldName>)
Example: SELECT account, sum(out) FROM account.csv USE csv
Syntax: todate(<fieldname>, <format>)
Example: SELECT todate(date,"dd-MMM-yy") FROM account.csv USE csv
Where: The where clause is used to filter the input data based on a certain condition
Syntax: SELECT <fields> WHERE <condition>
Example: SELECT description, out WHERE account = "Food"
The following operators are supported:
= |
Equals |
# |
Not equals |
> |
Greater than |
< |
Lesser than |
LIKE |
Search for a pattern. Use '%' for wild card. |
NOTLIKE |
Ensure pattern does not exist |
IN |
List of possible column values |
NOTIN |
List of values to exclude |
IMPORTANT: a field can only be compared to a value, logQL currently does not allow comparison of two fields.
The following binary operators are supported to club multiple conditions:
and |
Both conditions must evaluate to true |
or |
Either of the conditions must evaluate to true |
Inner
queries:
Where
clause supports inner queries with the IN operator
Syntax: SELECT <fields> WHERE <field> IN (<query>)
Example: SELECT description, out WHERE account IN (select account WHERE out >50)
IMPORTANT: the inner query should return only one column.
String functions:
All
of the
following functions can be used in SELECT or WHERE clause
1. strtok: String Tokenizer breaks the string into tokens based on a separator character and picks the token specified.
Syntax: strtok(<field|function>,<token>,<position>, <strict true|false>)
For instance,
assume we have a path 'Apparel/Mens/Shirts/formal.html'.
strtok(path,
'/', 2)
will return
'Mens'
The option strict is used to indicate if the last token should be used; if the total number of tokens is less than required tokens. If the value is false, the last token is returned.
Data: 'Apparel/Mens/Shirts/formal.html'
Example: strtok(path, '/',10,false) will return 'formal.html'
strtok(path, '/',10,true) will return empty string ''
This can be
used in the SELECT clause as well as the WHERE
clause.
Example: SELECT strtok(path, '/',3), count(*)
SELECT path, count(*) WHERE strtok(path,'/',2) = 'Mens'
2. hostname: for a column containing IP address, this function returns the host name or the same IP if it is not able to resolve.
Syntax: hostname(<fieldName>)
Examples: SELECT hostname(ip), count(*)
SELECT hostname(ip), bytes WHERE hostname(ip) like '%.us.comp%'
3. urldecode: Decodes a string from the application/x-www-form-urlencoded MIME format.
Syntax: urldecode(<fieldName>)
Examples: SELECT urldecode(referer), count(*)
SELECT path, count(*) WHERE urldecode(referrer) like '%key word%'
4. tolowercase: Converts the string to lower case.
Syntax: tolowercase(<fieldName>)
Examples: SELECT tolowercase(account), out
SELECT description, out WHERE tolowercase(account) = food
5. touppercase: Converts the string to upper case.
Syntax: touppercase(<fieldName>)
Examples: SELECT touppercase(account), out
SELECT description, out WHERE touppercase(account) = FOOD
Date functions:
1. day: This is the default function for all date values. It returns the date in 'dd-MMM-yy' format.
Syntax: day(<dateField>, <outputFormat>)
Examples: day(date), day(date, 'MMM-dd')
SELECT day(date, 'MMM-dd'), count(*)
When using this function in the where clause, you can enter the value in either of the formats ("MM-dd-yyyy", "MMM-dd-yyyy","dd-MMM-yyyy", "yyyy-MM-dd")
Example: SELECT description, out WHERE date = 'Feb-02-2007'
2. hour: Returns the hour of day
Syntax: hour(<dateField>)
Examples: SELECT hour(date), count(*)
SELECT path, count(*) WHERE hour(date) = 15;
3. dayofweek: Returns the day of the week (Sunday, Monday, …)
Syntax: dayofweek(<dateField>)
Examples: SELECT dayofweek(<dateField>), count(*)
SELECT path, count(*) WHERE dayofweek(date) = Wednesday
4. weekofmonth: Returns the week of the month (W1, W2, …)
Syntax: weekofmonth(<dateField>)
Examples: SELECT weekofmonth(date), count(*)
SELECT path, count(*) WHERE weekofmonth(date) = W1
5. weekofyear: Returns the week of the year (W1, W2, …)
Syntax: weekofyear(<dateField>)
Examples: SELECT weekofyear(date), count(*)
SELECT path, count(*) WHERE weekofyear(date) = W18
6. month: Returns the month (January, February, …)
Syntax: month(<dateField>)
Examples: SELECT month(date), count(*)
SELECT path, count(*) WHERE month(date) = January
7. dayofmonth: Returns the day of month (1, 2, 3, …)
Syntax: dayofmonth(<dateField>)
Examples: SELECT dayofmonth(date), count(*)
SELECT path, count(*) WHERE dayofmonth(date) = 15
8. year: Returns the year
Syntax: year(<dateField>)
Examples: SELECT year(date), count(*)
SELECT path, count(*) WHERE year(date) = 2007
9. todate: This function has been described before for converting string type to date in case of CSV or Delimited files. It can also be used for to give specific values to the date function.
Syntax: todate(<dateValue>,<dateFormat>)
Example: SELECT path, count(*) WHERE date > todate('jan-05-07','MMM-dd-yy')
Count function: This function returns the number of rows in a query.
Syntax: count(expression)
Examples SELECT
path, count(*)
SELECT count(host)
SELECT host, count(path)