Categories
Uncategorized

6.InfluxDB-InfluxQL basic grammar tutorial –GROUP BY clause

This translation from the official website, the official website address: (https://docs.influxdata.com/influxdb/v1.7/query_language/data_exploration/)

GROUP BY clause by users to develop their own tags set or time interval to query results are grouped.

A, GROUP BY tags

GROUP BY through a user-specified tag set, to group the query results.
    grammar:

SELECT_clause FROM_clause [WHERE_clause]
GROUP BY [* | [,

GROUP BY clause

significance

All tag grouping query results

The query using the specified tag group results

A plurality of tag query result using the specified group, wherein the sequence between the tag is irrelevant.

GROUP BY *
GROUP BY
GROUP BY ,

Note: If there is a WHERE clause and a GROUP BY clause in sql same time, you must be in the GROUP BY clause after the WHERE clause!

Other supported features: Regular Expressions


GROUP BY tags example sql

  1. Group query results by a single tag
                
                MEAN used above sql function, be grouped averaged h2o_feet this measurement location in the tag.
                NOTE: In the InfluxDB, 0 era 1970-01-01T00: 00: 00Z this time is often used to represent a NULL value in the timestamp. If you do not specify a timestamp returns, such as the above when calling an aggregate function, there is no specified time interval, so InfluxDB finally return 0 era as a timestamp your query.

  2. Group query results by more than one tag

  3. Group query results by all tags


Second, the basic GROUP BY time intervals

GROUP BY time () will query results according to user-specified time interval to be grouped.
    grammar:

SELECT () FROM_clause
WHERE 
GROUP BY time(),[tag_key] [fill()]

The basic GROUP BY time () query usage need to call the relevant function in the SELECT clause, calling time and time interval in the WHERE clause.

    time (time_interval)
            In the GROUP BY time () clause is time_interval consecutive time intervals, the time interval InfluxDB determine how to group results by time. For example, if time_interval is 5m, then it will query results into a group of five minutes (if you specify a time interval in the WHERE clause, then that would WHERE specified time interval into a group not five minutes).

  • fill ()
                fill () is optional. It may be filled with a value that no data time interval. From [GROUP BY time intervals and fill ()] (https://docs.influxdata.com/influxdb/v1.7/query_language/data_exploration/#group-by-time-intervals-and-fill) can view the portions on For more information on this section.

    Note: The basic GROUP BY time () query to determine the original data contained in each time interval and a timestamp returned by the current query time boundary InfluxDB preset database.


Basic usage examples sql

先看一个WHERE查询

The following GROUP BY time (time_interval) is an example of the improvement on the basis of the above sql, sql is:

SELECT COUNT("water_level") FROM "h2o_feet"
WHERE "location"='coyote_creek'
    AND time >= '2015-08-18T00:00:00Z'
    AND time <= '2015-08-18T00:30:00Z'
GROUP BY time(12m)

search result:
    
    The sql h2o_feet the table tag = "coyote_creek", and in '2015-08-18T00: 00: 00Z' and '2015-08-18T00: 30: 00Z' within a time interval of data inquiries from, and divided into its a group of every 12 minutes, for water_level count values ​​calculated.
    Note: in the query results, the time interval is left closed right-open. Take the first row of the query results data for, 2015-08-18T00: 00: 00Z represents the time interval [2015-08-18T00: 00: 00, 2015-08-18T00: 12: 00Z)


common problem

Question: query results and time interval value other than expected.
    In the basic usage, GROUP BY time () query to determine the raw data for each time interval contains a timestamp returned by the query and the current preset time boundary InfluxDB database, which may lead to a value other than expected results.
    For example, by sql:

SELECT "water_level" FROM "h2o_feet"
WHERE "location"='coyote_creek'
    AND time >= '2015-08-18T00:00:00Z'
    AND time <= '2015-08-18T00:18:00Z'

我们查询到原始数据如下所示:

In the following query, we pass the WHERE clause, specify the query data within 12 minutes, and by the GROUP BY clause, the query results are grouped into 12-minute time interval.

SELECT COUNT("water_level") FROM "h2o_feet"
WHERE "location"='coyote_creek'
    AND time >= '2015-08-18T00:06:00Z'
    AND time < '2015-08-18T00:18:00Z'
GROUP BY time(12m)

按照预想,因为查询的是12分钟内的数据,并且group by时是按照12分钟来进行分组的,所以最后的查询结果应该只有一行而已。然后实际的查询结果却有两行:

解释
influxdb使用预设的整数时间边界来作为GROUP BY的时间间隔,这些间隔独立于WHERE子句中的任何时间条件。在计算结果时,所有返回的数据都必须出现在WHERE查询的显式时间范围内,但当按间隔作为GROUP BY分组时是基于预设的时间边界。
(这里翻译的不好,下面是原版英文:
InfluxDB uses preset round-number time boundaries for GROUP BY intervals that are independent of any time conditions in the WHERE clause. When it calculates the results, all returned data must occur within the query’s explicit time range but the GROUP BY intervals will be based on the preset time boundaries.

高级的GROUP BY time()语法允许用户自定义预设时间边界的开始时间。在高级语法小节的示例sql3中,将展示这种用法,它查询的结果如下:


Third, the advanced GROUP BY time () syntax

The syntax is as follows:

SELECT ()
FROM_clause
WHERE 
GROUP BY time(,),[tag_key] [fill()]

In the GROUP BY time () advanced grammar, call the function InfluxDB in the SELECT clause, and the WHERE clause specified interval of time. And to note that, GROUP BY clause must come after the WHERE clause!

  • time (time_interval, offset_interval)
                By time_interval and offset_interval () clause GROUP BY time to represent a continuous time interval, the time interval determines how InfluxDB to group results by time. For example, if the time interval is 5m, then it will query results into a group of five minutes (if you specify a time interval in the WHERE clause, then that is the time interval specified in the WHERE divided into five minutes without a group).
                offset_interval is the duration of the text. It is time to move forward or backward a preset boundary InfluxDB database. offset_interval may be positive or negative.

  • fill ()
                fill () is optional. It may be filled with a value that no data time interval. From the GROUP BY time intervals and fill () portion can view more information on this section.

    Note: The syntax depends on the preset time boundary time_interval, offset_interval, and InfluxDB database to determine the number of data within each group, as well as the results of the query time stamp advanced GROUP BY time ().


High usage examples sql

Look at the following query sql

SELECT "water_level" FROM "h2o_feet"
WHERE "location"='coyote_creek'
    AND time >= '2015-08-18T00:00:00Z'
    AND time <= '2015-08-18T00:54:00Z'

search result:
    
    Next will use a subset of the above sample data to demonstrate. Sql The following will be set into the data at every 18m, and preset time limit advance.

ELECT MEAN("water_level") FROM "h2o_feet"
WHERE "location"='coyote_creek'
    AND time >= '2015-08-18T00:06:00Z'
    AND time <= '2015-08-18T00:54:00Z'
GROUP BY time(18m,6m)

search result:
    
    Sql will be seen above at every 18m query results for a group of packets, and the predetermined offset time limit six minutes.
    Note that for no offset_interval the group by time (), the time boundary of its query result and timestamp returned to follow a preset time boundary influxdb database. Here we look offset_interval the group by results from time ():

SELECT MEAN("water_level") FROM "h2o_feet"
WHERE "location"='coyote_creek'
    AND time >= '2015-08-18T00:06:00Z'
    AND time <= '2015-08-18T00:54:00Z'
GROUP BY time(18m)

Look at the following sql:

SELECT MEAN("water_level") FROM "h2o_feet"
WHERE "location"='coyote_creek'
    AND time >= '2015-08-18T00:06:00Z'
    AND time <= '2015-08-18T00:54:00Z'
GROUP BY time(18m,-12m);

search result
    
    Note: This sql using time (18m, -12m), offset_interval is negative, its results with the use of time (18m, 6m) is the same. Therefore, when negative offset determined interval, randomly selected option is the most straightforward.


GROUP BY time intervals and fill()

The Fill () may be filled with the values ​​of those time intervals without data.
    grammar:

SELECT () FROM_clause
WHERE 
GROUP BY time(time_interval,[)]

By default, the GROUP BY time () the query result, if a certain time interval no data, the time interval corresponding to the value of null. Via fill (), you can be filled with values ​​for those time intervals without data.
    Note that, fill () must appear at the end of the GROUP BY clause.

Fill Options

    Any mathematical value
            Filled using a given mathematical value

    linear
            Linearly interpolating values ​​for the time interval no data values, such values ​​after insertion, with other data section do have a linear value. (Translated here is not very good, you can see an example to understand)

    none
            If no data within a certain time interval, then the timestamp of the query results corresponding to the section will not be shown

    null
            Interval of no value, the display is null. This is the default option.

    previous
            Before a numerical interval to fill the current interval data has no value.

Example:

  1. fill(100)

  2. fill(linear)

  3. fill(none)

  4. fill(null)

  5. fill(previous)


Leave a Reply