开发者

Why is druid roll up not working as expected?

I have the following druid ingestion spec which reads data from kafka and generates some aggregates. Right now I am only interested in the count. It has two dimensions PURCHASE_STATUS and STORE_ID. Now I want the data to be rolled up and bucketed for a minute level granularity.

  "type": "kafka",
  "spec": {
    "dataSchema": {
      "dataSource": "purchase",
      "timestampSpec": {
        "column": "timestamp",
        "format": "millis",
        "missingValue": "1970-01-01T00:00:00.000Z"
      },
      "dimensionsSpec": {
        "dimensions": [
          {
            "type": "string",
            "name": "PURCHASE_STATUS",
            "multiValueHandling": "SORTED_ARRAY",
            "createBitmapIndex": true
          },
          {
            "type": "string",
            "name": "STORE_ID",
            "multiValueHandling": "SORTED_ARRAY",
            "createBitmapIndex": true
          }
        ],
        "dimensionExclusions": [
          "__time",
          "total_count",
          "timestamp",
        ],
        "includeAllDimensions": false
      },
      "metricsSpec": [
        {
          "type": "count",
          "name": "total_count"
        }
      ],
      "granularitySpec": {
        "type": "uniform",
        "segmentGranularity": "TEN_MINUTE",
        "queryGranularity": "MINUTE",
        "rollu开发者_JAVA百科p": true,
        "intervals": []
      },

Now when I query druid using the following query

SELECT
__time, STORE_ID, PURCHASE_STATUS, total_count
FROM mz_purchase
WHERE __time >= CURRENT_TIMESTAMP - INTERVAL '30' MINUTE ORDER BY __time DESC

I get the following results

2022-12-07T06:37:00.000Z    1   Status1 3
2022-12-07T06:37:00.000Z    1   Status2 2
2022-12-07T06:37:00.000Z    1   Status1 1
2022-12-07T06:37:00.000Z    1   Status3 23

I am confused why we are getting multiple aggregates for the same timestamp bucket and combination of dimensions.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜