使用聚合模型聚合小时数据,出现前后结果不一致该如何排查

Viewed 24

表结构如下

(
  `timestamp` datetimev2,
  `srcIp` varchar(40),
  `dstIp` varchar(40),
  `bytes` bigint
)
 ENGINE=OLAP
AGGREGATE KEY(`timestamp`, `srcIp`,`dstIp`)
COMMENT 'ip_bytes'
PARTITION BY RANGE(`timestamp`)
DISTRIBUTED BY RANDOM BUCKETS AUTO
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"is_being_synced" = "false",
"dynamic_partition.enable" = "true",
"dynamic_partition.time_unit" = "day",
"dynamic_partition.time_zone" = "Asia/Shanghai",
"dynamic_partition.start" = "-60",
"dynamic_partition.end" = "1",
"dynamic_partition.prefix" = "p",
"dynamic_partition.replication_allocation" = "tag.location.default: 1",
"dynamic_partition.buckets" = "70",
"dynamic_partition.create_history_partition" = "true",
"dynamic_partition.history_partition_num" = "3",
"dynamic_partition.hot_partition_num" = "0",
"dynamic_partition.reserved_history_periods" = "NULL",
"dynamic_partition.storage_policy" = "",
"storage_format" = "V2",
"estimate_partition_size" = "350G",
"light_schema_change" = "true",
"disable_auto_compaction" = "false",
"enable_single_replica_compaction" = "false"
);

每小时使用插入语句向小时表聚合,插入执行时间约1分钟

INSERT INTO
  t_pt1h
SELECT
  date_trunc(`timestamp`,'hour') `timestamp`,
  `srcIp`,
  `dstIp`,
  `bytes`
FROM
  t_pt5m
WHERE
  TIMESTAMP >= '${st}'
  AND TIMESTAMP < '${et}';

通过查询语句对比插入后合并结果,发现业务高峰期前后结果不一致,总流量相差最大约5%:

SELECT
  t1.timestamp,
  t1.bytes bytes_1,
  t2.bytes bytes_2
FROM
  (
    SELECT
      date_trunc(`timestamp`, 'hour') `timestamp`,
      sum(`bytes`) `bytes`
    FROM
      t_pt5m
    WHERE
      TIMESTAMP >= '${st}'
      AND TIMESTAMP < '${et}'
    group by
      1
  ) t1
  join (
    SELECT
      `timestamp`,
      sum(`bytes`) `bytes`
    FROM
      t_pt1h
    WHERE
      TIMESTAMP >= '${st}'
      AND TIMESTAMP < '${et}'
    group by
      1
  ) on t1.timestamp = t2.timestamp;

未丢数时,5分钟表最高数据量约为1.5亿条,
丢数时约2亿条,统计sql如下:

SELECT
  date_trunc(`timestamp`, 'hour') `timestamp`,
  count(distinct `srcIp`, `dstIp`) cnt
FROM
  t_pt5m
WHERE
  TIMESTAMP >= '${st}'
  AND TIMESTAMP < '${et}'
group by
  1

doris be情况如下:

+-----------+-----------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+-----------+------------------+--------------------+---------------+---------------+---------+----------------+--------------------+--------------------------+--------+---------------------------------+-------------------------------------------------------------------------------------------------------------------------------+-------------------------+-------------+
| BackendId | Host      | HeartbeatPort | BePort | HttpPort | BrpcPort | LastStartTime       | LastHeartbeat       | Alive | SystemDecommissioned | TabletNum | DataUsedCapacity | TrashUsedCapcacity | AvailCapacity | TotalCapacity | UsedPct | MaxDiskUsedPct | RemoteUsedCapacity | Tag                      | ErrMsg | Version                         | Status                                                                                                                        | HeartbeatFailureCounter | NodeRole    |
+-----------+-----------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+-----------+------------------+--------------------+---------------+---------------+---------+----------------+--------------------+--------------------------+--------+---------------------------------+-------------------------------------------------------------------------------------------------------------------------------+-------------------------+-------------+
| 10007     | 10.1.0.26 | 9050          | 9060   | 8040     | 8060     | 2025-01-03 14:06:19 | 2025-09-25 11:51:23 | true  | false                | 48967     | 4.964 TB         | 194.305 GB         | 1.364 TB      | 6.872 TB      | 80.15 % | 87.83 %        | 0.000              | {"location" : "default"} |        | selectdb-doris-2.0.8-ebab48122d | {"lastSuccessReportTabletsTime":"2025-09-25 11:50:25","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} | 0                       | mix         |
| 10008     | 10.1.0.23 | 9050          | 9060   | 8040     | 8060     | 2025-01-03 14:03:04 | 2025-09-25 11:51:23 | true  | false                | 49027     | 4.758 TB         | 148.052 GB         | 1.615 TB      | 6.872 TB      | 76.50 % | 82.85 %        | 0.000              | {"location" : "default"} |        | selectdb-doris-2.0.8-ebab48122d | {"lastSuccessReportTabletsTime":"2025-09-25 11:50:47","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} | 0                       | mix         |
| 10009     | 10.1.0.22 | 9050          | 9060   | 8040     | 8060     | 2025-01-03 14:01:49 | 2025-09-25 11:51:23 | true  | false                | 0         | 0.000            | 0.000              | 6.523 TB      | 6.872 TB      | 5.08 %  | 5.08 %         | 0.000              | {"location" : "default"} |        | selectdb-doris-2.0.8-ebab48122d | {"lastSuccessReportTabletsTime":"2025-09-25 11:50:52","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} | 0                       | computation |
| 10010     | 10.1.0.24 | 9050          | 9060   | 8040     | 8060     | 2025-01-03 14:04:24 | 2025-09-25 11:51:23 | true  | false                | 49516     | 5.029 TB         | 82.514 GB          | 1.408 TB      | 6.872 TB      | 79.51 % | 86.22 %        | 0.000              | {"location" : "default"} |        | selectdb-doris-2.0.8-ebab48122d | {"lastSuccessReportTabletsTime":"2025-09-25 11:50:51","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} | 0                       | mix         |
| 10011     | 10.1.0.25 | 9050          | 9060   | 8040     | 8060     | 2025-01-03 14:05:29 | 2025-09-25 11:51:23 | true  | false                | 49066     | 4.950 TB         | 98.978 GB          | 1.474 TB      | 6.872 TB      | 78.55 % | 83.13 %        | 0.000              | {"location" : "default"} |        | selectdb-doris-2.0.8-ebab48122d | {"lastSuccessReportTabletsTime":"2025-09-25 11:50:24","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} | 0                       | mix         |
| 10012     | 10.1.0.21 | 9050          | 9060   | 8040     | 8060     | 2025-01-03 13:59:54 | 2025-09-25 11:51:23 | true  | false                | 48993     | 4.926 TB         | 179.233 GB         | 1.419 TB      | 6.872 TB      | 79.35 % | 86.72 %        | 0.000              | {"location" : "default"} |        | selectdb-doris-2.0.8-ebab48122d | {"lastSuccessReportTabletsTime":"2025-09-25 11:51:08","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} | 0                       | mix         |
+-----------+-----------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+-----------+------------------+--------------------+---------------+---------------+---------+----------------+--------------------+--------------------------+--------+---------------------------------+-------------------------------------------------------------------------------------------------------------------------------+-------------------------+-------------+
1 Answers

建议老师升级到 2.1.11 版本进行测试,2.0 的问题不好跟进了呢。