请教各位大佬,如何使用doris 统计k线数据

Viewed 11

版本号
Doris version doris-3.0.4-rc02-39f9074cec 存算一体

需求
1:有毫秒秒的每个股票的实时交易记录和价格
2:使用同步物化视图统计秒级、分钟、小时、天和自定义时间级别,自定义起止时间的OHLC K线数据
3:需要实时聚合统计对实时性要求较高

遇到的问题
1:同步物化视图无法使用窗口函数LAST_VALUE

2:使用 MIN_BY(price, create_time),MAX_BY(price, create_time),替代可进行创建物化视图,但是在查询时会报错min_by_merge(mva_GENERIC__min_by_state(...)) is not supported

3:同步物化视图无法使用 WHERE stock_id = 'xxx'
AND create_time >= '2025-08-02' 中的 create_time字段限制查询起的时间缩减数据范围,会无法使用同步物化视图加速

4:去掉 AND create_time >= '2025-08-02' 时物化视图可用,但会有 min_by_merge(mva_GENERIC__min_by_state(...)) is not supported

5: bitmap_union_count(to_bitmap(CASE WHEN trade_type = 1 THEN stock_id_hash_code ELSE NULL END)) 统计的买卖交易次数不等于交易总次数

建表语句
CREATE TABLE trade_list (
stock_id varchar(256) NOT NULL COMMENT "股票代码",
price decimal(38,30) NOT NULL COMMENT "价格",
trade_type tinyint NOT NULL COMMENT "1买入 2卖出",
create_time datetime(3) NOT NULL COMMENT "交易时间",
stock_id_hash_code bigint NOT NULL COMMENT "物化视图hashcode后ID 可负数",

) ENGINE=OLAP
DUPLICATE KEY(stock_id)
COMMENT '交易明细'
PARTITION BY RANGE(create_time) ()
DISTRIBUTED BY HASH( stock_id) BUCKETS 64
PROPERTIES (
"replication_allocation" = "tag.location.default: 1", -- 单副本,后期可增加
"dynamic_partition.enable" = "true", --动态分区
"dynamic_partition.time_unit" = "DAY", -- 按天分区
"dynamic_partition.time_zone" = "Asia/Shanghai",
"dynamic_partition.start" = "-2147483648", -- 永久保留所有历史数据
"dynamic_partition.end" = "3", -- 预创建7天
"dynamic_partition.prefix" = "p",
"dynamic_partition.buckets" = "64",
"storage_medium" = "SSD", -- 使用SSD提升性能
"compression" = "ZSTD", -- 压缩算法
"bloom_filter_columns" = "stock_id"
);

同步物化视图
CREATE MATERIALIZED VIEW mv_trade_list AS
SELECT
stock_id,
date_trunc(create_time, 'second'),
MIN_BY(price, create_time),
MAX(price),
MIN(price),
MAX_BY(price, create_time),
bitmap_union_count(to_bitmap(stock_id_hash_code)),
bitmap_union_count(to_bitmap(CASE WHEN trade_type = 1 THEN stock_id_hash_code ELSE NULL END)),
bitmap_union_count(to_bitmap(CASE WHEN trade_type = 2 THEN stock_id_hash_code ELSE NULL END))
FROM trade_list
GROUP BY pair_address, date_trunc(create_time, 'second');

查询语句
EXPLAIN
SELECT
date_trunc(create_time, 'second'),
MIN_BY(price, create_time),
MAX(price),
MIN(price),
MAX_BY(price, create_time),
bitmap_union_count(to_bitmap(stock_id_hash_code)),
bitmap_union_count(to_bitmap(CASE WHEN swap_type = 0 THEN stock_id_hash_code ELSE NULL END)),
bitmap_union_count(to_bitmap(CASE WHEN swap_type = 1 THEN stock_id_hash_code ELSE NULL END))
FROM trade_list
WHERE stock_id = 'xxxx'
AND create_time >= '2025-08-02'
GROUP BY stock_id, date_trunc(create_time, 'second')
ORDER BY date_trunc(create_time, 'second') DESC
LIMIT 100;

0 Answers