doris job突然偶发性报错sync filter size meet error

Viewed 3

org.apache.doris.job.exception.JobException: errCode = 2, detailMessage = (10.16.56.186)[INTERNAL_ERROR]sync filter size meet error, filter: RuntimeFilter: (id = 3, type = in_or_bloomfilter, is_broadcast: false, ignored: true, disabled: false, build_bf_cardinality: true, dependency: HASH_JOIN_SINK_OPERATOR_FINISH_DEPENDENCY: id=12, block_task=0, ready=true, _always_ready=false, count=0, synced_size: -1, has_local_target: false, has_remote_target: true, error_msg: []

doris-3.0.4-rc02-39f9074cec,3fe + 3be

创建了一个job

drop table if exists global_dw.player_wallet_detail_snap_his;
CREATE TABLE if not exists global_dw.player_wallet_detail_snap_his (
event_day_local date not null comment '时间',
currency varchar(48) NULL ,
player_id bigint NULL ,
flink_server_region varchar(765) NULL ,
username varchar(192) NULL ,
wallet decimal(38,20) NULL ,
temp_wallet decimal(38,20) NULL ,
safe_wallet decimal(38,20) NULL,
lock_wallet decimal(38,20) NULL ,
current_status int NULL ,
tid bigint NULL ,
channel bigint NULL ,
create_time datetime NULL ,
create_time_longs bigint NULL,
flink_server_time_zone varchar(765) NULL ,
flink_server_update_mills bigint NULL ,
player_info_id bigint NULL ,
snap_time datetime null
) ENGINE=OLAP
UNIQUE KEY(event_day_local, currency, player_id, flink_server_region)
COMMENT 'xxxxx'
DISTRIBUTED BY HASH(event_day_local, currency, player_id, flink_server_region) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 3"
);

create job job_wallet_detail_snap_his
on SCHEDULE EVERY 15 MINUTE STARTS '2026-01-04 00:00:01'
do
insert into global_dw.player_wallet_detail_snap_his
select
DATE_SUB(DATE_FORMAT(case when time_zone = '+05:45' then CONVERT_TZ(now(), @@session.time_zone, 'Asia/Kathmandu') else CONVERT_TZ(now(), @@session.time_zone, time_zone) end, '%Y-%m-%d'), INTERVAL 1 DAY) as event_day_local,
pwv.currency,
pwv.player_id,
pwv.flink_server_region,
pwv.username,
pwv.wallet,
pwv.temp_wallet,
pwv.safe_wallet,
pwv.lock_wallet,
pwv.current_status,
pwv.tid,
pwv.channel,
pwv.create_time,
pwv.create_time_longs,
pwv.flink_server_time_zone,
pwv.flink_server_update_mills,
pwv.player_id as player_info_id,
now()
from global_ods.player_wallet_detail_v pwv
where time_zone in (
select time_zone -- , CONVERT_TZ(now(), @@session.time_zone, CASE WHEN max(time_zone) = '+05:45' THEN 'Asia/Kathmandu' ELSE max(time_zone) END)
from global_ods_config.sys_country_v scv
group by time_zone
having hour(CONVERT_TZ(now(), @@session.time_zone, CASE WHEN max(time_zone) = '+05:45' THEN 'Asia/Kathmandu' ELSE max(time_zone) END)) = 0
and MINUTE(CONVERT_TZ(now(), @@session.time_zone, CASE WHEN max(time_zone) = '+05:45' THEN 'Asia/Kathmandu' ELSE max(time_zone) END)) < 15
)
;

周六下午突然出现两个task 失败的,异常日志:
org.apache.doris.job.exception.JobException: errCode = 2, detailMessage = (10.16.56.186)[INTERNAL_ERROR]sync filter size meet error, filter: RuntimeFilter: (id = 3, type = in_or_bloomfilter, is_broadcast: false, ignored: true, disabled: false, build_bf_cardinality: true, dependency: HASH_JOIN_SINK_OPERATOR_FINISH_DEPENDENCY: id=12, block_task=0, ready=true, _always_ready=false, count=0, synced_size: -1, has_local_target: false, has_remote_target: true, error_msg: []

每15分钟运行一次,前两天运行都正常,没有失败,突然失败了,周末也没有人管他,

失败两次之后,后面的task又正常运行,没有失败,

今天早上来看日志,分析原因,把now 换成异常时候的时间,手动执行该select sql,执行正常,无数据返回。

貌似像偶发的,不好复现的。

https://github.com/apache/doris/pull/37103/files

搜了之后和这个单的异常相似,但是看官网说2.1.5,3.0.0版本在24年7,8月份就已修复

但是3.0.4后续的版本还是复现了这个问题。以前写了十几二十个job insert,一年多没遇到这种异常,这次第一次遇到

0 Answers