查询导致Doris宕机

Viewed 8

诉求:想知道为啥Doris BE 会死掉,如何能避免这种问题?
Doris 版本 : 2.1.9
集群信息: 5 BE , 1 FE。 都是 16c 64G配置
表现: 突然Doris其中的一个BE节点出现宕机。排查 Doris 的BE日志(be.out)发现错误信息如下:

SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
F20251216 11:15:04.502287 3530874 aggregate_function_sort.h:92] Check failed: st.ok() 
*** Check failure stack trace: ***
F20251216 11:15:04.502298 3530867 aggregate_function_sort.h:92] Check failed: st.ok() 
*** Check failure stack trace: ***
    @     0x56042b897466  google::LogMessage::SendToLog()
    @     0x56042b897466  google::LogMessage::SendToLog()
    @     0x56042b893eb0  google::LogMessage::Flush()
    @     0x56042b893eb0  google::LogMessage::Flush()
    @     0x56042b897ca9  google::LogMessageFatal::~LogMessageFatal()
    @     0x56042b897ca9  google::LogMessageFatal::~LogMessageFatal()
    @     0x56042612037e  doris::vectorized::AggregateFunctionSortData::deserialize()
    @     0x56042612037e  doris::vectorized::AggregateFunctionSortData::deserialize()
    @     0x56042611f6a9  doris::vectorized::IAggregateFunctionDataHelper<>::deserialize_and_merge()
    @     0x56042611f6a9  doris::vectorized::IAggregateFunctionDataHelper<>::deserialize_and_merge()
    @     0x56042611f4d0  doris::vectorized::IAggregateFunctionHelper<>::deserialize_and_merge_vec()
    @     0x56042611f4d0  doris::vectorized::IAggregateFunctionHelper<>::deserialize_and_merge_vec()
F20251216 11:15:04.502298 3530867 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:04.576153 3530870 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:04.576159 3530875 aggregate_function_sort.h:92] Check failed: st.ok() 
*** Check failure stack trace: ***
F20251216 11:15:04.502298 3530867 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:04.576153 3530870 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:04.576159 3530875 aggregate_function_sort.h:92] Check failed: st.ok() 
*** Check failure stack trace: ***
    @     0x56042b897466  google::LogMessage::SendToLog()
    @     0x56042b2e4059  doris::pipeline::AggSinkLocalState::_merge_with_serialized_key_helper<>()
    @     0x56042b897466  google::LogMessage::SendToLog()
    @     0x56042b893eb0  google::LogMessage::Flush()
    @     0x56042b893eb0  google::LogMessage::Flush()
    @     0x56042b897ca9  google::LogMessageFatal::~LogMessageFatal()
    @     0x56042b2e4059  doris::pipeline::AggSinkLocalState::_merge_with_serialized_key_helper<>()
    @     0x56042b897ca9  google::LogMessageFatal::~LogMessageFatal()
    @     0x56042b2e65b0  doris::pipeline::AggSinkLocalState::Executor<>::execute()
F20251216 11:15:04.502298 3530867 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:04.576153 3530870 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:04.576159 3530875 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:04.605412 3530871 aggregate_function_sort.h:92] Check failed: st.ok() 
*** Check failure stack trace: ***
    @     0x56042612037e  doris::vectorized::AggregateFunctionSortData::deserialize()
    @     0x56042612037e  doris::vectorized::AggregateFunctionSortData::deserialize()
    @     0x56042b897466  google::LogMessage::SendToLog()
    @     0x56042b2e65b0  doris::pipeline::AggSinkLocalState::Executor<>::execute()
    @     0x56042611f6a9  doris::vectorized::IAggregateFunctionDataHelper<>::deserialize_and_merge()
    @     0x56042b893eb0  google::LogMessage::Flush()
    @     0x56042611f6a9  doris::vectorized::IAggregateFunctionDataHelper<>::deserialize_and_merge()
    @     0x56042b2c16e9  doris::pipeline::AggSinkOperatorX::sink()
    @     0x56042b897ca9  google::LogMessageFatal::~LogMessageFatal()
    @     0x56042611f4d0  doris::vectorized::IAggregateFunctionHelper<>::deserialize_and_merge_vec()
    @     0x56042b2c16e9  doris::pipeline::AggSinkOperatorX::sink()
    @     0x56042611f4d0  doris::vectorized::IAggregateFunctionHelper<>::deserialize_and_merge_vec()
    @     0x56042b8680c6  doris::pipeline::PipelineXTask::execute()
    @     0x56042612037e  doris::vectorized::AggregateFunctionSortData::deserialize()
    @     0x56042b2e4059  doris::pipeline::AggSinkLocalState::_merge_with_serialized_key_helper<>()
    @     0x56042b8680c6  doris::pipeline::PipelineXTask::execute()
    @     0x56042b2e4059  doris::pipeline::AggSinkLocalState::_merge_with_serialized_key_helper<>()
    @     0x56042611f6a9  doris::vectorized::IAggregateFunctionDataHelper<>::deserialize_and_merge()
    @     0x56042b872f7c  doris::pipeline::TaskScheduler::_do_work()
    @     0x560421d7bf18  doris::ThreadPool::dispatch_thread()
    @     0x56042b2e65b0  doris::pipeline::AggSinkLocalState::Executor<>::execute()
    @     0x56042b872f7c  doris::pipeline::TaskScheduler::_do_work()
    @     0x56042611f4d0  doris::vectorized::IAggregateFunctionHelper<>::deserialize_and_merge_vec()
    @     0x560421d712a1  doris::Thread::supervise_thread()
    @     0x56042b2e65b0  doris::pipeline::AggSinkLocalState::Executor<>::execute()
    @     0x560421d7bf18  doris::ThreadPool::dispatch_thread()
    @     0x560421d712a1  doris::Thread::supervise_thread()
    @     0x56042b2c16e9  doris::pipeline::AggSinkOperatorX::sink()
    @     0x56042b2c16e9  doris::pipeline::AggSinkOperatorX::sink()
    @     0x56042b2e4059  doris::pipeline::AggSinkLocalState::_merge_with_serialized_key_helper<>()
    @     0x56042b8680c6  doris::pipeline::PipelineXTask::execute()
    @     0x56042b8680c6  doris::pipeline::PipelineXTask::execute()
    @     0x56042b2e65b0  doris::pipeline::AggSinkLocalState::Executor<>::execute()
    @     0x56042b872f7c  doris::pipeline::TaskScheduler::_do_work()
    @     0x56042b872f7c  doris::pipeline::TaskScheduler::_do_work()
    @     0x560421d7bf18  doris::ThreadPool::dispatch_thread()
    @     0x560421d7bf18  doris::ThreadPool::dispatch_thread()
    @     0x560421d712a1  doris::Thread::supervise_thread()
    @     0x560421d712a1  doris::Thread::supervise_thread()
    @     0x56042b2c16e9  doris::pipeline::AggSinkOperatorX::sink()
    @     0x56042b8680c6  doris::pipeline::PipelineXTask::execute()
F20251216 11:15:04.502298 3530867 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:04.576153 3530870 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:04.576159 3530875 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:04.605412 3530871 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:05.315984 3530873 aggregate_function_sort.h:92] Check failed: st.ok() 
*** Check failure stack trace: ***
    @     0x56042b872f7c  doris::pipeline::TaskScheduler::_do_work()
    @     0x56042b897466  google::LogMessage::SendToLog()
    @     0x560421d7bf18  doris::ThreadPool::dispatch_thread()
    @     0x56042b893eb0  google::LogMessage::Flush()
    @     0x56042b897ca9  google::LogMessageFatal::~LogMessageFatal()
    @     0x560421d712a1  doris::Thread::supervise_thread()
    @     0x56042612037e  doris::vectorized::AggregateFunctionSortData::deserialize()
F20251216 11:15:04.502298 3530867 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:04.576153 3530870 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:04.576159 3530875 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:04.605412 3530871 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:05.315984 3530873 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:05.358039 3530872 aggregate_function_sort.h:92] Check failed: st.ok() 
*** Check failure stack trace: ***
    @     0x56042b897466  google::LogMessage::SendToLog()
    @     0x56042611f6a9  doris::vectorized::IAggregateFunctionDataHelper<>::deserialize_and_merge()
    @     0x56042b893eb0  google::LogMessage::Flush()
    @     0x56042b897ca9  google::LogMessageFatal::~LogMessageFatal()
    @     0x56042611f4d0  doris::vectorized::IAggregateFunctionHelper<>::deserialize_and_merge_vec()
F20251216 11:15:04.502298 3530867 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:04.576153 3530870 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:04.576159 3530875 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:04.605412 3530871 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:05.315984 3530873 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:05.358039 3530872 aggregate_function_sort.h:92] Check failed: st.ok() F20251216 11:15:05.392037 3530869 aggregate_function_sort.h:92] Check failed: st.ok() 
*** Check failure stack trace: ***
    @     0x56042612037e  doris::vectorized::AggregateFunctionSortData::deserialize()
    @     0x56042b897466  google::LogMessage::SendToLog()
    @     0x56042b893eb0  google::LogMessage::Flush()
    @     0x56042b2e4059  doris::pipeline::AggSinkLocalState::_merge_with_serialized_key_helper<>()
    @     0x56042b897ca9  google::LogMessageFatal::~LogMessageFatal()
    @     0x56042611f6a9  doris::vectorized::IAggregateFunctionDataHelper<>::deserialize_and_merge()
    @     0x56042b2e65b0  doris::pipeline::AggSinkLocalState::Executor<>::execute()
    @     0x56042612037e  doris::vectorized::AggregateFunctionSortData::deserialize()
    @     0x56042611f4d0  doris::vectorized::IAggregateFunctionHelper<>::deserialize_and_merge_vec()
    @     0x56042611f6a9  doris::vectorized::IAggregateFunctionDataHelper<>::deserialize_and_merge()
    @     0x56042b2e4059  doris::pipeline::AggSinkLocalState::_merge_with_serialized_key_helper<>()
    @     0x56042b2c16e9  doris::pipeline::AggSinkOperatorX::sink()
    @     0x56042611f4d0  doris::vectorized::IAggregateFunctionHelper<>::deserialize_and_merge_vec()
    @     0x56042b2e65b0  doris::pipeline::AggSinkLocalState::Executor<>::execute()
    @     0x56042b8680c6  doris::pipeline::PipelineXTask::execute()
    @     0x56042b2e4059  doris::pipeline::AggSinkLocalState::_merge_with_serialized_key_helper<>()
    @     0x56042b2c16e9  doris::pipeline::AggSinkOperatorX::sink()
    @     0x56042b872f7c  doris::pipeline::TaskScheduler::_do_work()
    @     0x56042b2e65b0  doris::pipeline::AggSinkLocalState::Executor<>::execute()
    @     0x560421d7bf18  doris::ThreadPool::dispatch_thread()
    @     0x56042b8680c6  doris::pipeline::PipelineXTask::execute()
    @     0x560421d712a1  doris::Thread::supervise_thread()
    @     0x56042b2c16e9  doris::pipeline::AggSinkOperatorX::sink()
    @     0x56042b872f7c  doris::pipeline::TaskScheduler::_do_work()
    @     0x560421d7bf18  doris::ThreadPool::dispatch_thread()
    @     0x560421d712a1  doris::Thread::supervise_thread()
    @     0x56042b8680c6  doris::pipeline::PipelineXTask::execute()
    @     0x56042b872f7c  doris::pipeline::TaskScheduler::_do_work()
    @     0x560421d7bf18  doris::ThreadPool::dispatch_thread()
    @     0x560421d712a1  doris::Thread::supervise_thread()
    @     0x7ffa8a51056a  (unknown)
    @     0x7ffa8a51056a  (unknown)
    @     0x7ffa8a51056a  (unknown)
    @     0x7ffa8a51056a  (unknown)
    @     0x7ffa8a592ac0  (unknown)
    @     0x7ffa8a592ac0  (unknown)
    @     0x7ffa8a51056a  (unknown)
    @     0x7ffa8a592ac0  (unknown)
    @     0x7ffa8a592ac0  (unknown)
    @              (nil)  (unknown)
    @              (nil)  (unknown)
    @     0x7ffa8a592ac0  (unknown)
    @              (nil)  (unknown)
*** Query id: 3f0029d0e9b34f06-a6cede15d5c428bb ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1765854905 (unix time) try "date -d @1765854905" if you are using GNU date ***
*** Current BE git commitID: 3390475e02 ***
*** SIGABRT unknown detail explain (@0x3e90035d7d1) received by PID 3528657 (TID 3530874 OR 0x7fee06bf5640) from PID 3528657; stack trace: ***
    @     0x7ffa8a51056a  (unknown)
    @     0x7ffa8a51056a  (unknown)
    @     0x7ffa8a51056a  (unknown)
    @     0x7ffa8a592ac0  (unknown)
    @     0x7ffa8a592ac0  (unknown)
    @     0x7ffa8a592ac0  (unknown)
    @              (nil)  (unknown)
    @              (nil)  (unknown)
    @              (nil)  (unknown)
    @              (nil)  (unknown)
    @              (nil)  (unknown)
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_release/doris/be/src/common/signal_handler.h:421
 1# 0x00007FFA8A4C5EF0 in /usr/lib64/libc.so.6
 2# 0x00007FFA8A51213F in /usr/lib64/libc.so.6
 3# gsignal in /usr/lib64/libc.so.6
 4# abort in /usr/lib64/libc.so.6
 5# 0x000056042B8A1C7D in /msun/app/doris/be/lib/doris_be
 6# 0x000056042B89437A in /msun/app/doris/be/lib/doris_be
 7# google::LogMessage::SendToLog() in /msun/app/doris/be/lib/doris_be
 8# google::LogMessage::Flush() in /msun/app/doris/be/lib/doris_be
 9# google::LogMessageFatal::~LogMessageFatal() in /msun/app/doris/be/lib/doris_be
10# doris::vectorized::AggregateFunctionSortData::deserialize(doris::vectorized::BufferReadable&) in /msun/app/doris/be/lib/doris_be
11# doris::vectorized::IAggregateFunctionDataHelper<doris::vectorized::AggregateFunctionSortData, doris::vectorized::AggregateFunctionSort<doris::vectorized::AggregateFunctionSortData> >::deserialize_and_merge(char*, char*, doris::vectorized::BufferReadable&, doris::vectorized::Arena*) const at /home/zcp/repo_center/doris_release/doris/be/src/vec/aggregate_functions/aggregate_function.h:522
12# doris::vectorized::IAggregateFunctionHelper<doris::vectorized::AggregateFunctionSort<doris::vectorized::AggregateFunctionSortData> >::deserialize_and_merge_vec(char* const*, unsigned long, char*, doris::vectorized::IColumn const*, doris::vectorized::Arena*, unsigned long) const at /home/zcp/repo_center/doris_release/doris/be/src/vec/aggregate_functions/aggregate_function.h:387
13# doris::Status doris::pipeline::AggSinkLocalState::_merge_with_serialized_key_helper<false, false>(doris::vectorized::Block*) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/exec/aggregation_sink_operator.cpp:350
14# doris::pipeline::AggSinkLocalState::Executor<false, true>::execute(doris::pipeline::AggSinkLocalState*, doris::vectorized::Block*) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/exec/aggregation_sink_operator.h:79
15# doris::pipeline::AggSinkOperatorX::sink(doris::RuntimeState*, doris::vectorized::Block*, bool) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/exec/aggregation_sink_operator.cpp:744
16# doris::pipeline::PipelineXTask::execute(bool*) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/pipeline_x/pipeline_x_task.cpp:380
17# doris::pipeline::TaskScheduler::_do_work(unsigned long) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/task_scheduler.cpp:347
18# doris::ThreadPool::dispatch_thread() in /msun/app/doris/be/lib/doris_be
19# doris::Thread::supervise_thread(void*) at /home/zcp/repo_center/doris_release/doris/be/src/util/thread.cpp:499
20# 0x00007FFA8A51056A in /usr/lib64/libc.so.6
21# 0x00007FFA8A592AC0 in /usr/lib64/libc.so.6

StdoutLogger 2025-12-16 11:18:41,748 Start time: Tue Dec 16 11:18:41 AM CST 2025
INFO: java_cmd /msun/app/jdk/bin/java
INFO: jdk_version 8
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/msun/app/doris/be/lib/java_extensions/preload-extensions/preload-extensions-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/msun/app/doris/be/lib/java_extensions/java-udf/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/msun/app/doris/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
StdoutLogger 2025-12-16 11:20:08,862 Start time: Tue Dec 16 11:20:08 AM CST 2025

SQL信息
根据query_id。将查询的审计日志拿出来,再次执行想复现错误问题,但是无法复现。sql信息:

SELECT DISTINCT
a.visit_card_no  就诊号,
a.pat_name 姓名,
p.idnum 身份证号,
p.birth_date 出生日期,
'' AS 发病时间,
p.sex_name 性别,
p.nation_name 民族,
p.marriage_status_name 婚姻状况,
a.seedoc_time 就诊时间,
a.seedoc_dept_name  就诊科室,
a.seedoc_doc_name 医生姓名,
case when again_visit_times=0 then '初诊' else '复诊' end 初(复)诊,
SPLIT_PART(mm.emr_contt, '/', 1) 收缩压,
SPLIT_PART(mm.emr_contt, '/', 2) 舒张压,
d.out_dgmss_name  诊断名称,
d.out_dgmss_code 诊断编码,
s.total_amount 门诊总费用,
jszf.医保报销 as 医保报销,
jszf.医保卡金 as 医保卡金, 
jszf.微信卫生院 as 微信卫生院, 
jszf.现金 as 现金,
p.telephone 联系电话,
p.specific_addr_name 家庭地址,
a.age AS 年龄,
a.charge_class_name AS 费别,
a.visit_record_id
FROM    dw_dwb_sv_os_out_seedoc_info a
left join (
    select  out_visit_record_id,sum(total_amount) total_amount from dw_dwb_rs_mi_out_setlmt_d GROUP by out_visit_record_id
)s on a.visit_record_id=s.out_visit_record_id
LEFT JOIN dw_dwb_cz_ca_pat_info p ON a.pat_id = p.pat_id
left join  dw_dwb_sv_os_out_emr_m m on a.visit_record_id=m.out_visit_record_id
left join dw_dwb_sv_os_out_emr mm on m.out_emr_m_id=mm.out_emr_m_id and emr_contt_type_name='血压' and emr_contt !='/'
left join (
    SELECT visit_record_id as visit_record_id, GROUP_CONCAT(out_dgmss_code, ',' ORDER BY out_dgmss_code) AS out_dgmss_code
    , GROUP_CONCAT(out_dgmss_name, ',' ORDER BY out_dgmss_name) as out_dgmss_name
    FROM (
        SELECT DISTINCT visit_record_id, out_dgmss_code AS out_dgmss_code, out_dgmss_name AS out_dgmss_name
        FROM dw_dwb_sv_os_out_dgmss_info
        WHERE out_dgmss_code IS NOT NULL
        AND out_dgmss_name IS NOT NULL
    ) aa
    GROUP BY visit_record_id 
)as d on a.visit_record_id=d.visit_record_id 
LEFT JOIN dwd_dwd_dept t on a.seedoc_dept_id=t.dept_id
left JOIN (
    select B.register_id,SUM(case WHEN pay_way_id='2' then amount else 0 END) AS 医保报销,
    SUM(case WHEN pay_way_id='5' then amount else 0 END) AS 医保卡金,
    SUM(case WHEN pay_way_id='5044' then amount else 0 END) AS 微信卫生院,
    SUM(case WHEN pay_way_id NOT IN (2,5,5044) then amount else 0 END) AS 现金
    from dw_dwb_rs_mi_out_setlmt_pay_way A 
    LEFT JOIN dw_dwb_rs_mi_out_setlmt_m B ON A.out_setlmt_m_id=B.out_setlmt_m_id
    GROUP BY B.register_id
)jszf ON jszf.register_id=a.register_id
WHERE   a.seedoc_time > '2025-01-01 00:00:00'  and a.seedoc_time <'2025-03-31 23:59:59'  and t.dept_type_id !='11' 
        AND  1 = 1 
        AND  1 = 1 
        AND  1 = 1 
and a.hos_id= '10630009'
  and a.register_invalid_flag_id='0'
order by a.seedoc_dept_name
0 Answers