- Doris版本:doris-2.1.5
报错如下:
JobId: 49579717
Label: label_20250812_0900_29896
State: CANCELLED
Progress: 24.14% (4475/18536)
Type: BROKER
EtlInfo: NULL
TaskInfo: cluster:hdfs_cluster; timeout(s):14400; max_filter_ratio:0.0; priority:NORMAL
ErrorMsg: type:LOAD_RUN_FAIL; msg:send fragments failed. io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 34.999926650s. Name resolution delay 0.000000000 seconds. [closed=[], committed=[buffered_nanos=2325753, remote_addr=10.****.209/10.****.209:8060]], host: 10.****.209
CreateTime: 2025-08-12 09:55:17
EtlStartTime: 2025-08-12 09:58:37
EtlFinishTime: 2025-08-12 09:58:37
LoadStartTime: 2025-08-12 09:58:37
LoadFinishTime: 2025-08-12 10:01:33
URL: NULL
JobDetails: {"Unfinished backends":{"2ca43fdb72e4c92-962130f98851ce51":[10237,10043,10238,10240,10041,10241,10239,10242,10133,10364,10301,10302,10362,10300,10263,10244,10243]},"ScannedRows":2854091497,"TaskNumber":1,"LoadBytes":937517390345,"All backends":{"2ca43fdb72e4c92-962130f98851ce51":[10363,27244286,10231,10233,10232,10227,10229,10226,10228,10235,10234,10037,10236,10237,10129,10366,10137,10153,10144,10145,10147,10146,10148,10356,10152,10121,10122,10313,10150,10151,10124,10317,10337,10123,10318,10149,10126,10125,10128,10127,10120,10316,10140,10141,10142,10139,10138,10358,10357,10143,10117,10315,10116,10119,10118,10114,10115,10076,10314,10113,10136,10057,10308,10056,10309,10310,10311,10312,10359,10051,10052,10049,10303,10050,10054,10304,10305,10053,10306,10055,10307,10368,10367,10134,10360,10047,10361,10046,10048,10043,10238,10040,10039,10045,10044,10038,10240,10041,10241,10239,10042,10242,10135,10365,10130,10133,10364,10301,10302,10362,10300,10263,10244,10243]},"FileNumber":8769,"FileSize":1792700575556}
到对应的be节点上查看日志,并有报错,但不理解原因
I20250812 10:01:33.395941 465993 plan_fragment_executor.cpp:553] PlanFragmentExecutor::cancel 2ca43fdb72e4c92-962130f98851ce51|2ca43fdb72e4c92-962130f98851cf79 reason 3 error msg
W20250812 10:01:33.397295 465903 runtime_state.h:209] Task is cancelled, instance: 2ca43fdb72e4c92-962130f98851ce51|2ca43fdb72e4c92-962130f98851cf7a, st = [CANCELLED]
I20250812 10:01:33.394969 465879 plan_fragment_executor.cpp:553] PlanFragmentExecutor::cancel 2ca43fdb72e4c92-962130f98851ce51|2ca43fdb72e4c92-962130f98851cf77 reason 3 error msg
W20250812 10:01:33.398072 465879 runtime_state.h:209] Task is cancelled, instance: 2ca43fdb72e4c92-962130f98851ce51|2ca43fdb72e4c92-962130f98851cf77, st = [CANCELLED]
W20250812 10:01:33.398507 465879 status.h:412] meet error status: [ABORTED]
0# doris::PlanFragmentExecutor::cancel(doris::PPlanFragmentCancelReason const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:0
1# doris::FragmentMgr::cancel_instance(doris::TUniqueId const&, doris::PPlanFragmentCancelReason const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /home/zcp/repo_center/doris_release/doris/be/src/runtime/fragment_mgr.cpp:0
2# std::_Function_handler<void (), doris::PInternalServiceImpl::cancel_plan_fragment(google::protobuf::RpcController*, doris::PCancelPlanFragmentRequest const*, doris::PCancelPlanFragmentResult*, google::protobuf::Closure*)::$_0>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
3# doris::WorkThreadPool<false>::work_thread(int) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/atomic_base.h:646
4# execute_native_thread_routine at /data/gcc-11.1.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:85
5# ?
6# ?
W20250812 10:01:33.395648 465950 runtime_state.h:209] Task is cancelled, instance: 2ca43fdb72e4c92-962130f98851ce51|2ca43fdb72e4c92-962130f98851cf7b, st = [CANCELLED]
W20250812 10:01:33.397485 465993 runtime_state.h:209] Task is cancelled, instance: 2ca43fdb72e4c92-962130f98851ce51|2ca43fdb72e4c92-962130f98851cf79, st = [CANCELLED]
W20250812 10:01:33.399207 465950 status.h:412] meet error status: [ABORTED]
0# doris::PlanFragmentExecutor::cancel(doris::PPlanFragmentCancelReason const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:0
1# doris::FragmentMgr::cancel_instance(doris::TUniqueId const&, doris::PPlanFragmentCancelReason const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /home/zcp/repo_center/doris_release/doris/be/src/runtime/fragment_mgr.cpp:0
2# std::_Function_handler<void (), doris::PInternalServiceImpl::cancel_plan_fragment(google::protobuf::RpcController*, doris::PCancelPlanFragmentRequest const*, doris::PCancelPlanFragmentResult*, google::protobuf::Closure*)::$_0>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
3# doris::WorkThreadPool<false>::work_thread(int) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/atomic_base.h:646
4# execute_native_thread_routine at /data/gcc-11.1.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:85
5# ?
6# ?
W20250812 10:01:33.399487 465993 status.h:412] meet error status: [ABORTED]
0# doris::PlanFragmentExecutor::cancel(doris::PPlanFragmentCancelReason const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:0
1# doris::FragmentMgr::cancel_instance(doris::TUniqueId const&, doris::PPlanFragmentCancelReason const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /home/zcp/repo_center/doris_release/doris/be/src/runtime/fragment_mgr.cpp:0
2# std::_Function_handler<void (), doris::PInternalServiceImpl::cancel_plan_fragment(google::protobuf::RpcController*, doris::PCancelPlanFragmentRequest const*, doris::PCancelPlanFragmentResult*, google::protobuf::Closure*)::$_0>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
3# doris::WorkThreadPool<false>::work_thread(int) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/atomic_base.h:646
4# execute_native_thread_routine at /data/gcc-11.1.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:85
5# ?
6# ?
W20250812 10:01:33.400547 465903 status.h:412] meet error status: [ABORTED]
0# doris::PlanFragmentExecutor::cancel(doris::PPlanFragmentCancelReason const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:0
1# doris::FragmentMgr::cancel_instance(doris::TUniqueId const&, doris::PPlanFragmentCancelReason const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /home/zcp/repo_center/doris_release/doris/be/src/runtime/fragment_mgr.cpp:0
W20250812 10:01:33.433724 464577 vtablet_writer.cpp:587] cancel node channel VNodeChannel[27745951-10358], load_id=2ca43fdb72e4c92-962130f98851ce51, txn_id=34931676, node=10.******.40:8060, error message: [CANCELLED]Cancelled
W20250812 10:01:33.434144 464577 vtablet_writer.cpp:587] cancel node channel VNodeChannel[27745951-10231], load_id=2ca43fdb72e4c92-962130f98851ce51, txn_id=34931676, node=10.******.167:8060, error message: [CANCELLED]Cancelled
W20250812 10:01:33.438495 466080 status.h:431] meet error status: [INTERNAL_ERROR]PStatus: (10.******.220)[INTERNAL_ERROR]fail to add batch in load channel. unknown load_id=02ca43fdb72e4c92-962130f98851ce51
0# doris::Status doris::Status::create<true>(doris::PStatus const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
1# doris::vectorized::VNodeChannel::_add_block_success_callback(doris::PTabletWriterAddBlockResult const&, doris::vectorized::WriteBlockCallbackContext const&) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:481
2# std::_Function_handler<void (doris::PTabletWriterAddBlockResult const&, doris::vectorized::WriteBlockCallbackContext const&), doris::vectorized::VNodeChannel::init(doris::RuntimeState*)::$_1>::_M_invoke(std::_Any_data const&, doris::PTabletWriterAddBlockResult const&, doris::vectorized::WriteBlockCallbackContext const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/atomicity.h:98
3# doris::vectorized::WriteBlockCallback<doris::PTabletWriterAddBlockResult>::call() at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:0
4# doris::AutoReleaseClosure<doris::PTabletWriterAddBlockRequest, doris::vectorized::WriteBlockCallback<doris::PTabletWriterAddBlockResult> >::Run() at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/atomicity.h:98
5# brpc::Controller::EndRPC(brpc::Controller::CompletionInfo const&)
6# brpc::policy::ProcessRpcResponse(brpc::InputMessageBase*)
7# brpc::ProcessInputMessage(void*)
8# bthread::TaskGroup::task_runner(long)
9# bthread_make_fcontext
W20250812 10:01:33.439035 466080 vtablet_writer.cpp:587] cancel node channel VNodeChannel[27745951-10317], load_id=2ca43fdb72e4c92-962130f98851ce51, txn_id=34931676, node=10.******.220:8060, error message: VNodeChannel[27745951-10317], load_id=2ca43fdb72e4c92-962130f98851ce51, txn_id=34931676, node=10.******.220:8060, add batch req success but status isn't ok, err: [INTERNAL_ERROR]PStatus: (10.******.220)[INTERNAL_ERROR]fail to add batch in load channel. unknown load_id=02ca43fdb72e4c92-962130f98851ce51
0# doris::Status doris::Status::create<true>(doris::PStatus const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
1# doris::vectorized::VNodeChannel::_add_block_success_callback(doris::PTabletWriterAddBlockResult const&, doris::vectorized::WriteBlockCallbackContext const&) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:481
2# std::_Function_handler<void (doris::PTabletWriterAddBlockResult const&, doris::vectorized::WriteBlockCallbackContext const&), doris::vectorized::VNodeChannel::init(doris::RuntimeState*)::$_1>::_M_invoke(std::_Any_data const&, doris::PTabletWriterAddBlockResult const&, doris::vectorized::WriteBlockCallbackContext const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/atomicity.h:98
3# doris::vectorized::WriteBlockCallback<doris::PTabletWriterAddBlockResult>::call() at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:0
4# doris::AutoReleaseClosure<doris::PTabletWriterAddBlockRequest, doris::vectorized::WriteBlockCallback<doris::PTabletWriterAddBlockResult> >::Run() at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/atomicity.h:98
5# brpc::Controller::EndRPC(brpc::Controller::CompletionInfo const&)
6# brpc::policy::ProcessRpcResponse(brpc::InputMessageBase*)
7# brpc::ProcessInputMessage(void*)
8# bthread::TaskGroup::task_runner(long)
9# bthread_make_fcontext
W20250812 10:01:33.439509 466080 status.h:431] meet error status: [INTERNAL_ERROR]PStatus: (10.******.220)[INTERNAL_ERROR]fail to add batch in load channel. unknown load_id=02ca43fdb72e4c92-962130f98851ce51
0# doris::Status doris::Status::create<true>(doris::PStatus const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
1# void doris::AutoReleaseClosure<doris::PTabletWriterAddBlockRequest, doris::vectorized::WriteBlockCallback<doris::PTabletWriterAddBlockResult> >::_process_status<doris::PTabletWriterAddBlockResult>(doris::PTabletWriterAddBlockResult*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:481
2# doris::AutoReleaseClosure<doris::PTabletWriterAddBlockRequest, doris::vectorized::WriteBlockCallback<doris::PTabletWriterAddBlockResult> >::Run() at /home/zcp/repo_center/doris_release/doris/be/src/util/ref_count_closure.h:91
3# brpc::Controller::EndRPC(brpc::Controller::CompletionInfo const&)
4# brpc::policy::ProcessRpcResponse(brpc::InputMessageBase*)
5# brpc::ProcessInputMessage(void*)
6# bthread::TaskGroup::task_runner(long)
7# bthread_make_fcontext
W20250812 10:01:33.439718 466080 ref_count_closure.h:119] RPC meet error status: [INTERNAL_ERROR]PStatus: (10.******.220)[INTERNAL_ERROR]fail to add batch in load channel. unknown load_id=02ca43fdb72e4c92-962130f98851ce51
0# doris::Status doris::Status::create<true>(doris::PStatus const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
1# void doris::AutoReleaseClosure<doris::PTabletWriterAddBlockRequest, doris::vectorized::WriteBlockCallback<doris::PTabletWriterAddBlockResult> >::_process_status<doris::PTabletWriterAddBlockResult>(doris::PTabletWriterAddBlockResult*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:481
2# doris::AutoReleaseClosure<doris::PTabletWriterAddBlockRequest, doris::vectorized::WriteBlockCallback<doris::PTabletWriterAddBlockResult> >::Run() at /home/zcp/repo_center/doris_release/doris/be/src/util/ref_count_closure.h:91
3# brpc::Controller::EndRPC(brpc::Controller::CompletionInfo const&)
4# brpc::policy::ProcessRpcResponse(brpc::InputMessageBase*)
5# brpc::ProcessInputMessage(void*)
6# bthread::TaskGroup::task_runner(long)
7# bthread_make_fcontext
W20250812 10:01:33.439718 466080 ref_count_closure.h:119] RPC meet error status: [INTERNAL_ERROR]PStatus: (10.******.220)[INTERNAL_ERROR]fail to add batch in load channel. unknown load_id=02ca43fdb72e4c92-962130f98851ce51
0# doris::Status doris::Status::create<true>(doris::PStatus const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
1# void doris::AutoReleaseClosure<doris::PTabletWriterAddBlockRequest, doris::vectorized::WriteBlockCallback<doris::PTabletWriterAddBlockResult> >::_process_status<doris::PTabletWriterAddBlockResult>(doris::PTabletWriterAddBlockResult*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:481
2# doris::AutoReleaseClosure<doris::PTabletWriterAddBlockRequest, doris::vectorized::WriteBlockCallback<doris::PTabletWriterAddBlockResult> >::Run() at /home/zcp/repo_center/doris_release/doris/be/src/util/ref_count_closure.h:91
3# brpc::Controller::EndRPC(brpc::Controller::CompletionInfo const&)
4# brpc::policy::ProcessRpcResponse(brpc::InputMessageBase*)
5# brpc::ProcessInputMessage(void*)
6# bthread::TaskGroup::task_runner(long)
7# bthread_make_fcontext
W20250812 10:01:33.440117 464577 vtablet_writer.cpp:587] cancel node channel VNodeChannel[27745951-10228], load_id=2ca43fdb72e4c92-962130f98851ce51, txn_id=34931676, node=10.******.162:8060, error message: [CANCELLED]Cancelled