doris查询超时

Viewed 42

版本2.0.8

3fe,若干be,所有到调度到一个资源组的3个be的查询,均为简单查询,20:29开始突然查询超时,手动发起查询也是超时,调度到其他节点的查询正常,。be、fe资源使用均正常,超时原因为一个be节点22.66请求超时,从fe节点telnet对应be节点端口是通的,最后通过重启(20:43分左右)这3个be节点恢复了(当时不是非常确定哪个节点故障,就都重启了)。

后面查看这几台be的日志,看起来是22.66这个be的8060端口超时,而当时telnet是通的,而且22.66这个be节点似乎访问自己的8060端口也超时。但是这个几个节点也没有大的导入任务,也没有compaction。这几个be运行时间有个1快一年了,从未出现过这个问题,重启后页尾在发现。目前不确定原因是什么。

be的报错日志如下


        0#  doris::PlanFragmentExecutor::cancel(doris::PPlanFragmentCancelReason const&, std::__cxx11::basic_string<char, std::char_traits<char>, st
d::allocator<char> > const&) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:0
        1#  doris::FragmentExecState::cancel(doris::PPlanFragmentCancelReason const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::
allocator<char> > const&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1291
        2#  doris::FragmentExecState::execute() at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/uniqu
e_ptr.h:360
        3#  doris::FragmentMgr::_exec_actual(std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::RuntimeState*, doris::Status*)> c
onst&) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:446
        4#  std::_Function_handler<void (), doris::FragmentMgr::exec_plan_fragment(doris::TExecPlanFragmentParams const&, std::function<void (doris:
:RuntimeState*, doris::Status*)> const&)::$_0>::_M_invoke(std::_Any_data const&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../.
./../../include/c++/11/bits/shared_ptr_base.h:701
        5#  doris::ThreadPool::dispatch_thread() at /home/zcp/repo_center/doris_release/doris/be/src/util/threadpool.cpp:0
        6#  doris::Thread::supervise_thread(void*) at /var/local/ldb_toolchain/bin/../usr/include/pthread.h:562
        7#  start_thread
        8#  __clone

W0424 20:32:06.166144 98860 status.h:396] meet error status: [ANALYSIS_ERROR]TStatus: errCode = 2, detailMessage = transaction [366397523] not found

        0#  doris::Status doris::Status::create<true>(doris::TStatus const&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../..
/../include/c++/11/bits/basic_string.h:187
        1#  doris::StreamLoadExecutor::operate_txn_2pc(doris::StreamLoadContext*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.
h:446
        2#  doris::StreamLoad2PCAction::handle(doris::HttpRequest*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:357
        3#  ?
        4#  bufferevent_run_readcb_
        5#  ?
        6#  ?
        7#  ?
        8#  ?
        9#  std::_Function_handler<void (), doris::EvHttpServer::start()::$_0>::_M_invoke(std::_Any_data const&) at /var/local/ldb_toolchain/bin/../
lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/atomicity.h:98
        10# doris::ThreadPool::dispatch_thread() at /home/zcp/repo_center/doris_release/doris/be/src/util/threadpool.cpp:0
        11# doris::Thread::supervise_thread(void*) at /var/local/ldb_toolchain/bin/../usr/include/pthread.h:562
        12# start_thread
        13# __clone
W0424 20:32:06.166226 98860 stream_load_executor.cpp:343] 2PC commit transaction failed, errmsg=[ANALYSIS_ERROR]TStatus: errCode = 2, detailMessage 
= transaction [366397523] not found

        0#  doris::Status doris::Status::create<true>(doris::TStatus const&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../..
/../include/c++/11/bits/basic_string.h:187
        1#  doris::StreamLoadExecutor::operate_txn_2pc(doris::StreamLoadContext*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.
h:446
        2#  doris::StreamLoad2PCAction::handle(doris::HttpRequest*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:357
        3#  ?
        4#  bufferevent_run_readcb_
        5#  ?
        6#  ?
        7#  ?
        8#  ?
        9#  std::_Function_handler<void (), doris::EvHttpServer::start()::$_0>::_M_invoke(std::_Any_data const&) at /var/local/ldb_toolchain/bin/../
lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/atomicity.h:98
        10# doris::ThreadPool::dispatch_thread() at /home/zcp/repo_center/doris_release/doris/be/src/util/threadpool.cpp:0
        11# doris::Thread::supervise_thread(void*) at /var/local/ldb_toolchain/bin/../usr/include/pthread.h:562
        12# start_thread

image.png
image.png
image.png
image.png
image.png
image.png

1 Answers

看你的监控是对应的io上去和cpu利用率上去出现的问题吗