Caused by: java.sql.SQLException: errCode = 2, detailMessage = (10.1.0.26)[INTERNAL_ERROR]query_id: 7ee752fd37824e0a-8c5d6ca2af647e21, couldn't get a client for TNetworkAddress(hostname=10.1.0.23, port=9020), reason is [THRIFT_RPC_ERROR]Couldn't open transport for 10.1.0.23:9020 (open() timed out)
0# doris::ThriftClientImpl::open() at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
1# doris::ThriftClientImpl::open_with_retry(int, int) at /home/zcp/repo_center/doris_enterprise/doris/be/src/common/status.h:357
2# doris::ClientCacheHelper::_create_client(doris::TNetworkAddress const&, std::function<doris::ThriftClientImpl* (doris::TNetworkAddress const&, void**)>&, void**, int) at /home/zcp/repo_center/doris_enterprise/doris/be/src/common/status.h:446
3# doris::ClientCacheHelper::get_client(doris::TNetworkAddress const&, std::function<doris::ThriftClientImpl* (doris::TNetworkAddress const&, void**)>&, void**, int) at /home/zcp/repo_center/doris_enterprise/doris/be/src/common/status.h:446
4# doris::ClientConnection<doris::FrontendServiceClient>::ClientConnection(doris::ClientCache<doris::FrontendServiceClient>*, doris::TNetworkAddress const&, int, doris::Status*, int) at /home/zcp/repo_center/doris_enterprise/doris/be/src/common/status.h:357
5# doris::FragmentMgr::coordinator_callback(doris::ReportStatusRequest const&) at /home/zcp/repo_center/doris_enterprise/doris/be/src/common/status.h:446
6# doris::FragmentExecState::coordinator_callback(doris::Status const&, doris::RuntimeProfile*, doris::RuntimeProfile*, bool) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:244
7# doris::PlanFragmentExecutor::send_report(bool) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:360
8# doris::PlanFragmentExecutor::report_profile() at /home/zcp/repo_center/doris_enterprise/doris/be/src/runtime/plan_fragment_executor.cpp:421
9# std::_Function_handler<void (), doris::PlanFragmentExecutor::open()::$_0>::_M_invoke(std::_Any_data const&) at /home/zcp/repo_center/doris_enterprise/doris/be/src/runtime/plan_fragment_executor.cpp:256
10# doris::ThreadPool::dispatch_thread() at /home/zcp/repo_center/doris_enterprise/doris/be/src/util/threadpool.cpp:0
11# doris::Thread::supervise_thread(void*) at /var/local/ldb_toolchain/bin/../usr/include/pthread.h:562
12# ?
13# __clone
目前排查了cpu/memory/io,执行sql时负载较低,正常执行只需40s,没有性能瓶颈.
# fe thrift相关配置
thrift_backlog_num=1024
thrift_client_timeout_ms=0
thrift_server_max_worker_threads=4096
thrift_server_type=THREAD_POOL
# be thrift相关配置
thrift_client_open_num_tries=1
thrift_client_retry_interval_ms=1000
thrift_connect_timeout_seconds=3
thrift_rpc_timeout_ms=60000
thrift_server_type_of_fe=THREAD_POOL
单台机器的配置为3T SSD硬盘,1T 内存,128核
请问接下来该如何排查呢?