偶尔报错:SQLTransientConnectionException: HikariPool-6 - Connection is not available

Viewed 4

基础信息

  • doris版本: 3.0.6
  • 部署方式,k8s-operator

错误描述

执行SQL,偶尔 3% 机率报以下错误:

ERROR 1105 (HY000) at line 2: errCode = 2, detailMessage = (doriscluster-be-0.doriscluster-be-internal.doris.svc.cluster.local)[RUNTIME_ERROR]JdbcExecutorException: Initialize datasource failed:  | CAUSED BY: SQLTransientConnectionException: HikariPool-6 - Connection is not available, request timed out after 5004ms (total=1, active=0, idle=1, waiting=0) | CAUSED BY: MySQLNonTransientConnectionException: No operations allowed after connection closed. | CAUSED BY: CommunicationsException: Communications link failure

The last packet succ
Bye
Error (exit code 1)

be.log中也没有多余有用的日志,重试又是正常的

be.conf配置:

CUR_DATE=`date +%Y%m%d-%H%M%S`

PPROF_TMPDIR="$DORIS_HOME/log/"

JAVA_OPTS="-Djavax.security.auth.useSubjectCredsOnly=false -Dsun.security.krb5.debug=true -Xmx24576m -DlogPath=$DORIS_HOME/log/jni.log -Xloggc:$DORIS_HOME/log/be.gc.log.$CUR_DATE -Djavax.security.auth.useSubjectCredsOnly=false -Dsun.java.command=DorisBE -XX:-CriticalJNINatives -DJDBC_MIN_POOL=1 -DJDBC_MAX_POOL=100 -DJDBC_MAX_IDLE_TIME=300000 -DJDBC_MAX_WAIT_TIME=5000 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$DORIS_HOME/log/java_heapdump.hprof"

# For jdk 9+, this JAVA_OPTS will be used as default JVM options
JAVA_OPTS_FOR_JDK_9="-Djavax.security.auth.useSubjectCredsOnly=false -Dsun.security.krb5.debug=true -Xmx24576m -DlogPath=$DORIS_HOME/log/jni.log -Xlog:gc:$DORIS_HOME/log/be.gc.log.$CUR_DATE -Djavax.security.auth.useSubjectCredsOnly=false -Dsun.java.command=DorisBE -XX:-CriticalJNINatives -DJDBC_MIN_POOL=1 -DJDBC_MAX_POOL=100 -DJDBC_MAX_IDLE_TIME=300000 -DJDBC_MAX_WAIT_TIME=5000 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$DORIS_HOME/log/java_heapdump.hprof"
JAVA_OPTS_FOR_JDK_17="-Dfile.encoding=UTF-8 -Xmx24576m -DlogPath=$LOG_DIR/jni.log -Xlog:gc*:$LOG_DIR/be.gc.log.$CUR_DATE:time,uptime:filecount=10,filesize=50M -Djavax.security.auth.useSubjectCredsOnly=false -Dsun.security.krb5.debug=false -Dsun.java.command=DorisBE -XX:-CriticalJNINatives -XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED --add-opens=java.management/sun.management=ALL-UNNAMED"

# since 1.2, the JAVA_HOME need to be set to run BE process.
# JAVA_HOME=/path/to/jdk/

# https://github.com/apache/doris/blob/master/docs/zh-CN/community/developer-guide/debug-tool.md#jemalloc-heap-profile
# https://jemalloc.net/jemalloc.3.html
JEMALLOC_CONF="percpu_arena:percpu,background_thread:true,metadata_thp:auto,muzzy_decay_ms:15000,dirty_decay_ms:15000,oversize_threshold:0,prof:false,lg_prof_interval:32,lg_prof_sample:19,prof_gdump:false,prof_accum:false,prof_leak:false,prof_final:false"
#JEMALLOC_CONF="percpu_arena:percpu,background_thread:true,metadata_thp:auto,muzzy_decay_ms:15000,dirty_decay_ms:15000,oversize_threshold:0,lg_tcache_max:20,prof:false,lg_prof_interval:32,lg_prof_sample:19,prof_gdump:false,prof_accum:false,prof_leak:false,prof_final:false"
JEMALLOC_PROF_PRFIX=""

# INFO, WARNING, ERROR, FATAL
sys_log_level = INFO

# ports for admin, web, heartbeat service
# 增加心跳超时时间
heartbeat_service_interval=5
heartbeat_service_timeout=30
# 优化状态报告
report_tablet_interval=10
report_disk_state_interval=30

# 将默认的webserver_port:8040修改为30840,用于k8s对外暴露端口,修改端口到指定范围内,修改了端口需要清空数据,重装
webserver_port = 30840
enable_fqdn_mode=true
arrow_flight_sql_port = -1
#kerberos_krb5_conf_path=/opt/apache-doris/be/conf/krb5.conf

be当前的状态

看起来也是正常的

mysql> SHOW BACKENDS\G
*************************** 1. row ***************************
              BackendId: 10005
                   Host: doriscluster-be-0.doriscluster-be-internal.doris.svc.cluster.local
          HeartbeatPort: 9050
                 BePort: 9060
               HttpPort: 30840
               BrpcPort: 8060
     ArrowFlightSqlPort: -1
          LastStartTime: 2025-11-15 10:12:14
          LastHeartbeat: 2025-11-16 10:32:08
                  Alive: true
   SystemDecommissioned: false
              TabletNum: 85764
       DataUsedCapacity: 48.886 GB
      TrashUsedCapacity: 0.000 
          AvailCapacity: 914.603 GB
          TotalCapacity: 1.616 TB
                UsedPct: 44.73 %
         MaxDiskUsedPct: 44.73 %
     RemoteUsedCapacity: 0.000 
                    Tag: {"location" : "default"}
                 ErrMsg: 
                Version: selectdb-3.0.7-rc01-498b6d0288
                 Status: {"lastSuccessReportTabletsTime":"2025-11-16 10:32:06","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false,"isActive":false,"currentFragmentNum":2,"lastFragmentUpdateTime":1763260328438}
HeartbeatFailureCounter: 0
               NodeRole: mix
               CpuCores: 32
                 Memory: 32.00 GB
*************************** 2. row ***************************
              BackendId: 10006
                   Host: doriscluster-be-2.doriscluster-be-internal.doris.svc.cluster.local
          HeartbeatPort: 9050
                 BePort: 9060
               HttpPort: 30840
               BrpcPort: 8060
     ArrowFlightSqlPort: -1
          LastStartTime: 2025-11-15 10:10:18
          LastHeartbeat: 2025-11-16 10:32:08
                  Alive: true
   SystemDecommissioned: false
              TabletNum: 170408
       DataUsedCapacity: 62.299 GB
      TrashUsedCapacity: 0.000 
          AvailCapacity: 1.887 TB
          TotalCapacity: 2.762 TB
                UsedPct: 31.68 %
         MaxDiskUsedPct: 31.68 %
     RemoteUsedCapacity: 0.000 
                    Tag: {"location" : "default"}
                 ErrMsg: 
                Version: selectdb-3.0.7-rc01-498b6d0288
                 Status: {"lastSuccessReportTabletsTime":"2025-11-16 10:32:13","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false,"isActive":false,"currentFragmentNum":0,"lastFragmentUpdateTime":1763260321363}
HeartbeatFailureCounter: 0
               NodeRole: mix
               CpuCores: 32
                 Memory: 32.00 GB
*************************** 3. row ***************************
              BackendId: 10007
                   Host: doriscluster-be-1.doriscluster-be-internal.doris.svc.cluster.local
          HeartbeatPort: 9050
                 BePort: 9060
               HttpPort: 30840
               BrpcPort: 8060
     ArrowFlightSqlPort: -1
          LastStartTime: 2025-11-15 10:11:32
          LastHeartbeat: 2025-11-16 10:32:08
                  Alive: true
   SystemDecommissioned: false
              TabletNum: 173512
       DataUsedCapacity: 61.314 GB
      TrashUsedCapacity: 0.000 
          AvailCapacity: 1.971 TB
          TotalCapacity: 2.762 TB
                UsedPct: 28.65 %
         MaxDiskUsedPct: 28.65 %
     RemoteUsedCapacity: 0.000 
                    Tag: {"location" : "default"}
                 ErrMsg: 
                Version: selectdb-3.0.7-rc01-498b6d0288
                 Status: {"lastSuccessReportTabletsTime":"2025-11-16 10:31:42","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false,"isActive":false,"currentFragmentNum":0,"lastFragmentUpdateTime":1763260323230}
HeartbeatFailureCounter: 0
               NodeRole: mix
               CpuCores: 32
                 Memory: 32.00 GB
3 rows in set (0.00 sec)

0 Answers