基础信息
- doris版本: 3.0.6
- 部署方式,k8s-operator
错误描述
执行SQL,偶尔 3% 机率报以下错误:
ERROR 1105 (HY000) at line 2: errCode = 2, detailMessage = (doriscluster-be-0.doriscluster-be-internal.doris.svc.cluster.local)[RUNTIME_ERROR]JdbcExecutorException: Initialize datasource failed: | CAUSED BY: SQLTransientConnectionException: HikariPool-6 - Connection is not available, request timed out after 5004ms (total=1, active=0, idle=1, waiting=0) | CAUSED BY: MySQLNonTransientConnectionException: No operations allowed after connection closed. | CAUSED BY: CommunicationsException: Communications link failure
The last packet succ
Bye
Error (exit code 1)
be.log中也没有多余有用的日志,重试又是正常的
be.conf配置:
CUR_DATE=`date +%Y%m%d-%H%M%S`
PPROF_TMPDIR="$DORIS_HOME/log/"
JAVA_OPTS="-Djavax.security.auth.useSubjectCredsOnly=false -Dsun.security.krb5.debug=true -Xmx24576m -DlogPath=$DORIS_HOME/log/jni.log -Xloggc:$DORIS_HOME/log/be.gc.log.$CUR_DATE -Djavax.security.auth.useSubjectCredsOnly=false -Dsun.java.command=DorisBE -XX:-CriticalJNINatives -DJDBC_MIN_POOL=1 -DJDBC_MAX_POOL=100 -DJDBC_MAX_IDLE_TIME=300000 -DJDBC_MAX_WAIT_TIME=5000 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$DORIS_HOME/log/java_heapdump.hprof"
# For jdk 9+, this JAVA_OPTS will be used as default JVM options
JAVA_OPTS_FOR_JDK_9="-Djavax.security.auth.useSubjectCredsOnly=false -Dsun.security.krb5.debug=true -Xmx24576m -DlogPath=$DORIS_HOME/log/jni.log -Xlog:gc:$DORIS_HOME/log/be.gc.log.$CUR_DATE -Djavax.security.auth.useSubjectCredsOnly=false -Dsun.java.command=DorisBE -XX:-CriticalJNINatives -DJDBC_MIN_POOL=1 -DJDBC_MAX_POOL=100 -DJDBC_MAX_IDLE_TIME=300000 -DJDBC_MAX_WAIT_TIME=5000 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$DORIS_HOME/log/java_heapdump.hprof"
JAVA_OPTS_FOR_JDK_17="-Dfile.encoding=UTF-8 -Xmx24576m -DlogPath=$LOG_DIR/jni.log -Xlog:gc*:$LOG_DIR/be.gc.log.$CUR_DATE:time,uptime:filecount=10,filesize=50M -Djavax.security.auth.useSubjectCredsOnly=false -Dsun.security.krb5.debug=false -Dsun.java.command=DorisBE -XX:-CriticalJNINatives -XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED --add-opens=java.management/sun.management=ALL-UNNAMED"
# since 1.2, the JAVA_HOME need to be set to run BE process.
# JAVA_HOME=/path/to/jdk/
# https://github.com/apache/doris/blob/master/docs/zh-CN/community/developer-guide/debug-tool.md#jemalloc-heap-profile
# https://jemalloc.net/jemalloc.3.html
JEMALLOC_CONF="percpu_arena:percpu,background_thread:true,metadata_thp:auto,muzzy_decay_ms:15000,dirty_decay_ms:15000,oversize_threshold:0,prof:false,lg_prof_interval:32,lg_prof_sample:19,prof_gdump:false,prof_accum:false,prof_leak:false,prof_final:false"
#JEMALLOC_CONF="percpu_arena:percpu,background_thread:true,metadata_thp:auto,muzzy_decay_ms:15000,dirty_decay_ms:15000,oversize_threshold:0,lg_tcache_max:20,prof:false,lg_prof_interval:32,lg_prof_sample:19,prof_gdump:false,prof_accum:false,prof_leak:false,prof_final:false"
JEMALLOC_PROF_PRFIX=""
# INFO, WARNING, ERROR, FATAL
sys_log_level = INFO
# ports for admin, web, heartbeat service
# 增加心跳超时时间
heartbeat_service_interval=5
heartbeat_service_timeout=30
# 优化状态报告
report_tablet_interval=10
report_disk_state_interval=30
# 将默认的webserver_port:8040修改为30840,用于k8s对外暴露端口,修改端口到指定范围内,修改了端口需要清空数据,重装
webserver_port = 30840
enable_fqdn_mode=true
arrow_flight_sql_port = -1
#kerberos_krb5_conf_path=/opt/apache-doris/be/conf/krb5.conf
be当前的状态
看起来也是正常的
mysql> SHOW BACKENDS\G
*************************** 1. row ***************************
BackendId: 10005
Host: doriscluster-be-0.doriscluster-be-internal.doris.svc.cluster.local
HeartbeatPort: 9050
BePort: 9060
HttpPort: 30840
BrpcPort: 8060
ArrowFlightSqlPort: -1
LastStartTime: 2025-11-15 10:12:14
LastHeartbeat: 2025-11-16 10:32:08
Alive: true
SystemDecommissioned: false
TabletNum: 85764
DataUsedCapacity: 48.886 GB
TrashUsedCapacity: 0.000
AvailCapacity: 914.603 GB
TotalCapacity: 1.616 TB
UsedPct: 44.73 %
MaxDiskUsedPct: 44.73 %
RemoteUsedCapacity: 0.000
Tag: {"location" : "default"}
ErrMsg:
Version: selectdb-3.0.7-rc01-498b6d0288
Status: {"lastSuccessReportTabletsTime":"2025-11-16 10:32:06","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false,"isActive":false,"currentFragmentNum":2,"lastFragmentUpdateTime":1763260328438}
HeartbeatFailureCounter: 0
NodeRole: mix
CpuCores: 32
Memory: 32.00 GB
*************************** 2. row ***************************
BackendId: 10006
Host: doriscluster-be-2.doriscluster-be-internal.doris.svc.cluster.local
HeartbeatPort: 9050
BePort: 9060
HttpPort: 30840
BrpcPort: 8060
ArrowFlightSqlPort: -1
LastStartTime: 2025-11-15 10:10:18
LastHeartbeat: 2025-11-16 10:32:08
Alive: true
SystemDecommissioned: false
TabletNum: 170408
DataUsedCapacity: 62.299 GB
TrashUsedCapacity: 0.000
AvailCapacity: 1.887 TB
TotalCapacity: 2.762 TB
UsedPct: 31.68 %
MaxDiskUsedPct: 31.68 %
RemoteUsedCapacity: 0.000
Tag: {"location" : "default"}
ErrMsg:
Version: selectdb-3.0.7-rc01-498b6d0288
Status: {"lastSuccessReportTabletsTime":"2025-11-16 10:32:13","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false,"isActive":false,"currentFragmentNum":0,"lastFragmentUpdateTime":1763260321363}
HeartbeatFailureCounter: 0
NodeRole: mix
CpuCores: 32
Memory: 32.00 GB
*************************** 3. row ***************************
BackendId: 10007
Host: doriscluster-be-1.doriscluster-be-internal.doris.svc.cluster.local
HeartbeatPort: 9050
BePort: 9060
HttpPort: 30840
BrpcPort: 8060
ArrowFlightSqlPort: -1
LastStartTime: 2025-11-15 10:11:32
LastHeartbeat: 2025-11-16 10:32:08
Alive: true
SystemDecommissioned: false
TabletNum: 173512
DataUsedCapacity: 61.314 GB
TrashUsedCapacity: 0.000
AvailCapacity: 1.971 TB
TotalCapacity: 2.762 TB
UsedPct: 28.65 %
MaxDiskUsedPct: 28.65 %
RemoteUsedCapacity: 0.000
Tag: {"location" : "default"}
ErrMsg:
Version: selectdb-3.0.7-rc01-498b6d0288
Status: {"lastSuccessReportTabletsTime":"2025-11-16 10:31:42","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false,"isActive":false,"currentFragmentNum":0,"lastFragmentUpdateTime":1763260323230}
HeartbeatFailureCounter: 0
NodeRole: mix
CpuCores: 32
Memory: 32.00 GB
3 rows in set (0.00 sec)