fe偶现报错,节点崩溃,版本2.1.7

Viewed 28

2025-10-09 12:35:07,034 ERROR (thrift-server-pool-136|418460) [BDBJEJournal.write():294] catch an exception when writing to database. sleep and retry. journal id 111459967
com.sleepycat.je.rep.InsufficientAcksException: (JE 18.3.12) Transaction: -113547417 VLSN: 225,011,929, initiated at: 12:34:42. Insufficient acks for policy:SIMPLE_MAJORITY. Need replica acks: 1. Missing replica acks: 1. Timeout: 10000ms. FeederState=fe_5dd5626e_bcc4_4b3e_8b04_616798e6b808(2)[MASTER]
Current feeds:
fe_d272434a_5826_4ff8_8db0_7c9bd08dbcd1: feederVLSN=225,011,930 replicaTxnEndVLSN=225,011,927
fe_823210c2_ea47_4385_8232_b69285aad1ea: feederVLSN=225,011,930 replicaTxnEndVLSN=225,011,927

    at com.sleepycat.je.rep.impl.node.DurabilityQuorum.ensureSufficientAcks(DurabilityQuorum.java:205) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
    at com.sleepycat.je.rep.stream.FeederTxns.awaitReplicaAcks(FeederTxns.java:188) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
    at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHookInternal(RepImpl.java:1444) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
    at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHook(RepImpl.java:1403) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
    at com.sleepycat.je.rep.txn.MasterTxn.postLogCommitHook(MasterTxn.java:228) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
    at com.sleepycat.je.txn.Txn.commit(Txn.java:778) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
    at com.sleepycat.je.txn.Txn.commit(Txn.java:631) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
    at com.sleepycat.je.txn.Txn.operationEnd(Txn.java:1773) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
    at com.sleepycat.je.Database.put(Database.java:1638) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
    at com.sleepycat.je.Database.put(Database.java:1688) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
    at org.apache.doris.journal.bdbje.BDBJEJournal.write(BDBJEJournal.java:265) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.persist.EditLog.logEdit(EditLog.java:1265) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.persist.EditLog.logInsertTransactionState(EditLog.java:1567) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.transaction.DatabaseTransactionMgr.unprotectUpsertTransactionState(DatabaseTransactionMgr.java:1558) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.transaction.DatabaseTransactionMgr.unprotectedCommitTransaction(DatabaseTransactionMgr.java:1492) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.transaction.DatabaseTransactionMgr.commitTransaction(DatabaseTransactionMgr.java:812) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.transaction.GlobalTransactionMgr.commitTransaction(GlobalTransactionMgr.java:255) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.transaction.GlobalTransactionMgr.commitAndPublishTransaction(GlobalTransactionMgr.java:285) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.transaction.GlobalTransactionMgr.commitAndPublishTransaction(GlobalTransactionMgr.java:271) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.nereids.trees.plans.commands.insert.OlapInsertExecutor.onComplete(OlapInsertExecutor.java:217) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.nereids.trees.plans.commands.insert.AbstractInsertExecutor.executeSingleInsert(AbstractInsertExecutor.java:196) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.nereids.trees.plans.commands.insert.InsertIntoTableCommand.runInternal(InsertIntoTableCommand.java:268) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.nereids.trees.plans.commands.insert.InsertIntoTableCommand.run(InsertIntoTableCommand.java:116) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.nereids.trees.plans.commands.DeleteFromUsingCommand.run(DeleteFromUsingCommand.java:56) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.nereids.trees.plans.commands.DeleteFromCommand.run(DeleteFromCommand.java:172) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:729) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:556) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.ConnectProcessor.proxyExecute(ConnectProcessor.java:715) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.service.FrontendServiceImpl.forward(FrontendServiceImpl.java:1060) ~[doris-fe.jar:1.2-SNAPSHOT]
    at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source) ~[?:?]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_42]
    at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_42]
    at org.apache.doris.service.FeServer.lambda$start$0(FeServer.java:60) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.service.FeServer$$Lambda$156/790204983.invoke(Unknown Source) ~[?:?]
    at com.sun.proxy.$Proxy29.forward(Unknown Source) ~[?:?]
    at org.apache.doris.thrift.FrontendService$Processor$forward.getResult(FrontendService.java:3792) ~[fe-common-1.2-SNAPSHOT.jar:1.2-SNAPSHOT]
    at org.apache.doris.thrift.FrontendService$Processor$forward.getResult(FrontendService.java:3772) ~[fe-common-1.2-SNAPSHOT.jar:1.2-SNAPSHOT]
    at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) ~[libthrift-0.16.0.jar:0.16.0]
    at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) ~[libthrift-0.16.0.jar:0.16.0]
    at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250) ~[libthrift-0.16.0.jar:0.16.0]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_42]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_42]
    at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_42]

2025-10-09 12:35:37,478 WARN (thrift-server-pool-136|418460) [EditLog.logInsertTransactionState():1576] edit log insert transaction take a lot time, write bdb 54908 ms, write binlog 0 ms
2025-10-09 12:36:04,159 WARN (mysql-nio-pool-7815|425718) [MysqlConnectProcessor.processOnce():464] Null packet received from network. remote: 10.24.16.88:47651
2025-10-09 12:36:04,159 WARN (mysql-nio-pool-7815|425718) [ReadListener.lambda$handleEvent$0():60] Exception happened in one session(org.apache.doris.qe.ConnectContext@2de69fe).
java.io.IOException: Error happened when receiving packet.
at org.apache.doris.qe.MysqlConnectProcessor.processOnce(MysqlConnectProcessor.java:465) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.mysql.ReadListener.lambda$handleEvent$0(ReadListener.java:52) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.mysql.ReadListener$$Lambda$1857/528437017.run(Unknown Source) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_42]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_42]
at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_42]

监控截图
微信图片_20251009153109_10_18.png

1 Answers

bdbje的master节点写请求,如果达不成多数派,就会抛异常退出。你可以先本地的磁盘情况,或者看下fe的bdbje io刷盘慢的情况
然后可以看下fe.out和fe/doris-meta/bdb/je.info.0