FE启动失败

Viewed 35

三台fe:4cpu16G 四台Be:16cpu32G
昨天优化表,重新创建表并且把旧表数据导入新表中,因数据量过大选择按天导入,今天导入的时候Fe崩溃挂掉,在启动不了了
107报错:2025-04-16 15:14:46,003 INFO (stateListener|84) [BDBHA.fencing():78] start fencing, epoch number is 11
2025-04-16 15:14:51,076 INFO (UNKNOWN fe_30454908_4898_41cc_b378_752ca5c347c5(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:701 reason:
2025-04-16 15:14:56,004 WARN (stateListener|84) [BDBHA.fencing():94] fencing failed. tried 1 times
com.sleepycat.je.rep.InsufficientReplicasException: (JE 18.3.12) Commit policy: SIMPLE_MAJORITY required 1 replica. But none were active with this master.
at com.sleepycat.je.rep.impl.node.DurabilityQuorum.ensureReplicasForCommit(DurabilityQuorum.java:116) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.rep.impl.RepImpl.txnBeginHook(RepImpl.java:1171) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.rep.txn.MasterTxn.txnBeginHook(MasterTxn.java:195) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.txn.Txn.initTxn(Txn.java:384) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.txn.Txn.(Txn.java:288) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.txn.Txn.(Txn.java:267) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.rep.txn.MasterTxn.(MasterTxn.java:146) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.rep.txn.MasterTxn$1.create(MasterTxn.java:117) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.rep.txn.MasterTxn.create(MasterTxn.java:435) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.rep.impl.RepImpl.createRepUserTxn(RepImpl.java:1145) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.txn.Txn.createAutoTxn(Txn.java:334) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.txn.LockerFactory.getWritableLocker(LockerFactory.java:79) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.txn.LockerFactory.getWritableLocker(LockerFactory.java:40) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.Database.put(Database.java:1625) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.Database.putNoOverwrite(Database.java:1737) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at org.apache.doris.ha.BDBHA.fencing(BDBHA.java:84) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.catalog.Env.transferToMaster(Env.java:1497) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.catalog.Env.access$1400(Env.java:341) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.catalog.Env$5.runOneCycle(Env.java:2763) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.common.util.Daemon.run(Daemon.java:119) ~[doris-fe.jar:1.2-SNAPSHOT]
2025-04-16 15:14:58,006 INFO (stateListener|84) [BDBHA.fencing():78] start fencing, epoch number is 11
2025-04-16 15:15:01,090 INFO (UNKNOWN fe_30454908_4898_41cc_b378_752ca5c347c5(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:801 reason:
2025-04-16 15:15:08,006 WARN (stateListener|84) [BDBHA.fencing():94] fencing failed. tried 2 times
com.sleepycat.je.rep.InsufficientReplicasException: (JE 18.3.12) Commit policy: SIMPLE_MAJORITY required 1 replica. But none were active with this master.
at com.sleepycat.je.rep.impl.node.DurabilityQuorum.ensureReplicasForCommit(DurabilityQuorum.java:116) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.rep.impl.RepImpl.txnBeginHook(RepImpl.java:1171) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.rep.txn.MasterTxn.txnBeginHook(MasterTxn.java:195) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.txn.Txn.initTxn(Txn.java:384) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.txn.Txn.(Txn.java:288) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.txn.Txn.(Txn.java:267) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.rep.txn.MasterTxn.(MasterTxn.java:146) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.rep.txn.MasterTxn$1.create(MasterTxn.java:117) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.rep.txn.MasterTxn.create(MasterTxn.java:435) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.rep.impl.RepImpl.createRepUserTxn(RepImpl.java:1145) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.txn.Txn.createAutoTxn(Txn.java:334) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.txn.LockerFactory.getWritableLocker(LockerFactory.java:79) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.txn.LockerFactory.getWritableLocker(LockerFactory.java:40) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.Database.put(Database.java:1625) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at com.sleepycat.je.Database.putNoOverwrite(Database.java:1737) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
at org.apache.doris.ha.BDBHA.fencing(BDBHA.java:84) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.catalog.Env.transferToMaster(Env.java:1497) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.catalog.Env.access$1400(Env.java:341) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.catalog.Env$5.runOneCycle(Env.java:2763) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.common.util.Daemon.run(Daemon.java:119) ~[doris-fe.jar:1.2-SNAPSHOT]
2025-04-16 15:15:10,007 ERROR (stateListener|84) [Env.transferToMaster():1498] fencing failed. will exit.
然后其他两台fe报错:
105:2025-04-16 15:38:03,165 INFO (UNKNOWN fe_f1e70a2e_208b_4dc4_b7b7_b101de1b875c(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:14401 reason:
2025-04-16 15:38:13,178 INFO (UNKNOWN fe_f1e70a2e_208b_4dc4_b7b7_b101de1b875c(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:14501 reason:
2025-04-16 15:38:23,187 INFO (UNKNOWN fe_f1e70a2e_208b_4dc4_b7b7_b101de1b875c(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:14601 reason:
2025-04-16 15:38:33,199 INFO (UNKNOWN fe_f1e70a2e_208b_4dc4_b7b7_b101de1b875c(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:14701 reason:
2025-04-16 15:38:43,209 INFO (UNKNOWN fe_f1e70a2e_208b_4dc4_b7b7_b101de1b875c(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:14801 reason:
2025-04-16 15:38:53,218 INFO (UNKNOWN fe_f1e70a2e_208b_4dc4_b7b7_b101de1b875c(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:14901 reason:
2025-04-16 15:39:03,228 INFO (UNKNOWN fe_f1e70a2e_208b_4dc4_b7b7_b101de1b875c(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:15001 reason:
2025-04-16 15:39:13,237 INFO (UNKNOWN fe_f1e70a2e_208b_4dc4_b7b7_b101de1b875c(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:15101 reason:
2025-04-16 15:39:23,247 INFO (UNKNOWN fe_f1e70a2e_208b_4dc4_b7b7_b101de1b875c(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:15201 reason:
2025-04-16 15:39:33,256 INFO (UNKNOWN fe_f1e70a2e_208b_4dc4_b7b7_b101de1b875c(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:15301 reason:
106:2025-04-16 15:38:37,016 INFO (UNKNOWN fe_97529bca_e6b1_4c3d_86ce_46c90d8502d1(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:14801 reason:
2025-04-16 15:38:47,025 INFO (UNKNOWN fe_97529bca_e6b1_4c3d_86ce_46c90d8502d1(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:14901 reason:
2025-04-16 15:38:57,036 INFO (UNKNOWN fe_97529bca_e6b1_4c3d_86ce_46c90d8502d1(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:15001 reason:
2025-04-16 15:39:07,045 INFO (UNKNOWN fe_97529bca_e6b1_4c3d_86ce_46c90d8502d1(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:15101 reason:
2025-04-16 15:39:17,055 INFO (UNKNOWN fe_97529bca_e6b1_4c3d_86ce_46c90d8502d1(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:15201 reason:
2025-04-16 15:39:27,064 INFO (UNKNOWN fe_97529bca_e6b1_4c3d_86ce_46c90d8502d1(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:15301 reason:
2025-04-16 15:39:37,073 INFO (UNKNOWN fe_97529bca_e6b1_4c3d_86ce_46c90d8502d1(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:15401 reason:
2025-04-16 15:39:47,083 INFO (UNKNOWN fe_97529bca_e6b1_4c3d_86ce_46c90d8502d1(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:15501 reason:
2025-04-16 15:39:57,095 INFO (UNKNOWN fe_97529bca_e6b1_4c3d_86ce_46c90d8502d1(-1)|1) [Env.waitForReady():1099] wait catalog to be ready. feType:UNKNOWN isReady:false, counter:15601 reason:

1 Answers

是否三个FE都启动不了了呢,如果集群中还有正常工作的FE可以参考这个FE宕机快速恢复指南看看: Q6 恢复方案1