V2.0.3 版本,集群环境。BE 挂了一个节点,配置了自动拉起,启动失败。麻烦大佬们帮忙看看

Viewed 21

生成了很多 coreDump文件
image.png

start time: 2025年 07月 17日 星期四 14:04:52 CST
INFO: java_cmd /usr/local/java8/bin/java
INFO: jdk_version 8
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/doris-be/lib/java_extensions/preload-extensions/preload-extensions-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/doris-be/lib/java_extensions/java-udf/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/doris-be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/doris-be/lib/hadoop_hdfs/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
*** Query id: 0-0 ***
*** tablet id: 114067338 ***
*** Aborted at 1752732312 (unix time) try "date -d @1752732312" if you are using GNU date ***
*** Current BE git commitID: 37d31a5 ***
*** SIGFPE integer divide by zero (@0x55f7c1d62c14) received by PID 7020 (TID 8407 OR 0x7ff5e7026700) from PID 18446744072666622996; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/src/doris-2.0/be/src/common/signal_handler.h:417
 1# os::Linux::chained_handler(int, siginfo*, void*) in /usr/local/java8/jre/lib/amd64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/local/java8/jre/lib/amd64/server/libjvm.so
 3# signalHandler(int, siginfo*, void*) in /usr/local/java8/jre/lib/amd64/server/libjvm.so
 4# 0x00007FF88CBDE400 in /lib64/libc.so.6
 5# doris::segment_v2::NGramBloomFilter::add_bytes(char const*, unsigned int) at /root/src/doris-2.0/be/src/olap/rowset/segment_v2/ngram_bloom_filter.cpp:61
 6# doris::segment_v2::NGramBloomFilterIndexWriterImpl::add_values(void const*, unsigned long) at /root/src/doris-2.0/be/src/olap/rowset/segment_v2/bloom_filter_index_writer.cpp:251
 7# doris::segment_v2::ScalarColumnWriter::append_data_in_current_page(unsigned char const*, unsigned long*) at /root/src/doris-2.0/be/src/olap/rowset/segment_v2/column_writer.cpp:555
 8# doris::segment_v2::ScalarColumnWriter::append_data(unsigned char const**, unsigned long) at /root/src/doris-2.0/be/src/olap/rowset/segment_v2/column_writer.cpp:528
 9# doris::segment_v2::ColumnWriter::append_nullable(unsigned char const*, unsigned char const**, unsigned long) at /root/src/doris-2.0/be/src/olap/rowset/segment_v2/column_writer.cpp:403
10# doris::segment_v2::ColumnWriter::append(unsigned char const*, void const*, unsigned long) in /home/doris-be/lib/doris_be
11# doris::segment_v2::SegmentWriter::append_block(doris::vectorized::Block const*, unsigned long, unsigned long) in /home/doris-be/lib/doris_be
12# doris::BetaRowsetWriter::_do_add_block(doris::vectorized::Block const*, std::unique_ptr<doris::segment_v2::SegmentWriter, std::default_delete<doris::segment_v2::SegmentWriter> >*, unsigned long, unsigned long) at /root/src/doris-2.0/be/src/olap/rowset/beta_rowset_writer.cpp:394
13# doris::BetaRowsetWriter::_add_block(doris::vectorized::Block const*, std::unique_ptr<doris::segment_v2::SegmentWriter, std::default_delete<doris::segment_v2::SegmentWriter> >*, doris::FlushContext const*) at /root/src/doris-2.0/be/src/olap/rowset/beta_rowset_writer.cpp:425
14# doris::BetaRowsetWriter::add_block(doris::vectorized::Block const*) in /home/doris-be/lib/doris_be
15# doris::VSchemaChangeDirectly::_inner_process(std::shared_ptr<doris::RowsetReader>, doris::RowsetWriter*, std::shared_ptr<doris::Tablet>, std::shared_ptr<doris::TabletSchema>) at /root/src/doris-2.0/be/src/olap/schema_change.cpp:494
16# doris::SchemaChange::process(std::shared_ptr<doris::RowsetReader>, doris::RowsetWriter*, std::shared_ptr<doris::Tablet>, std::shared_ptr<doris::Tablet>, std::shared_ptr<doris::TabletSchema>) at /root/src/doris-2.0/be/src/olap/schema_change.h:121
17# doris::SchemaChangeHandler::_convert_historical_rowsets(doris::SchemaChangeHandler::SchemaChangeParams const&) at /root/src/doris-2.0/be/src/olap/schema_change.cpp:1121
18# doris::SchemaChangeHandler::_do_process_alter_tablet_v2(doris::TAlterTabletReqV2 const&) in /home/doris-be/lib/doris_be
19# doris::SchemaChangeHandler::process_alter_tablet_v2(doris::TAlterTabletReqV2 const&) at /root/src/doris-2.0/be/src/olap/schema_change.cpp:670
20# doris::EngineAlterTabletTask::execute() at /root/src/doris-2.0/be/src/olap/task/engine_alter_tablet_task.cpp:51
21# doris::StorageEngine::execute_task(doris::EngineTask*) at /root/src/doris-2.0/be/src/olap/storage_engine.cpp:1245
22# doris::AlterTableTaskPool::_alter_tablet(doris::TAgentTaskRequest const&, long, doris::TTaskType::type, doris::TFinishTaskRequest*) at /root/src/doris-2.0/be/src/agent/task_worker_pool.cpp:1767
23# doris::AlterTableTaskPool::_alter_tablet_worker_thread_callback() at /root/src/doris-2.0/be/src/agent/task_worker_pool.cpp:1733
24# doris::ThreadPool::dispatch_thread() in /home/doris-be/lib/doris_be
25# doris::Thread::supervise_thread(void*) at /root/src/doris-2.0/be/src/util/thread.cpp:499
26# start_thread in /lib64/libpthread.so.0
27# clone in /lib64/libc.so.6
2 Answers
社区小伙伴说可能是schema change 卡住了
排查进程
show alter table column;
取消掉 running 状态的
CANCEL ALTER TABLE COLUMN
FROM db_name.table_name