doris be偶尔被自动重启一下,服务被doris manager 管理。版本为2.1.10.
be.out日志如下:
StdoutLogger 2025-11-08 16:39:22,940 Start time: Sat Nov 8 16:39:22 CST 2025
INFO: java_cmd /opt/doris/java8/bin/java
INFO: jdk_version 8
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/doris/be/lib/java_extensions/preload-extensions/preload-extensions-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/doris/be/lib/java_extensions/java-udf/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/doris/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
StdoutLogger 2025-11-13 20:54:57,950 Start time: Thu Nov 13 20:54:57 CST 2025
INFO: java_cmd /opt/doris/java8/bin/java
INFO: jdk_version 8
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/doris/be/lib/java_extensions/preload-extensions/preload-extensions-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/doris/be/lib/java_extensions/java-udf/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/doris/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
Java config name: /etc/krb5.conf
Loaded from Java config
Java config name: /etc/krb5.conf
Loaded from Java config
>>> KdcAccessibility: reset
>>> KdcAccessibility: reset
>>>KinitOptions cache name is /tmp/krb5cc_1177434676
*** Query id: 0-0 ***
*** is nereids: 0 ***
*** tablet id: 0 ***
*** Aborted at 1763700902 (unix time) try "date -d @1763700902" if you are using GNU date ***
*** Current BE git commitID: 33df5ba180 ***
*** SIGSEGV address not mapped to object (@0x20) received by PID 4109415 (TID 3194640 OR 0x7f4a0e7ed700) from PID 32; stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_release/doris/be/src/common/signal_handler.h:421
1# os::Linux::chained_handler(int, siginfo_t*, void*) in /opt/doris/java8/jre/lib/amd64/server/libjvm.so
2# JVM_handle_linux_signal in /opt/doris/java8/jre/lib/amd64/server/libjvm.so
3# signalHandler(int, siginfo_t*, void*) in /opt/doris/java8/jre/lib/amd64/server/libjvm.so
4# 0x00007F5B482B15B0 in /lib64/libc.so.6
5# google::protobuf::internal::ArenaStringPtr::Set(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, google::protobuf::Arena*) in /opt/doris/be/lib/doris_be
6# doris::DeleteSubPredicatePB::MergeImpl(google::protobuf::Message&, google::protobuf::Message const&) at /home/zcp/repo_center/doris_release/doris/gensrc/build/gen_cpp/olap_file.pb.cc:4774
7# void google::protobuf::internal::RepeatedPtrFieldBase::MergeFromInnerLoop<google::protobuf::RepeatedPtrField<doris::DeleteSubPredicatePB>::TypeHandler>(void**, void**, int, int) at /home/zcp/repo_center/doris_release/doris/thirdparty/installed/include/google/protobuf/repeated_ptr_field.h:693
8# doris::DeletePredicatePB::MergeImpl(google::protobuf::Message&, google::protobuf::Message const&) at /home/zcp/repo_center/doris_release/doris/gensrc/build/gen_cpp/olap_file.pb.cc:4418
9# doris::RowsetMetaPB::MergeImpl(google::protobuf::Message&, google::protobuf::Message const&) at /home/zcp/repo_center/doris_release/doris/gensrc/build/gen_cpp/olap_file.pb.cc:3181
10# doris::RowsetMeta::to_rowset_pb(doris::RowsetMetaPB*) const at /home/zcp/repo_center/doris_release/doris/be/src/olap/rowset/rowset_meta.cpp:112
11# doris::TabletMeta::to_meta_pb(doris::TabletMetaPB*) at /home/zcp/repo_center/doris_release/doris/be/src/olap/tablet_meta.cpp:713
12# doris::TabletMeta::serialize(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) at /home/zcp/repo_center/doris_release/doris/be/src/olap/tablet_meta.cpp:510
13# doris::TabletMeta::_save_meta(doris::DataDir*) in /opt/doris/be/lib/doris_be
14# doris::TabletMeta::save_meta(doris::DataDir*) at /home/zcp/repo_center/doris_release/doris/be/src/olap/tablet_meta.cpp:479
15# doris::Tablet::save_meta() at /home/zcp/repo_center/doris_release/doris/be/src/olap/tablet.cpp:351
16# doris::Tablet::do_tablet_meta_checkpoint() in /opt/doris/be/lib/doris_be
17# doris::TabletManager::do_tablet_meta_checkpoint(doris::DataDir*) in /opt/doris/be/lib/doris_be
18# doris::ThreadPool::dispatch_thread() in /opt/doris/be/lib/doris_be
19# doris::Thread::supervise_thread(void*) at /home/zcp/repo_center/doris_release/doris/be/src/util/thread.cpp:499
20# start_thread in /lib64/libpthread.so.0
21# __GI___clone in /lib64/libc.so.6
StdoutLogger 2025-11-21 12:55:16,512 Start time: Fri Nov 21 12:55:16 CST 2025
INFO: java_cmd /opt/doris/java8/bin/java
INFO: jdk_version 8
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/doris/be/lib/java_extensions/preload-extensions/preload-extensions-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/doris/be/lib/java_extensions/java-udf/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/doris/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
Java config name: /etc/krb5.conf
Loaded from Java config
Java config name: /etc/krb5.conf
Loaded from Java config
>>> KdcAccessibility: reset
>>> KdcAccessibility: reset
>>>KinitOptions cache name is /tmp/krb5cc_1177434676
"be.out" 11661L, 1067173C
be.INFO日志如下:
I20251121 12:54:54.083719 4110916 tablets_channel.cpp:136] open tablets channel (load_id=a549ff9816402a26-c8f9b408f80011a1, index_id=3889527), tablets num: 29 timeout(s): 600
I20251121 12:54:54.083827 4110916 tablets_channel.cpp:165] txn 9403051: TabletsChannel of index 3889527 init senders 1 with incremental off
I20251121 12:54:54.084916 4109869 vtablet_writer.cpp:975] VNodeChannel[3889527-4064083], load_id=a549ff9816402a26-c8f9b408f80011a1, txn_id=9403051, node=W1PLPI-SPS09:8068 mark closed, left pending batch size: 1 hang_wait: 0
I20251121 12:54:54.084931 4109869 vtablet_writer.cpp:975] VNodeChannel[3889527-6542378], load_id=a549ff9816402a26-c8f9b408f80011a1, txn_id=9403051, node=10.20.19.129:8068 mark closed, left pending batch size: 1 hang_wait: 0
I20251121 12:54:54.084936 4109869 vtablet_writer.cpp:975] VNodeChannel[3889527-10038], load_id=a549ff9816402a26-c8f9b408f80011a1, txn_id=9403051, node=W1VLPI-SPS04:8068 mark closed, left pending batch size: 1 hang_wait: 0
I20251121 12:54:54.084941 4109869 vtablet_writer.cpp:975] VNodeChannel[3889527-10077], load_id=a549ff9816402a26-c8f9b408f80011a1, txn_id=9403051, node=W1VLPI-SPS07:8068 mark closed, left pending batch size: 1 hang_wait: 0
I20251121 12:54:54.084945 4109869 vtablet_writer.cpp:975] VNodeChannel[3889527-4064082], load_id=a549ff9816402a26-c8f9b408f80011a1, txn_id=9403051, node=W1PLPI-SPS08:8068 mark closed, left pending batch size: 1 hang_wait: 0
I20251121 12:54:54.084976 4109869 vtablet_writer.cpp:975] VNodeChannel[3889527-10058], load_id=a549ff9816402a26-c8f9b408f80011a1, txn_id=9403051, node=W1VLPI-SPS06:8068 mark closed, left pending batch size: 1 hang_wait: 0
I20251121 12:54:54.084981 4109869 vtablet_writer.cpp:975] VNodeChannel[3889527-6542380], load_id=a549ff9816402a26-c8f9b408f80011a1, txn_id=9403051, node=10.20.19.130:8068 mark closed, left pending batch size: 1 hang_wait: 0
I20251121 12:54:54.084986 4109869 vtablet_writer.cpp:975] VNodeChannel[3889527-6542379], load_id=a549ff9816402a26-c8f9b408f80011a1, txn_id=9403051, node=10.20.19.131:8068 mark closed, left pending batch size: 1 hang_wait: 0
I20251121 12:54:54.084990 4109869 vtablet_writer.cpp:975] VNodeChannel[3889527-10039], load_id=a549ff9816402a26-c8f9b408f80011a1, txn_id=9403051, node=W1VLPI-SPS05:8068 mark closed, left pending batch size: 1 hang_wait: 0
I20251121 12:54:54.086784 4110527 tablets_channel.cpp:271] close tablets channel: (load_id=a549ff9816402a26-c8f9b408f80011a1, index_id=3889527), sender id: 0, backend id: 6542380
I20251121 12:54:54.086855 4111588 vtablet_writer.cpp:1016] All node channels are stopped(maybe finished/offending/cancelled), sender thread exit. a549ff9816402a26-c8f9b408f80011a1
I20251121 12:54:54.088069 4110527 load_channel.cpp:219] txn 9403051 closed tablets_channel 3889527
I20251121 12:54:54.088136 4110527 load_channel.cpp:71] load channel removed load_id=a549ff9816402a26-c8f9b408f80011a1, is high priority=0, sender_ip=10.20.19.130, index id: 3889527, total_received_rows: 29, num_rows_filtered: 0
I20251121 12:54:54.092367 4109869 vtablet_writer.cpp:1589] total mem_exceeded_block_ns=0, total queue_push_lock_ns=0, total actual_consume_ns=625778, load id=a549ff9816402a26-c8f9b408f80011a1
I20251121 12:54:54.092383 4109869 vtablet_writer.cpp:1636] finished to close olap table sink. load_id=a549ff9816402a26-c8f9b408f80011a1, txn_id=9403051, node add batch time(ms)/wait execution time(ms)/close time(ms)/num: {10039:(4)(0)(8)(1)} {6542379:(0)(0)(8)(1)} {6542380:(1)(0)(8)(1)} {10058:(5)(0)(8)(1)} {4064082:(3)(0)(8)(1)} {10077:(4)(0)(8)(1)} {10038:(5)(0)(8)(1)} {6542378:(0)(0)(3)(1)} {4064083:(0)(0)(3)(1)}
I20251121 12:54:54.092684 4109831 query_context.cpp:191] Query a549ff9816402a26-c8f9b408f80011a1 deconstructed, use wg: , query type: 1, mem_tracker: , deregister query/load memory tracker, queryId=a549ff9816402a26-c8f9b408f80011a1, Limit=2.00 GB, CurrUsed=680.41 KB, PeakUsed=2.77 MB
I20251121 12:54:54.101576 3191518 task_worker_pool.cpp:337] successfully submit task|type=PUBLISH_VERSION|signature=9403051
I20251121 12:54:54.101903 4110440 engine_publish_version_task.cpp:456] publish version successfully on tablet, table_id=3889526, tablet=8844016, transaction_id=9403051, version=2322, num_rows=29, res=[OK], cost: 224(us)
I20251121 12:54:54.101933 4110487 engine_publish_version_task.cpp:349] finish to publish version on transaction.transaction_id=9403051, cost(us): 306, error_tablet_size=0, res=[OK]
I20251121 12:54:54.101958 4110487 task_worker_pool.cpp:1619] successfully publish version|signature=9403051|transaction_id=9403051|tablets_num=1|cost(s)=0
I20251121 12:54:54.116940 4111882 stream_load.cpp:137] finished to execute stream load. label=audit_log_20251121_125454_75_127_0_0_1_8038, txn_id=9403051, query_id=a549ff9816402a26-c8f9b408f80011a1, load_cost_ms=38, receive_data_cost_ms=3, read_data_cost_ms=0, write_data_cost_ms=11, commit_and_publish_txn_cost_ms=24, number_total_rows=62, number_loaded_rows=62, receive_bytes=18648, loaded_bytes=22368
I20251121 12:54:54.118218 3191518 task_worker_pool.cpp:337] successfully submit task|type=UPDATE_VISIBLE_VERSION|signature=-1
I20251121 12:54:54.655059 4110948 pipeline_x_fragment_context.cpp:212] PipelineXFragmentContext::prepare|query_id=940e96d1f0544888-abf4b2cbfac45ee0|fragment_id=0|pthread_id=140016525584128
I20251121 12:54:54.657476 4110816 fragment_mgr.cpp:796] Query 940e96d1f0544888-abf4b2cbfac45ee0 start execution
I20251121 12:54:54.659548 4111029 internal_service.cpp:660] Cancel query 940e96d1f0544888-abf4b2cbfac45ee0, reason: INTERNAL_ERROR
I20251121 12:54:54.659584 4111029 pipeline_x_fragment_context.cpp:147] PipelineXFragmentContext::cancel|query_id=940e96d1f0544888-abf4b2cbfac45ee0|fragment_id=0|reason=INTERNAL_ERROR|error message=
W20251121 12:54:54.659593 4111029 pipeline_x_fragment_context.cpp:168] PipelineXFragmentContext cancel instance: 940e96d1f0544888-abf4b2cbfac45ef1
W20251121 12:54:54.659618 4111029 pipeline_x_fragment_context.cpp:168] PipelineXFragmentContext cancel instance: 940e96d1f0544888-abf4b2cbfac45ef2
W20251121 12:54:54.659626 4111029 pipeline_x_fragment_context.cpp:168] PipelineXFragmentContext cancel instance: 940e96d1f0544888-abf4b2cbfac45ef3
W20251121 12:54:54.659632 4111029 pipeline_x_fragment_context.cpp:168] PipelineXFragmentContext cancel instance: 940e96d1f0544888-abf4b2cbfac45ef4
W20251121 12:54:54.659641 4111029 pipeline_x_fragment_context.cpp:168] PipelineXFragmentContext cancel instance: 940e96d1f0544888-abf4b2cbfac45ef5
W20251121 12:54:54.659651 4111029 pipeline_x_fragment_context.cpp:168] PipelineXFragmentContext cancel instance: 940e96d1f0544888-abf4b2cbfac45ef6
W20251121 12:54:54.659657 4111029 pipeline_x_fragment_context.cpp:168] PipelineXFragmentContext cancel instance: 940e96d1f0544888-abf4b2cbfac45ef7
W20251121 12:54:54.659663 4111029 pipeline_x_fragment_context.cpp:168] PipelineXFragmentContext cancel instance: 940e96d1f0544888-abf4b2cbfac45ef8
W20251121 12:54:54.659677 4109956 fragment_mgr.cpp:628] report error status: to coordinator: TNetworkAddress(hostname=W1VLPI-SPS02, port=9028), query id: 940e96d1f0544888-abf4b2cbfac45ee0, instance id: 0-0
I20251121 12:54:54.659850 4109956 fragment_mgr.cpp:660] Going to cancel instance 0-0 since report exec status got rpc failed: [RUNTIME_ERROR]TStatus:
I20251121 12:54:54.660158 4109956 query_context.cpp:191] Query 940e96d1f0544888-abf4b2cbfac45ee0 deconstructed, use wg: normal, query type: 0, mem_tracker: , deregister query/load memory tracker, queryId=940e96d1f0544888-abf4b2cbfac45ee0, Limit=2.00 GB, CurrUsed=8.06 KB, PeakUsed=2.04 MB
I20251121 12:54:57.967583 4110792 pipeline_x_fragment_context.cpp:212] PipelineXFragmentContext::prepare|query_id=29fe77085a9e45be-b083d059d9e814c5|fragment_id=1|pthread_id=140017834845952
I20251121 12:54:57.967797 4110792 pipeline_x_fragment_context.cpp:212] PipelineXFragmentContext::prepare|query_id=29fe77085a9e45be-b083d059d9e814c5|fragment_id=0|pthread_id=140017834845952
I20251121 12:54:57.970237 4110872 fragment_mgr.cpp:796] Query 29fe77085a9e45be-b083d059d9e814c5 start execution
I20251121 12:54:57.972002 4110871 internal_service.cpp:660] Cancel query 29fe77085a9e45be-b083d059d9e814c5, reason: INTERNAL_ERROR
I20251121 12:54:57.972030 4110871 pipeline_x_fragment_context.cpp:147] PipelineXFragmentContext::cancel|query_id=29fe77085a9e45be-b083d059d9e814c5|fragment_id=0|reason=INTERNAL_ERROR|error message=
W20251121 12:54:57.972039 4110871 pipeline_x_fragment_context.cpp:168] PipelineXFragmentContext cancel instance: 29fe77085a9e45be-b083d059d9e814de
W20251121 12:54:57.972077 4110871 pipeline_x_fragment_context.cpp:168] PipelineXFragmentContext cancel instance: 29fe77085a9e45be-b083d059d9e814df
W20251121 12:54:57.972085 4110871 pipeline_x_fragment_context.cpp:168] PipelineXFragmentContext cancel instance: 29fe77085a9e45be-b083d059d9e814e0
W20251121 12:54:57.972119 4110871 pipeline_x_fragment_context.cpp:168] PipelineXFragmentContext cancel instance: 29fe77085a9e45be-b083d059d9e814e1
W20251121 12:54:57.972127 4110871 pipeline_x_fragment_context.cpp:168] PipelineXFragmentContext cancel instance: 29fe77085a9e45be-b083d059d9e814e2
W20251121 12:54:57.972133 4110871 pipeline_x_fragment_context.cpp:168] PipelineXFragmentContext cancel instance: 29fe77085a9e45be-b083d059d9e814e3
W20251121 12:54:57.972141 4110871 pipeline_x_fragment_context.cpp:168] PipelineXFragmentContext cancel instance: 29fe77085a9e45be-b083d059d9e814e4
W20251121 12:54:57.972147 4110871 pipeline_x_fragment_context.cpp:168] PipelineXFragmentContext cancel instance: 29fe77085a9e45be-b083d059d9e814e5
W20251121 12:54:57.972167 4109957 fragment_mgr.cpp:628] report error status: to coordinator: TNetworkAddress(hostname=W1VLPI-SPS03, port=9028), query id: 29fe77085a9e45be-b083d059d9e814c5, instance id: 0-0
I20251121 12:54:57.972368 4109957 fragment_mgr.cpp:660] Going to cancel instance 0-0 since report exec status got rpc failed: [RUNTIME_ERROR]TStatus:
I20251121 12:54:57.972592 4110790 internal_service.cpp:660] Cancel query 29fe77085a9e45be-b083d059d9e814c5, reason: INTERNAL_ERROR
W20251121 12:54:57.972613 4110790 fragment_mgr.cpp:1297] Could not find the query id:29fe77085a9e45be-b083d059d9e814c5 fragment id:1 to cancel
I20251121 12:54:57.972654 4109957 query_context.cpp:191] Query 29fe77085a9e45be-b083d059d9e814c5 deconstructed, use wg: normal, query type: 0, mem_tracker: , deregister query/load memory tracker, queryId=29fe77085a9e45be-b083d059d9e814c5, Limit=2.00 GB, CurrUsed=8.06 KB, PeakUsed=2.05 MB
I20251121 12:54:59.834551 3191518 workload_group_listener.cpp:65] [topic_publish_wg]update workload group finish, wg info=TG[id = 1, name = normal, cpu_share = 1024, memory_limit = 33.75 GB, enable_memory_overcommit = true, version = 0, cpu_hard_limit = -1, scan_thread_num = 128, max_remote_scan_thread_num = 640, min_remote_scan_thread_num = 8, spill_low_watermark=50, spill_high_watermark=80, is_shutdown=false, query_num=0, read_bytes_per_second=-1, remote_read_bytes_per_second=-1], enable_cpu_hard_limit=false, cgroup cpu_shares=0, cgroup cpu_hard_limit=0, cgroup home path=, list size=1, thread info=[exec num:64, real_num:64, min_num:64, max_num:64],[l_scan num:128, real_num:128, min_num:128, max_num128],[r_scan num:8, real_num:8, min_num:8, max_num:640],[mem_tab_flush num:6, real_num:6, min_num:6, max_num:6]
I20251121 12:54:59.834619 3191518 workload_group_manager.cpp:134] [topic_publish_wg]finish clear unused workload group, time cost: 0 ms, deleted group size:0, before wg size=1, after wg size=1
I20251121 12:54:59.834633 3191518 topic_subscriber.cpp:47] [topic_publish]finish handle topic WORKLOAD_GROUP, size=1
I20251121 12:54:59.834646 3191518 workload_sched_policy_listener.cpp:79] [workload_schedule]finish update workload schedule policy, size=0
I20251121 12:54:59.834656 3191518 topic_subscriber.cpp:47] [topic_publish]finish handle topic WORKLOAD_SCHED_POLICY, size=0
I20251121 12:55:00.519582 4110260 wal_manager.cpp:486] Scheduled(every 10s) WAL info: [/app/doris/storage/wal: limit 163574524313 Bytes, used 0 Bytes, estimated wal bytes 0 Bytes, available 163574524313 Bytes.];
I20251121 12:55:00.519582 4110260 wal_manager.cpp:486] Scheduled(every 10s) WAL info: [/app/doris/storage/wal: limit 163574524313 Bytes, used 0 Bytes, estimated wal bytes 0 Bytes, available 163574524313 Bytes.];
I20251121 12:55:00.719671 4110401 tablet.cpp:1980] cumulative compaction meet delete rowset, increase cumu point without other operation.|tablet id:=8844016|after cumulative compaction, cumu point:=2
I20251121 12:55:02.153086 4110259 load_channel_mgr.cpp:218] cleaning timed out load channels
I20251121 12:55:02.564435 4110413 olap_server.cpp:448] begin to produce tablet meta checkpoint tasks, data_dir=/app/doris/storage
I20251121 12:55:03.085464 4111916 daemon.cpp:221] os physical memory 125.01 GB. process memory used 15.24 GB(= 19.70 GB[vm/rss] - 4.46 GB[tc/jemalloc_cache] + 0[reserved] + 0B[waiting_refresh]), limit 112.51 GB, soft limit 101.26 GB. sys available memory 100.39 GB(= 100.39 GB[proc/available] - 0[reserved] - 0B[waiting_refresh]), low water mark 6.25 GB, warning water mark 12.50 GB.
I20251121 12:55:16.644951 3195424 doris_main.cpp:382] version doris-2.1.10-rc01(AVX2) RELEASE (build git://vm-107@33df5ba180f25af140b332f491958c0c7a77337a)
Built on Wed, 14 May 2025 22:22:09 CST by vm-107
I20251121 12:55:16.740797 3195424 jni-util.cpp:104] set final LIBHDFS_OPTS: -Xmx16g -Xms16g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:+ParallelRefProcEnabled -XX:+UseCompressedOops -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -DlogPath=/opt/doris/be/log/jni.log -Xloggc:/opt/doris/be/log/be.gc.log.20251121-125516 -Djavax.security.auth.useSubjectCredsOnly=false -Dsun.security.krb5.debug=true -Dsun.java.command=DorisBE -XX:-CriticalJNINatives -Djava.security.krb5.conf=/etc/krb5.conf
I20251121 12:55:18.604444 3195424 doris_main.cpp:490] Doris backend JNI is initialized.
I20251121 12:55:18.605095 3195424 mem_info.cpp:373] Physical Memory: 134225395712, BE Available Physical Memory(consider cgroup): 134225395712, Mem Limit: 112.51 GB, origin config value: 90%, System Mem Available Min Reserve: 6.25 GB, Vm Min Free KBytes: 88.00 MB, Vm Overcommit Memory: 0
I20251121 12:55:18.605119 3195424 doris_main.cpp:508] Cpu Info:
Model: Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz
Cores: 64
Max Possible Cores: 64
L1 Cache: 32.00 KB (Line: 64.00 B)
L2 Cache: 1.00 MB (Line: 64.00 B)
L3 Cache: 22.00 MB (Line: 64.00 B)
Hardware Supports:
ssse3
sse4_1
sse4_2
popcnt
avx
avx2
Numa Nodes: 2
Numa Nodes of Cores: 0->0 | 1->1 | 2->0 | 3->1 | 4->0 | 5->1 | 6->0 | 7->1 | 8->0 | 9->1 | 10->0 | 11->1 | 12->0 | 13->1 | 14->0 | 15->1 | 16->0 | 17->1 | 18->0 | 19->1 | 20->0 | 21->1 | 22->0 | 23->1 | 24->0 | 25->1 | 26->0 | 27->1 | 28->0 | 29->1 | 30->0 | 31->1 | 32->0 | 33->1 | 34->0 | 35->1 | 36->0 | 37->1 | 38->0 | 39->1 | 40->0 | 41->1 | 42->0 | 43->1 | 44->0 | 45->1 | 46->0 | 47->1 | 48->0 | 49->1 | 50->0 | 51->1 | 52->0 | 53->1 | 54->0 | 55->1 | 56->0 | 57->1 | 58->0 | 59->1 | 60->0 | 61->1 | 62->0 | 63->1 |
I20251121 12:55:18.605144 3195424 doris_main.cpp:509] Disk Info:
Num disks 3: sda, sdb, dm-
I20251121 12:55:18.605149 3195424 doris_main.cpp:510] Physical Memory: 134225395712
Memory Limt: 120802856140
CGroup Info: Process CGroup Memory Info (cgroups path: /sys/fs/cgroup/memory/user.slice/user-1177418869.slice/session-38.scope, cgroup version: v1): memory limit: 9223372036854771712, memory usage: 2080067584
I20251121 12:55:18.605425 3195424 backend_options.cpp:108] priority cidrs: 10.20.19.0/24
I20251121 12:55:18.605710 3195424 backend_options.cpp:136] skip ip not belonged to priority networks: 127.0.0.1
dmesg -T 执行结果如下:
[Tue May 27 22:30:38 2025] bond1: (slave eno1np0): link status definitely down, disabling slave
[Tue May 27 22:30:38 2025] bond1: (slave ens1f1np1): making interface the new active one
[Tue May 27 23:18:20 2025] bnxt_en 0000:18:00.0 eno1np0: NIC Link is Up, 10000 Mbps (NRZ) full duplex, Flow control: ON - receive & transmit
[Tue May 27 23:18:20 2025] bnxt_en 0000:18:00.0 eno1np0: FEC autoneg off encoding: None
[Tue May 27 23:18:21 2025] bond1: (slave eno1np0): link status definitely up, 10000 Mbps full duplex
[Tue May 27 23:24:34 2025] bnxt_en 0000:3b:00.1 ens1f1np1: NIC Link is Down
[Tue May 27 23:24:34 2025] bond1: (slave ens1f1np1): link status definitely down, disabling slave
[Tue May 27 23:24:34 2025] bond1: (slave eno1np0): making interface the new active one
[Wed May 28 00:10:01 2025] bnxt_en 0000:3b:00.1 ens1f1np1: NIC Link is Up, 10000 Mbps (NRZ) full duplex, Flow control: ON - receive & transmit
[Wed May 28 00:10:01 2025] bnxt_en 0000:3b:00.1 ens1f1np1: FEC autoneg off encoding: None
[Wed May 28 00:10:01 2025] bond1: (slave ens1f1np1): link status definitely up, 10000 Mbps full duplex
[Wed May 28 00:21:41 2025] bnxt_en 0000:18:00.0 eno1np0: NIC Link is Down
[Wed May 28 00:21:41 2025] bond1: (slave eno1np0): link status definitely down, disabling slave
[Wed May 28 00:21:41 2025] bond1: (slave ens1f1np1): making interface the new active one
[Wed May 28 00:41:30 2025] bnxt_en 0000:18:00.0 eno1np0: NIC Link is Up, 10000 Mbps (NRZ) full duplex, Flow control: ON - receive & transmit
[Wed May 28 00:41:30 2025] bnxt_en 0000:18:00.0 eno1np0: FEC autoneg off encoding: None
[Wed May 28 00:41:30 2025] bond1: (slave eno1np0): link status definitely up, 10000 Mbps full duplex
[Thu Jun 5 18:04:31 2025] perf: interrupt took too long (14657 > 14458), lowering kernel.perf_event_max_sample_rate to 13000
[Mon Jul 28 17:48:02 2025] perf: interrupt took too long (18552 > 18321), lowering kernel.perf_event_max_sample_rate to 10000
[Mon Jul 28 18:10:08 2025] perf: interrupt took too long (23350 > 23190), lowering kernel.perf_event_max_sample_rate to 8000
[Mon Jul 28 22:31:56 2025] perf: interrupt took too long (29374 > 29187), lowering kernel.perf_event_max_sample_rate to 6000
[Wed Jul 30 19:09:37 2025] bnxt_en 0000:18:00.0 eno1np0: NIC Link is Down
[Wed Jul 30 19:09:37 2025] bond1: (slave eno1np0): link status definitely down, disabling slave
[Wed Jul 30 19:50:26 2025] bnxt_en 0000:18:00.0 eno1np0: NIC Link is Up, 10000 Mbps (NRZ) full duplex, Flow control: ON - receive & transmit
[Wed Jul 30 19:50:26 2025] bnxt_en 0000:18:00.0 eno1np0: FEC autoneg off encoding: None
[Wed Jul 30 19:50:26 2025] bond1: (slave eno1np0): link status definitely up, 10000 Mbps full duplex
[Wed Jul 30 20:05:09 2025] bnxt_en 0000:3b:00.1 ens1f1np1: NIC Link is Down
[Wed Jul 30 20:05:09 2025] bond1: (slave ens1f1np1): link status definitely down, disabling slave
[Wed Jul 30 20:05:09 2025] bond1: (slave eno1np0): making interface the new active one
[Wed Jul 30 20:52:13 2025] bnxt_en 0000:3b:00.1 ens1f1np1: NIC Link is Up, 10000 Mbps (NRZ) full duplex, Flow control: ON - receive & transmit
[Wed Jul 30 20:52:13 2025] bnxt_en 0000:3b:00.1 ens1f1np1: FEC autoneg off encoding: None
[Wed Jul 30 20:52:13 2025] bond1: (slave ens1f1np1): link status definitely up, 10000 Mbps full duplex
[Fri Aug 8 11:04:42 2025] perf: interrupt took too long (36806 > 36717), lowering kernel.perf_event_max_sample_rate to 5000