Doris be 内存使用过高

Viewed 5

在没有任何写入和查询的清空下,be内存使用量一直很高,该如何排除问题
be.INFO 日志

I20250904 16:24:29.780475 600385 daemon.cpp:265] [MemoryGC] start minor GC, process memory used 32.65 GB exceed soft limit 37.71 GB or sys available memory 3.45 GB less than warning water mark 4.66 GB..
I20250904 16:24:29.780511 600385 cache_manager.h:78] [MemoryGC] cache no prune stale, last prune less than interval 10, now 1756974269, last timestamp 1756974260
I20250904 16:24:29.780596 600385 memory_reclamation.cpp:234] [MemoryGC] start GC work load group that enable overcommit, number of group: 1, request_free_memory:4498620825, total_exceeded_memory:0, request more than exceeded, try free size = (group used - group limit).
I20250904 16:24:29.780614 600385 mem_tracker_limiter.cpp:568] [MemoryGC] GC free process top memory overcommit query, , start seek all query, running query and load num: 0
I20250904 16:24:29.780808 600385 mem_tracker_limiter.cpp:607] [MemoryGC] GC free process top memory overcommit query, seek finished, seek 0 tasks. among them, 0 tasks can be canceled; 0 small tasks that were skipped; 0 tasks is being canceled and has not been completed yet;
I20250904 16:24:29.780824 600385 mem_tracker_limiter.cpp:615] [MemoryGC] GC free process top memory overcommit query, finished, no task need be canceled.
I20250904 16:24:29.780848 600385 memory_reclamation.cpp:43] [MemoryGC] end minor GC, free memory 0. cost(us): 347, details: :
  FreeTopOvercommitMemoryQuery:
  WorkloadGroup:
I20250904 16:24:30.882858 600385 daemon.cpp:265] [MemoryGC] start minor GC, process memory used 32.65 GB exceed soft limit 37.71 GB or sys available memory 3.45 GB less than warning water mark 4.66 GB..
I20250904 16:24:30.882902 600385 cache_manager.h:78] [MemoryGC] cache no prune stale, last prune less than interval 10, now 1756974270, last timestamp 1756974260
I20250904 16:24:30.883005 600385 memory_reclamation.cpp:234] [MemoryGC] start GC work load group that enable overcommit, number of group: 1, request_free_memory:4498620825, total_exceeded_memory:0, request more than exceeded, try free size = (group used - group limit).
I20250904 16:24:30.883028 600385 mem_tracker_limiter.cpp:568] [MemoryGC] GC free process top memory overcommit query, , start seek all query, running query and load num: 0
I20250904 16:24:30.883231 600385 mem_tracker_limiter.cpp:607] [MemoryGC] GC free process top memory overcommit query, seek finished, seek 0 tasks. among them, 0 tasks can be canceled; 0 small tasks that were skipped; 0 tasks is being canceled and has not been completed yet;
I20250904 16:24:30.883286 600385 mem_tracker_limiter.cpp:615] [MemoryGC] GC free process top memory overcommit query, finished, no task need be canceled.
I20250904 16:24:30.883322 600385 memory_reclamation.cpp:43] [MemoryGC] end minor GC, free memory 0. cost(us): 434, details: :
  FreeTopOvercommitMemoryQuery:
  WorkloadGroup:
I20250904 16:24:31.114328 599751 data_dir.cpp:877] path: /data/doris-2.0.0/be/storage total capacity: 527295578112, available capacity: 484288319488, usage: 0.081562, in_use: 1
I20250904 16:24:31.503995 599751 storage_engine.cpp:363] get root path info cost: 389 ms. tablet counter: 391258
I20250904 16:24:31.984711 600385 daemon.cpp:265] [MemoryGC] start minor GC, process memory used 32.65 GB exceed soft limit 37.71 GB or sys available memory 3.45 GB less than warning water mark 4.66 GB..
I20250904 16:24:31.984776 600385 lru_cache_policy.h:148] [MemoryGC] MowDeleteBitmapAggCache not need prune stale, LRUCacheType::SIZE consumption 0 less than CACHE_MIN_FREE_SIZE 67108864
I20250904 16:24:31.984797 600385 lru_cache_policy.h:154] [MemoryGC] CreateTabletRRIdxCache not need prune stale, LRUCacheType::NUMBER usage 15 less than CACHE_MIN_FREE_NUMBER 1024
I20250904 16:24:31.984809 600385 lru_cache_policy.h:148] [MemoryGC] DataPageCache not need prune stale, LRUCacheType::SIZE consumption 787366 less than CACHE_MIN_FREE_SIZE 67108864
I20250904 16:24:31.984820 600385 lru_cache_policy.h:148] [MemoryGC] IndexPageCache not need prune stale, LRUCacheType::SIZE consumption 50280 less than CACHE_MIN_FREE_SIZE 67108864
I20250904 16:24:31.984831 600385 lru_cache_policy.h:135] [MemoryGC] TabletSchemaCache prune stale start, consumption 473566, usage 7258
I20250904 16:24:31.984846 600385 lru_cache_policy.h:142] [MemoryGC] TabletSchemaCache prune stale 0 entries, 0 bytes, 168 times prune
I20250904 16:24:31.984858 600385 lru_cache_policy.h:148] [MemoryGC] PKIndexPageCache not need prune stale, LRUCacheType::SIZE consumption 0 less than CACHE_MIN_FREE_SIZE 67108864
I20250904 16:24:31.984869 600385 lru_cache_policy.h:148] [MemoryGC] PointQueryRowCache not need prune stale, LRUCacheType::SIZE consumption 0 less than CACHE_MIN_FREE_SIZE 67108864
I20250904 16:24:31.984880 600385 lru_cache_policy.h:148] [MemoryGC] SegmentCache not need prune stale, LRUCacheType::SIZE consumption 183176 less than CACHE_MIN_FREE_SIZE 67108864
I20250904 16:24:31.984891 600385 lru_cache_policy.h:154] [MemoryGC] SchemaCache not need prune stale, LRUCacheType::NUMBER usage 18 less than CACHE_MIN_FREE_NUMBER 1024
I20250904 16:24:31.984903 600385 lru_cache_policy.h:135] [MemoryGC] CommonObjLRUCache prune stale start, consumption 0, usage 0
I20250904 16:24:31.984917 600385 lru_cache_policy.h:142] [MemoryGC] CommonObjLRUCache prune stale 0 entries, 0 bytes, 168 times prune
I20250904 16:24:31.984928 600385 lru_cache_policy.h:154] [MemoryGC] PointQueryLookupConnectionCache not need prune stale, LRUCacheType::NUMBER usage 0 less than CACHE_MIN_FREE_NUMBER 1024
I20250904 16:24:31.984939 600385 lru_cache_policy.h:148] [MemoryGC] InvertedIndexSearcherCache not need prune stale, LRUCacheType::SIZE consumption 0 less than CACHE_MIN_FREE_SIZE 67108864
I20250904 16:24:31.984949 600385 lru_cache_policy.h:148] [MemoryGC] InvertedIndexQueryCache not need prune stale, LRUCacheType::SIZE consumption 0 less than CACHE_MIN_FREE_SIZE 67108864
I20250904 16:24:31.985062 600385 memory_reclamation.cpp:234] [MemoryGC] start GC work load group that enable overcommit, number of group: 1, request_free_memory:4498620825, total_exceeded_memory:0, request more than exceeded, try free size = (group used - group limit).
I20250904 16:24:31.985085 600385 mem_tracker_limiter.cpp:568] [MemoryGC] GC free process top memory overcommit query, , start seek all query, running query and load num: 0
I20250904 16:24:31.985281 600385 mem_tracker_limiter.cpp:607] [MemoryGC] GC free process top memory overcommit query, seek finished, seek 0 tasks. among them, 0 tasks can be canceled; 0 small tasks that were skipped; 0 tasks is being canceled and has not been completed yet;
I20250904 16:24:31.985293 600385 mem_tracker_limiter.cpp:615] [MemoryGC] GC free process top memory overcommit query, finished, no task need be canceled.
I20250904 16:24:31.985364 600385 memory_reclamation.cpp:43] [MemoryGC] end minor GC, free memory 0. cost(us): 609, details: :
  FreeTopOvercommitMemoryQuery:
  WorkloadGroup:
I20250904 16:24:32.732210 600367 stream_load.cpp:197] new income streaming load request.id=5243b776b5409734-5c7d2f4cdc2b628a, job_id=-1, txn_id=-1, label=audit_log_20250904_162420_676_127_0_0_1_8030, elapse(s)=0, db=__internal_schema, tbl=audit_log, group_commit=0
I20250904 16:24:32.737165 600367 stream_load_executor.cpp:72] begin to execute stream load. label=audit_log_20250904_162420_676_127_0_0_1_8030, txn_id=31453234, query_id=5243b776b5409734-5c7d2f4cdc2b628a
I20250904 16:24:32.737192 600367 fragment_mgr.cpp:653] query_id: 5243b776b5409734-5c7d2f4cdc2b628a, coord_addr: TNetworkAddress(hostname=10.10.3.122, port=9020), total fragment num on current host: 0, fe process uuid: 0, query type: LOAD, report audit fe:TNetworkAddress(hostname=10.10.3.122, port=9020)
I20250904 16:24:32.737283 600367 fragment_mgr.cpp:705] Register query/load memory tracker, query/load id: 5243b776b5409734-5c7d2f4cdc2b628a limit: 0
I20250904 16:24:32.737314 600367 pipeline_fragment_context.cpp:252] Preparing instance 5243b776b5409734-5c7d2f4cdc2b628a|5243b776b5409734-5c7d2f4cdc2b628b, backend_num 0
I20250904 16:24:32.738222 600367 stream_load.cpp:203] finished to handle HTTP header, id=5243b776b5409734-5c7d2f4cdc2b628a, job_id=-1, txn_id=31453234, label=audit_log_20250904_162420_676_127_0_0_1_8030, elapse(s)=0
I20250904 16:24:32.739524 599107 vtablet_writer.cpp:127] init new node for instance 0, incremantal:0
I20250904 16:24:32.739562 599107 vtablet_writer.cpp:127] init new node for instance 0, incremantal:0
I20250904 16:24:32.739578 599107 vtablet_writer.cpp:127] init new node for instance 0, incremantal:0
I20250904 16:24:32.740315 599927 tablets_channel.cpp:136] open tablets channel of index -1, tablets num: 68 timeout(s): 600
I20250904 16:24:32.740463 599927 tablets_channel.cpp:164] txn 31453234: TabletsChannel of index 16092070 init senders 1 with incremental off
I20250904 16:24:32.744447 599107 vtablet_writer.cpp:955] VNodeChannel[16092070-10092], load_id=5243b776b5409734-5c7d2f4cdc2b628a, txn_id=31453234, node=10.10.3.124:8060 mark closed, left pending batch size: 1
I20250904 16:24:32.744483 599107 vtablet_writer.cpp:955] VNodeChannel[16092070-10039], load_id=5243b776b5409734-5c7d2f4cdc2b628a, txn_id=31453234, node=10.10.3.123:8060 mark closed, left pending batch size: 1
I20250904 16:24:32.744495 599107 vtablet_writer.cpp:955] VNodeChannel[16092070-10020], load_id=5243b776b5409734-5c7d2f4cdc2b628a, txn_id=31453234, node=10.10.3.122:8060 mark closed, left pending batch size: 1
I20250904 16:24:32.746388 599849 tablets_channel.cpp:268] close tablets channel: (load_id=5243b776b5409734-5c7d2f4cdc2b628a, index_id=16092070), sender id: 0, backend id: 10092
I20250904 16:24:32.746560 600207 vtablet_writer.cpp:995] All node channels are stopped(maybe finished/offending/cancelled), sender thread exit. 5243b776b5409734-5c7d2f4cdc2b628a
I20250904 16:24:32.746738 599505 vertical_segment_writer.cpp:704] add a single block 8
I20250904 16:24:32.746866 599502 vertical_segment_writer.cpp:704] add a single block 4
I20250904 16:24:32.757683 599849 load_channel.cpp:217] txn 31453234 closed tablets_channel 16092070
I20250904 16:24:32.757856 599849 load_channel.cpp:69] load channel removed load_id=5243b776b5409734-5c7d2f4cdc2b628a, is high priority=0, sender_ip=10.10.3.124, index id: 16092070, total_received_rows: 12, num_rows_filtered: 0
I20250904 16:24:32.768528 599107 vtablet_writer.cpp:1560] total mem_exceeded_block_ns=0, total queue_push_lock_ns=0, total actual_consume_ns=294623, load id=5243b776b5409734-5c7d2f4cdc2b628a
I20250904 16:24:32.768556 599107 vtablet_writer.cpp:1607] finished to close olap table sink. load_id=5243b776b5409734-5c7d2f4cdc2b628a, txn_id=31453234, node add batch time(ms)/wait execution time(ms)/close time(ms)/num: {10020:(7)(0)(24)(1)} {10039:(21)(0)(24)(1)} {10092:(12)(0)(14)(1)} 
I20250904 16:24:32.768774 599079 fragment_mgr.cpp:608] Removing query 5243b776b5409734-5c7d2f4cdc2b628a instance 5243b776b5409734-5c7d2f4cdc2b628b, all done? true
I20250904 16:24:32.768807 599079 fragment_mgr.cpp:614] Query 5243b776b5409734-5c7d2f4cdc2b628a finished
I20250904 16:24:32.768837 599079 query_context.cpp:136] Query 5243b776b5409734-5c7d2f4cdc2b628a deconstructed, , deregister query/load memory tracker, queryId=5243b776b5409734-5c7d2f4cdc2b628a, Limit=2.00 GB, CurrUsed=620.84 KB, PeakUsed=4.85 MB
I20250904 16:24:32.768884 599079 query_context.cpp:168] Query 5243b776b5409734-5c7d2f4cdc2b628a deconstructed, , deregister query/load memory tracker, queryId=5243b776b5409734-5c7d2f4cdc2b628a, Limit=2.00 GB, CurrUsed=620.84 KB, PeakUsed=4.85 MB
I20250904 16:24:32.768955 599079 stream_load_executor.cpp:139] finished to execute stream load. label=audit_log_20250904_162420_676_127_0_0_1_8030, txn_id=31453234, query_id=5243b776b5409734-5c7d2f4cdc2b628a, receive_data_cost_ms=6, read_data_cost_ms=0, write_data_cost_ms=31
I20250904 16:24:32.776863 600401 task_worker_pool.cpp:332] successfully submit task|type=PUBLISH_VERSION|signature=31453234
I20250904 16:24:32.777272 599597 engine_publish_version_task.cpp:405] publish version successfully on tablet, table_id=16092069, tablet=43083546, transaction_id=31453234, version=1236, num_rows=4, res=[OK], cost: 271(us) 
I20250904 16:24:32.777295 599604 engine_publish_version_task.cpp:405] publish version successfully on tablet, table_id=16092069, tablet=43083550, transaction_id=31453234, version=1236, num_rows=8, res=[OK], cost: 278(us) 
I20250904 16:24:32.777446 599697 engine_publish_version_task.cpp:326] finish to publish version on transaction.transaction_id=31453234, cost(us): 504, error_tablet_size=0, res=[OK]
I20250904 16:24:32.777500 599697 task_worker_pool.cpp:1606] successfully publish version|signature=31453234|transaction_id=31453234|tablets_num=2|cost(s)=0
I20250904 16:24:32.793427 600367 stream_load.cpp:706] put stream_load_record rocksdb successfully. label: audit_log_20250904_162420_676_127_0_0_1_8030, key: 1756974272793_audit_log_20250904_162420_676_127_0_0_1_8030
I20250904 16:24:32.893266 600401 task_worker_pool.cpp:332] successfully submit task|type=UPDATE_VISIBLE_VERSION|signature=-1
I20250904 16:24:33.086982 600385 daemon.cpp:265] [MemoryGC] start minor GC, process memory used 32.64 GB exceed soft limit 37.71 GB or sys available memory 3.45 GB less than warning water mark 4.66 GB..
I20250904 16:24:33.087025 600385 cache_manager.h:78] [MemoryGC] cache no prune stale, last prune less than interval 10, now 1756974273, last timestamp 1756974271
I20250904 16:24:33.087131 600385 memory_reclamation.cpp:234] [MemoryGC] start GC work load group that enable overcommit, number of group: 1, request_free_memory:4498620825, total_exceeded_memory:0, request more than exceeded, try free size = (group used - group limit).
I20250904 16:24:33.087154 600385 mem_tracker_limiter.cpp:568] [MemoryGC] GC free process top memory overcommit query, , start seek all query, running query and load num: 0
I20250904 16:24:33.087463 600385 mem_tracker_limiter.cpp:607] [MemoryGC] GC free process top memory overcommit query, seek finished, seek 0 tasks. among them, 0 tasks can be canceled; 0 small tasks that were skipped; 0 tasks is being canceled and has not been completed yet;
I20250904 16:24:33.087502 600385 mem_tracker_limiter.cpp:615] [MemoryGC] GC free process top memory overcommit query, finished, no task need be canceled.
I20250904 16:24:33.087553 600385 memory_reclamation.cpp:43] [MemoryGC] end minor GC, free memory 0. cost(us): 541, details: :
  FreeTopOvercommitMemoryQuery:
  WorkloadGroup:
I20250904 16:24:33.252985 599332 load_channel_mgr.cpp:216] cleaning timed out load channels
I20250904 16:24:34.189050 600385 daemon.cpp:265] [MemoryGC] start minor GC, process memory used 32.65 GB exceed soft limit 37.71 GB or sys available memory 3.45 GB less than warning water mark 4.66 GB..

be metrics

curl http://10.10.3.124:8040/metrics | grep -i "memory" | grep -v "#"  
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 33160  100 33160    0     0   415k      0 --:--:-- --:--:-- --:--:--  420k
doris_be_memory_jemalloc_pmuzzy_num 0
doris_be_memory_pool_bytes_total 0
doris_be_memory_jemalloc_pdirty_num 0
doris_be_memory_jemalloc_resident_bytes 35379679232
doris_be_memory_jemalloc_active_bytes 34901381120
doris_be_memory_jemalloc_muzzy_purged_num 221398
doris_be_memory_jemalloc_allocated_bytes 34843475528
doris_be_memory_pswpout 0
doris_be_memory_pgpgin 558177393
doris_be_memory_pgpgout 2677470383
doris_be_memory_jemalloc_metadata_bytes 520692256
doris_be_memory_allocated_bytes 35429982208
doris_be_query_cache_memory_total_byte 0
doris_be_memory_jemalloc_retained_bytes 7895199744
doris_be_memtable_memory_limiter_mem_consumption{type="load"} 0
doris_be_memory_jemalloc_mapped_bytes 35492777984
doris_be_memory_jemalloc_dirty_purged_num 2934944
doris_be_memory_pswpin 0
doris_be_memory_jemalloc_tcache_bytes 418773904
doris_be_memory_jemalloc_pactive_num 8520845

be buckets 分布
image.png

0 Answers