当前共有3台服务器,都是16核、64G,目前是存算一体的方式部署,3个fe、3个be。数据是通过Debezium+kafka+doris sink的方式进行的实时同步,数据通过每次的量不大,从几十到几百行不等,最多的也就5000左右。但是业务数据变更会比较平凡,几乎时时都有数据更新。但在配置Debezium时,已经做了限制,即每10秒去捕获一次。另外,doris目前我们考虑用来做报表的查询,已经建了60来个物化视图且都是增量异步的方式。从数据同步的情况来看,数据量确实是不大的,但是在运行了一段时间后,be节点会突发的宕机。我想知道,是我服务器搭建的问题还是我需要更多的服务器资源才能支撑我当前的业务需要?be报错如下:
W20260408 11:43:28.995352 75401 memory_profile.cpp:339] Process Memory Summary: os physical memory 62.89 GB. process memory used 16.07 GB(= 39.64 GB[vm/rss] - 23.57 GB[tc/jemalloc_cache] + 0[reserved] + 0B[waiting_refresh]), limit 56.60 GB, soft limit 50.94 GB. sys available memory 4.32 GB(= 4.32 GB[proc/available] - 0[reserved] - 0B[waiting_refresh]), low water mark 3.14 GB, warning water mark 6.29 GB.
W20260408 11:43:28.995515 75401 memory_profile.cpp:340]
MemoryOverviewSnapshot:
- PhysicalMemory(VmRSS) Current: 39.64 GB (Peak: 39.64 GB)
- VirtualMemory(VmSize) Current: 224.63 GB (Peak: 224.66 GB)
UntrackedMemory:- Memory Current: 2.58 GB (Peak: 23.10 GB)
TrackedMemory: - Memory Current: 37.07 GB (Peak: 37.17 GB)
TasksMemory:- Memory Current: 12.30 GB (Peak: 20.33 GB)
- ReservedMemory Current: 0 (Peak: 0)
Details: - Compaction Current: 45.25 MB (Peak: 590.33 MB)
- Load Current: 12.25 GB (Peak: 20.23 GB)
- AllMemTablesMemory Current: 3.27 GB (Peak: 7.82 GB)
- Other Current: 0 (Peak: 4.36 KB)
- Query Current: 5.94 KB (Peak: 9.21 GB)
- SchemaChange Current: 0 (Peak: 0)
GlobalMemory:
- ReservedMemory Current: 0 (Peak: 0)
- Memory Current: 67.27 MB (Peak: 392.31 MB)
MetadataMemory: - Memory Current: -1451712621.00 B (Peak: 1.14 GB)
CacheMemory: - Memory Current: 861.74 MB (Peak: 1.99 GB)
JemallocMemory: - Memory Current: 25.21 GB (Peak: 28.75 GB)
Details:- Cache Current: 23.57 GB (Peak: 27.11 GB)
- Metadata Current: 1.64 GB (Peak: 1.64 GB)
W20260408 11:43:29.294100 75401 memory_profile.cpp:341]
GlobalMemorySnapshot:
Orphan@global@id=414d623c482cb79f-f8015f4955b3f383:
- Memory Current: 12.30 GB (Peak: 20.33 GB)
- Memory Current: 0 (Peak: 0)
IOBufBlockMemory@global@id=cc4f4a366877baf2-1d8384a52b324783: - Memory Current: 53.31 MB (Peak: 380.86 MB)
PointQueryExecutor@global@id=eb49fb3d4aa8c1c0-7b46da0d4e7a84af: - Memory Current: 0 (Peak: 0)
BlockCompression@global@id=fd4db66996649362-9efcc15ee9786cae: - Memory Current: 13.96 MB (Peak: 13.96 MB)
RowIdStorageReader@global@id=424d5f90bdc852c4-49e75010a134bbaa: - Memory Current: 0 (Peak: 32.31 KB)
SubcolumnsTree@global@id=d845e049ee18d293-d589c6ef14b23aa8: - Memory Current: 0 (Peak: 0)
S3FileBuffer@global@id=ed401136392deb26-79413f949aecb0b9: - Memory Current: 0 (Peak: 0)
W20260408 11:43:29.329536 75401 memory_profile.cpp:342]
MetadataMemorySnapshot:
Tablets(not in SchemaCache, TabletSchemaCache)@metadata@id=824ec3d4423627ef-7163ba01f9e32baf: - Memory Current: -1993335140.00 B (Peak: 51.77 MB)
Segments(not in SegmentCache)@metadata@id=d7411e1c720b5405-c95bb24f5418c58e: - Memory Current: 7.53 MB (Peak: 10.34 MB)
Rowsets@metadata@id=ae47dce06c11a9b9-b449c044331d3b9e: - Memory Current: 76.83 MB (Peak: 115.81 MB)
ParquetMeta@metadata@id=4844bfce224b31eb-9c55518c43b124bb: - Memory Current: 0 (Peak: 0)
SegmentCache[size]@metadata@id=d943b11fe53ab9a4-dd6ff860060b3081: - Memory Current: 415.65 MB (Peak: 1.06 GB)
SchemaCache[number]@metadata@id=6845b0e5161dade5-15b1ed658c13368c: - Memory Current: 2.30 MB (Peak: 2.67 MB)
TabletSchemaCache[number]@metadata@id=524e5f1db09ed6ef-ab6eb8977245af9c: - Memory Current: 14.22 MB (Peak: 14.27 MB)
W20260408 11:43:29.329617 75401 memory_profile.cpp:343]
CacheMemorySnapshot:
QueryCache@cache@id=3c4f21cf2fb2625c-ee343bb67a75b585: - Memory Current: 0 (Peak: 0)
DataPageCache[size]@cache@id=b342e1db4a28e450-f52a3f1066388ab2: - Memory Current: 676.41 MB (Peak: 1.68 GB)
IndexPageCache[size]@cache@id=a74aae694f841b1b-3f02534c5c3514a8: - Memory Current: 58.03 MB (Peak: 256.85 MB)
PKIndexPageCache[size]@cache@id=7549da48f931d72d-94dc0a6149c377aa: - Memory Current: 54.41 MB (Peak: 224.25 MB)
PointQueryRowCache[size]@cache@id=0343eb56cc2d2676-efcb5d690148e581: - Memory Current: 0 (Peak: 0)
CommonObjLRUCache[number]@cache@id=dd4b3799a808dd7f-08b9583df9074aba: - Memory Current: 0 (Peak: 0)
PointQueryLookupConnectionCache[number]@cache@id=164f4a4b718a7ef7-f81001a9310ae2be: - Memory Current: 0 (Peak: 0)
InvertedIndexSearcherCache[size]@cache@id=80477320249d843c-6e6c7c6fcdfccea1: - Memory Current: 0 (Peak: 0)
InvertedIndexQueryCache[size]@cache@id=ef4d860ed77c4ad4-f1940fdb4401438c: - Memory Current: 0 (Peak: 0)
QueryCache[size]@cache@id=274c63e79d50de83-929ab67bf12a26a0: - Memory Current: 0 (Peak: 0)
MowDeleteBitmapAggCache[size]@cache@id=5f48944c99e2036f-b072595ac78a81b7: - Memory Current: 61.35 MB (Peak: 64.75 MB)
LoadStateChannelCache [number]@cache@id=08498156bd46a322-498a49abfee7b99f: - Memory Current: 0 (Peak: 0)
TabletColumnObjectPool[number]@cache@id=e14946521c67c0c8-fc8fd619c8f78b92: - Memory Current: 94.47 KB (Peak: 12.22 MB)
MowTabletVersionCache[number]@cache@id=b844446e844ccb80-03c2e108f9d0cb93: - Memory Current: 11.35 MB (Peak: 11.35 MB)
CreateTabletRRIdxCache[number]@cache@id=cd40c01725abc907-fd3ed81699c8b78c: - Memory Current: 105.54 KB (Peak: 115.67 KB)
W20260408 11:43:29.329803 75401 memory_profile.cpp:344]
TopMemoryTasksSnapshot:
Load#Id=12c5743b260b4f05-9a52716318372067@load@id=1f42dfa410b8b51a-1d37c208b3f31c98: - Limit: 2.00 GB
- Memory Current: 7.07 GB (Peak: 9.94 GB)
Load#Id=f554884f9d0a4a92-a5ce36cc3ee8827a@load@id=514feadf97af0524-daf74b70614649b1: - Limit: 2.00 GB
- Memory Current: 1.43 GB (Peak: 1.64 GB)
Load#Id=9135133805924275-ad2e8e9cdc2a5274@load@id=844b303ff7de28cc-20945ab168c163a2: - Limit: 2.00 GB
- Memory Current: 1.03 GB (Peak: 2.70 GB)
Load#Id=682af4224ed4501-be175670a065851f@load@id=0a44483393d33c04-096ede3d29cc7e8c: - Limit: 2.00 GB
- Memory Current: 742.39 MB (Peak: 1.83 GB)
Load#Id=5f6bacca68ec40dc-b5005aa42d04d917@load@id=b44d0e4afdd2e540-47811dfa6c99668c: - Limit: 2.00 GB
- Memory Current: 696.39 MB (Peak: 717.66 MB)
Load#Id=d6afe55293f745dc-b9ff073af847d1ba@load@id=a1411a0aeba2ea2b-a7ac8fb750cdb3a0: - Limit: 2.00 GB
- Memory Current: 686.58 MB (Peak: 692.36 MB)
Load#Id=d1d08de0d38c4370-8ca20a3d081d1117@load@id=114298441c6580b5-f79a81b57059e3ba: - Limit: 2.00 GB
- Memory Current: 394.91 MB (Peak: 400.59 MB)
Load#Id=6bc159c5dd004419-8ec7ed664bebc2c2@load@id=294dea3c17e3eb7d-1b5607aa40650bae: - Limit: 2.00 GB
- Memory Current: 220.34 MB (Peak: 283.40 MB)
Load#Id=8430bb802d9e4804-aada0ef9b7c8a935@load@id=4147e7be91a0193b-ee7f031990f1328b: - Limit: 2.00 GB
- Memory Current: 23.86 MB (Peak: 51.31 MB)
Load#Id=15df7786b0e34df4-a35f7a37a3348218@load@id=7346d6ed49d6b321-676cb61b7e2dab8a: - Limit: 2.00 GB
- Memory Current: 18.63 MB (Peak: 19.25 MB)
CumulativeCompaction:1775206217430@compaction@id=1940f0c4a060e8b8-61e1235b385d3288: - Memory Current: 15.65 MB (Peak: 15.65 MB)
CumulativeCompaction:1775206217434@compaction@id=ad4bbfddc5833417-af6c89a4e31e5bb1: - Memory Current: 11.95 MB (Peak: 54.96 MB)
CumulativeCompaction:1775206217442@compaction@id=b44a6e4c0c76e1f6-1920d3a41aaae488: - Memory Current: 8.65 MB (Peak: 8.65 MB)
CumulativeCompaction:1775206217454@compaction@id=8e46ea3352814b14-06ecd56ae593dcb4: - Memory Current: 4.50 MB (Peak: 53.84 MB)
CumulativeCompaction:1775206217418@compaction@id=204f18818ec5b9ad-21a6a2c16fbb2992: - Memory Current: 4.50 MB (Peak: 55.12 MB)
W20260408 11:43:44.106897 71202 sampler.cpp:194] bvar is busy at sampling for 2 seconds!
W20260408 11:43:57.981467 523128 vtablet_writer.cpp:837] cancel node channel VNodeChannel[1773914257837-1763454559578], load_id=c5402a92830c0d09-b59ab43c02eaa5a6, txn_id=6617279, node=172.22.113.105:8060, error message: timeout
W20260408 11:44:00.236279 523128 vtablet_writer.cpp:365] reach max wait time, max_wait_time_ms: 60000, cancel unfinished node channel and finish close, load id: c5402a92830c0d09-b59ab43c02eaa5a6, txn_id: 6617279, unfinished node channel: 172.22.113.105,
W20260408 11:44:00.275105 524406 vtablet_writer.cpp:837] cancel node channel VNodeChannel[1773914164244-1763454559578], load_id=6540a952b7e2a0dc-22a8c80f7a2cfb89, txn_id=6617283, node=172.22.113.105:8060, error message: timeout
W20260408 11:44:01.876127 524513 vtablet_writer.cpp:837] cancel node channel VNodeChannel[1773914235796-1763454559578], load_id=274218c72625b44f-1875410ebdca2290, txn_id=6617287, node=172.22.113.105:8060, error message: timeout
W20260408 11:44:07.685233 71202 sampler.cpp:194] bvar is busy at sampling for 2 seconds!
W20260408 11:44:06.827119 76764 global.cpp:246] GlobalUpdate is too busy!
W20260408 11:44:06.226248 524406 vtablet_writer.cpp:365] reach max wait time, max_wait_time_ms: 60000, cancel unfinished node channel and finish close, load id: 6540a952b7e2a0dc-22a8c80f7a2cfb89, txn_id: 6617283, unfinished node channel: 172.22.113.105,
W20260408 11:44:06.236351 524513 vtablet_writer.cpp:365] reach max wait time, max_wait_time_ms: 60000, cancel unfinished node channel and finish close, load id: 274218c72625b44f-1875410ebdca2290, txn_id: 6617287, unfinished node channel: 172.22.113.105,
W20260408 11:44:09.580600 526461 task_worker_pool.cpp:409] failed to register task|type=PUBLISH_VERSION|signature=6617281
W20260408 11:44:16.148491 525644 vtablet_writer.cpp:837] cancel node channel VNodeChannel[1773914237430-1763454559578], load_id=a44a438e8950cd16-dd15906bcf768496, txn_id=6617295, node=172.22.113.105:8060, error message: timeout
W20260408 11:44:25.542743 76764 global.cpp:246] GlobalUpdate is too busy!
W20260408 11:44:25.542805 526685 heartbeat_server.cpp:101] heartbeat consume too much time. time=3346380069, host:172.22.113.106, port:9020, cluster id:1613424285, frontend_info:TFrontendInfo(coordinator_address=TNetworkAddress(hostname=172.22.113.106, port=9020), process_uuid=1775205218760) TFrontendInfo(coordinator_address=TNetworkAddress(hostname=172.22.113.105, port=9020), process_uuid=1775205218987) TFrontendInfo(coordinator_address=TNetworkAddress(hostname=172.22.113.104, port=9020), process_uuid=1775209994999) , counter:0, BE start time: 1775205486625
W20260408 11:44:26.665649 75654 engine_publish_version_task.cpp:354] publish version failed on transaction, tablet version not exists. transaction_id=6617291, tablet_id=1773915425897, tablet_state=TABLET_RUNNING, version=3316
W20260408 11:44:26.834749 75654 engine_publish_version_task.cpp:354] publish version failed on transaction, tablet version not exists. transaction_id=6617291, tablet_id=1773915425901, tablet_state=TABLET_RUNNING, version=3316
W20260408 11:44:26.834859 75654 engine_publish_version_task.cpp:354] publish version failed on transaction, tablet version not exists. transaction_id=6617291, tablet_id=1773915425905, tablet_state=TABLET_RUNNING, version=3316
W20260408 11:44:26.834911 75654 engine_publish_version_task.cpp:354] publish version failed on transaction, tablet version not exists. transaction_id=6617291, tablet_id=1773915425909, tablet_state=TABLET_RUNNING, version=3316
W20260408 11:44:26.834936 75654 engine_publish_version_task.cpp:354] publish version failed on transaction, tablet version not exists. transaction_id=6617291, tablet_id=1773915425913, tablet_state=TABLET_RUNNING, version=3316
W20260408 11:44:26.835124 75654 engine_publish_version_task.cpp:354] publish version failed on transaction, tablet version not exists. transaction_id=6617291, tablet_id=1773915425917, tablet_state=TABLET_RUNNING, version=3316
W20260408 11:44:26.835140 75654 engine_publish_version_task.cpp:354] publish version failed on transaction, tablet version not exists. transaction_id=6617291, tablet_id=1773915425921, tablet_state=TABLET_RUNNING, version=3316
W20260408 11:44:26.835158 75654 engine_publish_version_task.cpp:354] publish version failed on transaction, tablet version not exists. transaction_id=6617291, tablet_id=1773915425925, tablet_state=TABLET_RUNNING, version=3316
W20260408 11:44:26.835176 75654 engine_publish_version_task.cpp:354] publish version failed on transaction, tablet version not exists. transaction_id=6617291, tablet_id=1773915425929, tablet_state=TABLET_RUNNING, version=3316
W20260408 11:44:26.835207 75654 engine_publish_version_task.cpp:354] publish version failed on transaction, tablet version not exists. transaction_id=6617291, tablet_id=1773915425933, tablet_state=TABLET_RUNNING, version=3316
- Memory Current: 2.58 GB (Peak: 23.10 GB)