2.1.8查询报错Size of filter doesn't match size of column

Viewed 19

Doris版本:2.1.8
问题现象:
在查询时发现有一条数据查不到,查询时报错Size of filter doesn't match size of column,但这条数据在外表hive中是正常的能查到。

查询语句

 SELECT 
STRUCT_ELEMENT(device_info,'device_os') as dim_value,
 tdid AS kpi_value
 FROM cdp_traffic_app.v_cdp_wechat_mp_new_user_yumid new
 where
     partitionday = '20250207'
 and appkey = '642931FE9530465391215F77CD92957A'
and tdid = 'omxHq0BcsEs2raBbcrszmiqq8pfc'

9a74b6fa-0f17-455a-9b43-5d17ecb2bf8c.jpeg
查询其他行数据是正常的(查询语句只有tdid有区别)
c04f149f-7875-42c5-a0e8-2c2fec1bcdfa.jpeg
v_cdp_wechat_mp_new_user_yumid view结构

CREATE OR REPLACE
VIEW `v_cdp_wechat_mp_new_user_yumid` AS
select
    `cdp_traffic_hive`.`dw_all`.`cdp_wechat_mp_new_user_yumid`.`yumid`,
    `cdp_traffic_hive`.`dw_all`.`cdp_wechat_mp_new_user_yumid`.`tdid`,
    `cdp_traffic_hive`.`dw_all`.`cdp_wechat_mp_new_user_yumid`.`openid`,
    `cdp_traffic_hive`.`dw_all`.`cdp_wechat_mp_new_user_yumid`.`geo_info`,
    `cdp_traffic_hive`.`dw_all`.`cdp_wechat_mp_new_user_yumid`.`geo_info_extra`,
    `cdp_traffic_hive`.`dw_all`.`cdp_wechat_mp_new_user_yumid`.`device_info`,
    `cdp_traffic_hive`.`dw_all`.`cdp_wechat_mp_new_user_yumid`.`device_info_extra`,
    `cdp_traffic_hive`.`dw_all`.`cdp_wechat_mp_new_user_yumid`.`app_info`,
    `cdp_traffic_hive`.`dw_all`.`cdp_wechat_mp_new_user_yumid`.`app_info_extra`,
    `cdp_traffic_hive`.`dw_all`.`cdp_wechat_mp_new_user_yumid`.`network_type`,
    `cdp_traffic_hive`.`dw_all`.`cdp_wechat_mp_new_user_yumid`.`etl_insert_time`,
    `cdp_traffic_hive`.`dw_all`.`cdp_wechat_mp_new_user_yumid`.`partitionday`,
    `cdp_traffic_hive`.`dw_all`.`cdp_wechat_mp_new_user_yumid`.`appkey`
from
    `cdp_traffic_hive`.`dw_all`.`cdp_wechat_mp_new_user_yumid`;

该条数据在hive中也没什么异常
img_v3_02l8_fd06f9f9-3bac-4430-a09b-a68509f11c1g.jpg
SQL报错及BE日志

SQL      [1105] [HY000]: errCode = 2, detailMessage = (172.16.24.170)[CANCELLED]cur path: hdfs://datalake-prd.bigdata3.prd.storage.local/apps/hive/warehouse/dw/all/cdp_wechat_mp_new_user_yumid/partitionday=20250207/appkey=642931FE9530465391215F77CD92957A/000000_0. Read parquet file hdfs://datalake-prd.bigdata3.prd.storage.local/apps/hive/warehouse/dw/all/cdp_wechat_mp_new_user_yumid/partitionday=20250207/appkey=642931FE9530465391215F77CD92957A/000000_0 failed, reason = [E-1721][E-1721] Size of filter doesn't match size of column: size=918, filter.size=4064

	0#  doris::Exception::Exception(int, std::basic_string_view<char, std::char_traits<char> > const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:173
	1#  doris::Exception::Exception<unsigned long&, unsigned long&>(int, std::basic_string_view<char, std::char_traits<char> > const&, unsigned long&, unsigned long&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
	2#  doris::vectorized::ColumnVector<unsigned char>::filter(doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator>, 16ul, 16ul> const&) at /home/zcp/repo_center/doris_release/doris/be/src/vec/columns/columns_common.h:86
	3#  doris::vectorized::ColumnNullable::filter(doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator>, 16ul, 16ul> const&) at /home/zcp/repo_center/doris_release/doris/be/src/vec/columns/column_nullable.cpp:373
	4#  doris::vectorized::Block::filter_block_internal(doris::vectorized::Block*, std::vector<unsigned int, std::allocator<unsigned int> > const&, doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator>, 16ul, 16ul> const&) at /home/zcp/repo_center/doris_release/doris/be/src/vec/core/block.cpp:790
	5#  doris::vectorized::RowGroupReader::next_batch(doris::vectorized::Block*, unsigned long, unsigned long*, bool*) at /home/zcp/repo_center/doris_release/doris/be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:0
	6#  doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block*, unsigned long*, bool*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:489
	7#  doris::vectorized::VFileScanner::_get_block_wrapped(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:494
	8#  doris::vectorized::VFileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:494
	9#  doris::vectorized::VScanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_release/doris/be/src/vec/exec/scan/vscanner.cpp:0
	10# doris::vectorized::VScanner::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_release/doris/be/src/vec/exec/scan/vscanner.cpp:102
	11# doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:380
	12# std::_Function_handler<void (), doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_1::operator()() const::{lambda()#1}>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701
	13# doris::ThreadPool::dispatch_thread() at /home/zcp/repo_center/doris_release/doris/be/src/util/threadpool.cpp:0
	14# doris::Thread::supervise_thread(void*) at /var/local/ldb-toolchain/bin/../usr/include/pthread.h:562
	15# start_thread
	16# __clone


W20250415 18:50:34.700150  3805 fragment_mgr.cpp:644] report error status: cur path: hdfs://datalake-prd.bigdata3.prd.storage.local/apps/hive/warehouse/dw/all/cdp_wechat_mp_new_user_yumid/partitionday=20250207/appkey=642931FE9530465391215F77CD92957A/000000_0. Read parquet file hdfs://datalake-prd.bigdata3.prd.storage.local/apps/hive/warehouse/dw/all/cdp_wechat_mp_new_user_yumid/partitionday=20250207/appkey=642931FE9530465391215F77CD92957A/000000_0 failed, reason = [E-1721][E-1721] Size of filter doesn't match size of column: size=918, filter.size=4064
W20250415 18:50:34.700441 49548 fragment_mgr.cpp:644] report error status: cur path: hdfs://datalake-prd.bigdata3.prd.storage.local/apps/hive/warehouse/dw/all/cdp_wechat_mp_new_user_yumid/partitionday=20250207/appkey=642931FE9530465391215F77CD92957A/000000_0. Read parquet file hdfs://datalake-prd.bigdata3.prd.storage.local/apps/hive/warehouse/dw/all/cdp_wechat_mp_new_user_yumid/partitionday=20250207/appkey=642931FE9530465391215F77CD92957A/000000_0 failed, reason = [E-1721][E-1721] Size of filter doesn't match size of column: size=918, filter.size=4064
W20250415 18:51:31.542778 28789 status.h:415] meet error status: [INTERNAL_ERROR]Read parquet file hdfs://datalake-prd.bigdata3.prd.storage.local/apps/hive/warehouse/dw/all/cdp_wechat_mp_new_user_yumid/partitionday=20250207/appkey=642931FE9530465391215F77CD92957A/000000_0 failed, reason = [E-1721][E-1721] Size of filter doesn't match size of column: size=918, filter.size=4064
W20250415 18:51:31.543093 28789 scanner_scheduler.cpp:283] Scan thread read VScanner failed: [INTERNAL_ERROR]cur path: hdfs://datalake-prd.bigdata3.prd.storage.local/apps/hive/warehouse/dw/all/cdp_wechat_mp_new_user_yumid/partitionday=20250207/appkey=642931FE9530465391215F77CD92957A/000000_0. Read parquet file hdfs://datalake-prd.bigdata3.prd.storage.local/apps/hive/warehouse/dw/all/cdp_wechat_mp_new_user_yumid/partitionday=20250207/appkey=642931FE9530465391215F77CD92957A/000000_0 failed, reason = [E-1721][E-1721] Size of filter doesn't match size of column: size=918, filter.size=4064
W20250415 18:51:31.543856  7691 task_scheduler.cpp:361] Pipeline task failed. query_id: 76d8677cd4c84bf7-bd9c31850fb30884|0-0 reason: [INTERNAL_ERROR]cur path: hdfs://datalake-prd.bigdata3.prd.storage.local/apps/hive/warehouse/dw/all/cdp_wechat_mp_new_user_yumid/partitionday=20250207/appkey=642931FE9530465391215F77CD92957A/000000_0. Read parquet file hdfs://datalake-prd.bigdata3.prd.storage.local/apps/hive/warehouse/dw/all/cdp_wechat_mp_new_user_yumid/partitionday=20250207/appkey=642931FE9530465391215F77CD92957A/000000_0 failed, reason = [E-1721][E-1721] Size of filter doesn't match size of column: size=918, filter.size=4064
W20250415 18:51:31.545722 49511 fragment_mgr.cpp:644] report error status: cur path: hdfs://datalake-prd.bigdata3.prd.storage.local/apps/hive/warehouse/dw/all/cdp_wechat_mp_new_user_yumid/partitionday=20250207/appkey=642931FE9530465391215F77CD92957A/000000_0. Read parquet file hdfs://datalake-prd.bigdata3.prd.storage.local/apps/hive/warehouse/dw/all/cdp_wechat_mp_new_user_yumid/partitionday=20250207/appkey=642931FE9530465391215F77CD92957A/000000_0 failed, reason = [E-1721][E-1721] Size of filter doesn't match size of column: size=918, filter.size=4064

这个问题应该怎么排查,有相关的案例吗,这个问题比较着急

在论坛中发现有类似的问题,但是还没有结论:
https://ask.selectdb.com/questions/D11Y1/cha-xun-bao-cuo-size-of-filter-doesn-t-match-size-of-column

1 Answers