最近在做数据血缘的相关工作,对explain语句的解析有两个问题
- EXPLAIN VERBOSE语句在查询insert语句的时候,OLAP TABLE SINK输出的TUPLE ID对应的TupleDescriptor的tbl内容为null
如SQL语句
EXPLAIN VERBOSE
INSERT into data_platform.test1 select
t1.uid as ID,
max(dt) as create_time,
min(dt) as update_time
from data_platform.test t1
left join data_platform.ods_test_wh_1 t2 on t1.uid=t2.AGE
where uid > 10
group by uid
order by uid
获得的结果为
PLAN FRAGMENT 0
OUTPUT EXPRS:
id[#15]
create_time[#16]
update_time[#17]
PARTITION: HASH_PARTITIONED: uid[#5]
HAS_COLO_PLAN_NODE: true
OLAP TABLE SINK
TUPLE ID: 7
RANDOM
IS_PARTIAL_UPDATE: false
5:VAGGREGATE (update finalize)(543)
| output: max(dt[#11])[#13], min(dt[#11])[#14]
| group by: uid[#10]
| cardinality=950,123
| final projections: uid[#12], create_time[#13], update_time[#14]
| final project output tuple id: 6
| distribute expr lists: uid[#10]
| tuple ids: 5
|
4:VHASH JOIN(533)
| join op: LEFT OUTER JOIN(PARTITIONED)[]
| equal join conjunct: (uid[#5] = expr_cast(AGE as INT)[#4])
| cardinality=2,999,986
| vec output tuple id: 4
| output tuple id: 4
| vIntermediate tuple ids: 3
| hash output slot ids: 5 6
| isMarkJoin: false
| final projections: uid[#7], dt[#8]
| final project output tuple id: 4
| distribute expr lists: uid[#5]
| distribute expr lists: expr_cast(AGE as INT)[#4]
| tuple ids: 2 1N
|
|----1:VEXCHANGE
| offset: 0
| distribute expr lists:
| tuple ids: 1N
|
3:VEXCHANGE
offset: 0
distribute expr lists:
tuple ids: 2
PLAN FRAGMENT 1
PARTITION: RANDOM
HAS_COLO_PLAN_NODE: false
STREAM DATA SINK
EXCHANGE ID: 03
HASH_PARTITIONED: uid[#5]
2:VOlapScanNode(502)
TABLE: data_platform.test(test), PREAGGREGATION: ON
PREDICATES: (uid[#5] > 10)
partitions=1/1 (test)
tablets=10/10, tabletList=99467678,99467694,99467710 ...
cardinality=3000017, avgRowSize=0.0, numNodes=1
pushAggOp=NONE
tuple ids: 2
PLAN FRAGMENT 2
PARTITION: HASH_PARTITIONED: USER[#0]
HAS_COLO_PLAN_NODE: false
STREAM DATA SINK
EXCHANGE ID: 01
HASH_PARTITIONED: expr_cast(AGE as INT)[#4]
0:VOlapScanNode(513)
TABLE: data_platform.ods_test_wh_1(ods_test_wh_1), PREAGGREGATION: ON
PREDICATES: ((CAST(AGE[#1] AS int) > 10) AND (__DORIS_DELETE_SIGN__[#2] = 0))
partitions=1/1 (ods_test_wh_1)
tablets=10/10, tabletList=167939628,167939632,167939636 ...
cardinality=64, avgRowSize=0.0, numNodes=1
pushAggOp=NONE
final projections: CAST(AGE[#1] AS int)
final project output tuple id: 1
tuple ids: 0
Tuples:
TupleDescriptor{id=0, tbl=ods_test_wh_1}
SlotDescriptor{id=1, col=AGE, colUniqueId=1, type=tinyint, nullable=true, isAutoIncrement=false, subColPath=null}
SlotDescriptor{id=2, col=__DORIS_DELETE_SIGN__, colUniqueId=2, type=tinyint, nullable=false, isAutoIncrement=false, subColPath=null}
TupleDescriptor{id=1, tbl=ods_test_wh_1}
SlotDescriptor{id=4, col=null, colUniqueId=null, type=int, nullable=true, isAutoIncrement=false, subColPath=null}
TupleDescriptor{id=2, tbl=test}
SlotDescriptor{id=5, col=uid, colUniqueId=0, type=int, nullable=true, isAutoIncrement=false, subColPath=null}
SlotDescriptor{id=6, col=dt, colUniqueId=1, type=datetimev2(0), nullable=true, isAutoIncrement=false, subColPath=null}
TupleDescriptor{id=3, tbl=null}
SlotDescriptor{id=7, col=uid, colUniqueId=0, type=int, nullable=true, isAutoIncrement=false, subColPath=null}
SlotDescriptor{id=8, col=dt, colUniqueId=1, type=datetimev2(0), nullable=true, isAutoIncrement=false, subColPath=null}
SlotDescriptor{id=9, col=null, colUniqueId=null, type=int, nullable=true, isAutoIncrement=false, subColPath=null}
TupleDescriptor{id=4, tbl=null}
SlotDescriptor{id=10, col=uid, colUniqueId=0, type=int, nullable=true, isAutoIncrement=false, subColPath=null}
SlotDescriptor{id=11, col=dt, colUniqueId=1, type=datetimev2(0), nullable=true, isAutoIncrement=false, subColPath=null}
TupleDescriptor{id=5, tbl=null}
SlotDescriptor{id=12, col=uid, colUniqueId=0, type=int, nullable=true, isAutoIncrement=false, subColPath=null}
SlotDescriptor{id=13, col=null, colUniqueId=null, type=datetimev2(0), nullable=true, isAutoIncrement=false, subColPath=null}
SlotDescriptor{id=14, col=null, colUniqueId=null, type=datetimev2(0), nullable=true, isAutoIncrement=false, subColPath=null}
TupleDescriptor{id=6, tbl=null}
SlotDescriptor{id=15, col=uid, colUniqueId=0, type=int, nullable=true, isAutoIncrement=false, subColPath=null}
SlotDescriptor{id=16, col=null, colUniqueId=null, type=datetimev2(0), nullable=true, isAutoIncrement=false, subColPath=null}
SlotDescriptor{id=17, col=null, colUniqueId=null, type=datetimev2(0), nullable=true, isAutoIncrement=false, subColPath=null}
TupleDescriptor{id=7, tbl=null}
SlotDescriptor{id=18, col=id, colUniqueId=0, type=int, nullable=true, isAutoIncrement=false, subColPath=null}
SlotDescriptor{id=19, col=create_time, colUniqueId=1, type=datetimev2(0), nullable=true, isAutoIncrement=false, subColPath=null}
SlotDescriptor{id=20, col=update_time, colUniqueId=2, type=datetimev2(0), nullable=true, isAutoIncrement=false, subColPath=null}
此处查阅了fe-core的代码,其中org.apache.doris.nereids.glue.translator.PhysicalPlanTranslator#visitPhysicalOlapTableSink方法中olapTuple对象生成后未设置table成员,此处应可以直接设置参数中olapTableSink的olapTableSink.getTargetTable()方法获取table并赋值到olapTuple,所以此处是否为bug
2. 查阅fe-core的DorisParser.g4文件时,EXPLAIN应该支持plantype和level的解析
explain
: explainCommand planType?
level=(VERBOSE | TREE | GRAPH | PLAN)?
PROCESS?
;
planType
: PARSED
| ANALYZED
| REWRITTEN | LOGICAL // same type
| OPTIMIZED | PHYSICAL // same type
| SHAPE
| MEMO
| ALL // default type
;
但实际无法执行形如EXPLAIN MEMO PLAN这样的语句,会报语法错误,请问是什么原因