本文转载自微信公众号《数据仓库宝库》,作者叶华等。转载本文请联系数据仓库宝贝图书馆公众号。案例:一个很简单的SQL语句,明明选择了索引扫描,但是效率还是很低。SQL语句比较简单,查询单表。示例代码如下:SQL>setautottraceSQL>SELECTREQUISITION_IDPARAM1,'1'PARAM2,/*电子标签*/'1'PARAM32FROMdbo.LIS_REQUISITION_INFO3WHEREPRINT_TIME>=4TO_DATE('2019-01-0100:00:00','YYYY-MM-DDHH24:MI:SS')5ANDPRINT_TIMETO_DATE('2019-01-0100:00:00','syyyy-mm-ddhh24:mi:ss'))3-filter("TAT1_STATE"ISNULLANDLENGTH("REQUISITION_ID")=12)4-access("PRINT_TIME">=TO_DATE('2019-01-0100:00:00','syyyy-mm-ddhh24:mi:ss')AND"PRINT_TIME"select/*+NO_MERGELEADING(ab)*/b.owner,b.table_name,a.column_name,b.num_rows,a.num_distinctCardinality,ROUND(A.num_distinct*100/B.num_rows,1)来自dba_tab_col_statisticsa,dba_tablesbwherea.owner=b.owneranda.table_name=b.table_nameanda.owner='DBO'anda.table_name='LIS_REQUISITION_INFO'and.column_name='PRINT_TIME';OWNERTABLE_NAMECOLUMN_NAMENUM_ROWSCARDINALITY------LECT---的选择性-----------------------------------------------------DBOLIS_REQUISITION_INFOPRINT_TIME6933600??222694432.1LIS_REQUISITION_INFO的数据量为6933600,PRINT_TIME列的不同值2226944,选择性高达32.1%。PRINT_TIME给出了条件时间范围,目前从执行计划来看,访问LIS_REQUISITION_INFO表首先通过I_PRINT_TIME索引进行范围扫描,然后将符合条件的记录过滤回表,导致大量的单块读取。PRINT_TIME虽然选择性高,符合索引扫描的要求,但它给定条件的范围太大,使该字段成为索引的不错选择。除了PRINT_TIME,SQL还有requisition_id、TAT1_STATE和ROWNUM。让我们看看他们的选择性。命令如下:SQL>select/*+NO_MERGELEADING(ab)*/b.owner,b.table_name,a.column_name,b.num_rows,a.num_distinctCardinality,ROUND(A.num_distinct*100/B.num_rows,1)selectivityfromdba_tab_col_statistica,dba_tablesbwherea.owner=b.owneranda.table_name=b.table_nameanda.owner='DBO'anda.table_name='LIS_REQUISITION_INFO'anda.column_namein('PRINT_TIME','REQUISITION_ID','TAT1_STATE');OWNERTABLE_NAMECOLUMN_NAMENUMITY_SELECTIVICITY-SELECTIVICARDIN的选择性------------------------------------------------------------------------------DBOLIS_REQUISITION_INFOPRINT_TIME6933600??20DBOLIS_REQUISITION_INFOREQUISITION_ID69336006933600100DBOLIS_REQUISITION_INFOPRINT_TIME6933600??20DBOLIS_REQUISITION_INFOREQUISITION_ID69336006933600100DBOLIS_REQUISITION_INFOPRINT_TIME69314>从SQL选择22.1dbo4>4.LIS_REQUISITION_INFOwherelength(requisition_id)=12COUNT(*)------6968919SQL>selectTAT1_STATE,count(*)fromdbo.LIS_REQUISITION_INFOgroupbyTAT1_STATE;TAT1_STATCOUNT(*)-----------------1242217153553662371401REQUISITION_ID为主键选择选择性很高,但几乎所有的记录值都符合length(requisition_id)=12,TAT1_STATE的数据分布是倾斜的。条件中的TAT1_STATE=''ORTAT1_STATEISNULL属于第一种情况,占总数据量的1/3该字段有固定值(TAT1_STATE=''ORTAT1_STATEISNULL)。如果将PRINT_TIME和TAT1_STATE结合起来创建一个联合索引,会有什么效果呢?命令如下:SQL>createindexdbo.idx_LIS_REQUISITION_INFO_com1ondbo.LIS_REQUISITION_INFO(PRINT_TIME,TAT1_STATE)online;SQL>SELECT/*+index(LIS_PEQUISITION_INFOdbo.idx_LIS_REQUISITION_INFO_com)*/REQUISITION_INFO_com1'PARAM/'2,/*电子标签PARAM3FROMdbo.LIS_REQUISITION_INFOWHEREPRINT_TIME>=TO_DATE('2019-01-0100:00:00','YYYY-MM-DDHH24:MI:SS')ANDPRINT_TIMETO_DATE('2019-01-0100:00:00','syyyy-mm-ddhh24:mi:ss'))3-filter(LENGTH("REQUISITION_ID")=12)4-access("PRINT_TIME">=TO_DATE('2019-01-0100:00:00','syyyy-mm-ddhh24:mi:ss')AND"TAT1_STATE"ISNULLAND"PRINT_TIME"