多列索引我们经常听到有人说“给WHERE条件下的所有列都加索引”。其实这个建议是非常错误的。在大多数情况下,在多个列上建立单独的索引不会提高MySQL查询性能。MySQL在5.0之后引入了一种叫做“索引合并”的策略。在一定程度上,可以使用表上的多个单列索引来定位指定的行。但是,当服务端对多个索引进行联合操作时,通常会占用大量的CPU和内存资源来缓存、排序和合并算法,尤其是当某些索引选择性不高,需要合并和扫描大量数据时.什么时候。这时候,我们就需要一个多列索引。案例创建一个测试数据库和数据表:CREATEDATABASEIFNOTEXISTSdb_testdefaultcharsetutf8COLLATEutf8_general_ci;usedb_test;CREATETABLEpayment(idINTUNSIGNEDNOTNULLAUTO_INCREMENT,staff_idINTUNSIGNEDNOTNULL,customer_idINTUNSIGNEDNOTNULL,PRIMARYKEY(id))ENGINE=InnoDBDEFAULTCHARSET=utf8;插入1000w行随机数据(利用存储过程):DROPPROCEDUREIFEXISTSadd_payment;DELIMITER//createPROCEDUREadd_payment(innumINT)BEGINDECLARErowidINTDEFAULT0;SET@exesql='INSERTIINTOpayment(staff_id,customer_id)values(?,?)';WHILErowidexplainselectcount(*)frompaymentwherestaff_id=2205ANDcustomer_id=93112;+----+------------+--------+------------+------------------------------+--------------------------+--------+------+--------+--------------------------------------------------------------------+|id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|+----+-------------+--------+------------+--------------------------+----------------------------+--------+------+--------+----------------------------------------------------------------------+|1|SIMPLE|payment|index_merge|idx_customer_id,idx_staff_id|idx_staff_id,idx_customer_id|4,4|NULL|11711|Usingintersect(idx_staff_id,idx_customer_id);Usingwhere;Usingindex|+----+------------+----------+------------+--------------------------------+----------------------------+--------+------+-------+----------------------------------------------------------------------+1rowinset(0.00sec)可以看到type是index_merge,Extra中提示Usingintersect(idx_staff_id,idx_customer_id);这是索引合并,使用两个索引,然后合并两个结果(取交集或并集或两者)查询结果:mysql>selectcount(*)frompaymentwherestaff_id=2205ANDcustomer_id=93112;+------------+|count(*)|+----------+|178770|+----------+1rowinset(0.12sec)然后删除上面的索引,增加多列索引:ALTERTABLEpaymentDROPINDEXidx_customer_id;ALTERTABLEpaymentDROPINDEXidx_staff_id;ALTERTABLE`payment`ADDINDEXidx_customer_id_staff_id(`customer_id`,`staff_id`);注意多列索引很在意索引列的顺序(因为customer_id比较有选择性,所以放在前面)查询:mysql>selectcount(*)frompaymentwherestaff_id=2205ANDcustomer_id=93112;+----------+|计数(*)|+---------+|178770|+------------+1rowinset(0.05sec)发现多列索引加快了查询速度(这里数据量还是小的,大了更明显)注多列索引的列顺序取决于重要的是,如何选择索引的列顺序有一个经验法则:将最具选择性的列放在索引的前面(但不是绝对的)经验法则考虑全局基数和选择性,而不是特定查询:mysql>selectcount(DISTINCTstaff_id)/count(*)ASstaff_id_selectivity,count(DISTINCTcustomer_id)/count(*)AScustomer_id_selectivity,count(*)frompayment\G;***************************1.行*************************staff_id_selectivity:0.0005customer_id_selectivity:0.0500count(*):100000001rowinset(6.29sec)customer_id的选择性较高,所以作为索引列的首位。多列索引只能匹配最左边的前缀,也就是说:select*frompaymentwherestaff_id=2205ANDcustomer_id=93112;selectcount(*)frompaymentwherecustomer_id=93112;可以使用索引,但是select*frompaymentwherestaff_id=2205;不能使用索引。