为什么uuid不能作为MySQL的主键？

时间：2023-03-16 02:17:46 科技观察

在MySQL中设计表时，MySQL官方建议不要使用uuid或不连续不重复的雪花id（长且唯一，单机自增），推荐使用连续自增的主键id。官方推荐的是auto_increment。图片来自Pexels，那为什么不推荐使用uuid呢？使用uuid有什么缺点？本题我们将从以下几个部分来分析这个问题，探究其内在原因：MySQL程序实例中使用uuid和自增id的索引结构对比总结MySQL程序实例为了说明这个问题，我们先创建三张表，即：user_auto_keyuser_uuiduser_random_key他们分别代表自动增长的主键，uuid为主键，randomkey为主键，其他的我们完全保持不变。根据控制变量的方法，我们只用不同的策略生成每张表的主键，而其他字段完全一样，然后测试表的插入速度和查询速度。注意：这里的randomkey其实是指雪花算法计算出来的不连续、不重复、不规则的id：一串18位长的值。id自动生成表：用户uuid表：随机主键表：光靠理论是不行的，直接上程序，用Spring的jdbcTemplate实现查询增加测试。技术框架：SpringBoot+jdbcTemplate+junit+hutool，程序原理是连接自己的测试数据库，然后在同一个环境下写入相同数量的数据，分析insert插入时间综合其效率。为了达到最真实的效果，所有的数据都是随机生成的，比如姓名，email，地址都是随机生成的：packagecom.wyq.mysqldemo;importcn.hutool.core.collection.CollectionUtil;importcom.wyq.mysqldemo.databaseobject.UserKeyAuto；导入com.wyq.mysqldemo.databaseobject.UserKeyRandom；导入com.wyq.mysqldemo.databaseobject.UserKeyUUID；导入com.wyq.mysqldemo.diffkeytest.AutoKeyTableService；导入com.wyq.mysqldemo.diffkeytest.RandomKeyTableService；导入com.wyq.mysqldemo.diffkeytest。UUIDKeyTableService；importcom.wyq.mysqldemo.util.JdbcTemplateService；importorg.junit.jupiter.api.Test；importorg.springframework.beans.factory.annotation.Autowired；importorg.springframework.boot.test.context.SpringBootTest；importorg.springframework。util.StopWatch;importjava.util.List;@SpringBootTestclassMysqlDemoApplicationTests{@AutowiredprivateJdbcTemplateServicejdbcTemplateService;@AutowiredprivateAutoKeyTableServiceautoKeyTableService;@AutowiredprivateUUIDKeyTableServiceuuidKeyTableService;@AutowiredprivateRandomKeyTableServicerandomKeyTableService;@TestvoidtestDBTime(){StopWatchstopwatch=newStopWatch("执行sql耗时");/***auto_incrementkey任务*/finalStringinsertSql="INSERTITOuser_key_auto(user_id,user_name,sex,address,city,email,state)VALUES(?,?,?,?,?,?,?)";ListinsertData=autoKeyTableService.getInsertData();stopwatch.start("自动生成键表任务开始");longstart1=System.currentTimeMillis();if(CollectionUtil.isNotEmpty(insertData)){booleaninsertResult=jdbcTemplateService.insert(insertSql,insertData,false);System.out.println(insertResult);}longend1=System.currentTimeMillis();System.out.println("autokey消耗的时间:"+(end1-start1));stopwatch.stop();/***uudID的键*/finalStringinsertSql2="INSERTINTOuser_uuid(id,user_id,user_name,sex,address,city,email,state)VALUES(?,?,?,?,?,?,?,?)";ListinsertData2=uuidKeyTableService.getInsertData();stopwatch.start("UUID键表任务开始");longbegin=System.currentTimeMillis();如果（收集tionUtil.isNotEmpty(insertData)){booleaninsertResult=jdbcTemplateService.insert(insertSql2,insertData2,true);System.out.println(insertResult);}longover=System.currentTimeMillis();System.out.println("UUIDkey消耗时间："+(over-begin));stopwatch.stop();/***随机长值键*/finalStringinsertSql3="INSERTINTOuser_random_key(id,user_id,user_name,sex,address,city,email,state)VALUES(?,?,?,?,?,?,?,?)";ListinsertData3=randomKeyTableService.getInsertData();stopwatch.start("随机长值键表任务开始");Longstart=System.currentTimeMillis();if(CollectionUtil.isNotEmpty(insertData)){booleaninsertResult=jdbcTemplateService.insert(insertSql3,insertData3,true);System.out.println(insertResult);}Longend=System.currentTimeMillis();System.out.println("随机key任务消耗时间:"+(end-start));stopwatch.stop();Stringresult=stopwatch.prettyPrint();System.out.println(result);}程序写入结果user_key_autowriteresult:user_random_keywriteresult:user_uuidtablewriteresult:效率测试结果已经当数据量为130W时：我们再测试插入10W的数据，看看结果如何：可以看出，当数据量为100W左右时，uuid的插入效率垫底，而130W的数据按顺序添加。uuid的时间直线下降，可以发挥的综合效率排名为：auto_key>random_key>uuid。Uuid效率最低，数据量大时效率直线下降。那么为什么会这样呢？带着疑惑，我们来探究一下这个问题：使用uuid和自增id的索引结构与自增id的内部结构对比。自增主键的值是有顺序的，所以InnoDB把每条记录一条一条地存储起来。当达到页面的最大填充因子时（InnoDB默认的最大填充因子是页面大小的15/16，将预留1/16的空间用于以后的修改）。①下一条记录将写入新的一页。一旦按此顺序加载数据，主键页面将被几乎连续的记录填充，从而提高了页面的最大填充率，并且不会浪费页面。②新插入的行必须是原来最大数据行的下一行。MySQL定位寻址速度快，不会为计算新行的位置做额外的消耗。③减少页面分裂和碎片化。使用uuid的索引内部结构是因为uuid相对于自增id的顺序是不规则的。新行的值不一定大于之前主键的值，所以innodb不能一直把新行Insert在索引的末尾，而是需要为新行找到一个新的合适的位置来分配新的空间。这个过程需要很多额外的操作，数据的无序会导致数据分布分散，会导致以下问题：yetbeen被加载到缓存中，InnoDB在插入之前必须从磁盘中找到目标页并将其读取到内存中，这会造成大量的随机IO。②因为写入乱序，innodb不得不频繁的做pagesplitting操作，为新的行分配空间。分页导致大量数据被移动，一次插入至少需要修改三个页面。③由于频繁的分页，页面会变得稀疏，填充不规律，最终导致数据碎片化。将随机值（uuid和snowflakeid）加载到聚集索引（InnoDB默认的索引类型）后，有时需要做一个OPTIMEIZETABLE来重建表并优化页面的填充，这将需要一个一定的时间消耗。结论：使用InnoDB时，尽可能按照主键递增的顺序插入，尽可能使用单调递增的聚簇键值插入新行。使用自增id的缺点那么使用自增id是完全无害的吗？不会的，自增id还会有以下问题：①别人一旦爬取你的数据库，就可以通过数据库的自增id获取你的业务增长信息，很容易分析出你的业务情况。②对于高并发负载，InnoDB在按照主键插入时会造成明显的锁竞争，主键的上界会成为竞争的热点，因为所有的插入都发生在这里，并发插入会导致间隙锁竞争.③Auto_Increment锁机制会造成自增锁的抢夺，有一定的性能损失。附：Auto_increment的锁竞争问题，如果想改善，需要调优innodb_autoinc_lock_mode的配置。总结一下这篇博客，首先从最开始的提问，建表到使用jdbcTemplate测试不同id生成策略在大数据量数据插入中的性能，再到分析MySQL索引结构中的不同id机制以及优缺点，深入解释了为什么uuid和随机不重复id在数据插入时会造成性能损失，并详细解释了这个问题。在实际开发中，还是按照MySQL官方的推荐使用自增id比较好。MySQL博大精深，内部还有很多值得我们去学习优化的地方。作者：Yrion编辑：陶佳龙来源：cnblogs.com/wyq178/p/12548864.html

上一篇：在特朗普微信禁令下在中国根深蒂固的苹果将遭受巨大冲击

下一篇：AOE工程实践银行卡OCR图像处理

为什么uuid不能作为MySQL的主键？相关文章