最近新项目写了很多insertBatch的代码。一直有人说批量插入比循环插入效率高很多。所以这篇文章是为了实验。是真的吗?测试环境:SpringBoot2.5Mysql8JDK8Docker一、插入多条数据,可选方案:foreach循环插入拼接sql,一次性执行使用批处理功能插入建立测试环境`sqlfile:dropdatabaseIFEXISTStest;CREATEDATABASEtest;usetest;DROPTABLEIFEXISTS`user`;CREATETABLE`user`(`id`int(11)NOTNULL,`name`varchar(255)DEFAULT"",`age`int(11)DEFAULT0,PRIMARYKEY(`id`))ENGINE=InnoDBDEFAULTCHARSET=utf8;应用配置文件:server:port:8081spring:#数据库连接配置datasource:driver-class-name:com.mysql.cj.jdbc.Driverurl:jdbc:mysql://127.0.0.1:3306/test?characterEncoding=utf-8&useSSL=false&allowPublicKeyRetrieval=true&&serverTimezone=UTC&setUnicode=true&characterEncoding=utf8&&nullCatalogMeansCurrent=true&&autoReconnect=true&&allowMultiQueries=trueusername:rootpassword:123456#mybatis的关联配置mybatis:#mapper配置type/location-xml-asepath:Fileclassmapper包裹:com.aphysia.spingbootdemo.model#开驼峰命名配置:map-underscore-to-camel-case:truelogging:level:root:错误启动文件,配置Mapper文件扫描路径:importorg.mybatis.spring.annotation.MapperScan;导入org.springframework.boot.SpringApplication;导入org.springframework.boot.autoconfigure.EnableAutoConfiguration;导入org.springframework.boot.autoconfigure.SpringBootApplication;@SpringBootApplication@MapperScan("com.aphysia.springdemo.mapper")公共类SpringdemoApplication{publicstaticvoidmain(String[]args){SpringApplication.run(SpringdemoApplication.class,args);}}Mapper文件准备了几种方法,包括插入单个对象,删除所有对象,拼接插入多个对象:importcom.aphysia.springdemo.model.User;importorg.apache.ibatis.annotations.Param;importjava.util.List;publicinterfaceUserMapper{intinsertUser(Useruser);intdeleteAllUsers();intinsertBatch(@Param("users")Listusers);}Mapper.xml文件如下:插入用户(id,age)values(#{id},#{age})从id>0的用户中删除;插入用户(id,age)VALUES(#{model.id},#{model.age})在测试的时候,我们把每一个操作的数据全部删除,保证测试的客观性,不受之前数据的影响不同测试1.foreach插入先获取列表,然后对每条数据执行一次数据库操作,插入数据:@SpringBootTest@MapperScan("com.aphysia.springdemo.mapper")classSpringdemoApplicationTests@ResourceUserMapper用户映射器;静态整数=100000;静态intid=1;@TestvoidinsertForEachTest(){Listusers=getRandomUsers();长启动=System.currentTimeMillis();for(inti=0;iusers=getRandomUsers();长启动=System.currentTimeMillis();userMapper.insertBatch(用户);长端=System.currentTimeMillis();System.out.println("时间:"+(end-start));}3。使用Batchbatchinsert设置MyBatissession的executor类型为Batch,使用sqlSessionFactory设置执行方式为batch,设置自动提交为false,全部插入后,再次提交:@TestpublicvoidinsertBatch(){SqlSessionsqlSession=sqlSessionFactory.openSession(ExecutorType.BATCH,false);UserMapper映射器=sqlSession.getMapper(UserMapper.class);Listusers=getRandomUsers();长启动=System.currentTimeMillis();for(inti=0;i用户=getRandomUsers();longstart=System.currentTimeMillis();for(inti=0;ishowVARIABLESlike'%max_allowed_packet%';+---------------------------+------------+|Variable_name|Value|+---------------------------+------------+|max_allowed_packet|67108864||mysqlx_max_allowed_packet|67108864||slave_max_allowed_packet|----------------+------------+3rowsinset(0.12sec)This67108864ismorethan600M,toobig,Nowondertherewillbeanerror,solet’smodifyitandretestaftermodification:First,enterthecontainerwhenstartingmysql,ordirectlyclicktheCliiconontheDockerdesktopversiontoenter:dockerexec-itmysqlbashEnterthe/etc/mysqldirectoryandmodifythemy.cnffile:cd/etc/mysqlfirstfollowvim,otherwisethefilecannotbeedited:apt-getupdateapt-getinstallvimmodifymy.cnfvimmy.cnfaddmax_allowed_packet=inthelastline20M(pressitoedit,pressescafterediting,enter:wqtoexit)[mysqld]pid-file=/var/run/mysqld/mysqld.pidsocket=/var/run/mysqld/mysqld.sockdatadir=/var/lib/mysqlsecure-file-priv=NULL#建议禁用符号链接以防止各种安全风险symbolic-links=0#自定义配置应该放在这里!includedir/etc/mysql/conf.d/max_allowed_pa??cket=2M退出容器#exit查看mysql容器iddockerps-arestartmysqldockerrestartc178e8998e68重启成功后查看max_allowed_pa??cket,发现修改成功:mysql>showVARIABLESlike'%max_allowed_pa??cket%';+------------------------+------------+|变量名|值|+---------------------------+------------+|最大允许数据包|2097152||mysqlx_max_allowed_pa??cket|67108864||slave_max_allowed_pa??cket|1073741824|+------------------------+------------+我们再次执行拼接sql,发现在100w的时候,sql达到了3.6M左右,超过了我们设定的2M,演示成功抛出错误:org.springframework.dao.TransientDataAccessResourceException:###Cause:com.mysql.cj.jdbc.exceptions。PacketTooBigException:查询数据包太大(36,788,583>2,097,152)。你可以改变通过设置“max_allowed_pa??cket”变量在服务器上设置此值。查询数据包太大(36,788,583>2,097,152)。您可以通过设置“max_allowed_pa??cket”变量在服务器上更改此值。嵌套异常是com.mysql.cj.jdbc.exceptions.PacketTooBigException:Packetforqueryistoolarge(36,788,583>2,097,152)。您可以通过设置“max_allowed_pa??cket”变量在服务器上更改此值。为什么批处理这么慢?但是,如果仔细观察,就会发现上述方法在批处理方面并没有体现出任何优势。跟for循环没什么区别?这是正确的吗?这绝对是错误的。从官方文档可以看出,它会分批更新,不会每次都创建preparedstatements。理论更快。然后我发现我最重要的一个问题:数据库连接URL地址没有rewriteBatchedStatements=true,如果我们不写,MySQLJDBC驱动默认会忽略executeBatch()语句。我们期望一组批量执行的sql语句被拆解,但是在执行过程中是一条一条的发送到MySQL数据库。其实就是单次插入,直接导致性能下降。我说性能如何类似于循环插入数据。只有当rewriteBatchedStatements参数设置为true时,数据库驱动才会帮我们批量执行SQL。Correctdatabaseconnection:jdbc:mysql://127.0.0.1:3306/test?characterEncoding=utf-8&useSSL=false&allowPublicKeyRetrieval=true&&serverTimezone=UTC&setUnicode=true&characterEncoding=utf8&&nullCatalogMeansCurrent=true&&autoReconnect=true&&allowMultiQueries=true&&retestafterbatchretesting,wefoundtheproblem测试,最终的结果如下:插入方式1010010001w10w100w1000wfor循环插入3871150790770026635984太久了...太久了...拼接sql插入308320392838315624948(很可能超过sql长度限制)OutOfMemoryError:堆内存溢出批处理(重点)33332336263616388978OutOfMemoryError:堆Memoryoverflowbatchprocessing+batchsubmission359313394630290718631OutOfMemoryError:HeapmemoryoverflowJudgingfromtheaboveresults,itistruethatbatchprocessingismuchfaster.Whentheorderofmagnitudeistoolarge,itwillactuallyexceedmemoryoverflow.BatchprocessingplusbatchsubmissionItdidn’tgetfaster,it’ssimilartobatchprocessing,butsloweddown,andsubmittedtoomanytimes.Theschemeofsplicingsqlisactuallynotmuchdifferentfrombatchprocessingwhenthenumberisrelativelysmall.Theworstsolutionistoinsertdatainaforloop.Reallytimeconsuming.Whenthereare100entries,ittakes1s,sothissolutioncannotbeselected.WhenIfoundoutthatbatchprocessingwasslowatfirst,Ireallydoubtedmyself.Later,Ifoundoutthattherewasaparameter,andtherewasafeelingofpushingthroughtheclouds.Themoreyouknow,themoreyoudon’tknow.【Author'sbriefintroduction】:QinHuai,authorof公众号【QinHuaiGroceryStore】,theroadtotechnologyisnottemporary,themountainsarehighandtheriversarelong,evenifitisslow,itwillnotstop.JianzhiOfferAllQuestionsandAnswersPDFWhatdidIwritein2020?OpenSourceProgrammingNotes