当前位置: 首页 > Linux

sqoop

时间:2023-04-06 19:58:19 Linux

的安装与使用官方下载页面链接:http://www.apache.org/dyn/clo...这里使用的sqoop版本是1.X下载路径:http://mirrors.hust.edu.cn/ap...#下载wgethttp://mirrors.hust.edu.cn/apache/sqoop/1.4.7/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz#解压tar-zxvfsqoop-1.4.7.bin__hadoop-2.6.0.tar.gz-C/usr/local#复制配置文件模板cp/usr/local/sqoop/conf/sqoop-env-template.sh/usr/local/sqoop/conf/sqoop-env.shcp/usr/local/sqoop/conf/sqoop-site-template.xml/usr/local/sqoop/conf/sqoop-site.xmlln-s/usr/local/hive/conf/hive-site.conf/usr/local/sqoop/conf/hive-site.conf配置文件介绍sqoop-env.sh#设置hive主目录exportHIVE_HOME=/usr/local/hive添加环境变量/etc/profileexportHADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/local/hive/lib/*Addmysqldriver将mysqlconnector驱动放入sqoop安装目录下的lib目录下。注意不要在这里使用最新的驱动程序。博主这里使用5.1驱动下载页面链接:https://dev.mysql。com/downloa...test/usr/local/sqoop/bin/sqoop\list-databases\--connectjdbc:mysql://hadoop001:3306/\--用户名root\--密码123456mysqlimporthive#mysql全表导入hivebin/sqoop导入\--drivercom.mysql.jdbc.Driver\--connectjdbc:mysql://hadoop001:3306/hadoop\--usernameroot\--password123456\--tabletest\--fields-terminated-by'\001'\--lines-terminated-by'\n'\--delete-target-dir\--num-mappers1\--hive-import\--hive-databasedefault\--hive-tabletest\--direct#mysqlimporthiveincrementalupdatebin/sqoopimport\--drivercom.mysql.jdbc.Driver\--connectjdbc:mysql://hadoop001:3306/hadoop\--usernameroot\--password123456\--tabletest\--check-columntime\--增量lastmodified\--last-value'2018-08-0915:30:29'\--merge-keyid\--fields-terminated-by'\001'\--lines-terminated-by'\n'\--num-mappers1\--target-dir/user/hive/warehouse/test\--hive-drop-import-delims#--hive-delims-replacement'-'数据导入时可能会报错==mainERRORCouldnotregistermbeansjava.security.AccessControlException:accessdenied("javax.management.MBeanTrustPermission""register")==这是一个java安全策略问题;找到jre包,/java/jre/lib/security,这个包下有一个java.policy文件,打开编辑:添加如下权限javax.management.MBeanTrustPermission"register";jobDirect#log-in-defaultjobbin/sqoopjob--createtest--\import\--drivercom.mysql.jdbc.Driver\--connectjdbc:mysql://hadoop001:3306/1。hadoop\--usernameroot\--password123456\--testtable\--check-columntime\--incrementallastmodified\--last-value'2018-08-0915:30:29'\--merge-keyid\--fields-terminated-by'\001'\--lines-terminated-by'\n'\--num-folders1\--target-dir/user/hive/warehouse/testdirectory返回错误错误sqoop。Sqoop:运行Sqoop时出现异常:java。浪。空指针异常浪。组织中的NullPointerException。JSON。JSON对象。(JSONObject.java:144)在组织中。阿帕奇。啪啪啪。工具.SqoopJsonUtil。getJsonStringforMap(SqoopJsonUtil.java:43)在org。阿帕奇。啪啪啪。Sqoop选项。writeProperties(SqoopOptions.java:785)在orgorg。阿帕奇。啪啪啪。元存储。数据库。数据库作业存储。创建(HsqldbJobStorage.java:379)在org.apache.sqoop.tool.JobTool.createJob(JobTool.java:181)在org.apache.sqoop.tool.JobTool.run(JobTool.java:294)在org.apache.sqoop.Sqoop.run(Sqoop.java:147)在org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)在org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)在org.apache.sqoop。Sqoop.runTool(Sqoop.java:234)atorg.apache.sqoop.Sqoop.runTool(Sqoop.java:243)atorg.apache.sqoop.Sqoop.main(Sqoop.java:252)报错的原因是因为缺少json包:==java-json-schema.jar==将jar包添加到${SQOOP_HOME}/lib中执行jobbin/sqoopjob--exectest在job执行过程中,会提示输入数据库密码,可以在sqoop中输入-在site.xml中添加如下相关配置,添加后重新创建job即可实现免密码sqoop.metastore.client.record.passwordtrue如果为true,则允许在metastore中保存密码。再次执行作业并检查数据是否已更新检查jobbin/sqoop作业--showtestJob:testTool:importOptions:--------------------------verbose=falsehcatalog.drop.and.create.table=false#sqoop会自动帮助你记录last-value并更新,这使增加量更新得相对当简单incremental.last.value=2018-08-1003:51:47.0db.connect.string=jdbc:mysql://hadoop001:3306/hadoopcodegen.output.delimiters.escape=0codegen.output.delimiters.enclose.required=falsecodegen.input.delimiters.field=0mainframe.input.dataset.type=psplit.limit=nullhbase.create.table=falsedb.require.password=falseskip.dist.cache=falsehdfs.append.dir=falsedb.table=testcodegen.input.delimiters.escape=0db.password=123456accumulo.create.table=falseimport.fetch.size=nullcodegen.input.delimiters.enclose.required=falsedb.username=rootreset.onemapper=falsecodegen.output.delimiters.record=10import.max.inline.lob.size=16777216sqoop.throwOnError=falsehbase.bulk.load.enabled=falsehcatalog.create.table=falsedb.clear.staging.table=falseincremental.col=timecodegen.input.delimiters.record=0enable.compression=falsehive.overwrite.table=falsehive.import=falsecodegen.input.delimiters.enclose=0accumulo.batch.size=10240000hive.drop.delims=falsecustomtool.options.jsonmap={}codegen.output.delimiters.enclose=0hdfs.delete-target.dir=falsecodegen。output.dir=.codegen.auto.compile.dir=truerelaxed.isolation=falsemapreduce.num.mappers=1accumulo.max.latency=5000import.direct.split.size=0sqlconnection.metadata.transaction.isolation.level=2codegen。输出.delimiters.field=9export.new.update=UpdateOnlyincremental.mode=DateLastModifiedhdfs.file.format=TextFilesqoop.oracle.escaping.disabled=truecodegen.compile.dir=/tmp/sqoop-hadoop/compile/028365970856b88aa0aa91435firmeportal.firfect17dirRoot=_sqoophdfs.target.dir=/user/hive/warehouse/testhive.fail.table.exists=falsemerge.key.col=idjdbc.driver.class=com.mysql.jdbc.Driverdb.batch=false==normal接下来,我们可以结合sqoopjob和crontab等任务调度工具实现相关业务==hiveexporttomysqlbin/sqoopexport\--drivercom.mysql.jdbc.Driver\--connect"jdbc:mysql://hadoop001:3306/hadoop?useUnicode=true&characterEncoding=utf-8"\--usernameroot\--password123456\--tabletest_out\--num-mappers1\--export-dir/user/hive/warehouse/test_out\--fields-terminated-by'\001'\--lines-terminated-by'\n'