当前位置：首页 > Linux

hadoop2.x hdfs完全分布式 HA 搭建

时间：2023-04-06 06:24:50 Linux

准备三台服务器分别为hadoop001，hadoop002，hadoop003搭建依赖环境java这里选择1.8.0安装相关文档可以通过百度查阅，这里不做详细介绍配置免密码登录==特别是namenode之间可以相互免密码登录(在选举activity namenode时要用到免密码通讯)==准备zookeeper这里选择最新版本即可，我用的是3.4.13版本官网配置文档链接：http://zookeeper.apache.org/d... 相关下载路径：https://www.apache.org/dyn/cl...// 下载wget https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.4.13/zookeeper-3.4.13.tar.gz// 解压tar -zxvf zookeeper-3.4.13.tar.gz -C /usr/local// 复制配置模板文件zoo_sample.cfg为zoo.cfgcp /usr/local/zookeeper/conf/zoo_sample.cfg /usr/local/zookeeper//conf/zoo.cfg配置文件介绍# The number of milliseconds of each ticktickTime=2000# The number of ticks that the initial # synchronization phase can takeinitLimit=10# The number of ticks that can pass between # sending a request and getting an acknowledgementsyncLimit=5# the directory where the snapshot is stored.# do not use /tmp for storage, /tmp here is just # example sakes.dataDir=/usr/local/zookeeper/data# the port at which the clients will connectclientPort=2181# the maximum number of client connections.# increase this if you need to handle more clients#maxClientCnxns=60## Be sure to read the maintenance section of the # administrator guide before turning on autopurge.## http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance## The number of snapshots to retain in dataDir#autopurge.snapRetainCount=3# Purge task interval in hours# Set to "0" to disable auto purge feature#autopurge.purgeInterval=1server.1=hadoop001:2888:3888server.2=hadoop002:2888:3888server.3=hadoop003:2888:3888==别忘了创建myid文件，存放路劲根据上面的配置dataDir来，文件内容为配置中server.x对应的x==准备hadoop本文以2.9.1版本举例子官网文档主页路径：http://hadoop.apache.org/docs... 官网HA完全分布配置文档链接：http://hadoop.apache.org/docs...// 下载wget http://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-2.9.1/hadoop-2.9.1-src.tar.gz// 解压tar -zxvf hadoop-2.9.1-src.tar.gz -C /usr/local配置文件介绍 hadoop-env.sh# java的安装位置export JAVA_HOME=/usr/local/javaslaves （存放dataname的主机）hadoop002hadoop003hdfs-site.xml<configuration> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/usr/local/hadoop/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/usr/local/hadoop/dfs/data</value> </property> <property> <name>dfs.nameservices</name> <value>nameservices001</value> </property> <property> <name>dfs.ha.namenodes.nameservices001</name> <value>namenode001,namenode002</value> </property> <property> <name>dfs.namenode.rpc-address.nameservices001.namenode001</name> <value>hadoop001:8020</value> </property> <property> <name>dfs.namenode.rpc-address.nameservices001.namenode002</name> <value>hadoop002:8020</value> </property> <property> <name>dfs.namenode.http-address.nameservices001.namenode001</name> <value>hadoop001:50070</value> </property> <property> <name>dfs.namenode.http-address.nameservices001.namenode002</name> <value>hadoop002:50070</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop001:8485;hadoop002:8485;hadoop003:8485/nameservices001</value> </property> <property> <name>dfs.client.failover.proxy.provider.nameservices001</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/usr/local/hadoop/jn/data</value> </property> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property></configuration>core-site.xml<configuration> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop/tmp</value> </property> <property> <name>fs.default.name</name> <value>hdfs://hadoop001:9000</value> </property> <property> <name>fs.defaultFS</name> <value>hdfs://nameservices001</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value> </property></configuration>yarn-site.xml<configuration> <property>　　　　 <name>yarn.resourcemanager.hostname</name>　　　　 <value>hadoop001</value>　　 </property>　　 <property>　　　　 <name>yarn.nodemanager.aux-services</name>　　　　 <value>mapreduce_shuffle</value>　　 </property>　　 <property>　　　　 <name>yarn.log-aggregation-enable</name>　　　　 <value>true</value>　　 </property>　　 <property>　　　　 <name>yarn.log-aggregation.retain-seconds</name>　　　　 <value>604800</value>　　 </property></configuration>mapred-site.xml<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property></configuration>首次启动前的格式化sudo yum install psmisc，activity namenode切换要用到启动所有zookeeper：kServer.sh start 启动所有JournalNode：hadoop-daemon.sh start journalnode 在其中任意一个namenode上格式化：hdfs namenode -format 把刚刚格式化之后的元数据拷贝到另外一个namenode上启动刚刚格式化的namenode 在任一没有格式化的namenode上执行同步元数据：hdfs namenode -bootstrapStandby 启动第二个namenode 在其中一个namenode上初始化zkfc：hdfs zkfc -formatZK 停止上面节点：stop-dfs.sh 全面启动：start-dfs.sh

上一篇：你真的会用Git吗？Git中的数据结构和相应的伪代码

下一篇：CentOS7.0下使用yum安装MySQL

hadoop2.x hdfs完全分布式 HA 搭建相关文章