当前位置: 首页 > Linux

Redis集群生产环境高可用解决方案实战流程

时间:2023-04-06 03:45:14 Linux

部署方案说明1.sentinel负责redis集群中主从服务的监控、提醒和自动故障转移2.redis集群负责对外提供相关服务哨兵原理介绍原理:哨兵是一个分布式系统,可以在一个架构中运行多个哨兵进程。这些进程使用八卦协议(gossipprotocols)接收rdis主服务器是否离线的信息,并使用投票协议(agreementprotocols)决定是否执行Automaticfailover,选举哪个slave成为新的master。Gossip协议:sentinel服务使用ping命令来确认被监控的服务器是否正常。SubjectivelyDown(简称SDOWN)是指服务器上单个Sentinel实例做出的下线判断。ObjectivelyDown(简称ODOWN)是指多个Sentinel实例在同一台服务器上进行SDOWN判断。投票协议:其实就是选举。哨兵集群按照一定的规则从redis组中选出一台新的服务器成为主服务器,并使其他服务器成为新的从服务器,并修改自己的配置文件。服务器部署规划实验环境使用两台服务器模拟集群环境服务器系统环境Centos6.6x86_64Masterserver10.0.0.3/24Redis-Mster10.0.0.3:6379Redis-Slave110.0.0.3:63791Redis-Slave210.0.0.3:63792Sentinelservices10.0.0.3:26379s110.0.0.3:26378Slaveserver10.0.0.4/24Redis-Slave310.0.0.4:63793Redis-Slave410.0.0.4:63794Sentinelservices210.0.0.6:26379.s30Redis-Slave310.0.0.4:63793Redis-Slave前后逻辑服务配置安装部署过程在主服务器安装redis服务mkdir/usr/local/redis/datacd/usr/local/srcwgethttp://download.redis.io/releases/redis-2.8.9.tar.gztarzxfredis-2.8.9.tar.gzcdredis-2.8.9make&&makeinstall复制配置文件cpredis.conf/usr/local/bin/cd/usr/local/bincpredis.confredis-slave1cpredis.confredis-slave2修改配置文件[root@masterbin]#viredis.confdaemonizeyes**#开启后台运行模式**pidfile/var/run/redis.pidbind10.0.0.3**dbfilenamedump.rdb**dir/usr/local/redis/data**port6379**[root@masterbin]#viredis-slave1daemonizeyes**pidfile/var/run/redis-slave1.pid****port63791**bind10.0.0.3**dbfilename转储-slave1.rdb**dir/usr/local/redis/data**slaveof10.0.0.36379****slave-read-onlyyes**[root@masterbin]#viredis-slave2daemonizeyespidfile/var/run/redis-slave2.pidport63792bind10.0.0.3dbfilenamedump-slave2.rdbdir/usr/local/redis/dataslaveof10.0.0.36379配置redis-sentinel服务mkdir/var/log/redis-pcp/usr/local/src/redis-2.8.9/src/redis-sentinel/usr/bin/cp/usr/local/src/redis-2.8.9/src/sentinel.conf/usr/local/bin/cd/usr/local/bincpsentinel.confsentinel-s1.conf修改配置文件[root@masterbin]#egrep-v"^#|^$"sentinel.confport26379daemonizeyeslogfile/var/log/redis/sentinel.logsentinelmonitormymaster10.0.0.363792sentineldown-after-毫秒mymaster30000sentinelparallel-syncsmymaster1sentinelfailover-timeoutmymaster180000[root@masterbin]#egrep-v"^#|^$"sentinel-s1.confport26378daemonizeyeslogfile/var/log/redis/sentinel-s1.logsentinel监控mymaster10.0.0.363792sentineldown-after-millisecondsmymaster30000sentinelparallel-syncsmymaster1sentinelfailover-timeoutmymaster180000#以上配置从服务器操作过程同上startservicestartredisservice[root@masterbin]#redis-serverredis.conf[root@masterbin]#redis-服务器redis-slave1[root@masterbin]#redis-serverredis-slave2[root@masterbin]#ps-ef|grepredisroot25791023:55?00:00:00redis服务器10.0.0.3:6379root25851023:55?00:00:00redis服务器10.0.0.3:63792root25901023:55?00:00:00redis-server10.0.0.3:63791root25972479023:56pts/000:00:00grep--color=autoredis[root@slavebin]#redis-serverredis-slave3[root@slavebin]#redis-serverredis-slave4[root@slavebin]#ps-ef|grepredisroot25761023:56?00:00:00redis服务器10.0.0.4:63793root25801023:56?00:00:00redis-server10.0.0.4:63794root25842502023:5600:00:00grep--color=autoredis启动redis-sentinel服务[root@masterbin]#redis-sentinelsentinel.conf[root@masterbin]#redis-sentinelsentinel-s1.conf[root@masterbin]#ps-ef|grepredis-sentinelroot26381001:05?00:00:04**redis-sentinel*:26379**root26461001:13?00:00:00**redis-sentinel*:26378**root26502479001:1300:00:00grep--color=autoredis[root@slavebin]#redis-sentinelsentinel-s2.conf[root@slavebin]#redis-sentinelsentinel-s3.conf[root@slavebin]#ps-ef|grepredis-sentinelroot26441101:14?00:00:00**redis-sentinel*:26378**root26491001:14?00:00:00**redis-sentinel*:26379**root26532502001:1500:00:00grep--color=autoredis-sentinel查看日志观察启动过程[root@masterbin]#tail-f/var/log/redis/sentinel.log\`-。mastermymaster10.0.0.36379quorum2[2664]12May01:20:11.123*-dup-sentinelmastermymaster10.0.0.36379#duplicateof10.0.0.3:26378或fb1fbe73B51A0A6E6E71A8CEAE57D34EF773D086E3[2664]12月12日01:20:20:11.123*+SentinelSentinel10.0.0.0.0.0.0.3:2637810.0.0.0.0.0.0.0.0.3263786379#duplate为10.0.0.4:26379或3D43DDEA4D4BA8DE7DDE7DDE7DDE7DD332D332723508F6D4C19[2664][2664]12月12日01:20:20:20:20:20:20:21.410**-dup-sentinelmastermymaster10.0.0.36379#duplate为10.0.0.0.4:26378或6D134D9A3E3E53C0C0CB70DE842281DE8AAF17AAF17AAF17A84C00**可以看到其他监控服务器加入了集群**查看配置文件是否有变化root@masterbin]#egrep-v"^#|^$"sentinel-s1.confport26378daemonizeyeslogfile"/var/log/redis/sentinel-s1.log"sentinelmonitormymaster10.0.0.363792sentinelconfig-epochmymaster0sentinelleader-epochmymaster0sentinelknown-slavemymaster10.0.0.363792dir"/usr/local/bin"sentinelknown-slavemymaster10.0.0.463793sentinelknown-slavemymaster10.0.0.463794sentinelknown-slavemymaster10.0.0.363791sentinelknown-sentinelmymaster10.0.0.326379c327be464ef36e670566a0d76c9dc85bac7f33b1sentinelknown-sentinelmymaster10.0.0.4263793d43ddea4d4ba8de7dd30e2d332723508f6d4c19sentinelknown-sentinelmymaster10.0.0.4263786d134d9a3e53c0cb70de842281de8aaf17a84c00sentinelcurrent-epoch0通过log观察failover过程模拟master服务器故障查看failover[root@masterbin]#redis-cli-h10.0.0.3-p6379shutdown[root@masterbin]#ps-ef|grepredisroot258510May11?00:00:07redis服务器10.0.0.3:63792root259010May11?00:00:07redis服务器10.0.0.3:63791root26601001:20?00:00:02redis-sentinel*:26378root26641001:20?00:00:02redis-sentinel*:26379root26762479001:3000:00:00grep--color=autoredis发现此时主服务器进程不存在,说明服务有问题。清空原日志,查看failover过程[root@slavebin]#>/var/log/redis/sentinel-s3.log[root@slavebin]#tail-f/var/log/redis/sentinel-s3.log[2669]12May01:30:55.203#+sdownmastermymaster10.0.0.36379[2669]12May01:30:55.276#+new-epoch1[2669]12May01:30:55.280#+vote-for-leaderc327be464ef36e670566a0d76c9dc85bac7f33b11[2669]12May01:30:56.329#+odownmastermymaster10.0.0.36379#quorum4/2+[2069]71.30mymaster1.35mymaster1.35Maitch1.35637910.0.0.363792[2669]12May01:30:57.548*+slaveslave10.0.0.4:6379410.0.0.463794@mymaster10.0.0.365792[2069]5:12May*+slaveslave10.0.0.34:6379410.0.0.463793@mymaster10.0.0.363792[2669]5月12日01:30:57.556*+slaveslave10.0.0.3:6379110.0.0.363791@60master.22.0[]5月12日01:30:57.561*+slaveslave10.0.0.3:637910.0.0.36379@mymaster10.0.0.363792[2669]12May01:31:27.620#+sdownslave10.0.0.3:6379130.07@mymaster10.0.0.363792**可以看出master主观下线(+sdown),sentinel选举10.0.0.363792作为新的master服务器,其他slave自动执行slaveof,故障转移成功**恢复原来的master服务器[root@masterbin]#redis-serverredis.conf[root@masterbin]#ps-ef|grepredisroot258510May11?00:00:08redis服务器10.0.0.3:63792root259010May11?00:00:08redis服务器10.0.0.3:63791root26601001:20?00:00:05redis-sentinel*:26378root26641001:20?00:00:05redis-哨兵*:26379root26831:10036?00:00:00redis服务器10.0.0.3:6379root26892479001:3600:00:00grep--color=autoredis[root@slavebin]#tail-f/var/log/redis/sentinel-s3.log[2673]12May01:36:21.925#-sdownslave10.0.0.3:637910.0.0.36379@mymaster10.0.0.363792**当原来的master服务器故障恢复后,会自动加入slave角色集群不会抢占master服务器的角色**测试读写分离[root@masterbin]#redis-cli-h10.0.0.3-p6379210.0.0.3:63792>getkey"test"10.0.0.3:63792>设置密钥文件OK10.0.0.3:63792>获取密钥“文件”[root@masterbin]#redis-cli-h10.0.0.3-p637910.0.0.3:6379>获取密钥“文件”10.0.0.3:6379>setkeyfile1(error)READONLYYoucan'twriteagainstareadonlyslave.#表示新master提升成功,原master恢复失败后已经是从服务器了,也是只读状态,没有破坏之前主-写-从-读的状态。至此,整个部署流程结束,实现了集群监控、故障自动切换、读写分离等功能。