当前位置: 首页 > Linux

linuxkeepalive检测对应用层socketapi的影响

时间:2023-04-06 23:05:40 Linux

问题大多数人都知道tcp的keepalive。假设读者知道如何触发保活。本文想讨论一下keepalive触发后对socket用户的影响。keepalive设置修改/etc/sysctl.confubuntu#vim/etc/sysctl.confubuntu#sysctl-pfs.file-max=131072net.ipv4.tcp_keepalive_time=10net.ipv4.tcp_keepalive_intvl=5net.ipv4.tcp_keepalive_probes=3验证ubuntu#sysctl-a|grepkeepalivenet.ipv4.tcp_keepalive_intvl=5net.ipv4.tcp_keepalive_probes=3net.ipv4.tcp_keepalive_time=10tcp_server.pyimportsocketimportsyssock=socket.socket(socket.AF_INET,socket.SOCK_STREAM)sock.setsockopt(socket.AD.SOL_STREAM)socket.AD.SOL_STREAM套接字)server_address=('localhost',22345)sock.bind(server_address)sock.listen(1)connection,client_address=sock.accept()whileTrue:data=connection.recv(1024)print("data",data)tcp_client.pyimportsocketimportsysimporttimesock=socket.socket(socket.AF_INET,socket.SOCK_STREAM)sock.setsockopt(socket.SOL_SOCKET,socket.SO_KEEPALIVE,1)server_address=('localhost',22345)sock.connect(server_address)time.sleep(999999999),我们可以看到,由于tcp_client开启了SO_KEEPALIVE,tcp_client主动对tcp_server发起KEEPALIVE检测。如果tcp_server启用了SO_KEEPALIVE,tcp_server会向tcp_client发送KEEPALIVE检测。如果tcp_server/tcp_client都启用了KEEPALIVE,它将双向检测。影响应用层socketapi的准备。为了模拟keepalive生效的情况,使用docker模拟网线断开的情况。准备用docker、python、vim、tcpdump镜像安装ubuntu,创建docker网络。运行它,修改心跳设置。ubuntu#sudodockerrun-it\--volume=//home/enjolras/code_repo/python/keepalive_test://home/enjolras/code_repo/python/keepalive_test\--detach=true\--name=tcp_server\--privileged=true\--network=multi-host-network\ubuntu_with_python08f89dcff3547bb15c7aed975dfa5a0821e4d0246d6d812e02fd1470f3cef6c3ubuntu#sumedockerlu/enj-ol/revohcodekeepalive_test://home/enjolras/code_repo/python/keepalive_test\--detach=true\--name=tcp_client\--privileged=true\--network=multi-host-network\ubuntu_with_python对阻塞发送/接收的影响tcp_serverimportsocketimportsyssock=socket.socket(socket.AF_INET,socket.SOCK_STREAM)sock.setsockopt(socket.SOL_SOCKET,socket.SO_REUSEADDR,1)server_address=('0.0.0.0',22345)sock.bind(server_address)sock.listen(1)connection,client_address=sock.accept()connection.setsockopt(socket.SOL_SOCKET,socket.SO_KEEPALIVE,1)data=connection.recv(1024)print("data",data)tcp_clientimportsocketimportsysimporttimesock=socket.socket(socket.AF_INET,socket.SOCK_STREAM)sock.setsockopt(socket.SOL_SOCKET,socket.SO_KEEPALIVE,1)server_address=('tcp_server',22345)sock.connect(server_address)time.sleep(999999999)发送/接收将结束exception/errorcode方法可以知道heartbeat检测到的链接断开了。可以看到tcp_server/tcp_client互相发送心跳。root@0b3f1ee81446:/#tcpdump-i任何端口22345tcpdump:抑制详细输出,使用-v或-vv对任何链接类型LINUX_SLL(Linux熟)进行完整协议解码侦听,捕获大小262144字节12:29:34.491239IPtcp_client.multi-host-network.57130>0b3f1ee81446.22345:标志[S],seq2347845399,win28200,选项[mss1410,sackOK,TSval951128354ecr0,nop,wscale7],长度012:29:34.491279IP0b3f1ee81446.22345>.tcp_client-network.57130:Flags[S.],seq1169988006,ack2347845400,win27960,options[mss1410,sackOK,TSval2298965862ecr951128354,nop,wscale7],length012:29:34.491299IP.multi-tcp_clienthost-network.57130>0b3f1ee81446.22345:Flags[.],ack1,win221,options[nop,nop,TSval951128354ecr2298965862],长度012:29:44.666952IP0b3f1ee81446.22345>tcp_client-network.57130:标志[.],ack1,win219,选项[nop,nop,TSval2298976038ecr951128354],长度012:29:44.666969IPtcp_client.multi-host-network.57130>0b3f1ee81446.22345:标志[.],ack1,win221,选项[nop,nop,TSval951138530ecr2298965862],长度012:29:44.666978IP0b3f1ee81446.22345>tcp_client.multi-host-network.57130:标志[.],ack1,win219,选项[nop,nop,TSval2298976038ecr951128354],length012:29:44.666987IPtcp_client.multi-host-network.57130>0b3f1ee81446.22345:Flags[.],ack1,win221,options[nop,nop,TSval951138530ecr2298976038],长度012:29:54.907019IP0b3f1ee81446.22345>tcp_client.multi-host-network.57130:Flags[.],ack1,win219,选项[nop,nop,TSval2298986278ecr951138530],长度012:29:54.907054IPtcp_client.multi-host-network.57130>0b3f1ee81446.22345:Flags[.],ack1,win221,options[nop,nop,TSval951148770ecr2298976038],长度012:29:54.907059IPtcp_client.multi-host-network.57130>0b3f1ee81446.22345:Flags[.],ack1,win221,选项[nop,nop,TSval951148770ecr2298976038],长度012:29:54.907062IP0b3f1ee81446.223pclient_2234.multi-host-network.57130:Flags[.],ack1,win219,options[nop,nop,TSval2298986278ecr951138530],length0将tcp_server/tcp_client断网.ubuntu#dockernetworkdisconnectmulti-host-网络tcp_clien可以看到tcp_server在连续三个检测包没有回复后,给tcp_client发送了一个RST。12:31:47.547010IPtcp_client.multi-host-network.57130>0b3f1ee81446.22345:Flags[.],ack1,win221,options[nop,nop,TSval951261408ecr2299088676],长度012:31:47.547019IP0b3f1ee81446.22345>tcp_client.multi-host-network.57130:Flags[.],acktion19,winops2nop,nop,TSval2299098916ecr951251168],length012:31:47.547061IPtcp_client.multi-host-network.57130>0b3f1ee81446.22345:Flags[.],ack1,win221,options,[nop,nopval951261408ecr2299098916],长度012:31:57.787226IP0b3f1ee81446.22345>tcp_network.multi-host.multi-host57130:标志[.],ack1,win219,选项[nop,nop,TSval2299109156ecr951261408],长度012:32:02.906612IP0b3f1ee81446.22345>tcp_client.multi-host-network.57130:标志[。],ack1,win219,options[nop,nop,TSval2299114276ecr951261408]1,length:3:08.026829IP0b3f1ee81446.22345>tcp_client.multi-host-network.57130:F滞后[.],ack1,win219,选项[nop,nop,TSval2299119396ecr951261408],长度012:32:13.146776IP0b3f1ee81446.22345>tcp_client.multi-host-network.57130:标志[R,seq1,ack1,win219,options[nop,nop,TSval2299124516ecr951261408],length0可以看到,心跳机制检测到socket状态异常后,会通过异常/错误码通知调用方等.3f1ee81446:/home/enjolras/code_repo/python/keepalive_test#pythontcp_servTraceback(最后一次调用):文件“tcp_server.py”,第11行,在data=connection.recv(1024)socket中。error:[Errno110]连接超时对select的影响tcp_serverimportsocketimportsysimportselectsock=socket.socket(socket.AF_INET,socket.SOCK_STREAM)sock.setsockopt(socket.SOL_SOCKET,socket.SO_REUSEADDR,1)server_address=('0,0.0.0'22345)sock.bind(server_address)sock.listen(1)connection,client_address=sock.accept()connection.setsockopt(socket.SOL_SOCKET,socket.SO_KEEPALIVE,1)可读,可写,可选=选择。选择([连接],[],[])print("readable",readable,writable,exceptional)data=connection.recv(1024)print("data",data)返回套接字选择的可读事件。3f1ee81446:/home/enjolras/code_repo/python/keepalive_test#pythontcp_serv('readable',[],[],[])Traceback(最近调用最后):文件“tcp_server.py",line14,indata=connection.recv(1024)socket.error:[Errno110]Connectiontimedout对epoll的影响没测试,应该和select一致。结论heartbeat检测到tcp链路断开后,会在可读事件中通知应用层。如果没有tcp心跳,就没有应用层心跳,应用层无法知道链路的真实状态。为什么使用应用层heartbeatkeepalive发生在内核中,keepalive检测不到应用本身的状态keepalive可能无法通过网络中继设备,比如负载均衡器。行为由系统配置,检测时间长,不如应用层心跳灵活。当对端直接断电时,不会触发keepalive,因为还有数据包要发送,需要很长时间才能被应用层检测到,比如发送缓冲区已满。