当前位置: 首页 > Linux

nfs客户端进程改成D,扩展linuxlock

时间:2023-04-07 00:11:22 Linux

问题现象客户端是opentack,通过nfs挂载我们的storage,发现一个服务进程变成了D,并且长时间无法恢复,如下[root@ECM-043~]#psaux|grepnova-电脑新星214091.40.02117692110916?dl15:061:24/opt/server/python27/bin/python/usr/bin/nova-compute--logfile/var/log/nova/compute.log故障排除步骤客户端故障排除,发现进程挂在了Where,lsof查找进程打开的文件lsof-p21409nova-comp21409nova22wREG0,2902199054015793/var/lib/nova/instances/locks/nova-storage-registry-lock(block.beijing.wocloud.cn:/var/share/ezfs/shareroot/block-bj)可以看到client进程已经打开了nfs上的nova-storage-registry-lock文件,接下来查看文件状态服务器。服务器检查nova-storage-registry-lock文件的inode号是否已知。inode号在所有nfs服务器上看是否有其他进程持有锁文件root@Storage5:~#cat/proc/locks|grepinode1:POSIXADVISORYWRITE2833100:14:21990534942910EOF2:POSIXADVISORYWRITE2816500:14:1099511627783003:POSIX咨询阅读36076900:0F:21590444:POSIXAdvisoryWrite2833100:14:21990540157930EOF5:eof5:eof5:eof5:ADVISORYWRITE9884200:0f:471810EOF7:POSIXADVISORYWRITE68867700:0f:11864037780EOF8:POSIXADVISORYREAD3875400:0f:2159044服务器确认哪些nfs客户端连接到nfs,cif是无状态协议不一样,不就是TCP吗?root@Storage5:/var/lib/nfs/sm#lltotal48drwxr-xr-x2statdroot4096Apr1309:03./drwxr-xr-x5statdroot4096Apr1018:21../-rw-------1statdroot88Apr1116:0810.55.4.1-rw------1statdroot89Apr1014:4710.55.4.15-rw------1statdroot89Apr1014:4110.55.4.16-rw------1statdroot90Apr702:1010.55.4.199-rw-r-----1statdroot1350Apr1113:4210.55.4.205-rw------1statdroot89Apr1116:0010.55.4.31-rw------1statdroot89Apr1014:4410.55.4.32-rw-------1statdroot89Apr1014:3410.55.4.37-rw------1statdroot89Apr1116:0410.55.4.54-rw-r-----1statdroot2288Apr1309:0310.55.4.9root@Storage5:/var/lib/nfs/sm#pwd/var/lib/nfs/sm扩展了linux下的lock类型文件锁,主要分为flock和fcntl2,其粒度不同flocklock保存整个文件,如下,root@scal61:/usr/share/pyshared/ezs3#cat/proc/locks1:FLOCKADVISORYWRITE9786200:12:2296350EOF2:FLOCK(锁类型)ADVISORY(推荐锁,不可强制)WRITE(持有者可写锁文件)4246(持有者的pid)00:12:20704(MAJOR-DEVICE:MINOR-DEVICE:INODE-NUMBERofthelockfile)0EOF(锁文件的范围,0到EOF代表整个文件)的某部分fcntllock文件,如下posixroot@scal61:/usr/share/pyshared/ezs3#cat/proc/locks3:POSIXADVISORYWRITE9783008:03:5244040EOF4:POSIXADVISORYWRITE405600:12:410280EOF5:POSIX咨询读2151:00:10095446:POSIX咨询读278300:12:10052447:POSIX咨询读278300:12:14432448:POSIX咨询写278300:12:10050signosFCK0signosFCK0来自旧文件的flocksystemcallPOSIX表示来自lockf系统调用的较新的POSIX锁。ADVISORY表示该锁不会阻止其他人访问数据;它只会阻止其他人尝试锁定它MANDATORY意味着不允许对数据进行其他访问dwhilethelockisheld第四列显示锁是否允许持有者对文件进行读或写访问

猜你喜欢