当前位置: 首页 > 后端技术 > Java

基于JAVA的虚拟机故障排查工具

时间:2023-04-01 14:00:18 Java

今天学习了《深入理解java虚拟机》的排查工具,都是一些小命令,并不复杂,但是有些命令在prod环境下是不能用的,比如jmap,这个要几块钱千兆字节在内存pod中,生成一次文件可能会导致环境停止。如果这时候还有大量的请求进来,那么pod可能还没有生成dump文件就crash了。jps命令查看java进程buxuesongdeMacBook-Pro:~buxuesong$jps-l1121/Users/buxuesong/Documents/git_code/account/account-api/target/account-api-1.3.0-SNAPSHOT.jar1131sun.tools.jps.jpsbuxuesongdeMacBook-Pro:~buxuesong$ps-efgrep112150111211120011:01pmttys0000:24.41/usr/bin/java-jar/Users/buxuesong/Documents/git_code/account/account-api/target/account-api-1.3.0-SNAPSHOT.jar50111381125011:02pmttys0010:00.00grep1121buxuesongdeMacBook-Pro:~buxuesong$jstat命令用于监控虚拟机buxuesongdeMacBook-Pro:~buxuesong$jstat的各种运行状态信息-gcutil1121S0S1EOMCCSYGCYGCTFGCFGCTGCT0.0096.7863.5223.5295.1292.69110.10120.2110.311//1121命令下有2个Survivor区(S0,S1,Survivor0,1空),Survivor6.7分别为63.52%的新生代Eden区(E,表示Eden)//老年代(O,表示Old)和永久代(M)分别使用了23.52%和95.12%的空间//CCS,压缩使用率92.69%//MinorGC(YGC,表示YoungGC)自程序运行以来发生了11次,总时间为0.101秒;//发生FullGC(FGC,表示FullGC)2次,总耗时(FGCT,表示FullGCTime)为0.211秒;所有GC的总耗时(GCT,表示GCTime)为0.311秒jinfo命令实时检查和调整虚拟机的各种参数buxuesongdeMacBook-Pro:~buxuesong$jinfo-flagCMSInitiatingOccupancyFraction1835-XX:CMSInitiatingOccupancyFraction=-1jmap(MemoryMapforJava)命令用于生成堆转储快照(一般称为heapdump或转储文件)buxuesongdeMacBook-Pro:~buxuesong$jmap-dump:format=b,file=account.dump1868Dumpingheapto/Users/buxuesong/account.dump...Heapdumpfilecreatedjhat(JVMHeapAnalysisTool)命令与jmap结合使用,分析jmap生成的heapdump快照。运行后访问http://localhost:7000查看dump内容buxuesongdeMacBook-Pro:~buxuesong$jhataccount.dumpReadingfromaccount.dump...DumpfilecreatedWedNov2523:37:23CST2020Snapshotread,正在解决...正在解决402521个对象...正在跟踪引用,预计80个点...........................................................................................消除重复引用。.........................................................................快照已解决。StartedHTTPserveronport7000Serverisready.jstack(StackTraceforJava)command生成虚拟机当前时刻的线程快照(一般称为threaddump或javacore文件)buxuesongdeMacBook-Pro:~buxuesong$jstack-l19392020-11-2523:47:43全线程转储JavaHotSpot(TM)64位服务器VM(25.201-b09混合模式):“附加侦听器”#34daemonprio=9os_prio=31tid=0x00007fbe7a496800nid=0x9d03等待条件[0x0000000000000000]java.lang.Druid.State:RUNNABLELockedowners"able-ConnectionPool-Destroy-205735257"#33daemonprio=5os_prio=31tid=0x00007fbe77f6f800nid=0x9f03等待条件[0x0000700012133000]java.lang.Thread.State:TIMED_WAITING(睡眠)在com处的java.lang.Thread.sleep(本机方法)。alibaba.druid.pool.DruidDataSource$DestroyConnectionThread.run(DruidDataSource.java:2032)锁定的可拥有同步器:-None"Druid-ConnectionPool-Create-205735257"#32daemonprio=5os_prio=31tid=0x00007fbe77a01800nid=0xa003waitingoncondition[0x0000700012030000]java.lang.Thread.State:WAITING(parking)atsun.misc.Unsafe.park(NativeMethod)-停车等待<0x00000007aa5174d0>(java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)在java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)在java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)在com.alibaba.druid.pool.DruidDataSource$创建连接ionThread.run(DruidDataSource.java:1948)锁定的可拥有同步器:-None"mysql-cj-abandoned-connection-cleanup"#31daemonprio=5os_prio=31tid=0x00007fbe7a3cf000nid=0x6703inObject.wait()[0x0000700011f2d000]java.lang.Thread.State:java.lang.Object.wait(NativeMethod)处的TIMED_WAITING(在对象监视器上)-等待java.lang处的<0x0000000797d2a1f0>(java.lang.ref.ReferenceQueue$Lock)。ref.ReferenceQueue.remove(ReferenceQueue.java:144)-在com.mysql.cj.jdbc.AbandonedConnectionCleanupThread.run(AbandonedConnectionCleanupThread.java:85)处锁定<0x0000000797d2a1f0>(java.lang.ref.ReferenceQueue$Lock).util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)在java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)在java.lang.Thread.run(Thread.java:748)锁定可拥有的同步器:-<0x0000000797d2cce0>(一个java.util.concurrent.ThreadPoolExecutor$Worker)"DestroyJavaVM"#30prio=5os_prio=31tid=0x00007fbe7c87f800nid=0x1003等待条件[0x0000000000000000]java.lang.Thread.State:RUNNABLE锁定的可拥有同步器:-无