一种利用字节码增强技术检测线程阻塞的实现方法慢慢的,业务线程池中的所有线程都被阻塞,最终无法对外提供服务(现象是CPU、Load、内存等指标比较低,请求接口后响应超时或无响应)。问题分析响应时间是接口监控的黄金指标之一:假设接口收到请求的时间为t1,接口处理请求,响应时间为t2,则接口响应时间为:t2-t1,响应时间指示器连接到监控报警系统,当响应时间大于阈值时报警;但是当线程阻塞时,由于接口还没有返回,无法监控响应时间。阻塞的线程往往是业务线程,这些业务线程可能是:基于tomcat提供http服务的tomcat线程,线程名称类似:http-nio-8080-exec-1基于RocketMQ的消息消费线程,线程名称类似:ConsumeMessageThread_1基于HSFProvider的线程,线程名称类似:HSFBizProcessor-DEFAULT-12-thread-3......如果我们能够拦截这些业务线程的必要路径,那么我们就可以记录下时间线程开始执行,同时开始计时控制器不断检测线程执行时间,当执行时间大于设定的阈值时,会打印出线程堆栈并报警;当线程正常返回时,线程记录会被删除,所以需要解决的问题主要有两个:如何拦截线程计时检查线程执行时间是否超过阈值。解决思路通过问题分析,可以确定需要解决以下两个问题。检测阻塞线程该模块主要做了三件事:当业务线程开始执行时,注册业务线程结束执行或抛出异常当删除线程注册信息时,周期性检测注册线程是否阻塞,如果阻塞,则打印线程堆栈importorg.apache.commons.logging.Log;importorg.apache.commons.logging.LogFactory;importjava.util。映射;导入java.util.concurrent.ConcurrentHashMap;导入java.util.concurrent.ScheduledThreadPoolExecutor;导入java.util.concurrent.ThreadFactory;导入java.util.concurrent.TimeUnit;导入java.util.concurrent.atomic.AtomicInteger;公共classBlockedThreadChecker{protectedfinalstaticLoglogger=LogFactory.getLog(BlockedThreadChecker.class);私有静态易失性BlockedThreadChecker实例;私人的tefinalstaticintDELAY=10;privatefinalstaticintPERIOD=1000;私有ScheduledThreadPoolExecutor执行器;privatefinalMapthreads=newConcurrentHashMap<>();privateBlockedThreadChecker(){logger.info("initBlockedThreadChecker......classloader:"+this.getClass().getClassLoader()+",parentclassloader:"+this.getClass().getClassLoader().getParent());intcoreSize=Runtime.getRuntime().availableProcessors();ThreadFactorythreadFactory=newThreadFactory(){finalAtomicInteger计数器=newAtomicInteger();@OverridepublicThreadnewThread(Runnabler){Threadthread=newThread(r,"BlockThreadCheckerTimer-"+counter.incrementAndGet());thread.setDaemon(true);返回线程;}};executor=newScheduledThreadPoolExecutor(coreSize,threadFactory);执行人。scheduleAtFixedRate(newRunnable(){@Overridepublicvoidrun(){longnow=System.currentTimeMillis();for(Map.Entryentry:threads.entrySet()){longexecStart=entry.getValue().startTime;longdur=now-execStart;if(dur>=entry.getValue().maxExecTime){BlockedThreadExceptione=newBlockedThreadException(entry.getKey().getName()+"已被阻止"+dur+"ms");e.setStackTrace(entry.getKey().getStackTrace());logger.error(e.getMessage(),e);}}}},DELAY,PERIOD,TimeUnit.MILLISECONDS);}publicstaticBlockedThreadCheckergetInstance(){if(instance!=null){返回实例;}同步(BlockedThreadChecker.class){如果(实例!=null){返回实例;}instance=newBlockedThreadChecker();}返回实例;}publicvoidregisterThread(Threadthread){registerThread(thread,newTask());}publicvoidregisterThread(Threadthread,Tasktask){threads.put(thread,task);logger.info("registerThread"+thread.getName());}publicvoidunregisterThread(Threadthread){threads.remove(thread);logger.info("unregisterThread"+thread.getName());}类任务{longstartTime=System.currentTimeMillis();longmaxExecTime=10000L;}}在线程方案一服务中拦截几个常见的业务线程:基于tomcat提供http服务的tomcat线程,通过实现自定义Filter,在Filter中完成线程注册和注销操作;基于RocketMQ的消息消费线程,根据业务需求统一实现MessageListenerConcurrently、MessageListenerOrderly等,在统一的实现类中完成线程的注册和注销;HSFProvider-basedthreads,通过实现一个自定义的filter,线程的注册和注销操作都在filter中完成。该方案实现简单,但对业务的侵入性较大。强入侵意味着业务在没有意识到问题的情况下没有改变的动力。方案二基于jvm-sandbox实现自定义模块,实现思路如下:importcom.alibaba.jvm.sandbox.api.Information;importcom.alibaba.jvm.sandbox.api.LoadCompleted;importcom.alibaba.jvm.sandbox.api.Module;导入com.alibaba.jvm.sandbox.api.listener.ext.Advice;导入com.alibaba.jvm.sandbox.api.listener.ext.AdviceListener;导入com.alibaba.jvm.sandbox.api.listener.ext.EventWatchBuilder;导入com.alibaba.jvm.sandbox.api.resource.ModuleEventWatcher;导入org.kohsuke.MetaInfServices;导入sun.misc.Unsafe;导入javax.annotation.Resource;导入java.lang.reflect.Field;importjava.util.Properties;@MetaInfServices(Module.class)@Information(id="blocked-thread-module",version="0.0.1",author="yuji")publicclassBlockedThreadModuleimplementsModule,LoadCompleted{@资源私有ModuleEventWatchermoduleEventWatcher;privateAdviceListeneradviceListener=newAdviceListener(){@Overrideprotectedvoidbefore(Adviceadvice)throwsThrowable{如果(!advice.isProcessTop()){返回;}BlockedThreadChecker.getInstance().registerThread(Thread.currentThread());}@OverrideprotectedvoidafterReturning(Adviceadvice){if(!advice.isProcessTop()){return;}BlockedThreadChecker.getInstance().unregisterThread(Thread.currentThread());}@OverrideprotectedvoidafterThrowing(Adviceadvice){if(!advice.isProcessTop()){return;}BlockedThreadChecker.getInstance().unregisterThread(Thread.currentThread());}};@OverridepublicvoidloadCompleted(){newEventWatchBuilder(moduleEventWatcher).onClass("javax.servlet.http.HttpServlet").onBehavior("service").onWatch(adviceListener);新的EventWatchBuilder(moduleEventWatcher).onCl屁股(“com.alibaba.rocketmq.client.consumer.listener.MessageListenerConcurrently”).includeSubClasses().onBehavior(“consumeMessage”).onWatch(adviceListener);newEventWatchBuilder(moduleEventWatcher).onClass("com.alibaba.rocketmq.client.consumer.listener.MessageListenerOrderly").includeSubClasses().onBehavior("consumeMessage").onWatch(adviceListener);newEventWatchBuilder(moduleEventWatcher).onClass("com.taobao.hsf.remoting.provider.ReflectInvocationHandler").includeClasses().onBehavior("invoke").onWatch(adviceListener);}}在应用启动参数中加入javaagent=jvm-sandboxagent即可使用。与方案一相比,业务应用不需要改动任何代码,也不需要修改现有的打包框架,缺点是需要提前在每台应用机器上部署jvm-sandbox,会给应用带来工作量操作和维护。我个人认为这个方案是最稳定的方案。对于运维工作,一种思路是以jar包的形式提供给业务方。业务方引入jar包即可。有两个主要问题需要解决。如何触发jar包执行初始化逻辑。一种方式是通过springbootstarter,比如arthas-spring-boot-starter;另一种是根据spring容器初始化过程选择某个入口点,比如实现ApplicationListener接口来监听spring已经初始化的ApplicationEvent的实现。如何初始化jvm-sandbox初始化的核心逻辑如下://通过ByteBuddyAgentInst=ByteBuddyAgent.install()获取InstrumentationInstrumentation;//在BootstrapClassLoader搜索路径中添加对应版本的sandbox-spy.jar//的操作这一步由于sandbox-spy中的包名以java开头,所以只能通过BootstrapClassLoaderJarFilespyJarFile=newJarFile("/directory/sandbox-spy-version.jar");inst.appendToBootstrapClassLoaderSearch(spyJarFile);//构造jvm-sandboxCoreFeatureStringStringsandboxCoreFeatureString=String.format(";system_module=%s;mode=%s;sandbox_home=%s;provider=%s;namespace=%s;unsafe.enable=true;",systemModule,"agent",sandboxHome,provider,NAMESPACE);CoreConfigurecoreConfigure=CoreConfigure.toConfigure(sandboxCoreFeatureString,null);CoreLoadedClassDataSourceclassDataSource=newDefaultCoreLoadedClassDataSource(inst,true);ProviderManagerproviderManager=newDefaultProviderManager(核心/核心是用户定义的类);CoreModuleManager在这个类中加载和初始化coreModuleManager=newDefaultCoreModuleManager(coreConfigure,inst,classDataSource,providerManager);//为关系初始化命名空间和SpyHandlerSpyUtils.init(NAMESPACE);//加载各种modulecoreModuleManager.reset();上面代码整体逻辑没有问题,需要考虑的细节是上面代码是在不同的类中加载Tomcat下的兼容性问题tomcat类加载器关系pandorarunablejar包pandora类加载器关系idea应用类加载器关系经验总结从目前的三种方案中,我个人更倾向于第二种方案。参考资料bytebuddyjvm-sandboxarthas