Alibaba Java Diagnostic Tool Arthas/Alibaba Java诊断利器Arthas

Overview

Arthas

arthas

Build Status codecov maven license Average time to resolve an issue Percentage of issues still open

Arthas is a Java Diagnostic tool open sourced by Alibaba.

Arthas allows developers to troubleshoot production issues for Java applications without modifying code or restarting servers.

中文说明/Chinese Documentation

Background

Often times, the production system network is inaccessible from the local development environment. If issues are encountered in production systems, it is impossible to use IDEs to debug the application remotely. More importantly, debugging in production environment is unacceptable, as it will suspend all the threads, resulting in the suspension of business services.

Developers could always try to reproduce the same issue on the test/staging environment. However, this is tricky as some issues cannot be reproduced easily on a different environment, or even disappear once restarted.

And if you're thinking of adding some logs to your code to help troubleshoot the issue, you will have to go through the following lifecycle; test, staging, and then to production. Time is money! This approach is inefficient! Besides, the issue may not be reproducible once the JVM is restarted, as described above.

Arthas was built to solve these issues. A developer can troubleshoot your production issues on-the-fly. No JVM restart, no additional code changes. Arthas works as an observer, which will never suspend your existing threads.

Key features

  • Check whether a class is loaded, or where the class is being loaded. (Useful for troubleshooting jar file conflicts)
  • Decompile a class to ensure the code is running as expected.
  • View classloader statistics, e.g. the number of classloaders, the number of classes loaded per classloader, the classloader hierarchy, possible classloader leaks, etc.
  • View the method invocation details, e.g. method parameter, return object, thrown exception, and etc.
  • Check the stack trace of specified method invocation. This is useful when a developers wants to know the caller of the said method.
  • Trace the method invocation to find slow sub-invocations.
  • Monitor method invocation statistics, e.g. qps, rt, success rate and etc.
  • Monitor system metrics, thread states and cpu usage, gc statistics, and etc.
  • Supports command line interactive mode, with auto-complete feature enabled.
  • Supports telnet and websocket, which enables both local and remote diagnostics with command line and browsers.
  • Supports profiler/Flame Graph
  • Supports JDK 6+.
  • Supports Linux/Mac/Windows.

Online Tutorials(Recommended)

Quick start

Use arthas-boot(Recommended)

Downloadarthas-boot.jar,Start with java command:

curl -O https://arthas.aliyun.com/arthas-boot.jar
java -jar arthas-boot.jar

Print usage:

java -jar arthas-boot.jar -h

Use as.sh

You can install Arthas with one single line command on Linux, Unix, and Mac. Copy the following command and paste it into the command line, then press Enter to run:

curl -L https://arthas.aliyun.com/install.sh | sh

The command above will download the bootstrap script as.sh to the current directory. You can move it any other place you want, or put its location in $PATH.

You can enter its interactive interface by executing as.sh, or execute as.sh -h for more help information.

Documentation

Feature Showcase

Dashboard

dashboard

Thread

See what is eating your CPU (ranked by top CPU usage) and what is going on there in one glance:

$ thread -n 3
"as-command-execute-daemon" Id=29 cpuUsage=75% RUNNABLE
    at sun.management.ThreadImpl.dumpThreads0(Native Method)
    at sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:440)
    at com.taobao.arthas.core.command.monitor200.ThreadCommand$1.action(ThreadCommand.java:58)
    at com.taobao.arthas.core.command.handler.AbstractCommandHandler.execute(AbstractCommandHandler.java:238)
    at com.taobao.arthas.core.command.handler.DefaultCommandHandler.handleCommand(DefaultCommandHandler.java:67)
    at com.taobao.arthas.core.server.ArthasServer$4.run(ArthasServer.java:276)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

    Number of locked synchronizers = 1
    - java.util.concurrent.ThreadPoolExecutor$Worker@6cd0b6f8

"as-session-expire-daemon" Id=25 cpuUsage=24% TIMED_WAITING
    at java.lang.Thread.sleep(Native Method)
    at com.taobao.arthas.core.server.DefaultSessionManager$2.run(DefaultSessionManager.java:85)

"Reference Handler" Id=2 cpuUsage=0% WAITING on java.lang.ref.Reference$Lock@69ba0f27
    at java.lang.Object.wait(Native Method)
    -  waiting on java.lang.ref.Reference$Lock@69ba0f27
    at java.lang.Object.wait(Object.java:503)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)

jad

Decompile your class with one shot:

$ jad javax.servlet.Servlet

ClassLoader:
+-java.net.URLClassLoader@6108b2d7
  +-sun.misc.Launcher$AppClassLoader@18b4aac2
    +-sun.misc.Launcher$ExtClassLoader@1ddf84b8

Location:
/Users/xxx/work/test/lib/servlet-api.jar

/*
 * Decompiled with CFR 0_122.
 */
package javax.servlet;

import java.io.IOException;
import javax.servlet.ServletConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;

public interface Servlet {
    public void init(ServletConfig var1) throws ServletException;

    public ServletConfig getServletConfig();

    public void service(ServletRequest var1, ServletResponse var2) throws ServletException, IOException;

    public String getServletInfo();

    public void destroy();
}

mc

Memory compiler, compiles .java files into .class files in memory.

mc /tmp/Test.java

redefine

Load the external *.class files to re-define the loaded classes in JVM.

redefine /tmp/Test.class
redefine -c 327a647b /tmp/Test.class /tmp/Test\$Inner.class

sc

Search any loaded class with detailed information.

$ sc -d org.springframework.web.context.support.XmlWebApplicationContext
 class-info        org.springframework.web.context.support.XmlWebApplicationContext
 code-source       /Users/xxx/work/test/WEB-INF/lib/spring-web-3.2.11.RELEASE.jar
 name              org.springframework.web.context.support.XmlWebApplicationContext
 isInterface       false
 isAnnotation      false
 isEnum            false
 isAnonymousClass  false
 isArray           false
 isLocalClass      false
 isMemberClass     false
 isPrimitive       false
 isSynthetic       false
 simple-name       XmlWebApplicationContext
 modifier          public
 annotation
 interfaces
 super-class       +-org.springframework.web.context.support.AbstractRefreshableWebApplicationContext
                     +-org.springframework.context.support.AbstractRefreshableConfigApplicationContext
                       +-org.springframework.context.support.AbstractRefreshableApplicationContext
                         +-org.springframework.context.support.AbstractApplicationContext
                           +-org.springframework.core.io.DefaultResourceLoader
                             +-java.lang.Object
 class-loader      +-org.apache.catalina.loader.ParallelWebappClassLoader
                     +-java.net.URLClassLoader@6108b2d7
                       +-sun.misc.Launcher$AppClassLoader@18b4aac2
                         +-sun.misc.Launcher$ExtClassLoader@1ddf84b8
 classLoaderHash   25131501

stack

View the call stack of test.arthas.TestStack#doGet:

$ stack test.arthas.TestStack doGet
Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 286 ms.
ts=2018-09-18 10:11:45;thread_name=http-bio-8080-exec-10;id=d9;is_daemon=true;priority=5;TCCL=org.apache.catalina.loader.ParallelWebappClassLoader@25131501
    @test.arthas.TestStack.doGet()
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:624)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:731)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
        at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:110)
        ...
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:451)
        at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1121)
        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:637)
        at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(Thread.java:745)

Trace

See what is slowing down your method invocation with trace command:

trace

Watch

Watch the first parameter and thrown exception of test.arthas.TestWatch#doGet only if it throws exception.

$ watch test.arthas.TestWatch doGet {params[0], throwExp} -e
Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 65 ms.
ts=2018-09-18 10:26:28;result=@ArrayList[
    @RequestFacade[org.apache.catalina.connector.RequestFacade@79f922b2],
    @NullPointerException[java.lang.NullPointerException],
]

Monitor

Monitor a specific method invocation statistics, including total number of invocations, average response time, success rate, and every 5 seconds:

$ monitor -c 5 org.apache.dubbo.demo.provider.DemoServiceImpl sayHello
Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 109 ms.
 timestamp            class                                           method    total  success  fail  avg-rt(ms)  fail-rate
----------------------------------------------------------------------------------------------------------------------------
 2018-09-20 09:45:32  org.apache.dubbo.demo.provider.DemoServiceImpl  sayHello  5      5        0     0.67        0.00%

 timestamp            class                                           method    total  success  fail  avg-rt(ms)  fail-rate
----------------------------------------------------------------------------------------------------------------------------
 2018-09-20 09:45:37  org.apache.dubbo.demo.provider.DemoServiceImpl  sayHello  5      5        0     1.00        0.00%

 timestamp            class                                           method    total  success  fail  avg-rt(ms)  fail-rate
----------------------------------------------------------------------------------------------------------------------------
 2018-09-20 09:45:42  org.apache.dubbo.demo.provider.DemoServiceImpl  sayHello  5      5        0     0.43        0.00%

Time Tunnel(tt)

Record method invocation data, so that you can check the method invocation parameters, returned value, and thrown exceptions later. It works as if you could come back and replay the past method invocation via time tunnel.

$ tt -t org.apache.dubbo.demo.provider.DemoServiceImpl sayHello
Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 75 ms.
 INDEX   TIMESTAMP            COST(ms)  IS-RET  IS-EXP   OBJECT         CLASS                          METHOD
-------------------------------------------------------------------------------------------------------------------------------------
 1000    2018-09-20 09:54:10  1.971195  true    false    0x55965cca     DemoServiceImpl                sayHello
 1001    2018-09-20 09:54:11  0.215685  true    false    0x55965cca     DemoServiceImpl                sayHello
 1002    2018-09-20 09:54:12  0.236303  true    false    0x55965cca     DemoServiceImpl                sayHello
 1003    2018-09-20 09:54:13  0.159598  true    false    0x55965cca     DemoServiceImpl                sayHello
 1004    2018-09-20 09:54:14  0.201982  true    false    0x55965cca     DemoServiceImpl                sayHello
 1005    2018-09-20 09:54:15  0.214205  true    false    0x55965cca     DemoServiceImpl                sayHello
 1006    2018-09-20 09:54:16  0.241863  true    false    0x55965cca     DemoServiceImpl                sayHello
 1007    2018-09-20 09:54:17  0.305747  true    false    0x55965cca     DemoServiceImpl                sayHello
 1008    2018-09-20 09:54:18  0.18468   true    false    0x55965cca     DemoServiceImpl                sayHello

Classloader

$ classloader
 name                                                  numberOfInstances  loadedCountTotal
 BootstrapClassLoader                                  1                  3346
 com.taobao.arthas.agent.ArthasClassloader             1                  1262
 java.net.URLClassLoader                               2                  1033
 org.apache.catalina.loader.ParallelWebappClassLoader  1                  628
 sun.reflect.DelegatingClassLoader                     166                166
 sun.misc.Launcher$AppClassLoader                      1                  31
 com.alibaba.fastjson.util.ASMClassLoader              6                  15
 sun.misc.Launcher$ExtClassLoader                      1                  7
 org.jvnet.hk2.internal.DelegatingClassLoader          2                  2
 sun.reflect.misc.MethodUtil                           1                  1

Web Console

web console

Profiler/FlameGraph

$ profiler start
Started [cpu] profiling
$ profiler stop
profiler output file: /tmp/demo/arthas-output/20191125-135546.svg
OK

View profiler results under arthas-output via browser:

Arthas Spring Boot Starter

Known Users

Arthas has more than 120 registered users, View All.

Welcome to register the company name in this issue: https://github.com/alibaba/arthas/issues/111 (in order of registration)

Alibaba Alipay Aliyun Taobao ICBC 雪球财经 顺丰科技 贝壳找房 vipkid 百度凤巢 有赞

Derivative Projects

Credits

Contributors

This project exists, thanks to all the people who contributed.

Projects

  • bytekit Java Bytecode Kit.
  • greys-anatomy: The Arthas code base has derived from Greys, we thank for the excellent work done by Greys.
  • termd: Arthas's terminal implementation is based on termd, an open source library for writing terminal applications in Java.
  • crash: Arthas's text based user interface rendering is based on codes extracted from here
  • cli: Arthas's command line interface implementation is based on cli, open sourced by vert.x
  • compiler Arthas's memory compiler.
  • Apache Commons Net Arthas's telnet client.
  • async-profiler Arthas's profiler command.
Comments
  • No class or method is affected/Trace, watch等命令无法找到对应的类和方法

    No class or method is affected/Trace, watch等命令无法找到对应的类和方法

    分为两种情况:

    一、 先用sc或者sm搜索对应的类和方法,确认有结果,确认已经被JVM加载的

    1. 系统级别的类(即java.*)默认不能进行增强,需要增强是请参考这里的unsafe开关,增强系统类时请谨慎操作

      options unsafe true
      
    2. 检查下 $HOME/logs/arthas/arthas.log,有没有 ERROR c.t.arthas.core.advisor.Enhancer -the classloader can not load SpyAPI, ignore it. 的日志。如果有,则加载目标类的ClassLoader实现代码有问题,没能正确加载java.* package下面的类。可以尝试开启arthas.enhanceLoaders=java.lang.ClassLoader配置项 ,参考: https://github.com/alibaba/arthas/issues/1596

    3. 构造函数是<init>,例如:watch demo.MathGame <init> '{params,returnObj,throwExp}' -v

    4. 默认情况下,enum类会被过滤掉,参考: https://github.com/alibaba/arthas/issues/1677

    5. 不支持对 lambda 生成的动态类做增强,JVM自身限制

    6. 数组类、接口类、枚举类以及java.lang.Class/java.lang.Integer/java.lang.reflect.Method等类都是不能进行增强的

    7. 由Arthas自身的ClassLoader加载的类不能被增强

    8. $HOME/logs/arthas/arthas.log中查找有没有Method code too large的异常

    9. 存在该异常时,尝试用reset class_name命令对类进行恢复,再进行trace,watch等操作

    二、用sc或者sm搜索对应的类和方法,结果是0时

    1. 查找内部类要用 $ 符号拼出正确的类名,比如sc outer-class$inner-class
    2. 连错了进程,如果之前连过别的进程,且没有shutdown退出,则下次连别的进程时,默认还会连到之前连的进程上。
    question-answered 
    opened by ralf0131 35
  • 通过 Arthas Trace 命令将接口性能优化十倍(User Case 投稿)

    通过 Arthas Trace 命令将接口性能优化十倍(User Case 投稿)

    背景

    Helios 系统要处理的数据量比较大,尤其是查询所有服务一天的评分数据时要返回每日 1440 分钟的所有应用的评分,总计有几十万个数据点,接口有时延迟会达到数秒。本文记录如何利用 Arthas ,将接口从几百几千 ms,优化到几十 ms。

    链路:

    从链路上看,线上获取一整天的数据时大概 300 多 ms,而查询数据库只有 11ms,说明大部分时间都是程序组装数据时消耗的,于是动起了优化代码的念头。

    优化过程

    温馨提示:代码可以不用看,没有上下文的情况下很难明白函数什么意思。主要看 Arthas Trace 的结果与优化思路。

    初始未优化版本

    代码

        private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) {
            HeliosGetScoreResponse response = new HeliosGetScoreResponse();
    
            List<HeliosScore> heliosScores = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId());
            if (CollectionUtils.isEmpty(heliosScores)) {
                return response;
            }
    
            Set<String> dateSet = new HashSet<>();
    
            Map<String, List<HeliosScore>> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId));
            for (List<HeliosScore> value : groupByAppIdHeliosScores.values()) {
                value.sort(Comparator.comparing(HeliosScore::getTimeFrom));
                HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score();
                score.setNamespace(value.get(0).getNamespace());
                score.setAppId(value.get(0).getAppId());
                for (HeliosScore heliosScore : value) {
                    List<HeliosScore> splitHeliosScores = heliosScore.split();
                    for (HeliosScore splitHeliosScore : splitHeliosScores) {
                        if (splitHeliosScore.getTimeFrom().compareTo(request.getStartTime()) < 0) {
                            continue;
                        }
                        if (splitHeliosScore.getTimeFrom().compareTo(request.getEndTime()) > 0) {
                            break;
                        }
                        dateSet.add(DateUtils.yyyyMMddHHmm.formatDate(splitHeliosScore.getTimeFrom()));
                        if (splitHeliosScore.getScores() == null) {
                            splitHeliosScore.setScores("100");
                            log.error("查询时发现数据缺失: {}", heliosScore);
                        }
                        score.add(Math.max(0, Integer.parseInt(splitHeliosScore.getScores())), null);
                    }
                }
                response.getValues().add(score);
            }
    
            response.setDates(new ArrayList<>(dateSet).stream().sorted().collect(Collectors.toList()));
            return response;
        }
    

    Arthas Trace

    `---ts=2021-08-17 16:28:00;thread_name=http-nio-8080-exec-10;id=81;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@20864cd1
        `---[4046.398447ms] xxxService.controller.HeliosController:queryScores()
            +---[0.022259ms] xxxService.model.helios.HeliosGetScoreResponse:<init>() #147
            +---[0.007132ms] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #149
            +---[0.006985ms] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #149
            +---[0.008704ms] xxxService.model.helios.HeliosGetScoreRequest:getFilterByAppId() #149
            +---[19.284658ms] xxxService.service.HeliosService:queryScoresTimeBetween() #149
            +---[0.017468ms] org.apache.commons.collections.CollectionUtils:isEmpty() #150
            +---[0.008054ms] java.util.HashSet:<init>() #154
            +---[0.027591ms] java.util.List:stream() #156
            +---[0.044229ms] java.util.stream.Collectors:groupingBy() #156
            +---[0.155582ms] java.util.stream.Stream:collect() #156
            +---[0.018318ms] java.util.Map:values() #157
            +---[0.019199ms] java.util.Collection:iterator() #157
            +---[min=3.51E-4ms,max=0.014266ms,total=0.125003ms,count=123] java.util.Iterator:hasNext() #157
            +---[min=5.11E-4ms,max=0.010188ms,total=0.145693ms,count=122] java.util.Iterator:next() #157
            +---[min=4.89E-4ms,max=0.045356ms,total=0.321978ms,count=122] java.util.Comparator:comparing() #158
            +---[min=0.003637ms,max=0.033049ms,total=0.928795ms,count=122] java.util.List:sort() #158
            +---[min=5.94E-4ms,max=0.010442ms,total=0.1485ms,count=122] xxxService.model.helios.HeliosGetScoreResponse$Score:<init>() #159
            +---[min=4.5E-4ms,max=0.010857ms,total=0.12773ms,count=122] java.util.List:get() #160
            +---[min=5.01E-4ms,max=0.007849ms,total=0.123696ms,count=122] xxxService.helios.entity.HeliosScore:getNamespace() #160
            +---[min=6.5E-4ms,max=0.007324ms,total=0.135906ms,count=122] xxxService.model.helios.HeliosGetScoreResponse$Score:setNamespace() #160
            +---[min=3.72E-4ms,max=0.010288ms,total=0.086703ms,count=122] java.util.List:get() #161
            +---[min=5.1E-4ms,max=0.00627ms,total=0.103871ms,count=122] xxxService.helios.entity.HeliosScore:getAppId() #161
            +---[min=5.97E-4ms,max=0.006531ms,total=0.126184ms,count=122] xxxService.model.helios.HeliosGetScoreResponse$Score:setAppId() #161
            +---[min=4.45E-4ms,max=0.020198ms,total=0.138299ms,count=122] java.util.List:iterator() #162
            +---[min=3.42E-4ms,max=0.014615ms,total=0.256056ms,count=366] java.util.Iterator:hasNext() #162
            +---[min=3.59E-4ms,max=0.014974ms,total=0.174396ms,count=244] java.util.Iterator:next() #162
            +---[min=0.071035ms,max=0.148132ms,total=19.444179ms,count=244] xxxService.helios.entity.HeliosScore:split() #163
            +---[min=4.06E-4ms,max=0.022364ms,total=0.210152ms,count=244] java.util.List:iterator() #164
            +---[min=3.07E-4ms,max=0.199649ms,total=143.267893ms,count=351604] java.util.Iterator:hasNext() #164
            +---[min=3.25E-4ms,max=24.863976ms,total=177.15363ms,count=351360] java.util.Iterator:next() #164
            +---[min=3.93E-4ms,max=0.096771ms,total=176.843018ms,count=351360] xxxService.helios.entity.HeliosScore:getTimeFrom() #165
            +---[min=4.07E-4ms,max=18.772715ms,total=205.632183ms,count=351360] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #165
            +---[min=3.33E-4ms,max=0.045589ms,total=149.24486ms,count=351360] java.util.Date:compareTo() #165
            +---[min=3.93E-4ms,max=0.032972ms,total=86.466793ms,count=175680] xxxService.helios.entity.HeliosScore:getTimeFrom() #168
            +---[min=4.12E-4ms,max=0.061003ms,total=94.294061ms,count=175680] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #168
            +---[min=3.37E-4ms,max=0.038792ms,total=74.505056ms,count=175680] java.util.Date:compareTo() #168
            +---[min=3.97E-4ms,max=0.036548ms,total=87.693935ms,count=175680] xxxService.helios.entity.HeliosScore:getTimeFrom() #171
         1  +---[min=0.001952ms,max=0.068413ms,total=391.739063ms,count=175680] xxxService.utils.DateUtils$yyyyMMddHHmm:formatDate() #171
            +---[min=4.07E-4ms,max=0.037904ms,total=108.107714ms,count=175680] java.util.Set:add() #171
            +---[min=3.95E-4ms,max=0.031555ms,total=88.173857ms,count=175680] xxxService.helios.entity.HeliosScore:getScores() #172
            +---[min=3.88E-4ms,max=0.033584ms,total=84.689466ms,count=175680] xxxService.helios.entity.HeliosScore:getScores() #176
            +---[min=3.11E-4ms,max=0.038121ms,total=69.708752ms,count=175680] java.lang.Math:max() #176
            +---[min=4.66E-4ms,max=0.03391ms,total=104.476576ms,count=175680] xxxService.model.helios.HeliosGetScoreResponse$Score:add() #176
            +---[min=6.17E-4ms,max=0.01503ms,total=0.159826ms,count=122] xxxService.model.helios.HeliosGetScoreResponse:getValues() #179
            +---[min=6.44E-4ms,max=0.03742ms,total=0.21068ms,count=122] java.util.List:add() #179
            +---[0.108961ms] java.util.ArrayList:<init>() #182
            +---[0.017455ms] java.util.ArrayList:stream() #182
            +---[0.011099ms] java.util.stream.Stream:sorted() #182
            +---[0.013699ms] java.util.stream.Collectors:toList() #182
            +---[0.38178ms] java.util.stream.Stream:collect() #182
            `---[0.004627ms] xxxService.model.helios.HeliosGetScoreResponse:setDates() #182
    

    分析

    Arthas 显示总共花了 4 秒,但实际上在链路上看大概是 350~450ms 左右。其他多出来的时间是 Arthas 每一次执行统计的消耗,因为方法里的循环比较多。这也告诉我们,不要用 trace 去看循环很多的方法。会对性能有非常严重的影响。

    可以看出整个函数有 3 个循环,第一层循环的数量为 appId 的数量约为 140,第二层是查出来的数据条数,一天的数据已经归并了所以这里应该是 1,第三层是时间区间的分钟数,一天的话就是 1440 个。

    Trace 中可以看到消耗最多的是封装的一个 SimpleDateFormat.formatDate()

    第一次优化

    优化方向

    1. 遍历每个时间点的思路改变,把合并过的大对象拆分成一个个小对象直接遍历,改成先合并起来,通过时间点逻辑上遍历。这样会减少创建几十万个对象。

    2. 将时间点集合 Set<String> dateSet 改为 Set<Date>,这样减少反复 formatDate() 的开销。

    3. 优化字符串转数字的过程,减少 Integer.parseInt方法调用,改为用 Map<String, Integer> 提前创建出 0~100 的字符串数字字典。(后来经过 JMH 测试,还是 Integer.parseInt 最快)

    代码

    private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) {
            HeliosGetScoreResponse response = new HeliosGetScoreResponse();
    
            List<HeliosScore> heliosScoresRecord = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId());
            if (CollectionUtils.isEmpty(heliosScoresRecord)) {
                return response;
            }
    
            Set<Date> dateSet = new HashSet<>();
    
            List<HeliosScore> heliosScores = HeliosDataMergeJob.mergeData(heliosScoresRecord);
    
            Map<String, List<HeliosScore>> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId));
    
            for (List<HeliosScore> scores : groupByAppIdHeliosScores.values()) {
                HeliosScore heliosScore = scores.get(0);
                HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score();
                score.setNamespace(heliosScore.getNamespace());
                score.setAppId(heliosScore.getAppId());
                score.setScores(new ArrayList<>());
                response.getValues().add(score);
    
                List<Integer> scoreIntList = HeliosHelper.splitScores(heliosScore);
    
                // 以 requestTime 为准
                Calendar indexDate = DateUtils.roundDownMinute(request.getStartTime().getTime());
                int index = 0;
                // 如果 timeFrom < requestTime,则增加 timeFrom 到 requestTime
                while (indexDate.getTime().compareTo(heliosScore.getTimeFrom()) > 0) {
                    heliosScore.getTimeFrom().setTime(heliosScore.getTimeFrom().getTime() + 60_000);
                    index++;
                }
    
                while (indexDate.getTime().compareTo(request.getEndTime()) <= 0 && indexDate.getTime().compareTo(heliosScore.getTimeTo()) <= 0  && index < scoreIntList.size()) {
                    Integer scoreInt = scoreIntList.get(index++);
                    score.getScores().add(scoreInt);
                    dateSet.add(indexDate.getTime());
                    indexDate.add(Calendar.MINUTE, 1);
                }
            }
    
            response.setDates(new ArrayList<>(dateSet).stream().sorted().map(DateUtils.yyyyMMddHHmm::formatDate).collect(Collectors.toList()));
            return response;
        }
    

    Arthas Trace

    ---ts=2021-08-17 14:44:11;thread_name=http-nio-8080-exec-10;id=ab;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@16ea0f22
        `---[6997.005629ms] xxxService.controller.HeliosController:queryScores()
            +---[0.020032ms] xxxService.model.helios.HeliosGetScoreResponse:<init>() #149
            +---[0.007451ms] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #151
            +---[min=0.001054ms,max=7.458198ms,total=213.19538ms,count=170754] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #57
            +---[0.007267ms] xxxService.model.helios.HeliosGetScoreRequest:getFilterByAppId() #57
            +---[15.255919ms] xxxService.service.HeliosService:queryScoresTimeBetween() #57
            +---[0.020045ms] org.apache.commons.collections.CollectionUtils:isEmpty() #152
            +---[0.015161ms] java.util.HashSet:<init>() #156
            +---[20.06713ms] xxxService.helios.jobs.HeliosDataMergeJob:mergeData() #158
            +---[0.043042ms] java.util.List:stream() #160
            +---[0.028232ms] java.util.stream.Collectors:groupingBy() #57
            +---[min=0.087087ms,max=1.931641ms,total=2.018728ms,count=2] java.util.stream.Stream:collect() #57
            +---[0.0151ms] java.util.Map:values() #162
            +---[0.019611ms] java.util.Collection:iterator() #57
            +---[min=7.55E-4ms,max=0.015165ms,total=0.201221ms,count=121] java.util.Iterator:hasNext() #57
            +---[min=0.001178ms,max=0.02477ms,total=0.220931ms,count=120] java.util.Iterator:next() #57
            +---[min=8.14E-4ms,max=0.01101ms,total=0.155044ms,count=120] java.util.List:get() #163
            +---[min=0.001049ms,max=0.009425ms,total=0.231297ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:<init>() #164
            +---[min=0.001167ms,max=0.009721ms,total=0.194502ms,count=120] xxxService.helios.entity.HeliosScore:getNamespace() #165
            +---[min=0.001222ms,max=0.020409ms,total=0.264791ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setNamespace() #57
            +---[min=0.001097ms,max=0.006475ms,total=0.169987ms,count=120] xxxService.helios.entity.HeliosScore:getAppId() #166
            +---[min=0.00121ms,max=0.007106ms,total=0.207877ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setAppId() #57
            +---[min=8.63E-4ms,max=0.008981ms,total=0.176195ms,count=120] java.util.ArrayList:<init>() #167
            +---[min=0.001225ms,max=0.021948ms,total=0.340375ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setScores() #57
            +---[min=0.00112ms,max=0.008984ms,total=0.196212ms,count=120] xxxService.model.helios.HeliosGetScoreResponse:getValues() #168
            +---[min=7.64E-4ms,max=0.027237ms,total=154.660479ms,count=170753] java.util.List:add() #57
            +---[min=0.028779ms,max=0.237608ms,total=20.049731ms,count=120] xxxService.helios.HeliosHelper:splitScores() #170
            +---[min=0.001178ms,max=0.008102ms,total=0.199087ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #173
            +---[min=6.89E-4ms,max=0.048069ms,total=140.74298ms,count=170040] java.util.Date:getTime() #57
            +---[min=0.004686ms,max=0.03805ms,total=0.775394ms,count=120] xxxService.utils.DateUtils:roundDownMinute() #57
            +---[min=7.84E-4ms,max=7.562581ms,total=162.855553ms,count=170040] java.util.Calendar:getTime() #176
          2 +---[min=9.94E-4ms,max=0.029962ms,total=385.371864ms,count=339960] xxxService.helios.entity.HeliosScore:getTimeFrom() #57
          1 +---[min=7.76E-4ms,max=7.936578ms,total=483.361269ms,count=511428] java.util.Date:compareTo() #57
            +---[min=9.95E-4ms,max=0.077109ms,total=192.749805ms,count=169920] xxxService.helios.entity.HeliosScore:getTimeFrom() #177
            +---[min=6.94E-4ms,max=7.358942ms,total=151.184751ms,count=169920] java.util.Date:setTime() #57
            +---[min=7.67E-4ms,max=0.029244ms,total=152.500401ms,count=170753] java.util.Calendar:getTime() #181
            +---[min=7.65E-4ms,max=0.016336ms,total=151.879643ms,count=170635] java.util.Calendar:getTime() #182
            +---[min=0.001011ms,max=0.028133ms,total=196.192946ms,count=170635] xxxService.helios.entity.HeliosScore:getTimeTo() #57
            +---[min=6.93E-4ms,max=0.836104ms,total=141.443001ms,count=170635] java.util.List:size() #57
            +---[min=7.63E-4ms,max=7.940119ms,total=162.285955ms,count=170633] java.util.List:get() #183
          3 +---[min=0.001068ms,max=0.973964ms,total=209.721ms,count=170633] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #184
            +---[min=7.71E-4ms,max=0.028856ms,total=154.918574ms,count=170633] java.util.Calendar:getTime() #185
            +---[min=8.07E-4ms,max=8.030316ms,total=186.971072ms,count=170633] java.util.Set:add() #57
            +---[min=7.82E-4ms,max=0.034732ms,total=156.2645ms,count=170633] java.util.Calendar:add() #186
            +---[0.050615ms] java.util.ArrayList:<init>() #190
            +---[0.019114ms] java.util.ArrayList:stream() #57
            +---[0.029096ms] java.util.stream.Stream:sorted() #57
            +---[0.018823ms] java.util.stream.Stream:map() #57
            +---[0.009092ms] java.util.stream.Collectors:toList() #57
            `---[0.006768ms] xxxService.model.helios.HeliosGetScoreResponse:setDates() #57
    

    分析

    这一步实际上执行时间优化了 50ms 左右。

    从 Trace 中看耗时时间最长的是 Date 的 compareTo,也就是代码中的 if (splitHeliosScore.getTimeFrom().compareTo(request.getStartTime()) < 0)

    而比较意外的是从对象中 get 属性居然也是有开销的。

    第二次优化

    优化方向

    结合上一次 Arthas Trace 的结果,在以下几个方向进行优化:

    1. Date 对象的换成 long 型时间戳进行比较
    2. Date 对象反复 getTime、setTime,改为 long 型时间戳 += 60_000 实现,得到结果后只 setTime 一次。
    3. 每次填充数据都往 Set<String> dateSet 放入数据,改为通过标识判断只放入一次。
    4. 存放分数的 ArrayList 在第一次循环之后,可以确认大小,之后循环创建 ArrayList 时直接填入固定的大小,减少内存创建。

    代码

        private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) {
            HeliosGetScoreResponse response = new HeliosGetScoreResponse();
    
            List<HeliosScore> heliosScoresRecord = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId());
            if (CollectionUtils.isEmpty(heliosScoresRecord)) {
                return response;
            }
    
            Set<Date> dateSet = new HashSet<>();
            boolean isDateSetInitial = false;
            int scoreSize = 16;
    
            List<HeliosScore> heliosScores = HeliosDataMergeJob.mergeData(heliosScoresRecord);
    
            Map<String, List<HeliosScore>> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId));
    
            for (List<HeliosScore> scores : groupByAppIdHeliosScores.values()) {
                HeliosScore heliosScore = scores.get(0);
                HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score();
                score.setNamespace(heliosScore.getNamespace());
                score.setAppId(heliosScore.getAppId());
                score.setScores(new ArrayList<>(scoreSize));
                response.getValues().add(score);
    
                List<Integer> scoreIntList = HeliosHelper.splitScores(heliosScore);
    
                // 以 requestTime 为准
                long indexDateMills = request.getStartTime().getTime();
                int index = 0;
                // 如果 timeFrom < requestTime,则增加 timeFrom 到 requestTime
                long heliosScoreTimeFromMills = heliosScore.getTimeFrom().getTime();
                while (indexDateMills > heliosScoreTimeFromMills) {
                    heliosScoreTimeFromMills += 60_000;
                    index++;
                }
                heliosScore.getTimeFrom().setTime(heliosScoreTimeFromMills);
    
                long requestEndTimeMills = request.getEndTime().getTime();
                long heliosScoreTimeToMills = heliosScore.getTimeTo().getTime();
                // 循环条件为 (当前时间 <= 请求最大时间) && (当前时间 <= 数据最大时间) && (index < 数据条数)
                while (indexDateMills <= requestEndTimeMills && indexDateMills <= heliosScoreTimeToMills && index < scoreIntList.size()) {
                    score.getScores().add(scoreIntList.get(index++));
                    if (!isDateSetInitial) {
                        dateSet.add(new Date(indexDateMills));
                    }
                    indexDateMills += 60_000;
                }
                // 性能优化,减少重复放入的次数
                isDateSetInitial = true;
                // 性能优化,初始化足够的 size 减少扩容次数。 x1.1 为了万一数据数量不一致,留出一点 buffer。
                scoreSize = (int) (score.getScores().size() * 1.1);
            }
    
            response.setDates(new ArrayList<>(dateSet).stream().sorted().map(DateUtils.yyyyMMddHHmm::formatDate).collect(Collectors.toList()));
            return response;
        }
    

    Arthas Trace

    `---ts=2021-08-17 15:20:41;thread_name=http-nio-8080-exec-7;id=aa;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@14be750c
        `---[1411.395123ms] xxxService.controller.HeliosController:queryScores()
            +---[0.016102ms] xxxService.model.helios.HeliosGetScoreResponse:<init>() #149
            +---[0.019084ms] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #151
            +---[0.007879ms] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #57
            +---[0.006808ms] xxxService.model.helios.HeliosGetScoreRequest:getFilterByAppId() #57
            +---[27.494178ms] xxxService.service.HeliosService:queryScoresTimeBetween() #57
            +---[0.02087ms] org.apache.commons.collections.CollectionUtils:isEmpty() #152
            +---[0.007694ms] java.util.HashSet:<init>() #156
            +---[19.990512ms] xxxService.helios.jobs.HeliosDataMergeJob:mergeData() #160
            +---[0.044161ms] java.util.List:stream() #162
            +---[0.025737ms] java.util.stream.Collectors:groupingBy() #57
            +---[min=0.079651ms,max=2.007048ms,total=2.086699ms,count=2] java.util.stream.Stream:collect() #57
            +---[0.018405ms] java.util.Map:values() #164
            +---[0.021408ms] java.util.Collection:iterator() #57
            +---[min=7.4E-4ms,max=0.015625ms,total=0.177657ms,count=121] java.util.Iterator:hasNext() #57
            +---[min=0.001193ms,max=0.026712ms,total=0.258491ms,count=120] java.util.Iterator:next() #57
            +---[min=7.69E-4ms,max=0.011855ms,total=0.158671ms,count=120] java.util.List:get() #165
            +---[min=0.001045ms,max=0.019788ms,total=0.232004ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:<init>() #166
            +---[min=0.001072ms,max=0.007958ms,total=0.193652ms,count=120] xxxService.helios.entity.HeliosScore:getNamespace() #167
            +---[min=0.001164ms,max=0.007796ms,total=0.201584ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setNamespace() #57
            +---[min=0.001048ms,max=0.007456ms,total=0.178323ms,count=120] xxxService.helios.entity.HeliosScore:getAppId() #168
            +---[min=0.001137ms,max=0.010225ms,total=0.201887ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setAppId() #57
            +---[min=0.001627ms,max=0.010431ms,total=0.291395ms,count=120] java.util.ArrayList:<init>() #169
            +---[min=0.00116ms,max=0.0088ms,total=0.20171ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setScores() #57
            +---[min=0.001076ms,max=0.010293ms,total=0.199407ms,count=120] xxxService.model.helios.HeliosGetScoreResponse:getValues() #170
            +---[min=7.54E-4ms,max=0.086952ms,total=150.86682ms,count=170753] java.util.List:add() #57
            +---[min=0.020428ms,max=0.269554ms,total=19.477128ms,count=120] xxxService.helios.HeliosHelper:splitScores() #172
            +---[min=0.001092ms,max=0.005258ms,total=0.202045ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #175
            +---[min=7.09E-4ms,max=0.021027ms,total=0.630747ms,count=480] java.util.Date:getTime() #57
            +---[min=0.00106ms,max=0.015055ms,total=0.188439ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #178
            +---[min=0.001025ms,max=0.009712ms,total=0.171506ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #183
            +---[min=7.4E-4ms,max=0.092253ms,total=0.251068ms,count=120] java.util.Date:setTime() #57
            +---[min=0.001086ms,max=0.006234ms,total=0.184256ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #185
            +---[min=0.001036ms,max=0.012332ms,total=0.176491ms,count=120] xxxService.helios.entity.HeliosScore:getTimeTo() #186
          3 +---[min=6.73E-4ms,max=0.066785ms,total=135.009239ms,count=170635] java.util.List:size() #188
          1 +---[min=0.001085ms,max=0.089243ms,total=208.003309ms,count=170633] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #189
          2 +---[min=7.31E-4ms,max=0.070823ms,total=145.488732ms,count=170633] java.util.List:get() #57
            +---[min=0.001177ms,max=0.143546ms,total=2.319379ms,count=1440] java.util.Date:<init>() #191
            +---[min=0.001346ms,max=0.064411ms,total=2.839878ms,count=1440] java.util.Set:add() #57
            +---[min=0.001096ms,max=0.009059ms,total=0.190336ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #198
            +---[min=6.92E-4ms,max=0.016223ms,total=0.141751ms,count=120] java.util.List:size() #57
            +---[0.069753ms] java.util.ArrayList:<init>() #201
            +---[0.021066ms] java.util.ArrayList:stream() #57
            +---[0.029498ms] java.util.stream.Stream:sorted() #57
            +---[0.014089ms] java.util.stream.Stream:map() #57
            +---[0.013053ms] java.util.stream.Collectors:toList() #57
            `---[0.009818ms] xxxService.model.helios.HeliosGetScoreResponse:setDates() #57
    

    分析

    这一步将执行时间又优化了 80ms 左右。现在还剩是 160ms 了。

    从 Trace 中看耗时时间最长的是三个方法:

    • getScores。直接 get 了属性啥也没干,但是积少成多
    • list.size()
    • list.get(index)

    也就是说虽然这几个函数里也没干什么东西,但是函数调用、指针寻址本身也是有开销的。

    第三次优化

    优化方向

    1. 减少 list 属性的调用
    2. 一次次 list.add 方法改成 subList 一次性放入

    也就是说循环中不做任何耗时操作,不做任何指针/引用。

    代码

    private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) {
            HeliosGetScoreResponse response = new HeliosGetScoreResponse();
    
            List<HeliosScore> heliosScoresRecord = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId());
            if (CollectionUtils.isEmpty(heliosScoresRecord)) {
                return response;
            }
    
            Set<Date> dateSet = new HashSet<>();
            boolean isDateSetInitial = false;
            int scoreSize = 16;
    
            List<HeliosScore> heliosScores = HeliosDataMergeJob.mergeData(heliosScoresRecord);
    
            Map<String, List<HeliosScore>> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId));
    
            for (List<HeliosScore> scores : groupByAppIdHeliosScores.values()) {
                HeliosScore heliosScore = scores.get(0);
                HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score();
                score.setNamespace(heliosScore.getNamespace());
                score.setAppId(heliosScore.getAppId());
                score.setScores(new ArrayList<>(scoreSize));
                response.getValues().add(score);
    
                List<Integer> scoreIntList = HeliosHelper.splitScores(heliosScore);
    
                // 以 requestTime 为准
                long indexDateMills = request.getStartTime().getTime();
                int index = 0;
                // 如果 timeFrom < requestTime,则增加 timeFrom 到 requestTime
                long heliosScoreTimeFromMills = heliosScore.getTimeFrom().getTime();
                while (indexDateMills > heliosScoreTimeFromMills) {
                    heliosScoreTimeFromMills += 60_000;
                    index++;
                }
                heliosScore.getTimeFrom().setTime(heliosScoreTimeFromMills);
    
                long requestEndTimeMills = request.getEndTime().getTime();
                long heliosScoreTimeToMills = heliosScore.getTimeTo().getTime();
    
                // 循环条件为 (当前时间 <= 请求最大时间) && (当前时间 <= 数据最大时间) && (index < 数据条数)
                int scoreIntListSize = scoreIntList.size();
                int indexStart = index;
                while (indexDateMills <= requestEndTimeMills && indexDateMills <= heliosScoreTimeToMills && index++ < scoreIntListSize) {
                    if (!isDateSetInitial) {
                        dateSet.add(new Date(indexDateMills));
                    }
                    indexDateMills += 60_000;
                }
                score.getScores().addAll(scoreIntList.subList(indexStart, index - 1));
                // 性能优化,减少重复放入的次数
                isDateSetInitial = true;
                // 性能优化,初始化足够的 size 减少扩容次数。 x1.1 为了万一数据数量不一致,留出一点 buffer。
                scoreSize = (int) (score.getScores().size() * 1.1);
            }
    
            response.setDates(new ArrayList<>(dateSet).stream().sorted().map(DateUtils.yyyyMMddHHmm::formatDate).collect(Collectors.toList()));
            return response;
        }
    

    Arthas Trace

    `---ts=2021-08-17 15:33:40;thread_name=http-nio-8080-exec-11;id=f1;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@d1c5cf2
        `---[138.624811ms] xxxService.controller.HeliosController:queryScores()
            +---[0.021852ms] xxxService.model.helios.HeliosGetScoreResponse:<init>() #149
            +---[0.00746ms] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #151
            +---[0.005838ms] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #57
            +---[0.006341ms] xxxService.model.helios.HeliosGetScoreRequest:getFilterByAppId() #57
        2   +---[15.227453ms] xxxService.service.HeliosService:queryScoresTimeBetween() #57
            +---[0.02168ms] org.apache.commons.collections.CollectionUtils:isEmpty() #152
            +---[0.008923ms] java.util.HashSet:<init>() #156
        1   +---[22.703926ms] xxxService.helios.jobs.HeliosDataMergeJob:mergeData() #160
            +---[0.047118ms] java.util.List:stream() #162
            +---[0.043183ms] java.util.stream.Collectors:groupingBy() #57
            +---[min=0.095654ms,max=2.183288ms,total=2.278942ms,count=2] java.util.stream.Stream:collect() #57
            +---[0.022906ms] java.util.Map:values() #164
            +---[0.025777ms] java.util.Collection:iterator() #57
            +---[min=9.28E-4ms,max=0.017187ms,total=0.261862ms,count=121] java.util.Iterator:hasNext() #57
            +---[min=9.88E-4ms,max=0.018901ms,total=0.280889ms,count=120] java.util.Iterator:next() #57
            +---[min=9.65E-4ms,max=0.014741ms,total=0.262695ms,count=120] java.util.List:get() #165
            +---[min=0.001215ms,max=0.013928ms,total=0.347762ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:<init>() #166
            +---[min=0.001253ms,max=0.010855ms,total=0.328842ms,count=120] xxxService.helios.entity.HeliosScore:getNamespace() #167
            +---[min=0.001316ms,max=0.014714ms,total=0.372553ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setNamespace() #57
            +---[min=0.001211ms,max=0.010511ms,total=0.322723ms,count=120] xxxService.helios.entity.HeliosScore:getAppId() #168
            +---[min=0.00132ms,max=0.010201ms,total=0.334627ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setAppId() #57
            +---[min=0.00116ms,max=0.014504ms,total=0.386879ms,count=120] java.util.ArrayList:<init>() #169
            +---[min=0.00131ms,max=0.014072ms,total=0.344922ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setScores() #57
            +---[min=0.001261ms,max=0.017312ms,total=0.356444ms,count=120] xxxService.model.helios.HeliosGetScoreResponse:getValues() #170
            +---[min=9.73E-4ms,max=0.016531ms,total=0.275794ms,count=120] java.util.List:add() #57
         3  +---[min=0.023208ms,max=19.808819ms,total=47.196601ms,count=120] xxxService.helios.HeliosHelper:splitScores() #172
            +---[min=0.001289ms,max=0.009578ms,total=0.36878ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #175
            +---[min=8.85E-4ms,max=0.016405ms,total=0.994157ms,count=480] java.util.Date:getTime() #57
            +---[min=0.001238ms,max=0.016801ms,total=0.34399ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #178
            +---[min=0.001217ms,max=0.008931ms,total=0.316197ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #183
            +---[min=9.14E-4ms,max=0.015929ms,total=0.277078ms,count=120] java.util.Date:setTime() #57
            +---[min=0.001238ms,max=0.01061ms,total=0.3375ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #185
            +---[min=0.001225ms,max=0.018059ms,total=0.315198ms,count=120] xxxService.helios.entity.HeliosScore:getTimeTo() #186
            +---[min=8.79E-4ms,max=0.022669ms,total=0.272356ms,count=120] java.util.List:size() #189
            +---[min=0.002001ms,max=0.056977ms,total=4.32853ms,count=1440] java.util.Date:<init>() #193
            +---[min=0.002174ms,max=0.040594ms,total=4.594415ms,count=1440] java.util.Set:add() #57
            +---[min=0.001302ms,max=0.012925ms,total=0.353165ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #197
            +---[min=0.001004ms,max=0.033424ms,total=0.338294ms,count=120] java.util.List:subList() #57
            +---[min=0.004871ms,max=0.051046ms,total=2.945263ms,count=120] java.util.List:addAll() #57
            +---[min=0.001291ms,max=0.009831ms,total=0.314292ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #201
            +---[min=8.84E-4ms,max=0.018168ms,total=0.249321ms,count=120] java.util.List:size() #57
            +---[0.054305ms] java.util.ArrayList:<init>() #204
            +---[0.024481ms] java.util.ArrayList:stream() #57
            +---[0.028717ms] java.util.stream.Stream:sorted() #57
            +---[0.013725ms] java.util.stream.Stream:map() #57
            +---[0.0128ms] java.util.stream.Collectors:toList() #57
            `---[0.007166ms] xxxService.model.helios.HeliosGetScoreResponse:setDates() #57
    

    分析

    这一步又优化了 100ms 左右,现在还剩 60ms。

    现在从 trace 上看耗时操作只有三个了:

    • 查数据库
    • 合并数据
    • 拆分得分字符串 "100,100,100" 为 int 数组 [100,100,100]

    第四次优化

    优化方向

    1. 查数据库发现由于 SQL 判断不准确,每次会多查出来一条数据,在后边循环的时候会多循环一倍
    2. 合并数据时发现可以针对单条数据的情况直接过滤,减少开销。

    代码

    1. 改了 SQL 并验证,减少查询出来的数据量
    2. 单条数据时不再处理合并逻辑

    Arthas Trace

    `---ts=2021-08-17 16:03:24;thread_name=http-nio-8080-exec-13;id=f1;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@69e2fe3b
        `---[38.171379ms] xxxService.controller.HeliosController:queryScores()
            +---[0.009463ms] xxxService.model.helios.HeliosGetScoreResponse:<init>() #149
            +---[0.00348ms] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #151
            +---[0.003233ms] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #57
            +---[0.003395ms] xxxService.model.helios.HeliosGetScoreRequest:getFilterByAppId() #57
         1  +---[10.157226ms] xxxService.service.HeliosService:queryScoresTimeBetween() #57
            +---[0.009989ms] org.apache.commons.collections.CollectionUtils:isEmpty() #152
            +---[0.003394ms] java.util.HashSet:<init>() #156
            +---[0.083535ms] xxxService.helios.jobs.HeliosDataMergeJob:mergeData() #160
            +---[0.017819ms] java.util.List:stream() #162
            +---[0.011787ms] java.util.stream.Collectors:groupingBy() #57
            +---[min=0.047561ms,max=2.02786ms,total=2.075421ms,count=2] java.util.stream.Stream:collect() #57
            +---[0.015525ms] java.util.Map:values() #164
            +---[0.021965ms] java.util.Collection:iterator() #57
            +---[min=7.25E-4ms,max=0.009733ms,total=0.115783ms,count=121] java.util.Iterator:hasNext() #57
            +---[min=8.43E-4ms,max=0.011422ms,total=0.142771ms,count=120] java.util.Iterator:next() #57
            +---[min=7.81E-4ms,max=0.010883ms,total=0.128809ms,count=120] java.util.List:get() #165
            +---[min=0.001023ms,max=0.004301ms,total=0.150165ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:<init>() #166
            +---[min=0.001066ms,max=0.004648ms,total=0.154698ms,count=120] xxxService.helios.entity.HeliosScore:getNamespace() #167
            +---[min=0.001137ms,max=0.005607ms,total=0.170279ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setNamespace() #57
            +---[min=0.001023ms,max=0.004292ms,total=0.151767ms,count=120] xxxService.helios.entity.HeliosScore:getAppId() #168
            +---[min=0.001105ms,max=0.004701ms,total=0.164955ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setAppId() #57
            +---[min=0.001359ms,max=0.007931ms,total=0.233665ms,count=120] java.util.ArrayList:<init>() #169
            +---[min=0.001117ms,max=0.00785ms,total=0.168539ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:setScores() #57
            +---[min=0.001073ms,max=0.004488ms,total=0.156654ms,count=120] xxxService.model.helios.HeliosGetScoreResponse:getValues() #170
            +---[min=7.98E-4ms,max=0.00977ms,total=0.125818ms,count=120] java.util.List:add() #57
            +---[min=0.022304ms,max=0.12093ms,total=8.88628ms,count=120] xxxService.helios.HeliosHelper:splitScores() #172
            +---[min=0.001092ms,max=0.004967ms,total=0.161288ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getStartTime() #175
            +---[min=7.02E-4ms,max=0.012136ms,total=0.467786ms,count=480] java.util.Date:getTime() #57
            +---[min=0.001022ms,max=0.004944ms,total=0.151353ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #178
            +---[min=0.001018ms,max=0.004731ms,total=0.148025ms,count=120] xxxService.helios.entity.HeliosScore:getTimeFrom() #183
            +---[min=7.3E-4ms,max=0.009359ms,total=0.120588ms,count=120] java.util.Date:setTime() #57
            +---[min=0.00107ms,max=0.008948ms,total=0.162848ms,count=120] xxxService.model.helios.HeliosGetScoreRequest:getEndTime() #185
            +---[min=0.001034ms,max=0.014003ms,total=0.158614ms,count=120] xxxService.helios.entity.HeliosScore:getTimeTo() #186
            +---[min=6.99E-4ms,max=0.009995ms,total=0.11179ms,count=120] java.util.List:size() #189
            +---[min=6.95E-4ms,max=0.005468ms,total=1.116308ms,count=1440] java.util.Date:<init>() #193
            +---[min=7.79E-4ms,max=0.029909ms,total=1.407528ms,count=1440] java.util.Set:add() #57
            +---[min=0.001097ms,max=0.008616ms,total=0.160597ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #197
            +---[min=8.23E-4ms,max=0.0294ms,total=0.153353ms,count=120] java.util.List:subList() #57
            +---[min=0.005771ms,max=0.04465ms,total=1.992151ms,count=120] java.util.List:addAll() #57
            +---[min=0.001098ms,max=0.007013ms,total=0.169555ms,count=120] xxxService.model.helios.HeliosGetScoreResponse$Score:getScores() #201
            +---[min=7.04E-4ms,max=0.01315ms,total=0.120998ms,count=120] java.util.List:size() #57
            +---[0.197732ms] java.util.ArrayList:<init>() #204
            +---[0.018589ms] java.util.ArrayList:stream() #57
            +---[0.025192ms] java.util.stream.Stream:sorted() #57
            +---[0.012544ms] java.util.stream.Stream:map() #57
            +---[0.012188ms] java.util.stream.Collectors:toList() #57
            `---[0.0067ms] xxxService.model.helios.HeliosGetScoreResponse:setDates() #57
    

    分析

    可以看到现在最大耗时的地方终于是数据库查询了。现在查询一整天的数据,也只需要 25~40ms 左右。

    结果

    链路:

    链路上看程序代码还是要处理个十几 ms,主要是字符串转 int[] 时的开销,这一步可以再想办法继续优化。

    结论

    从这次优化我们可以得到一些结论:

    1. 尽量少创建对象
    2. SimpleDateFormat的开销很大
    3. Date.compare 的开销不低
    4. 哪怕最简单的操作如 list.size() list.add次数多了开销也很可观
    5. 对于性能分析和优化一定要有合适工具,才能得出有用的结论并针对性优化。一开始我以为减少对象创建就万事大吉,但实际上性能消耗的大头并不在这里。还是得借助 Arthas 的 Trace 才能真正针对性地优化。

    另外强烈推荐 动态追踪技术漫谈

    user-case 
    opened by liuzhiguo630 32
  • Alibaba Arthas 3.1.5版本支持火焰图,快速定位应用热点

    Alibaba Arthas 3.1.5版本支持火焰图,快速定位应用热点

    Arthas

    Arthas是Alibaba开源的Java诊断工具,深受开发者喜爱。

    • Github: https://github.com/alibaba/arthas
    • 文档:https://alibaba.github.io/arthas

    Arthas 3.1.5版本带来下面全新的特性:

    • 开箱即用的Profiler/火焰图功能
    • grep命令支持更丰富的选项
    • monitor/tt/trace等命令提供更精确的时间统计
    • telnet/http协议共用3658端口

    Profiler/Frame Graph/火焰图

    火焰图的威名相信大家都有所耳闻,但可能因为使用比较复杂,所以望而止步。

    在新版本的Arthas里集成了async-profiler,使用profiler命令就可以很方便地生成火焰图,并且可以在浏览器里直接查看。

    • profiler命令wiki: https://alibaba.github.io/arthas/profiler.html

    profiler 命令基本运行结构是 profiler action [actionArg]。下面介绍如何使用。

    启动profiler

    $ profiler start
    Started [cpu] profiling
    

    默认情况下,生成的是cpu的火焰图,即event为cpu。可以用--event参数来指定。

    获取已采集的sample的数量

    $ profiler getSamples
    23
    

    查看profiler状态

    $ profiler status
    [cpu] profiling is running for 4 seconds
    

    可以查看当前profiler在采样哪种event和采样时间。

    生成svg格式结果

    $ profiler stop
    profiler output file: /tmp/demo/arthas-output/20191125-135546.svg
    OK
    

    默认情况下,生成的结果保存到应用的工作目录下的arthas-output目录里。

    通过浏览器查看arthas-output下面的profiler结果

    默认情况下,arthas使用3658端口,则可以打开: http://localhost:3658/arthas-output/ 查看到arthas-output目录下面的profiler结果:

    点击可以查看具体的结果:

    如果是chrome浏览器,可能需要多次刷新。

    grep命令支持更丰富的选项

    标准的linux grep命令支持丰富的选项,可以很方便地定位结果的上下文等。

    新版本的grep命令支持更多标准的选项,下面是一些例子:

    sysprop | grep java
    sysprop | grep java -n
    sysenv | grep -v JAVA
    sysenv | grep -e "(?i)(JAVA|sun)" -m 3  -C 2
    sysenv | grep JAVA -A2 -B3
    thread | grep -m 10 -e  "TIMED_WAITING|WAITING"
    

    感谢社区里 @qxo 的贡献。

    telnet/http协议共用3658端口

    默认情况下,Arthas的Telnet端口是3658,HTTP端口是8563,这个常常让用户迷惑。在新版本里,在3658端口同时支持Telnet/HTTP协议。

    在浏览器里访问 http://localhost:3658/ 也可以访问到Web Console了。

    在后续的版本里,考虑默认只侦听 3658端口,减少用户的配置项。

    monitor/tt/trace等命令提供更精确的时间统计

    以前Arthas被诟病比较多的一个问题是,monitor/tt/trace等命令时间统计误差大。因为以前只使用了一个int来保存时间,所以不精确。

    在新版本里,改用一个高效的stack来保存数据,时间的准确度大大提升,欢迎大家反馈效果。

    感谢社区里 @huangjIT 的贡献。

    总结

    总之,3.1.5版本的Arthas引入了开箱即用的Profiler/火焰图功能,欢迎大家使用反馈。

    • 火焰图的一个参考文章:https://openresty.org/posts/dynamic-tracing/
    • Release Note: https://github.com/alibaba/arthas/releases/tag/arthas-all-3.1.5

    最后,Arthas 正在参加2019年度最受欢迎开源中国软件评选,急需大家宝贵的一票支持!!查看


    评论区抽奖两本图书: openshift

    data

    抽奖要求:

    • 给Arthas抽票:https://www.oschina.net/p/arthas
    • 评论里提供投票结果截图
    opened by hengyunabc 31
  • Provide a separate tutorial for each command/ Arthas在线教程重新编排,每个命令一个小教程

    Provide a separate tutorial for each command/ Arthas在线教程重新编排,每个命令一个小教程

    There are currently two parts to the online tutorial: basic and advanced

    https://alibaba.github.io/arthas/arthas-tutorials?language=en&id=arthas-basics https://alibaba.github.io/arthas/arthas-tutorials?language=en&id=arthas-advanced

    • Each little tutorial is longer, and adding new content will cause the tutorial to get longer and longer Users tend to lose patience.
    • It's hard to find what you need.
    • More users want to see the usage of a command directly So consider.

    Provide a separate tutorial for each command, such as a separate tutorial for the watch command, so that many tips can be written in.

    One tutorial per case to make it easier for users to learn to use Arthas.

    What we need to:

    1. Register an account here and write a tutorial for Arthas' single command: https://www.katacoda.com/.
    2. Test the tutorial, then send the PR merge to https://github.com/alibaba/arthas/tree/master/tutorials/katacoda
    3. Update the web packe: https://github.com/alibaba/arthas/blob/master/site/src/site/sphinx/_include_html/arthas-tutorials.html

    目前在线教程有两部分:基础和进阶

    • https://alibaba.github.io/arthas/arthas-tutorials?language=cn&id=arthas-basics
    • https://alibaba.github.io/arthas/arthas-tutorials?language=cn&id=arthas-advanced

    之前的一些调查结果: https://github.com/alibaba/arthas/issues/742

    目前在线教程有两部分:基础 和 进阶。

    • 每个小教程都是比较长,再增加新的内容会导致教程越来越长
    • 用户容易失去耐心
    • 不好找到需要的内容
    • 更多的用户是想直接看某个命令的用法

    所以考虑

    • 每个命令一个教程,比如watch命令单独一个教程,这样子很多小技巧都可以写进去
    • 每一个案例一个教程,方便用户对应问题学习

    需要做的事情:

    • 拆分教程
    • 为单个命令提PR到: https://github.com/alibaba/arthas/tree/master/tutorials/katacoda
    • 修改现有的web页面 : https://github.com/alibaba/arthas/blob/master/site/src/site/sphinx/_include_html/arthas-tutorials.html
    help wanted SoC2020 
    opened by hengyunabc 31
  • Alibaba Arthas实践--获取到Spring Context,然后为所欲为

    Alibaba Arthas实践--获取到Spring Context,然后为所欲为

    背景

    Arthas 是Alibaba开源的Java诊断工具,深受开发者喜爱。

    • https://github.com/alibaba/arthas

    Arthas提供了非常丰富的关于调用拦截的命令,比如 trace/watch/monitor/tt 。但是很多时候我们在排查问题时,需要更多的线索,并不只是函数的参数和返回值。 比如在一个spring应用里,想获取到spring context里的其它bean。如果能随意获取到spring bean,那就可以“为所欲为”了。

    下面介绍如何利用Arthas获取到spring context。

    Demo: https://github.com/hengyunabc/spring-boot-inside/tree/master/demo-arthas-spring-boot

    Arthas快速开始:https://alibaba.github.io/arthas/quick-start.html

    使用tt命令获取到spring context

    Demo是一个spring mvc应用,请求会经过一系列的spring bean处理,那么我们可以在spring mvc的类里拦截到一些请求。

    启动Demo: mvn spring-boot:run

    使用Arthas Attach成功之后,执行tt命令来记录RequestMappingHandlerAdapter#invokeHandlerMethod的请求

    tt -t org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter invokeHandlerMethod
    

    然后访问一个网页: http://localhost:8080/

    可以看到Arthas会拦截到这个调用,index是1000,并且打印出:

    $ watch com.example.demo.Test * 'params[0]@sss'
    $ tt -t org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter invokeHandlerMethod
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 101 ms.
     INDEX  TIMESTAMP         COST(ms  IS-RE  IS-EX  OBJECT       CLASS                     METHOD
                              )        T      P
    ------------------------------------------------------------------------------------------------------------------
     1000   2019-01-27 16:31  3.66744  true   false  0x4465cf70   RequestMappingHandlerAda  invokeHandlerMethod
            :54                                                   pter
    

    那么怎样获取到spring context?

    可以用tt命令的-i参数来指定index,并且用-w参数来执行ognl表达式来获取spring context:

    $ tt -i 1000 -w 'target.getApplicationContext()'
    @AnnotationConfigEmbeddedWebApplicationContext[
        reader=@AnnotatedBeanDefinitionReader[org.springframework.context.annotation.AnnotatedBeanDefinitionReader@35dc90ec],
        scanner=@ClassPathBeanDefinitionScanner[org.springframework.context.annotation.ClassPathBeanDefinitionScanner@72078a14],
        annotatedClasses=null,
        basePackages=null,
    ]
    Affect(row-cnt:1) cost in 7 ms.
    

    从spring context里获取任意bean

    获取到spring context之后,就可以获取到任意的bean了,比如获取到helloWorldService,并调用getHelloMessage()函数:

    $ tt -i 1000 -w 'target.getApplicationContext().getBean("helloWorldService").getHelloMessage()'
    @String[Hello World]
    Affect(row-cnt:1) cost in 5 ms.
    

    更多的思路

    在很多代码里都有static函数或者static holder类,顺滕摸瓜,可以获取很多其它的对象。比如在Dubbo里通过SpringExtensionFactory获取spring context:

    $ ognl '#context=@com.alibaba.dubbo.config.spring.extension.SpringExtensionFactory@contexts.iterator.next, 
    #context.getBean("userServiceImpl").findUser(1)'
    @User[
        id=@Integer[1],
        name=@String[Deanna Borer],
    ]
    

    链接

    • Arthas: https://github.com/alibaba/arthas
    • https://alibaba.github.io/arthas/tt.html
    • https://alibaba.github.io/arthas/ognl.html
    user-case 
    opened by hengyunabc 29
  • Unable to open socket file: target process not responding or HotSpot VM not loaded

    Unable to open socket file: target process not responding or HotSpot VM not loaded

    • [ ] 我已经在 issues 里搜索,没有重复的issue。

    环境信息

    • arthas-boot.jar 或者 as.sh 的版本: 3.0.5
    • Arthas 版本: 3.0.5
    • 操作系统版本: CentOS release 6.9
    • Java版本: 1.8.0_151
    [root@localhost arthas]# java -jar arthas-boot.jar 
    * [1]: 2705 jar
      [2]: 40018 jar
      [3]: 105193 jar
      [4]: 3036 Bootstrap
      [5]: 39807 jar
    2
    [INFO] arthas home: /root/.arthas/lib/3.0.5/arthas
    [INFO] Try to attach process 40018
    com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded
    	at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:106)
    	at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:78)
    	at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:250)
    	at com.taobao.arthas.core.Arthas.attachAgent(Arthas.java:72)
    	at com.taobao.arthas.core.Arthas.<init>(Arthas.java:25)
    	at com.taobao.arthas.core.Arthas.main(Arthas.java:99)
    [ERROR] Start arthas failed, exception stack trace: 
    [ERROR] attach fail, targetPid: 40018
    
    
    question-answered 
    opened by 11238390 24
  • Call for document translation

    Call for document translation

    The current document is written only in Chinese. I am planning to translate them into English.

    Any volunteer who is willing to help on this?

    |Table of content| Assignee | Status | | -------------- | --------- | --------- | |README|@ralf0131| DONE| |Installatoin| @Hearen| DONE| |Quick start |@Hearen| DONE| |Advanced usage| @Hearen| DONE| |Commands| @Hearen| DONE| |Commands/dashboard |@Hearen| DONE| |Commands/thread |@Hearen| DONE| |Commands/jvm |@Hearen| DONE| |Commands/sysprop |@Hearen| DONE| |Commands/getstatic |@Hearen| DONE| |Commands/sc |@Hearen| DONE| |Commands/sm |@Hearen| DONE| |Commands/dump |@Hearen| DONE| |Commands/jad |@Hearen| DONE| |Commands/classloader |@Hearen| DONE| |Commands/redefine |@Hearen| DONE| |Commands/monitor |@Hearen| DONE| |Commands/watch |@Hearen| DONE| |Commands/trace |@Hearen| DONE| |Commands/stack |@Hearen| DONE| |Commands/tt |@Hearen| DONE| |Commands/options |@alfredzouang| DONE| |Commands/Basic commands |@Hearen| DONE| User cases
    |Release note |@Hearen| Done #138 | Questions and Answers
    |CONTRIBUTING |@Hearen| Done #139 |

    help wanted 
    opened by ralf0131 23
  • 【Arthas问题排查集】活用ognl表达式

    【Arthas问题排查集】活用ognl表达式

    前言

    Arthas 3.0中使用ognl表达式替换了groovy来实现表达式的求值功能,解决了groovy潜在会出现内存泄露的问题。灵活运用ognl表达式,能够极大提升问题排查的效率。

    ognl官方文档:https://commons.apache.org/proper/commons-ognl/language-guide.html

    一个测试应用

    import java.util.ArrayList;
    import java.util.HashMap;
    import java.util.List;
    import java.util.Map;
    import java.util.Random;
    
    /**
     * @author zhuyong on 2017/9/13.
     */
    public class Test {
    
        public static final Map m = new HashMap<>();
        public static final Map n = new HashMap<>();
    
        static {
            m.put("a", "aaa");
            m.put("b", "bbb");
    
            n.put(Type.RUN, "aaa");
            n.put(Type.STOP, "bbb");
        }
    
        public static void main(String[] args) throws InterruptedException {
            List<Pojo> list = new ArrayList<>();
    
            for (int i = 0; i < 40; i ++) {
                Pojo pojo = new Pojo();
                pojo.setName("name " + i);
                pojo.setAge(i + 2);
    
                list.add(pojo);
            }
    
            while (true) {
                int random = new Random().nextInt(40);
    
                String name = list.get(random).getName();
                list.get(random).setName(null);
    
                test(list);
    
                list.get(random).setName(name);
    
                Thread.sleep(1000l);
            }
        }
    
        public static void test(List<Pojo> list) {
    
        }
    
        public static void invoke(String a) {
            System.out.println(a);
        }
    
        static class Pojo {
            String name;
            int age;
            String hobby;
    
            public String getName() {
                return name;
            }
    
            public void setName(String name) {
                this.name = name;
            }
    
            public int getAge() {
                return age;
            }
    
            public void setAge(int age) {
                this.age = age;
            }
    
            public String getHobby() {
                return hobby;
            }
    
            public void setHobby(String hobby) {
                this.hobby = hobby;
            }
        }
    }
    
    public enum Type {
        RUN, STOP;
    }
    

    查看第一个参数

    params是参数列表,是一个数组,可以直接通过下标方式访问

    $ watch Test test params[0] -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 26 ms.
    @ArrayList[
        @Pojo[Test$Pojo@6e2c634b],
        @Pojo[Test$Pojo@37a71e93],
        @Pojo[Test$Pojo@7e6cbb7a],
        ...
    ]
    

    这里的-n表示只输出一次

    查看数组中的元素

    第一个参数是一个List,想要看List中第一个Pojo对象,可以通过下标方式,也可以通过List的get方法访问。

    $ watch Test test params[0][0] -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 14 ms.
    @Pojo[
        name=@String[name 0],
        age=@Integer[2],
        hobby=null,
    ]
    
    $ watch Test test params[0].get(0) -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 14 ms.
    @Pojo[
        name=@String[name 0],
        age=@Integer[2],
        hobby=null,
    ]
    
    

    查看Pojo的属性

    拿到这个Pojo可以,直接访问Pojo的属性,如age

    $ watch Test test params[0].get(0).age -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 21 ms.
    @Integer[2]
    

    还可以通过下标的方式访问params[0][0]["age"],这个写法等效于params[0][0].age

    $ watch Test test params[0][0]["name"] -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 53 ms.
    watch failed, condition is: null, express is: params[0][0][age], ognl.NoSuchPropertyException: com.taobao.arthas.core.advisor.Advice.age, visit /Users/wangtao/logs/arthas/arthas.log for more details.
    

    但这样会报错,这时候需要再加一个引号

    $ watch Test test 'params[0][0]["age"]' -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 25 ms.
    @Integer[2]
    

    集合投影

    有时候我们只需要抽取对象数组中的某一个属性,这种情况可以通过投影来实现,比如要将Pojo对象列表中的name属性单独抽出来,可以通过params[0].{name}这个表达式来实现。 ognl会便利params[0]这个List取出每个对象的name属性,重新组装成一个新的数组。用法相当于Java stream中的map函数。

    $ watch Test test params[0].{name} -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 56 ms.
    @ArrayList[
        @String[name 0],
        @String[name 1],
        @String[name 2],
        @String[name 3],
        null,
        @String[name 5],
        @String[name 6],
        @String[name 7],
        @String[name 8],
        @String[name 9],
    ]
    

    集合过滤

    有时候还需要针对集合对象按某种条件进行过滤,比如想找出所有age大于5的Pojo的name,可以这样写

    $ watch Test test "params[0].{? #this.age > 5}.{name}" -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 25 ms.
    @ArrayList[
        @String[name 4],
        @String[name 5],
        @String[name 6],
        null,
        @String[name 8],
        @String[name 9],
    ]
    

    其中{? #this.age > 5} 相当于stream里面的filter,后面的name相当于stream里面的map

    那如果要找到第一个age大于5的Pojo的name,怎么办呢?可以用^$来进行第一个或最后一个的匹配,像下面这样:

    $ watch Test test "params[0].{^ #this.age > 5}.{name}" -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 24 ms.
    @ArrayList[
        @String[name 4],
    ]
    Command hit execution time limit 1, therefore will be aborted.
    $ watch Test test "params[0].{$ #this.age > 5}.{name}" -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 43 ms.
    @ArrayList[
        @String[name 9],
    ]
    

    多行表达式

    有些表达式一行之内无法表达,需要多行才能表达,应该怎么写的?比如,假设我们要把所有Pojo的name拿出来,再往里面新加一个新的元素,在返回新的列表,应该如何写?可以通过中括号将多个表达式串联起来,最后一个表达式的返回值代表整个表达式的最终结果。临时变量可以用#来表示。

    $ watch Test test '(#test=params[0].{name}, #test.add("abc"), #test)' -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 28 ms.
    @ArrayList[
        @String[name 0],
        @String[name 1],
        @String[name 2],
        @String[name 3],
        @String[name 4],
        @String[name 5],
        @String[name 6],
        @String[name 7],
        @String[name 8],
        null,
        @String[abc],
    ]
    

    调用构造函数

    调用构造函数,必须要指定要创建的类的全类名。比如下面的例子中,创建一个新的list,然后添加一个新的元素,然后返回添加后的list。

    $ watch Test test '(#test=new java.util.ArrayList(), #test.add("abc"), #test)' -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 37 ms.
    @ArrayList[
        @String[abc],
    ]
    

    访问静态变量

    可以通过@class@filed方式访问,注意需要填写全类名

    $ watch Test test '@Test@m' -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 35 ms.
    @HashMap[
        @String[a]:@String[aaa],
        @String[b]:@String[bbb],
    ]
    

    调用静态方法

    可以通过@class@method(args)方式访问,注意需要填写全类名

    $ watch Test test '@java.lang.System@getProperty("java.version")' -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 42 ms.
    @String[1.8.0_51]
    

    静态方法和非静态方法结合,例如想要获取当前方法调用的TCCL,可以像下面这样写:

    $ watch Test test '@java.lang.Thread@currentThread().getContextClassLoader()' -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 84 ms.
    @AppClassLoader[
        ucp=@URLClassPath[sun.misc.URLClassPath@4cdbe50f],
        $assertionsDisabled=@Boolean[true],
    ]
    

    访问Map中的元素

    Test.n是一个HashMap,假设要获取这个Map的所有key,ongl针对Map接口提供了keys, values这两个虚拟属性,可以像普通属性一样访问。

    $ watch Test test '@[email protected]' -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 57 ms.
    @KeySet[
        @Type[RUN],
        @Type[STOP],
    ]
    

    因为这个Map的Key是一个Enum,假设要把key为RUN这个值的value取出来应该怎么写呢?可以通过Enum的valueOf方法来创建一个Enum,然后get出来,比如下面一样

    $ watch Test test '@[email protected](@Type@valueOf("RUN"))' -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 168 ms.
    @String[aaa]
    

    或者是下面这样,通过迭代器+过滤的方式:

    $ watch Test test '@[email protected]().iterator.{? #this.key.name() == "RUN"}' -n 1
    Press Ctrl+C to abort.
    Affect(class-cnt:1 , method-cnt:1) cost in 72 ms.
    @ArrayList[
        @Node[RUN=aaa],
    ]
    

    附录: ognl内置的ognl的虚拟属性

    • Collection:
      • size
      • isEmpty
    • List:
      • iterator
    • Map:
      • keys
      • values
    • Set:
      • iterator
    • Iterator:
      • next
      • hasNext
    • Enumeration:
      • next
      • hasNext
      • nextElement
      • hasMoreElements

    最后

    欢迎在留言区分享你的牛逼用法,互相交流进步~

    user-case 
    opened by hengyunabc 23
  • attach to target jvm (*) failed

    attach to target jvm (*) failed

    error info: image java -version: java version "1.8.0_172" Java(TM) SE Runtime Environment (build 1.8.0_172-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)

    elasticsearch version : 6.0.0

    question-answered 
    opened by Sidneyxt 21
  • 同一台物理机多docker jvm容器., arthas 最多只能监听一个.

    同一台物理机多docker jvm容器., arthas 最多只能监听一个.

    • [ ] 我已经在 issues 里搜索,没有重复的issue。

    环境信息

    • arthas-boot.jar 或者 as.sh 的版本:dockerhub hengyunabc/arthas 3.1.3
    • Arthas 版本: dockerhub hengyunabc/arthas 3.1.3
    • 操作系统版本: xxx
    • 目标进程的JVM版本: openjdk 11.0.3
    • 执行arthas-boot的版本: arthas 3.1.3

    重现问题的步骤

    1. docker-compose 启动几个 jvm容器,容器内有arthas
    2. jvm容器内部的java进程号为 1
    3. arthas attach到其中任何一台后,其他的容器运行arthas都报错,不能够启动监听的 3658.

    期望的结果

    每个容器内的arthas都可以attach到每个容器内pid为1的java进程.

    实际运行的结果

    第一个容器内的arthas 可以监听成功,其他的容器内的都失败.

    Attach porcess 1 success.
    arthas-client connect  to 127.0.0.1:3658
    Connect to telnet server error: 127.0.0.1 3658
    java.net.ConnectException:  Connection Refused(..)
    
    
    question-answered 
    opened by zhmeng 19
  • SpringBoot Admin2.0集成Arthas实践

    SpringBoot Admin2.0集成Arthas实践

    前言

    • [参考原文-SpringBoot Admin集成Arthas实践 #1601] (https://github.com/alibaba/arthas/issues/1601#issue-755947978)

    项目最初使用Arthas主要有两个目的:

    1. 通过arthas解决实现测试环境、性能测试环境以及生产环境性能问题分析工具的问题;
    2. 通过使用jad、mc、redefine功能组合实现生产环境部分节点代码热更新的能力;

    技术选型相关

    因为公司还未能建立起较为统一的生产微服务配置以及状态管理的能力,各自系统的研发运维较为独立。 同时现在项目使用了Spring Cloud以及Eureka的框架结构,和SBA的基础支撑能力较为匹配,同时,SBA已经可以提供服务感知,日志级别配置管理,以及基于actuator的JVM、Spring容器的众多管理插件,可以满足基础使用的需求。 在调研期间,Arthas整体版本为3.4.5,提供了基于Webconsole的Tunner Server模式,通过前面链接文章已经实践,与SBA已经可以实现集成。 因为项目本身没有历史包袱,在实际集成的过程中采用了SBA 2.0版本以提供更多的管理功能和图形界面能力。 其他优点:

    • web console界面嵌入SBA整体密码登录和网页权限管理,实现登陆SBA后才可以使用相关arthas web console的功能。
    • 基于SBA 客户端依赖的jolokia-core开放目标服务进程的jmx管理,通过实现jmx接口复用SBA的相关操作界面,减少前端界面开发能力的要求。

    整体结构

    整体结构 几个关键点,使用JVM内置Arthas Spring Boot插件,参考工商银行的模式建立完善的客户端下载以及修改脚本实现远程控制。 内置方案工作开发量小,只需要集成相关的开源组件即可实现相关的远程使用的模式并兼顾安全。工银的方案大而全适合整体架构规划后配置专有研发团队之城。 内置方案同时包含通过JMX的启停操作(基于3.4.5的Spring Boot插件无法获得相关句柄,暂时无法实现。),默认不启动。 通过远程JMX开通后,JVM新增相关线程8个,新增虚拟机内存30MB左右,和本文参考的SBA1.0方案相同,需要考虑在线开启前JVM内存是否可以支持。

    实现效果

    SBA 2.0 最大的方便就是提供了配置化链接外部网页的能力,同时如果网页实现在当前JVM进程,可以实现Spring-Security的本地权限管理,在生产环境下只有在登录SBA后才能使用相关集成的arthas功能。

    1. 登录界面 SBA登录界面

    2. 外嵌连接位置 首页banner

    3. JMX的使用 JMX菜单1 JMX启动-Arthas Agent

    4. 跳转arthas web console arthas_web_console

    改造方案

    参考原文-SpringBoot Admin集成Arthas实践 #1601中实现的几个步骤

    1. 整体工程结构 整体工程结构

    整体工程修改自SBA开源项目的example工程,具体使用custom-ui的工程链接如下: [spring-boot-admin-sample-custom-ui] (https://github.com/codecentric/spring-boot-admin/tree/master/spring-boot-admin-samples/spring-boot-admin-sample-custom-ui) 红色框的部分是arthas web console的全部静态文件,通过Maven Resource的指定配置打入指定目录,实现SBA启动时的自定义加载。 maven resource配置--下:

                <resource>
                    <directory>static</directory>
                    <targetPath>${project.build.directory}/classes/META-INF/spring-boot-admin-server-ui/extensions/arthas
                    </targetPath>
                    <filtering>false</filtering>
                </resource>
    

    最终构建的jar中META-INFO中包含相关的文件即可在SBA自带的tomcat启动后加载到相关的静态资源,最后的url和自定义实现的arthas console配置的外部URL对应即可。 2. 外部链接配置 SBA 2.0 开始已经使用vue全家桶了,扩展集成均比较方便。 其中,官方文档给出了外嵌连接的配置方式[Linking / Embedding External Pages] (https://codecentric.github.io/spring-boot-admin/2.3.1/#customizing-external-views)

    参考sba example工程的application.yml配置即可

        # tag::customization-external-views[]
        spring:
          boot:
            admin:
              ui:
                external-views:
                  - label: "Arthas Console"
                    url: http://21.129.49.153:8080/
                    order: 1900
        # end::customization-external-views[]
    
    1. 对应Spring MVC controller实现 参考引用原实现的 SBA集成部分,该部分主要修改实现如下功能:

      • 实现tunnel server已经加载实例列表的刷新并展示到前段 AgentID框供选择点击链接;
      • 实现自定义IP地址的刷新(解决生产环境双生产IP和运维段IP不一致的问题)
    2. Arthas Spring Boot插件修改和配置 参考引用原实现的 SBA集成中插件修改以及客户端配置application.yml 对原版Spring boot插件修改主要在于原有插件是通过Spring的@ConditionalOnMissingBean实现自动加载。 修改主要是通过修改这部分实现通过配置文件默认不启动,然后使用时通过远程启动相关agent线程。

    3. 基于Spring Actuator的JMX实现。 SBA client在maven引入中会默认引入jolokia-core.jar,如果没有因为SBA client依赖可以自行引入该包,可以实现通过actuator开放基于http的jmx操作能力和SBA控制台的相关功能无缝配合。 application.yml中开放management相关配置,根据自身环境情况,也可以开在客户端侧开启Spring security认证,SBA也可以很好的支持通过服务发现实现密码保护actuator端点的访问。

          #放开management
          management:
            endpoints:
              web:
                exposure:
                  # 这里用* 代表暴露所有端点只是为了观察效果,实际中按照需进行端点暴露
                  include: "*"
                  exclude: env
            endpoint:
              health:
                # 详细信息显示给所有用户。
                show-details: ALWAYS
            health:
              status:
                http-mapping:
                  # 自定义健康检查返回状态码对应的 http 状态码
                  FATAL:  503
      

      JMX实现参考原文中EnvironmentChangeListener的实现思路,基于Spring的JMX注解实现即可。

      
         @Component
         @ManagedResource(objectName = "com.ArthasAgentManageMbean:name=ArthasMbean", description = "Arthas远程管理Mbean")
         public class ArthasMbeanImpl {
      
             @Autowired
             private Map<String, String> arthasConfigMap;
      
             @Autowired
             private ArthasProperties arthasProperties;
      
             @Autowired
             private ApplicationContext applicationContext;
      
             /**
              * 初始化
              *
              * @return
              */
             private ArthasAgent arthasAgentInit() {
                 arthasConfigMap = StringUtils.removeDashKey(arthasConfigMap);
                 // 给配置全加上前缀
                 Map<String, String> mapWithPrefix = new HashMap<String, String>(arthasConfigMap.size());
                 for (Map.Entry<String, String> entry : arthasConfigMap.entrySet()) {
                     mapWithPrefix.put("arthas." + entry.getKey(), entry.getValue());
                 }
                 final ArthasAgent arthasAgent = new ArthasAgent(mapWithPrefix, arthasProperties.getHome(),
                         arthasProperties.isSlientInit(), null);
                 arthasAgent.init();
                 return arthasAgent;
             }
      
             @ManagedOperation(description = "获取配置Arthas Tunnel Server地址")
             public String getArthasTunnelServerUrl() {
                 return arthasProperties.getTunnelServer();
             }
      
             @ManagedOperation(description = "设置Arthas Tunnel Server地址,重新attach后生效")
             @ManagedOperationParameter(name = "tunnelServer", description = "example:ws://127.0.0.1:7777/ws")
             public Boolean setArthasTunnelServerUrl(String tunnelServer) {
                 if (tunnelServer == null || tunnelServer.trim().equals("") || tunnelServer.indexOf("ws://") < 0) {
                     return false;
                 }
                 arthasProperties.setTunnelServer(tunnelServer);
                 return true;
             }
      
             @ManagedOperation(description = "获取AgentID")
             public String getAgentId() {
                 return arthasProperties.getAgentId();
             }
      
             @ManagedOperation(description = "获取应用名称")
             public String getAppName() {
                 return arthasProperties.getAppName();
             }
      
             @ManagedOperation(description = "获取ArthasConfigMap")
             public HashMap<String, String> getArthasConfigMap() {
                 return (HashMap) arthasConfigMap;
             }
      
             @ManagedOperation(description = "返回是否已经加载Arthas agent")
             public Boolean isArthasAttched() {
                 DefaultListableBeanFactory defaultListableBeanFactory = (DefaultListableBeanFactory) applicationContext.getAutowireCapableBeanFactory();
                 String bean = "arthasAgent";
                 if (defaultListableBeanFactory.containsBean(bean)) {
                     return true;
                 }
                 return false;
             }
      
             @ManagedOperation(description = "启动Arthas agent")
             public Boolean startArthasAgent() {
                 DefaultListableBeanFactory defaultListableBeanFactory = (DefaultListableBeanFactory) applicationContext.getAutowireCapableBeanFactory();
                 String bean = "arthasAgent";
                 if (defaultListableBeanFactory.containsBean(bean)) {
                     ((ArthasAgent) defaultListableBeanFactory.getBean(bean)).init();
                     return true;
                 }
                 defaultListableBeanFactory.registerSingleton(bean, arthasAgentInit());
                 return true;
             }
      
             @ManagedOperation(description = "关闭Arthas agent,暂未实现")
             public Boolean stopArthasAgent() {
                 // TODO 无法获取自定义tmp文件夹加载的classLoader,因此无法获取到com.taobao.arthas.core.server.ArthasBootstrap类并调用destroy方法
                 DefaultListableBeanFactory defaultListableBeanFactory = (DefaultListableBeanFactory) applicationContext.getAutowireCapableBeanFactory();
                 String bean = "arthasAgent";
                 if (defaultListableBeanFactory.containsBean(bean)) {
                     defaultListableBeanFactory.destroySingleton(bean);
                     return true;
                 } else {
                     return false;
                 }
             }
         }
      
      

    实际使用

    管理工程投产后,多次在生产环境用于问题排查和代码热修复。性能问题主要用于性能流控组件以及灰度发布相关配置参数的在线验证和debug。 代码热加载相关初期通过jad+mc的方式进行操作,后续发现jad在部分代码上因环境配置以及jvm问题产生反编译代码不一致的情况,后续通过maven打包部署应用程序source压缩包的方式解决,直接使用和应用jar同版本构建的source进行修改更加可靠。 整体方案在管理较为严格的生产环境提供了有效的性能分析以及热修复的能力。

    遗留问题:

    • 现有官方提供的com.taobao.arthas.agent.attach.ArthasAgent 中启动arthas agent的客户端使用的arthasClassLoader和bootstrapClass均为方法内的临时变量,外部无法获取相关句柄实现通过bootstrapClass关闭arthas agent的功能; 临时解决方案为通过JMX启动后,在web console连接使用后,使用stop命令实现目标进程中 arthas agent的关闭。
    • 现有字节码加载工具可以很好的实现内部类,私有类的在线热部署替换,同时经测试可以兼容SkyWalk8.x版本的javaagent插件,但是在测试环境因为配置有jacoco覆盖度采集插件与Arthas字节码产生了不兼容的情况,在部分环境使用时需要先关闭对应的agent后才能正常使用arthas的相关功能。
    user-case 
    opened by password36 18
  • 通过http api执行dashboard命令30s才有返回,tunnelserver下正常返回

    通过http api执行dashboard命令30s才有返回,tunnelserver下正常返回

    • [ ] 我已经在 issues 里搜索,没有重复的issue。

    环境信息

    • arthas-boot.jar 或者 as.sh 的版本: 3.6.7
    • Arthas 版本: 3.6.7
    • 执行arthas-boot的版本: 3.6.7

    重现问题的步骤

    1. curl --location --request POST 'http://localhost:8563/api'
      --header 'Content-Type: text/plain'
      --data-raw '{ "action":"exec", "command":"dashboard" }'

    期望的结果

    期望http api 请求正常返回

    实际运行的结果

    30s左右才会返回数据

    把异常信息贴到这里 pSe0Esx.png

    opened by lommayqiu 0
  • AsyncProfiler error: No AllocTracer symbols found. Are JDK debug symbols installed?

    AsyncProfiler error: No AllocTracer symbols found. Are JDK debug symbols installed?

    使用docker镜像openjdk:8-jdk-alpine https://github.com/alibaba/arthas/blob/master/Dockerfile

    进入容器,然后执行profiler start --event alloc [arthas@7]$ profiler start --event alloc AsyncProfiler error: No AllocTracer symbols found. Are JDK debug symbols installed?

    opened by piglingcn 0
  • 修复Configure的telnetPort 和httpPort 在BinderUtils.inject 反射设置异常的问题

    修复Configure的telnetPort 和httpPort 在BinderUtils.inject 反射设置异常的问题

    arthas-spring-boot-starter 在设置httpPort或者telnetPort是会报异常:

    Caused by: java.lang.IllegalArgumentException: Cannot convert value [8564] from source type [Integer] to target type [int] at com.taobao.arthas.core.env.PropertySourcesPropertyResolver.getProperty(PropertySourcesPropertyResolver.java:97) at com.taobao.arthas.core.env.PropertySourcesPropertyResolver.getProperty(PropertySourcesPropertyResolver.java:62) at com.taobao.arthas.core.env.ArthasEnvironment.getProperty(ArthasEnvironment.java:101) at com.taobao.arthas.core.config.BinderUtils.inject(BinderUtils.java:53) ... 34 common frames omitted

    在BinderUtils.inject 设置值的时候调用PropertySourcesPropertyResolver.getProperty 比较int和integer的时候验证不过抛出异常,Configure 的set方法入参需要修改成Integer

    opened by zhaojinyu 4
  • arthas dump操作无法实时跟踪目标程序运行时类字节码的变动情况

    arthas dump操作无法实时跟踪目标程序运行时类字节码的变动情况

    • [✅ ] 我已经在 issues 里搜索,没有重复的issue。

    环境信息

    • arthas-boot.jar 或者 as.sh 的版本: 3.6.7
    • Arthas 版本: 3.6.7
    • 操作系统版本: mac m1 13.0.1
    • 目标进程的JVM版本: 1.8.0_211
    • 执行arthas-boot的版本: 3.6.7

    重现问题的步骤

    1.启动目标SpringBoot程序 2.分别通过arthas和dumpclass.jar的dump操作得到修改前类ApplicationFilterChain的字节码 3.通过agent的retransformClasses修改ApplicationFilterChain类的doFilter方法,更改全局过滤器,使得可以在web上操作目标程序的linux服务器 image 4.再次分别通过arthas和dumpclass.jar的dump操作得到修改后类ApplicationFilterChain的字节码 image

    5.最终结果,arthas前后字节码没有变化,dumpclass前后字节码有变化 image

    期望的结果

    期望arthas dump操作得到的运行时类字节码前后有变化。

    实际运行的结果

    实际arthas dump操作得到的结果前后无变化

    opened by yaokuku123 0
  • 有办法通过提供一个指令,dump线程堆栈信息吗?

    有办法通过提供一个指令,dump线程堆栈信息吗?

    目前支持 dump内存堆栈文件 heapdump。 有办法可以dumo线程堆栈文件吗?类似于实现 jstack 的效果。目前我想到几种方式,大佬帮忙参考一下:

    1. ThreadMXBean#dumpAllThreads 这种方式获取到的线程信息和 jstack导出来的线程堆栈文件有点不同,不好统一做处理

    2. HotSpotVirtualMachine attach 自己,然后执行 hotSpotVm#remoteDataDump ,实际执行是可以的,但感觉有点不靠谱,这种方式有啥风险吗? String selfName = ManagementFactory.getRuntimeMXBean().getName(); String selfPid = selfName.substring(0, selfName.indexOf('@')); // Attach to the VM. VirtualMachine vm = VirtualMachine.attach(selfPid); HotSpotVirtualMachine hotSpotVm = (HotSpotVirtualMachine) vm;

    3. 通过 Runtime.getRuntime().exec(”jstack xxx“) 呢?

    opened by spilledyear 0
Releases(arthas-all-3.6.7)
Owner
Alibaba
Alibaba Open Source
Alibaba
A tool ot export, analyse and visualize your transactions, rewards and commissions of your liquidity mining pools or DEX transactions

A tool ot export, analyse and visualize your transactions, rewards and commissions of your liquidity mining pools or DEX transactions.

Adam·Michael 15 Mar 11, 2022
A Parser tool which actually tries to convert XML data into JSON data

SpringBoot A Parser tool which actually tries to convert XML data into JSON data Tools Required Postman (Testing API's) IDE - Eclipse / NetBeans/ Inte

null 1 Jan 27, 2022
Java lib for monitoring directories or individual files via java.nio.file.WatchService

ch.vorburger.fswatch Java lib for monitoring directories or individual files based on the java.nio.file.WatchService. Usage Get it from Maven Central

Michael Vorburger ⛑️ 21 Jan 7, 2022
Tencent Kona JDK11 is a no-cost, production-ready distribution of the Open Java Development Kit (OpenJDK), Long-Term Support(LTS) with quarterly updates. Tencent Kona JDK11 is certified as compatible with the Java SE standard.

Tencent Kona JDK11 Tencent Kona JDK11 is a no-cost, production-ready distribution of the Open Java Development Kit (OpenJDK), Long-Term Support(LTS) w

Tencent 268 Dec 16, 2022
This repository contains Java programs to become zero to hero in Java.

This repository contains Java programs to become zero to hero in Java. Data Structure programs topic wise are also present to learn data structure problem solving in Java. Programs related to each and every concep are present from easy to intermidiate level

Sahil Batra 15 Oct 9, 2022
An open-source Java library for Constraint Programming

Documentation, Support and Issues Contributing Download and installation Choco-solver is an open-source Java library for Constraint Programming. Curre

null 607 Jan 3, 2023
Java Constraint Programming solver

https://maven-badges.herokuapp.com/maven-central/org.jacop/jacop/badge.svg [] (https://maven-badges.herokuapp.com/maven-central/org.jacop/jacop/) JaCo

null 202 Dec 30, 2022
Java Constraint Solver to solve vehicle routing, employee rostering, task assignment, conference scheduling and other planning problems.

OptaPlanner www.optaplanner.org Looking for Quickstarts? OptaPlanner’s quickstarts have moved to optaplanner-quickstarts repository. Quick development

KIE (Drools, OptaPlanner and jBPM) 2.8k Jan 2, 2023
Java rate limiting library based on token/leaky-bucket algorithm.

Java rate-limiting library based on token-bucket algorithm. Advantages of Bucket4j Implemented on top of ideas of well known algorithm, which are by d

Vladimir Bukhtoyarov 1.7k Jan 8, 2023
Object-Oriented Java primitives, as an alternative to Google Guava and Apache Commons

Project architect: @victornoel ATTENTION: We're still in a very early alpha version, the API may and will change frequently. Please, use it at your ow

Yegor Bugayenko 691 Dec 27, 2022
Google core libraries for Java

Guava: Google Core Libraries for Java Guava is a set of core Java libraries from Google that includes new collection types (such as multimap and multi

Google 46.5k Jan 1, 2023
Java regular expressions made easy.

JavaVerbalExpressions VerbalExpressions is a Java library that helps to construct difficult regular expressions. Getting Started Maven Dependency: <de

null 2.6k Dec 30, 2022
MinIO Client SDK for Java

MinIO Java SDK for Amazon S3 Compatible Cloud Storage MinIO Java SDK is Simple Storage Service (aka S3) client to perform bucket and object operations

High Performance, Kubernetes Native Object Storage 787 Jan 3, 2023
java port of Underscore.js

underscore-java Requirements Java 1.8 and later or Java 11. Installation Include the following in your pom.xml for Maven: <dependencies> <dependency

Valentyn Kolesnikov 411 Dec 6, 2022
(cross-platform) Java Version Manager

jabba Java Version Manager inspired by nvm (Node.js). Written in Go. The goal is to provide unified pain-free experience of installing (and switching

Stanley Shyiko 2.5k Jan 9, 2023
Manage your Java environment

Master your Java Environment with jenv Website : http://www.jenv.be Maintainers : Gildas Cuisinier Future maintainer in discussion: Benjamin Berman As

jEnv 4.6k Dec 30, 2022
The shell for the Java Platform

______ .~ ~. |`````````, .'. ..'''' | | | |'''|''''' .''```. .'' |_________| |

CRaSH Repositories 916 Dec 30, 2022
Hashids algorithm v1.0.0 implementation in Java

Hashids.java A small Java class to generate YouTube-like hashes from one or many numbers. Ported from javascript hashids.js by Ivan Akimov What is it?

CELLA 944 Jan 5, 2023
a pug implementation written in Java (formerly known as jade)

Attention: jade4j is now pug4j In alignment with the javascript template engine we renamed jade4j to pug4j. You will find it under https://github.com/

neuland - Büro für Informatik 700 Oct 16, 2022