Home >Web Front-end >JS Tutorial >Enhancing Task Scheduling Reliability: Integrating Arthas for API Monitoring in DolphinScheduler

Enhancing Task Scheduling Reliability: Integrating Arthas for API Monitoring in DolphinScheduler

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2024-11-05 12:56:021123browse

This article details the integration of Arthas into Apache DolphinScheduler to enable real-time monitoring of API calls. Arthas, a powerful Java diagnostic tool, assists developers in inspecting the runtime status, identifying performance bottlenecks, and tracking method calls. Embedding Arthas in DolphinScheduler allows for the capture of key call information during task scheduling, enabling timely issue detection and resolution for improved system stability. Here, we outline the steps to start Arthas within the DolphinScheduler environment, monitor specific API calls, and analyze the collected performance data to enhance scheduling reliability and maintainability.

Manual Installation

https://arthas.aliyun.com/download/latest_version?mirror=aliyun
arthas-packaging-3.7.2-bin.zip

cp arthas-packaging-3.7.2-bin.zip /opt/arthas
cd /opt/arthas
unzip arthas-packaging-3.7.2-bin.zip

java -jar arthas-boot.jar

Select the corresponding process ID.

Error Troubleshooting

Error 1

[ERROR] Start arthas failed, exception stack trace: 
com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded
        at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:106)
        at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:78)
        at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:250)
        at com.taobao.arthas.core.Arthas.attachAgent(Arthas.java:102)
        at com.taobao.arthas.core.Arthas.<init>(Arthas.java:27)
        at com.taobao.arthas.core.Arthas.main(Arthas.java:161)

Solution:
In ${DOLPHINSCHEUDLER_HOME}/api-server/bin, add the following line to jvm_args_env.sh:

-XX:+StartAttachListener

Error 2

Picked up JAVA_TOOL_OPTIONS: 
java.io.IOException: well-known file /tmp/.java_pid731688 is not secure: file should be owned by the current user (which is 0) but is owned by 989
        at sun.tools.attach.LinuxVirtualMachine.checkPermissions(Native Method)
        at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:117)
        at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:78)
        at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:250)
        at com.taobao.arthas.core.Arthas.attachAgent(Arthas.java:102)
        at com.taobao.arthas.core.Arthas.<init>(Arthas.java:27)
        at com.taobao.arthas.core.Arthas.main(Arthas.java:161)
[ERROR] Start arthas failed, exception stack trace: 
[ERROR] attach fail, targetPid: 731688

Solution:
Ensure the user running the Arthas service matches the user running DolphinScheduler to avoid this error.

Watch

Watch is used to monitor the specific execution details of methods, such as parameters and return values.

watch org.apache.dolphinscheduler.api.controller.UsersController queryUserList returnObj
[arthas@731688]$ watch org.apache.dolphinscheduler.api.controller.UsersController queryUserList returnObj
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 126 ms, listenerId: 2
method=org.apache.dolphinscheduler.api.controller.UsersController.queryUserList location=AtExit
ts=2024-08-27 02:04:01; [cost=4.918943ms] result=@Result[
...

Trace

Trace monitors the depth of method calls, including the methods called and the execution time of each.

[arthas@973263]$ trace org.apache.dolphinscheduler.api.controller.UsersController queryUserList 
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 319 ms, listenerId: 1
`---ts=2024-08-27 10:33:08;thread_name=qtp1836984213-26;id=26;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@439f5b3d
    `---[13.962731ms] org.apache.dolphinscheduler.api.controller.UsersController:queryUserList()
        +---[0.18% 0.025123ms ] org.apache.dolphinscheduler.api.controller.UsersController:checkPageParams() #130
        +---[0.09% 0.012549ms ] org.apache.dolphinscheduler.plugin.task.api.utils.ParameterUtils:handleEscapes() #131
        `---[96.47% 13.469876ms ] org.apache.dolphinscheduler.api.service.UsersService:queryUserList() #132

Dump

To generate a heap dump file, use:

[arthas@973263]$ heapdump arthas-output/dump.hprof
Dumping heap to arthas-output/dump.hprof ...
Heap dump file created

Analyze the dump file with tools like MAT for memory leak diagnostics.

Viewing JVM Memory Changes

Use memory to inspect JVM memory usage:

[arthas@973263]$ memory 
Memory                                                         used                 total                max                  usage                
heap                                                           485M                 900M                 900M                 53.91%               
ps_eden_space                                                  277M                 327M                 358M                 77.61%               
...

Viewing CPU Usage

Use dashboard to view CPU usage, and identify specific threads for further inspection with thread -n thread_id.

Enhancing Task Scheduling Reliability: Integrating Arthas for API Monitoring in DolphinScheduler

The above is the detailed content of Enhancing Task Scheduling Reliability: Integrating Arthas for API Monitoring in DolphinScheduler. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn