Home  >  Article  >  Java  >  Alibaba Cloud ARMS diagnoses Java application stuck problems in practice

Alibaba Cloud ARMS diagnoses Java application stuck problems in practice

php是最好的语言
php是最好的语言Original
2018-08-09 16:55:071625browse

Alibaba Cloud ARMS diagnoses Java application stuck problems in practice

Don’t panic, this is just a picture

Besides the 404, this picture may be the most distressing to netizens.

We welcome netizens to leave messages in the background of the "Alibaba Middleware" public account and send us the heartbreaking things you see on the Internet. It is not limited to text, pictures and voices. Maybe we will push it. This is the "Internet Heartache Collection".

According to relevant research, when the page loading time increases from 1 second to 3 seconds, the chance of jumping out increases by about 30%. The chance of 1s to 5s increases to 90%. If your website takes 10s to load, the chance of bounce will exceed 120%. (The 120% here does not mean 10 people come and 12 people leave, it means the growth rate of user churn.) Therefore, in this era of "user experience is king", application performance monitoring has become the top priority of operation and maintenance management.

1. Find the cause of “slowness”.

Website lag and slow page loading are common problems in Internet applications. It is not easy to troubleshoot and solve such problems, and it will cost operation and maintenance personnel a lot of time and energy. There are usually three reasons:

» The application link is too long and there is no way to start.

From the front-end page to the back-end gateway, from the Web application server to the back-end database, problems in any link may cause the overall request to be stuck. Is the front-end resource loading too slow? Or is there something wrong with the database? Or does the newly released server code have performance issues? Problems can occur for a variety of reasons.

For applications that use a "microservice" architecture, the links are more complex. Different components may be maintained by different teams and personnel, which makes troubleshooting more difficult.

» The log is incomplete or of poor quality, and the scene is missing.

Application log is undoubtedly an artifact for troubleshooting online problems, but the location of the problem is often unpredictable. When a problem occurs, it is usually found that the log information is incomplete because we cannot print every place where a problem may occur. log.

The definition of "slow" is subjective, and "slow" is sometimes an accidental phenomenon. To really capture the "slow" line of code, we often need to record every call and not miss every line of code, but this approach is too costly.

» Insufficient monitoring means it’s too late when problems arise.

Rapid business development and faster iteration speed will lead to frequent interface modifications of the business system, increased dependencies, and deterioration of code quality. If there is not a complete monitoring system that can fully automatically monitor the performance of each interface of the application and automatically record problematic calls, it will be too late to solve the problem after user feedback.

2. How to locate "slow" problem in 1 minute

Business real-time monitoring service ARMS (Application Real-Time Monitoring Service) is a full-link Alibaba Cloud Application Performance Management (APM) class Monitor products. ARMS provides a full range of monitoring functions for Java application monitoring and diagnosis, Internet of Vehicles real-time monitoring, retail industry real-time monitoring, user experience monitoring and other scenarios, including front-end monitoring, application monitoring and custom monitoring functions to quickly build real-time business Monitoring capabilities.

Step one: Install Java probe (if your application is hosted on EDAS, you can even skip this step)

• Open ARMS and create an application.

• Download the Java probe package and unzip it.

• Add -javaagent:/{user.workspace}/ArmsAgent/arms-bootstrap-1.7.0-SNAPSHOT.jar-Darms.licenseKey=xxx -Darms.appId=xxx ( Fill in the appId and licenseKey according to the information assigned on the page)

• Open the ARMS page, data begins to be reported, and verify that the Java probe is successfully installed.

Alibaba Cloud ARMS diagnoses Java application stuck problems in practice

#Step 2: Find "slow" suspicious clues in the application overview

Enter the ARMS application topology diagram. In the application overview, we can clearly see that there are "slow SQL" 5 times in the system today.

Alibaba Cloud ARMS diagnoses Java application stuck problems in practice

Step 3: Browse and find the "slow interface"

Click on the interface list, we can see at a glance what this application provides All interfaces as well as the number of calls and time-consuming of this interface. Of course, these interfaces are automatically discovered by ARMS probes in the program without any configuration.

Among these interfaces, the "slow" interface will be clearly marked. We clearly found a suspiciously slow interface.

Select the "slow" interface with the most calls on the left. We can see from the right that this call is obviously "slow" in the database call.

Alibaba Cloud ARMS diagnoses Java application stuck problems in practice

Step 4: “Which line of code is slow”? Locate the reason with one click!

• It’s not enough to just see how long the interface takes, we need to accurately locate the line of code where “slowness” occurs.

• Click "Interface Snapshot" to see the snapshots of all interfaces corresponding to this interface. The snapshot is a complete record of the full link call of a call. The ARMS probe will record the code and time taken by each call with very small performance loss, helping you pinpoint "slow" problems.

Alibaba Cloud ARMS diagnoses Java application stuck problems in practice

• We click on the TraceId of a certain call snapshot and expand it to see which row is specifically "slow" for this call. From the picture above, we can clearly see that in this call that took 705 milliseconds, most of the time was spent in the "SELECT * FROMl_employee" SQL call, which is obviously a full table scan operation. !

• So far, we have clearly discovered the root cause of a slow call error in the system. And there is sufficient basis to guide our next step of code optimization work. We can also go back to the calling interface list, and then open other "slow" calls in the list one by one, and solve them one by one. I believe that with the help of ARMS, your website will be able to stay away from the trouble of lag and provide users with a smoother experience.

Step 5: Take preventive measures before they happen - set alarms

Of course, you can set alarms for a certain interface or all interfaces in the alarm settings of ARMS. Let your operation and maintenance team be notified immediately when the page interface is stuck.

Alibaba Cloud ARMS diagnoses Java application stuck problems in practice

3. What other website experience issues are there?

Of course, in addition to website lags and slow page loading, the website will also have a series of problems such as background errors, page loading failures, and memory leaks. How to use ARMS to quickly solve more website problems, please pay attention to our ARMS series of articles - "One-minute location of common website problems".

Related recommendations:

mysql-Alibaba Cloud-RDS-MySQL-5.5 calculation problem

Since using Qiniu After using Cloud Robots, Baidu cannot crawl it. Diagnostics show robots disabled. What went wrong?

The above is the detailed content of Alibaba Cloud ARMS diagnoses Java application stuck problems in practice. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn