


nodeWhat should I do if the service CPU is too high? How to check? The following article will sort out and share with you the troubleshooting ideas for node service CPU being too high. I hope it will be helpful to you!
Help a colleague look at a problem of excessive CPU
- The CPU cannot go down after it has increased. Finally, the colleague found out that it was a dependency upgrade. The default public redis configuration was offline after the major version (the project is old and has not been touched for a long time), but the business side needs to configure and shut down the redis service in its own code. The business side has an information gap, so they don't know to close redis, which causes them to keep retrying to connect to redis after going online (one more request means one more retry)
Finally, we summarized the troubleshooting ideas, as follows , welcome to add
Troubleshooting ideas
0. Restart the instance
Some problems can be solved by restarting the instance.
Restart the instance first. This is a necessary step to make the service available first. If the subsequent CPU still surges too fast, you may have to consider rolling back the code first. If the surge is not fast, you don’t need to roll back and troubleshoot the problem as soon as possible
1. linux shell Determine whether it is caused by the node process
Command 1: top
- It can be found that the node process is mainly occupying the CPU. [Related tutorial recommendations: nodejs video tutorial]
[root@*** ~]# top PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 680 root 20 0 2290976 168176 34976 S 30.3 2.0 103:42.59 node 687 root 20 0 2290544 166920 34984 R 26.3 2.0 96:26.42 node 52 root 20 0 1057412 23972 15188 S 1.7 0.3 11:25.97 **** 185 root 20 0 130216 41432 25436 S 0.3 0.5 1:03.44 **** ...
Command 2: vmstat
- First look at a vmstat 2 command , indicating that it is collected every two seconds
[root@*** ~]# vmstat 2 procs -----------memory---------------- ---swap-- -----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 233481328 758304 20795516 0 0 0 1 0 0 0 0 100 0 0 0 0 0 233480800 758304 20795520 0 0 0 0 951 1519 0 0 100 0 0 0 0 0 233481056 758304 20795520 0 0 0 0 867 1460 0 0 100 0 0 0 0 0 233481408 758304 20795520 0 0 0 20 910 1520 0 0 100 0 0 0 0 0 233481680 758304 20795520 0 0 0 0 911 1491 0 0 100 0 0 0 0 0 233481920 758304 20795520 0 0 0 0 889 1530 0 0 100 0 0
-
procs
r #Represents the running queue (that is, how many processes are actually allocated to the CPU), When this value exceeds the number of CPUs, a CPU bottleneck will occur. This is also related to the load of top. Generally, if the load exceeds 3, it is relatively high, if it exceeds 5, it is high, if it exceeds 10, it is abnormal, and the status of the server is very dangerous. The load of top is similar to the run queue per second. If the run queue is too large, it means that your CPU is very busy, which generally results in high CPU usage.
b #Indicates a blocked process, a process waiting for resources. I won’t say much about this, but everyone knows that the process is blocked.
-
memory
swpd #The size of virtual memory used. If it is greater than 0, it means that your machine's physical memory is insufficient. If it is not the cause of program memory leak, then You should upgrade the memory or migrate memory-consuming tasks to other machines.
free # The size of free physical memory
buff #Linux/Unix system is used to store the contents, permissions, etc. of the directory
cache #cache It is directly used to remember the files we open, buffer the files, and use part of the free physical memory to cache files and directories in order to improve the performance of program execution. When the program uses memory, buffer/cached will be very fast. land is used.
-
swap
si #The size of the virtual memory read from the disk per second. If this value is greater than 0, it means that the physical memory is not enough or the memory is leaked. You need to find it. Solve the memory-consuming process. My machine has plenty of memory and everything works fine.
so #The size of virtual memory written to disk per second, if this value is greater than 0, same as above.
-
io
bi #The number of blocks received by the block device per second. The block device here refers to all disks and other block devices on the system. The default block size is 1024byte
bo #The number of blocks sent by the block device per second. For example, when we read a file, bo must be greater than 0. Bi and bo are generally close to 0, otherwise the IO is too frequent and needs to be adjusted.
-
system
in #The number of CPU interrupts per second, including time interrupts
cs #The number of context switches per second, for example, when we call system functions , it is necessary to perform context switching, thread switching, and process context switching. The smaller the value, the better. If it is too large, consider lowering the number of threads or processes
-
cpu
us #User CPU time. I was on a server that frequently encrypted and decrypted. I could see that us was close to 100 and the r run queue reached 80 (the machine was doing stress testing and its performance was poor) .
sy #System CPU time, if it is too high, it means that the system call time is long, such as frequent IO operations.
id #Idle CPU time, generally speaking, id us sy = 100, generally I think id is the idle CPU usage, us is the user CPU usage, and sy is the system CPU usage.
wt #Waiting for IO CPU time.
-
practice
procs r: There are many processes running and the system is very busy.
bi/bo: The amount of data written to the disk is slightly larger. If it is a large file, it should be within 10M. There is basically no need to worry. If it is a small file, it should be within 2M. Basically normal
cpu us: It is continuously greater than 50%, which is acceptable during service peak periods. If it is greater than 50 for a long time, you can consider optimization
cpu sy: The percentage of actual kernel processes, the reference value of us sy here is 80% , if us sy is greater than 80%, it means there may be insufficient CPU.
cpu wa: column shows the percentage of CPU time occupied by IO waiting. The reference value of wa here is 30%. If wa exceeds 30%, it means that the IO wait is serious. This may be caused by a large number of random accesses to the disk, or it may be caused by the bandwidth bottleneck of the disk or disk access controller (mainly block operations)
Reference link: https://www.cnblogs.com/zsql/p/11643750.html
2. Look at the code diff
If restarting the instance still does not solve the problem, and it is determined that the problem is the node process,
Check the online commit, check the code diff, and see if the problem can be found. Click
3. Open the runtime CPU profiler
This operation method is the same as my other articleHow to quickly locate SSR server memory leaks Question is similar to
Use node --inspect to start the service
-
Local simulation of the online environment, use build After the code, direct build may not be usable. Environment variables must be controlled well, and ugly compression must be turned off.
- For example, let some environment variables (CDN domain name, etc.) point to Local, because the package is local and not uploaded to CDN
Generate CPU profiler
What if the online environment cannot be simulated locally?
For example, if the downstream RPC is isolated from the local, then you can only add code to create a profilenodejs.org/docs/latest…
After getting the profile file, open it with chrome devtool
4. Analyze the CPU profiler
Combine profiler and code diff to find the cause
You can also upload the profile file to www.speedscope.app/ (File upload), you can get the cpu profile flame graph (more detailed introduction: www.npmjs.com/package/spe…
5. Stress test verification
You can use ab, or other stress test tools
Summary
Restart the instance
Make sure it is caused by the node process
Look at the code diff
Generate runtime CPU profiler
Combined profiler and code diff to find the cause
-
Stress test verification
For more node-related knowledge, please visit: nodejs tutorial!
The above is the detailed content of What should I do if the node service CPU is too high? Let's talk about troubleshooting ideas. For more information, please follow other related articles on the PHP Chinese website!

node、nvm与npm的区别:1、nodejs是项目开发时所需要的代码库,nvm是nodejs版本管理工具,npm是nodejs包管理工具;2、nodejs能够使得javascript能够脱离浏览器运行,nvm能够管理nodejs和npm的版本,npm能够管理nodejs的第三方插件。

Vercel是什么?本篇文章带大家了解一下Vercel,并介绍一下在Vercel中部署 Node 服务的方法,希望对大家有所帮助!

node怎么爬取数据?下面本篇文章给大家分享一个node爬虫实例,聊聊利用node抓取小说章节的方法,希望对大家有所帮助!

node导出模块的两种方式:1、利用exports,该方法可以通过添加属性的方式导出,并且可以导出多个成员;2、利用“module.exports”,该方法可以直接通过为“module.exports”赋值的方式导出模块,只能导出单个成员。

安装node时会自动安装npm;npm是nodejs平台默认的包管理工具,新版本的nodejs已经集成了npm,所以npm会随同nodejs一起安装,安装完成后可以利用“npm -v”命令查看是否安装成功。

node中没有包含dom和bom;bom是指浏览器对象模型,bom是指文档对象模型,而node中采用ecmascript进行编码,并且没有浏览器也没有文档,是JavaScript运行在后端的环境平台,因此node中没有包含dom和bom。

Node.js 如何实现异步资源上下文共享?下面本篇文章给大家介绍一下Node实现异步资源上下文共享的方法,聊聊异步资源上下文共享对我们来说有什么用,希望对大家有所帮助!


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.

SublimeText3 Linux new version
SublimeText3 Linux latest version

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

WebStorm Mac version
Useful JavaScript development tools

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft
