Database performance monitoring is something every DBA worth their salt should be doing on a regular basis.
It should be adopted as a proactive task to help identify issues early on before they become too serious and be part of a post code deployment monitoring process.
Bundled in with linux based operating systems are a heap of great tools that you can use as a DBA to help performance monitor your database server. If you are not happy with what you get “out of the box”, you can also find some great database monitoring tools online that are available to download for free.
For this post, I’m going to talk about both MySQL and Linux operating system performance monitoring tools. In many scenarios, you’ll need both types in order to get a complete understanding of where the delays are in your system.
1/ MySQL slow query log
The mysql slow query log is absolutely brilliant for capturing slow queries hitting your MySQL databases.
You can log queries whose durations match the number you specify in my.cnf. So you can analyze queries which take more than 3 seconds for example.
Activate in my.cnf with customizable settings for log location, long query time and whether to log queries that do not use any indexes.
#slow query loggingslow-query-log = 1slow-query-log-file = /var/log/mysql/slow-loglong-query-time = 3log-queries-not-using-indexes = 0
Once you have been logging for a while you can aggregate the results with the mysqldumpslow utility, optimize them and then monitor for improvements!
2/ MySQL Performance Schema
Introduced in version 5.5, the performance_schema database provides a way of querying internal execution of the server at run-time.
To enable add “performance_schema” to my.cnf
There are many objects to query, too many to talk about in this post. Check out the documentation here .
3/ The MySQL process list
To get an idea of how many processes are connected to your MySQL instance, what they are running and for how long, you can run SHOW FULL PROCESSLIST or alternatively read from the information_schema.processlist table.
mysql> SELECT user, host, time, info FROM information_schema.processlist;+-------------+------------+-------+-------------------------------------------------------------------+| user| host | time| info|+-------------+------------+-------+-------------------------------------------------------------------+| root| localhost| 0 | SELECT user, host, time, info FROM information_schema.processlist || replication | srv1:46892 | 11843 | NULL|+-------------+------------+-------+-------------------------------------------------------------------+2 rows in set (0.00 sec)
4/ mtop
I love this utility, it provides a real-time view of the MySQL process list and updates according to the number of seconds your specify when you run it.
What I really like about it is that you can have it running on one screen and as problems occur, the colours of the threads change colour with red indicating that something has been running for some time.
There is a great article here about how to install it on different flavours of Linux as well as some detail on how to run it.
5/ SHOW STATUS
Like other command line tools, such as SHOW PROCESSLIST, you run these to get moment in time reports on different variable status’s.
For example, if you want to get information about the query cache, you can run :
mysql> SHOW STATUS LIKE 'Qcache%';+-------------------------+------------+| Variable_name | Value|+-------------------------+------------+| Qcache_free_blocks| 9353 || Qcache_free_memory| 93069936 || Qcache_hits | 9719103977 || Qcache_inserts| 1451857238 || Qcache_lowmem_prunes| 897050960|| Qcache_not_cached | 222234089|| Qcache_queries_in_cache | 20856|| Qcache_total_blocks | 52497|+-------------------------+------------+8 rows in set (0.00 sec)
This type of reporting can help you monitor specific areas of your MySQL instance. For example, if you wanted to know the query cache hit rate, you could get the numbers from above and calculate based on this formula:
((Qcache_hits/(Qcache_hits+Qcache_inserts+Qcache_not_cached))*100)
For more information, see this link .
6/ TOP
This will list running processes and the resources that they are consuming. It updates real-time and you can quickly gage if there are processes which are consuming large areas of resource in CPU and memory at a very high level.
top - 17:33:48 up 7 min,1 user,load average: 0.03, 0.04, 0.04Tasks:64 total, 1 running,63 sleeping, 0 stopped, 0 zombieCpu(s):0.0%us,0.0%sy,0.0%ni,100.0%id,0.0%wa,0.0%hi,0.0%si,0.0%stMem: 604332k total, 379280k used, 225052k free, 11724k buffersSwap: 0k total, 0k used, 0k free, 135064k cachedPID USER PRNIVIRTRESSHR S %CPU %MEM TIME+COMMAND809 tomcat7 20 0 1407m 149m13m S0.3 25.4 0:10.99 java 1153 ubuntu 20 0 81960 1592756 S0.30.3 0:00.01 sshd 1318 root 20 0 17320 1256972 R0.30.2 0:00.07 top 1 root 20 0 24340 2284 1344 S0.00.4 0:00.39 init 2 root 20 0 0 0 0 S0.00.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S0.00.0 0:00.03 ksoftirqd/0 4 root 20 0 0 0 0 S0.00.0 0:00.00 kworker/0:0 5 root 20 0 0 0 0 S0.00.0 0:00.01 kworker/u:0
7/ free
This utility helps to give you an idea whether you have a memory issue. Again this is another great tool for getting a high level view. I like to use “free -m” as it returns the numbers to me in megabytes instead of bytes. The information returned shows you in use, free and swap usage. It also shows what is in use by the kernel and buffers.
root@vm1:~# free -m total used free sharedbuffers cachedMem: 5903732160 11131-/+ buffers/cache:229360Swap:000
8/ vmstat
This utility is very useful for monitoring many areas of the system, CPU, IO blocks and swap. I find it particularly good to monitor swap file usage.
Whilst “free” might tell you if there are any pages in the swap file, vmstat will tell you if your system is actively swapping. Computers and servers do need to use their swap file but the less this happens, the better it is for your applications performance.
When you have a problem with swap, it is when it is being used constantly and can be a sign that you don’t have enough memory installed in your system.
By default, running vmstat will not give you a real time view of your system. So you need to add a figure to the command to give you a fresh read out in the number of seconds specified. In this example, I am specifying every 2 seconds.
root@vm1:~# vmstat 2procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- rb swpd free buffcache si sobibo in cs us sy id wa 000 22132412556 135252009319 40 7510 980 000 22132412556 13527600 0 0 34 6500 1000 000 22132412564 13528000 024 38 6400 1000 000 22132412564 13528000 0 0 32 5600 1000 000 22132412564 13528000 0 0 33 5600 1000 000 22132412564 13528000 0 0 30 5501 1000 000 22132412564 13528000 0 0 35 5900 1000
The columns you are interested in are swap si and so. Which stands for “swap in” and “swap out”. These figures tell you what is being read in from disk swap file (si) and what is being swapped out to the swap file (so). Swapping is very slow I/O intensive process and you want to be doing some optimization somewhere or adding more memory if this is a problem.
Run “man vmstat” for a full list of features and documentation.
9/ sar
I love sar! It will capture you a whole bunch of metrics based on CPU time, CPU queues, RAM, IO and network activity. It will give you a point in time view of the resource usage in the form of a historical report.
The default time between report lines is 10 minutes but you can change that. It’s great for seeing whether you have any particularly heavy areas of resource pressure at any time in the day. You can also use it as a performance monitoring tool to measure the effects of optimizations to your system.
Some examples, run “man sar” for a full list of features and documentation on what each column header means.
sar -q (check CPU queue length)
11:20:01 AM runq-szplist-sz ldavg-1 ldavg-5ldavg-1511:30:01 AM 1 2010.000.000.0011:40:01 AM 1 2000.000.000.0011:50:01 AM 1 2010.000.000.0012:00:01 PM 2 2010.000.000.00
sar -r (check RAM usage)
11:20:01 AM kbmemfree kbmemused%memused kbbufferskbcachedkbcommit %commit11:30:01 AM151308 3765480 96.14 91416 1054136 2961684 49.2511:40:01 AM151076 3765712 96.14 91664 1054136 2961012 49.2411:50:01 AM150680 3766108 96.15 91888 1054148 2961152 49.2412:00:01 PM150704 3766084 96.15 92104 1054152 2961340 49.24
10/ iostat
This tool will you give you statistics for CPU and I/O for devices, partitions and network file systems. Great for knowing where the busiest drives are for example.
root@vm1 ~# iostatLinux 2.6.32-431.11.2.el6.x86_64 (vm1)06/27/2014_x86_64_(4 CPU)avg-cpu:%user %nice %system %iowait%steal %idle 0.230.000.070.100.00 99.60Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtnsda 11.78 785.38 450.12 1437054564823620760dm-0 1.00 1.35 6.672472280 12211040dm-1 64.52 783.30 441.42 1433252442807699512dm-2 0.00 0.00 0.02 765829336dm-3 0.27 0.53 2.01 9786263680440
So there you have it – 10 really useful tools which you can utilize in your database performance monitoring efforts. There are many more but I’ve run out of time now.