Home  >  Article  >  Database  >  MySQL Slave triggers oom-killer solution_MySQL

MySQL Slave triggers oom-killer solution_MySQL

WBOY
WBOYOriginal
2016-08-20 08:48:101419browse

Recently, I often receive alarm messages similar to insufficient memory in MySQL instances. When I log in to the server, I find that MySQL has eaten up 99% of the memory. God!

Sometimes if it is not processed in time, the kernel will restart MySQL for us, and then we can see that the dmesg information has the following records:

Mar 9 11:29:16 xxxxxx kernel: mysqld invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Mar 9 11:29:16 xxxxxx kernel: mysqld cpuset=/ mems_allowed=0
Mar 9 11:29:16 xxxxxx kernel: Pid: 99275, comm: mysqld Not tainted 2.6.32-431.el6.x86_64 #1
Mar 9 11:29:16 xxxxxx kernel: Call Trace:

Now describe the specific scene:

Major premise: Operating system and MySQL version:

OS: CentOS release 6.5 (Final) Kernel: 2.6.32-431.el6.x86_64 (physical machine)
MySQL: Percona 5.6.23-72.1-log (single instance)

Trigger scenario: Slave will experience periodic memory surges regardless of whether there are other links coming in, triggering the kernel oom-killer

It is said that this problem has occurred for more than a year. Since I just came here, the boss asked me to check again to see if I can find any clues, so I started to check this problem:

1. I suspected that the memory allocated to MySQL was unreasonable, so I checked the size of innodb_buffer_pool and the size of physical memory, and found that the size allocated to BP accounts for about 60% of the physical memory. If this is not the reason, rule it out. If this was the problem, they should have discovered it long ago~
2. Check the parameter configuration of the operating system. [vm.swappiness = 1 ; /proc/sys/vm/overcommit_memory ; oom_adj ] Before troubleshooting the problem, you can temporarily set the adj parameter to -15 or directly -17, so that the kernel will never kill mysql. However, this cannot fundamentally solve the problem, and there are certain risks. Will it cause MySQL to hang when it needs memory but cannot allocate it? Just think about it this way.
3. Well, mysql initialization parameters and operating system parameters seem to have no inappropriate configuration. Then let’s look for MySQL itself!

Since MySQL memory has been soaring, is it caused by memory allocation? According to a bug reported online that is caused by MySQL memory allocation, I will also operate it in my environment. , take a look: 1. Record the memory size occupied by the current MySQL process; 2. Record show engine innodb status; 3. Execute flush tables; 4. Record show engine innodb status; 5. Record the size occupied by the MySQL process; 6 pair these two Compare the results, mainly to see if there is any obvious change in the memory allocated by MySQL before executing Flush Table and after Flush Table. Well, it seems that this bug is no longer here for me.

After looking at this version, there is an innodb_buffer_pool_instances parameter. There is also a bug on the official website about improper settings of innodb_buffer_pool_instances and innodb_buffer_pool_size causing MySQL OOM. The general meaning is: we can set innodb_buffer_pool_size to be larger than our actual physical memory. For example, our physical memory is : 64GB, and if we set innodb_buffer_pool_size = 300GB, and set innodb_buffer_pool_instances > 5, we can still pull up MySQL. However, MySQL is prone to OOM. Detailed information: http://bugs.mysql.com/bug.php?id=79850 Take a look here.

There is another situation, and a BUG has been reported, that is, when the slave sets filtering, it will also trigger OOM, but I have not set these instances, so I will ignore this.

Since it is not caused by MySQL memory oversubscription, it is not caused by the handle of the open table. So what other reasons are there?

Let’s think about it again. This phenomenon occurs in the Slave. The Master and Slave configurations are the same. It’s just that the Master is running the production business. Some Instances on the Slave are running the query business. Some Instances are not running any tasks at all, but OOM will still occur. Then this situation is probably caused by Slave.

Then I found an example and tried it. I don’t know if I don’t try it. I was shocked when I tried it. Go up and execute: stop slave; start slave; this command stuck for about 3 minutes. After checking the memory usage, 20GB+ was released at once. At this point, we have basically located the problem, but we all know that Slave has two threads. Is it caused by SQL Thread or IO Thread? We still have to wait for further investigation when it happens next time.

Post some memory monitoring information:

12:00:01 PM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit
02:40:01 PM 566744 131479292 99.57 88744 618612 132384348 89.19
02:50:01 PM 553252 131492784 99.58 83216 615068 132406792 89.20
03:00:01 PM 39302700 92743336 70.24 95908 925860 132413308 89.21
03:10:01 PM 38906360 93139676 70.54 109264 1292908 132407836 89.21
03:20:01 PM 38639536 93406500 70.74 120676 1528272 132413136 89.21

I recorded slightly more specific things here: https://bugs.launchpad.net/percona-server/+bug/1560304 If you can’t access it, you can access it (http://www.bitsCN.com/article/88729 .htm)

Finally, a little summary:

Phenomena: Slave OOM
Temporary solution: Restart Slave
Long-term solution: Minor version upgrade of MySQL Server

For more systematic information, please read what Mr. Guo wrote:
http://www.bitsCN.com/article/88726.htm
http://www.bitsCN.com/article/88727.htm

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn