1 故障描述 测试环境Cacti页面登陆失败,sa找我排查问题,sa找我看是否是DB故障 2 去db服务器check。 [root@xxxx mysqldata]# ps-eaf|grep mysql root1422 12582 0 03:48 pts/13 00:00:00 grep mysql root1961 1 0 Sep02 ? 00:00:03 /bin/sh /usr/bin/mysqld_s
测试环境Cacti页面登陆失败,sa找我排查问题,sa找我看是否是DB故障
[root@xxxx mysqldata]# ps-eaf|grep mysql
root 1422 12582 0 03:48 pts/13 00:00:00 grep mysql
root 1961 1 0 Sep02 ? 00:00:03 /bin/sh /usr/bin/mysqld_safe--datadir=/opt/mysqldata --socket=/var/lib/mysql/mysql.sock--pid-file=/opt/mysqldata/mysqld.pid --basedir=/usr --user=mysql
mysql 15117 1961 3 03:44 ? 00:00:08 /usr/libexec/mysqld--basedir=/usr --datadir=/opt/mysqldata --user=mysql--log-error=/opt/mysqldata/mysqld.log --pid-file=/opt/mysqldata/mysqld.pid--socket=/var/lib/mysql/mysql.sock
root 31480 6972 0 Sep17 pts/8 00:00:00 mysql -uroot -px xxxxxxx
[root@xxxx mysqldata]#
OK,mysql进程一切正常,在后台运行着,没有被kill掉。
[root@xxxx mysqldata]# mysql-uroot -p
Enter password:
ERROR 2002 (HY000): Can't connect to localMySQL server through socket '/var/lib/mysql/mysql.sock' (111)
[root@eanltrsutl001 mysqldata]#
在2的步骤里面已经看到了--log-error=/opt/mysqldata/mysqld.log参数,所以去打开error日志文件/opt/mysqldata/mysqld.log
[root@xxxx mysqldata]# tail –f /opt/mysqldata/mysqld.log
130930 3:39:50 InnoDB: Started; logsequence number 3 3403486393
130930 3:39:50 [ERROR] /usr/libexec/mysqld: Error writing file'/opt/mysqldata/mysqld.pid' (Errcode: 28)
130930 3:39:50 [ERROR] Can't start server: can't create PID file: No space lefton device
130930 03:39:50 mysqld_safe Number ofprocesses running now: 0
130930 03:39:50 mysqld_safe mysqldrestarted
130930 3:39:50 InnoDB: Initializingbuffer pool, size = 15.0G
130930 3:39:51 InnoDB: Completedinitialization of buffer pool
InnoDB: The log sequence number in ibdatafiles does not match
InnoDB: the log sequence number in theib_logfiles!
130930 3:39:51 InnoDB: Database was notshut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information fromthe .ibd files...
InnoDB: Restoring possible half-writtendata pages from the doublewrite
InnoDB: buffer...
130930 3:39:51 InnoDB: Started; logsequence number 3 3403486393
130930 3:39:51 [ERROR] /usr/libexec/mysqld: Errorwriting file '/opt/mysqldata/mysqld.pid' (Errcode: 28)
130930 3:39:51 [ERROR] Can't start server: can'tcreate PID file: No space left on device
130930 03:39:51 mysqld_safe Number ofprocesses running now: 0
130930 03:39:51 mysqld_safe mysqldrestarted
130930 3:39:51 InnoDB: Initializingbuffer pool, size = 15.0G
看到了有没有,No space left ondevice,肯定是磁盘满了,导致无法wrote了。
[root@xxxx mysqldata]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/Sys-root 1008M 513M 445M 54% /
tmpfs 15G 0 15G 0% /dev/shm
/dev/mapper/Sys-applog
50G 14G 34G 29% /applog
/dev/vda1 194M 33M 152M 18% /boot
/dev/mapper/Sys-home 2.0G 68M 1.9G 4% /home
/dev/mapper/Sys-opt 20G 19G 0 100%/opt
/dev/mapper/Sys-tmp 7.9G 3.4G 4.2G 45% /tmp
/dev/mapper/Sys-usr 2.0G 1.9G 41M 98% /usr
/dev/mapper/Sys-var 7.9G 4.4G 3.2G 58% /var
/dev/mapper/Sys-crash
2.0G 68M 1.9G 4% /var/crash
/dev/mapper/Sys-log 7.9G 1.3G 6.3G 17% /var/log
/dev/mapper/Sys-vtmp 1008M 34M 924M 4% /var/tmp
//10.15.41.252/share 466G 23G 444G 5% /applog/winshare
果然是磁盘满了,/opt目录都100%了,赶紧通知sa清理磁盘空间,sa将磁盘扩充到了40G。
[root@xxxx mysqldata]# mysql-uroot -p Enter password: ERROR 1045 (28000): Access denied for user'root'@'localhost' (using password: NO) [root@xxxx mysqldata]#
PS:OK,好了,磁盘满了之后,只要加大磁盘空间,mysql会自动恢复各种clients连接操作的。
由于这台是测试服务器的cacti监控应用db服务器,所以部署安装完db后,提醒sa添加磁盘监控,但是dev部门说没事量不大,sa忙碌别的事情就没有来得及加磁盘监控。
看来还得及时提醒sa去添加磁盘监控,dev的经验也不是100%靠谱啊!