Home > Article > Operation and Maintenance > Tips on how to troubleshoot and repair Linux system failures
[Introduction] I found that some faults will occur during the startup process of the Linux system, causing the system to fail to start normally. Here I have written several fault repair cases that apply single-user mode, GRUB command operation, and Linux rescue mode to help everyone. Learn how to solve such problems. (1) Single-user mode Linux system provides
I found that the Linux system will have some failures during the startup process, causing the system to fail to start normally. I have written several application single-user mode and GRUB command operations here. , Linux rescue mode fault repair cases help everyone understand how to solve such problems.
(1) Single-user mode
Linux system provides a single-user mode (similar to Windows safe mode), which can be performed in a minimal environment system maintenance. In single-user mode (runlevel 1), Linux boots into a root shell, networking is disabled, and only a few processes are running. Single-user mode can be used to modify file system damage, restore configuration files, move user data, etc.
The following lists several typical cases of repairing system failures in single-user mode:
Case 1: Forgot the root password
In single-user mode , Linux does not require a root password (Red Hat system does not require a root password, but SuSe does. Different Linux systems are slightly different. This article uses Fedora Core 6 as an example), which makes it very easy to change the root password. It is important to understand how to enter single-user mode when the system fails to boot into multi-user mode.
1. During the system startup process, the start interface will appear. Press any key to enter the GRUB menu option.
If you want to avoid this prompt in the future, go directly to the GRUB menu option and delete the "hiddenmenu" item in the configuration file grub.conf.
2. Press the "e" key to edit the GRUB boot menu options. The GRUB screen after pressing the "e" key. Use the arrow keys to move down to the kernel line and press the "e" key.
3. Add single at the cursor on the last line, press the Enter key to return to the previous screen, and press the "b" key to boot. Automatically enter single-user mode. If you want to change the root password, execute the command: sh-3.1# passwd root
After the change is successful, execute the command exit to exit and restart.
You can correct many problems that prevent the system from starting normally in single-user mode, such as:
1. Disable services that may stop the system from running. For example, if you disable the Samba service, execute: sh- 3.1# chkconfig smb off The Samba service will not be started the next time the system boots.
2. Change the system default run level. If X Window cannot start or fails, you can edit the /etc/inittab file, log in using text mode, and change the initdefault boot level to 3: id: 3: initdefault:
Case 2: Disorganized Hard Disk Sectors
The most common problem encountered during the startup process is that the hard disk may have bad sectors or disordered sectors (data damage) , this situation is mostly caused by abnormal power outage or abnormal shutdown. When this kind of problem occurs, when the system starts, the screen will display:
Press root password or ctrl+D: At this time, enter the root password and the system will automatically enter single-user mode, enter "fsck -y /dev/hda6 "(fsck is the file system detection and repair command, "-y" sets the automatic repair when an error is detected, /dev/hda6 is the hard disk partition where the error occurred, please change this parameter according to the specific situation), after the system repair is completed, use the command " "reboot" to restart.
Case 3. GRUB option setting error
"Error 15" shows that the system cannot find the kernel specified in grub.conf. GRUB boot error message. We observed that due to a typing error, "vmlinuz" in the kernel file was typed as "vmlinux", so the system could not find the kernel executable file. We can press any key to return to the GRUB editing interface and modify this error. Press Enter to save and press the "b" key to boot normally. Of course, don't forget to modify the error in the grub.conf file after entering the system. This is a mistake that many novice Linux users make easily when modifying GRUB settings. When this black screen prompt appears, pay attention to the error message and you can fix it accordingly.
(2) GRUB boot troubleshooting
I found that sometimes Linux will directly enter the GRUB command line interface after startup (only "grub>" prompt), at this time many users choose to reinstall GRUB or even reinstall the system. In fact, generally speaking, there are two most common reasons for this failure: one is the wrong option setting in the GRUB configuration file; the other is the loss of the GRUB configuration file (there are also a few reasons, such as the kernel file or image file being damaged or missing, the /boot directory Accidental deletion, etc.), if it is the first case, you can first boot the system through the GRUB command and repair it; if it is the second case, you need to use the Linux rescue mode to repair it (described later in this article).
First of all, we need to understand the boot process of the GRUB startup system. The main configuration options in the grub.conf file are as follows (note that the GRUB configuration file is /boot/grub/grub.conf, /etc/grub.conf Just a soft link to this file):
title Fedora Core (2.6.18-1.2798.fc6)root (hd0,0)kernel /boot/vmlinuz-2.6.18-1.2798.fc6 ro root=LABEL=/ rhgb quiet initrd /boot/initrd-2.6.18-1.2798.fc6.img
The "title" section specifies the system booted by GRUB: the "root" section specifies the location of the /boot partition: the "kernel" section specifies the location of the kernel file. When the kernel is loaded, the permission attribute is read-only (" ro") and specify the location of the root partition (root=LABEL=/); initrd specifies the location of the image file. Therefore, when GRUB boots, the order is to first load the /boot partition, and then load the kernel and image files in sequence.
Case: The "title Fedora Core (2.6.18-1.2798.fc6)" segment was accidentally deleted
At this time, the system will automatically enter "GRUB> "Command line, in order to troubleshoot, we can do the following operations in sequence:
1. Find the partition where the /boot/grub/grub.conf file is located GRUB> find /boot/grub/grub.conf(hd0, 0)
2. Check the grub.conf file for errors GRUB>cat (hd0, 0)/boot/grub/grub.conf It is recommended that after the system is installed and set up, the grub.conf file should be backed up. If there is a backup file, such as grub.conf.bak, you can check the backup file at this time, compare it with the current file, and find the error: GRUB>cat (hd0, 0)/boot/grub/grub.conf.bak
3. Confirm the error Finally, first complete the GRUB boot through the command line, and then repair the grub.conf file error after entering the system: 1) Specify the /boot partition root (hd0, 0)
2) Specify the kernel to load kernel /boot/ vmlinuz-2.6.18-1.2798.fc6 ro root=LABEL=/ rhgb quiet 3) Specify the location of the image file initrd /boot/initrd-2.6.18-1.2798.fc6.img
Tip: GRUB supports tabs Key command completion function
4. Start boot from the /boot partition (hd0, 0)
The command line mode can be called by pressing the "c" key in the GRUB menu mode, or you can use To test the newly compiled kernel (set up kernel, initrd to boot the new kernel and image file). Increasing your understanding of GRUB booting and Linux system booting knowledge will go a long way in troubleshooting this type of problem.
(3) Linux rescue mode application
When the system cannot even enter single-user mode or the GRUB command line cannot solve the problem Boot problems, we need to use Linux rescue mode to troubleshoot. The steps are as follows:
1. Put the Linux installation CD (if using a CD, put the first boot CD) into the CD-ROM drive, set the firmware CMOS/BIOS to boot from the CD, when the Linux installation screen appears, Enter "linux rescue" at the "boot:" prompt and press Enter to enter rescue mode. (If you want to know more about the rescue mode, you can also press the F5 key to view)
2. The system will detect the hardware, boot the Linux environment on the CD, and prompt you to select the language to use in the rescue mode (it is recommended to choose the default English is enough. According to the author's test, some Linux systems will produce garbled characters when selecting Chinese); just use the default "us" for keyboard settings; network settings can be as needed. Most fault repairs do not require a network connection, so you do not need to make this setting. Select "No".
3. Next, the system will try to find the root partition, and a mounting prompt will appear. The default setting is rescue mode. The root partition of the hard disk will be mounted to the /mnt/sysimage directory of the CD-ROM Linux environment. The default option is " "continue" means the mounting permission is read and write: "Read-only" means read-only. If the detection fails, you can choose "skip" to skip. Here, because the system needs to be repaired, read and write permissions are required. Generally, the default option "continue" is selected.
After entering the next step, the system prompts to execute the "chroot /mnt/sysimage" command to mount the root directory to the root directory of our hard disk system.
Case 1: Dual system startup repair
When we install a dual system environment, install Linux first and then install Windows; or the Windows that has been installed in the dual system environment is damaged. After reinstalling Windows, the MBR (Master Boot Record) that saves GRUB will be overwritten by the Windows system's bootloader NTLDR, causing the Linux system to fail to boot.
1. If you want to restore dual system boot, first enter the rescue mode using the above method, and execute the chroot command as follows:
sh-3.1# chroot /mnt/sysimage
2. Switch the root directory to the root directory of the hard disk system, and then Execute the grub-install command to reinstall GRUB:
sh-3.1# grub-install /dev/hda
"/dev/hda" is the name of the hard disk. If you use a SCSI hard disk or Linux is installed on the second IDE hard disk, this setting needs to be adjusted accordingly.
3. Then execute the exit command in sequence to exit chroot mode and rescue mode (execute the exit command twice):
sh-3.1# exit
After the system restarts, the GRUB booted dual-system startup will be restored.
Case 2: Repair of lost system configuration file
During system boot, a very important process is that the init process reads its configuration file /etc/inittab and starts The system's basic service program and the default run-level service program complete system booting. If /etc/inittab is accidentally deleted or modified incorrectly, Linux will not start normally, as shown in Figure 7. At this point, such problems can only be solved through rescue mode.
/etc/inittab file is missing boot error example
1、有备份文件的恢复办法进入救援模式,执行chroot命令后,如果有此文件的备份(强烈建议系统中的重要数据目录,如/etc、/boot等要进行备份),直接将备份文件拷贝回去,退出重启即可。如果是配置文件修改错误,如比较典型的/boot/grub/grub.conf及/etc/passwd的文件修改错误,也可以直接修正恢复。假设有备份文件/etc/inittab.bak,则在救援模式下执行:
sh-3.1# chroot /mnt/sysimagesh-3.1# cp /etc/inittab.bak /etc/inittab
2、没有备份文件的恢复办法如果一些配置文件丢失或软件误删除,且无备份,可以通过重新安装软件包来恢复,首先查找到/etc/inittab属于哪一个RPM包(即便文件丢失,因为存在RPM数据库,一样可以查找到结果):sh-3.1# chroot /mnt/sysimage sh-3.1# rpm -qf /etc/inittab initscripts-8.45.3-1
退出chroot模式:
sh-3.1# exit
挂载存放RPM包的安装光盘(在救援模式下,光盘通常挂载在/mnt/source目录下):
sh-3.1# mount /dev/hdc /mnt/source
Fedora系统的RPM包存放在光盘Fedora/RPMS目录下,其他Linux存放位置大同小异,我在这里不一一列举;另外,因为要修复的硬盘系统的根目录在/mnt/sysimage下,需要使用——root选项指定其位置。覆盖安装/etc/inittab文件所在的RPM包:
sh-3.1# rpm -ivh ——replacepkgs ——root /mnt/sysimage /mnt/source/Fedora/RPMS/ initscripts-8.45.3-1.i386.rpm
其中的rpm命令选项“——replacepkgs”表示覆盖安装,执行完成后,即已经恢复了此文件。
如果想只提取RPM包中的/etc/inittab文件进行恢复,可以在进入救援模式后,执行命令:
sh-3.1# rpm2cpio /mnt/source/Fedora/RPMS/initscripts-8.45.3-1.i386.rpm | cpio -idv ./etc/inittabsh-3.1# cp etc/inittab /mnt/sysimage/etc
注意此命令执行时不能将文件直接恢复至/etc目录,只能提取到当前目录下,且恢复的文件名称所在路径要写完整的绝对路径。提取文件成功后,将其复制到根分区所在的/mnt/sysimage目录下相应位置即可。
救援模式是维护Linux的有力武器,本文以上述两个例子讲解了它的应用方法,希望能够给读者一点启示。解决Linux系统启动的故障,必须充分理解Linux的引导过程,才能够对故障进行有效的判断和处理。
The above is the detailed content of Tips on how to troubleshoot and repair Linux system failures. For more information, please follow other related articles on the PHP Chinese website!