search
HomeOperation and MaintenanceLinux Operation and MaintenanceFrequently asked Linux interview questions: Find large files and securely clear them

本篇文章给大家分享一个Linux 线上面试高频问题:如何查找大文件并安全地清除?,给大家分析分析,大家也可以对照着自己分析一下,希望对大家有所帮助!

Frequently asked Linux interview questions: Find large files and securely clear them

1 案例描述?

  • 服务线上环境,会出现一些磁盘使用率过高而告警的情况,可能是某个日志文件过大,没有及时清理回收,如何找到大目录和大文件?

  • 如何安全的清理大文件?

  • 如何使占用的磁盘空间快速释放掉?

2 命令一(目录统计排序最佳命令)

(这里以当前目录 ./ 为例,统计 top5)

【du -k --max-depth=1 ./ |sort -nr|head -n5】

[root@test-001 /]# du -k --max-depth=1 ./ |sort -nr|head -n5
137450839518./
6785876./data
2182577./usr
1830341./home
446856./var
//du -k # 显示目录或文件大小时,以 kB 为单位;
//du --max-depth=1 [目录] # 只显示指定目录下第一层目录(不含单个文件)的大小;
//sort -nr # 以行为单位,根据数字大小从大到小排序;
//head -n5 # 显示内容的开头 5 行,这里显示就是 top5 内容;

3 命令二(最实用,目录和文件一起统计排序)

(这里以当前目录 ./ 为例,统计 top5)

(1)命令详情和说明

【du -sk * ./ | sort -nr | head -n5 | awk -F'\t' '{if(1024 * 1024 * 1024 * 1024 > $1 && $1 >= 1024 * 1024 * 1024) {printf "%.2fT\t\t %s\n", $1/(1024 * 1024 * 1024), $2} else if(1024 * 1024 * 1024 > $1 && $1 >= 1024 * 1024) {printf "%.2fG\t\t %s\n", $1/(1024 * 1024), $2} else if (1024 * 1024 > $1 && $1 >= 1024) {printf "%.2fM\t\t %s\n", $1/1024, $2} else {printf "%sk\t\t %s\n", $1, $2}}' 】

[root@test-001 /]# du -sk * ./ | sort -nr | head -n5 | awk -F'\t' '{if(1024 * 1024 * 1024 * 1024 > $1 && $1 >= 1024 * 1024 * 1024) {printf "%.2fT\t\t %s\n", $1/(1024 * 1024 * 1024), $2} else if(1024 * 1024 * 1024 > $1 && $1 >= 1024 * 1024) {printf "%.2fG\t\t %s\n", $1/(1024 * 1024), $2} else if (1024 * 1024 > $1 && $1 >= 1024) {printf "%.2fM\t\t %s\n", $1/1024, $2} else {printf "%sk\t\t %s\n", $1, $2}}'
7.13G data
2.17G usr
1.75G home
447.04M var
408.50M run
//du -sk * # 显示当前目录下每个文件夹和文件的大小以KB为单位(最常用),s表示汇总,k是以KB为统计单位;
//./ #当前目录下
//sort -nr # 以行为单位,根据数字大小从大到小排序;
//awk -F'\t'# 以水平制表符进行分割,后面的程序就是进行换算单位,格式化输出成易懂的形式;

(2)du、head、sort、awk 详细说明参考已有文章附录

(3)Linux 中 printf 命令使用参考

// Linux 中 printf 命令使用参考
// https://www.linuxprobe.com/linux-printf-example.html
'{
    if(1024 * 1024 * 1024 * 1024 > $1 && $1 >= 1024 * 1024 * 1024) 
    {
        printf "%.2fT\t\t %s\n", $1/(1024 * 1024 * 1024), $2
    } 
    else if(1024 * 1024 * 1024 > $1 && $1 >= 1024 * 1024) 
    {
        printf "%.2fG\t\t %s\n", $1/(1024 * 1024), $2
    } 
    else if (1024 * 1024 > $1 && $1 >= 1024) 
    {
        printf "%.2fM\t\t %s\n", $1/1024, $2
    } 
    else 
    {
        printf "%sk\t\t %s\n", $1, $2
    }
}'

4 如何安全使用 rm 命令删除文件?

(1)rm 命令有哪些坑?

  • rm -rf / # 这个命令绝逼不能操作,删除根目录下的文件,就是系统中的所有文件都要被删除。如果是线上服务机器操作了,那就悲剧了!误操作了怎么办?赶快ctrl+c、ctrl+z 能保住多少是多少吧。

  • rm -rf / home/apps/logs/ # 这也是个天坑命令!目的是删除日志文。结果书写时“多了一个空格”的 bug,看懂了么?这就变成了 [rm -rf /] !

  • 埋藏隐患的日志清理 shell 脚本!脚本关键内容如下。

cd ${log_path}
rm -rf *

目的是:进入到日志目录,然后把日志都删除。隐患:当目录不存在时,悲剧就发生了!

(2)如何安全使用 rm 命令?

  • 在生产环境把 [rm -rf] 命令替换为 [mv],再写个脚本程序定期清理,模拟回收站的功能。

  • 把日志清理 shell 脚本,改用逻辑与 && 进行连接。

cd ${log_path}
rm -rf *

改用逻辑与 && 进行连接,合并成一句,前半句逻辑失败,后半句命令不执行:

```shell

cd ${log_path} && rm -rf *

完整的日志清理 shell 脚本如下:

 ```shell
#!/bin/bash
base_home="/home/apps"
log_path=${base_home}/logs
cd ${log_path} && rm -rf *

5 磁盘使用率报警,却查不到具体的大文件?

(1)问题情景

  • 1 磁盘使用率监控报警,进入机器可以 (df -h) 命令看到磁盘使用率确实超过了报警阀值。

  • 2 使用命令查看大目录,并进入到目录下 【du -sk * ./ | sort -nr | head -n5 | awk -F'\t' '{if(1024 * 1024 * 1024 * 1024 > $1 && $1 >= 1024 * 1024 * 1024) {printf "%.2fT\t\t %s\n", $1/(1024 * 1024 * 1024), $2} else if(1024 * 1024 * 1024 > $1 && $1 >= 1024 * 1024) {printf "%.2fG\t\t %s\n", $1/(1024 * 1024), $2} else if (1024 * 1024 > $1 && $1 >= 1024) {printf "%.2fM\t\t %s\n", $1/1024, $2} else {printf "%sk\t\t %s\n", $1, $2}}'

  • 3 依然没找到大文件,该怎么办呢?

(2)排查思路

  • 1 思考:是不是有文件已经被删除了,但进程还在占用该文件,进程未结束,空间未释放?

  • 2 使用「lsof |grep -i deleted」命令查看,能查看到已删除,空间没有释放的文件,包含文件大小,进程和服务名等信息。

Frequently asked Linux interview questions: Find large files and securely clear themlsof(List Open Files) 用于查看进程打开的文件,打开文件的进程,进程打开的端口(TCP、UDP),找回/恢复删除的文件。是十分方便的系统监视工具,因为 lsof 命令需要访问核心内存和各种文件,所以需要root 用户权限执行。

(3) Release the occupied disk space

Restart the service pointed to by the process, and the occupied disk space will be released. Be careful during online production operations. Do not kill the process directly. Evaluate whether there is a restart command for the process service itself, and evaluate whether the service can be restarted.

(4) Remarks Appendix

  • 1 When a file is being used by a process and the user deletes the file, the file will only be removed from the directory removed from the structure but not removed from disk.

  • 2 When the process using this file ends, the file will be truly deleted from the disk and the occupied space will be released. When Linux opens a file, the kernel will create a file for each process in the /proc//proc/{nnnn}/fd/ folder ({nnnn} (for pid)" Create a folder named after its pid to save the relevant information of the process, and its subfolder fd saves the fd (fd: file descriptor) of all the files opened by the process.

  • 3 Ctrl C and Ctrl Z are both interrupt commands. Ctrl C is to forcefully interrupt the execution of the program, and the process has been terminated; Ctrl Z is to suspend the task (meaning to pause), it is still in the process, it just maintains the suspended state.

6 Commonly used safe cleaning commands for large files in production environments

  • Safe cleaning of production environments What are the requirements for large files? It is necessary not only to not affect the normal operation of the service, but also to quickly release the occupied disk space (it is not our purpose to make the files disappear, our purpose is to quickly release the occupied disk space).

  • Do not use "rm -rf xxx.log"; commonly use "echo "" > xxx.log".

  • It is assumed that xxx.log is a large file. For example, this xxx.log has dozens of GB. "echo "" > xxx.log" is used A "" content overwrites the original file content, causing disk space to be released instantly!

7 Summary

  • summarizes the commonly used combination commands for finding large directories and large files (involving du, head , sort, awk and other commands);

  • And how to use the rm command safely;

  • There is also a disk usage alarm, but it cannot be checked How to troubleshoot specific large files;

  • Finally, we also mentioned the commonly used echo command to overwrite the original file to instantly release the disk space occupied.

Related recommendations: "Linux Video Tutorial"

The above is the detailed content of Frequently asked Linux interview questions: Find large files and securely clear them. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:今日头条. If there is any infringement, please contact admin@php.cn delete
Linux Operations: Utilizing the Maintenance ModeLinux Operations: Utilizing the Maintenance ModeApr 19, 2025 am 12:08 AM

Linux maintenance mode can be entered through the GRUB menu. The specific steps are: 1) Select the kernel in the GRUB menu and press 'e' to edit, 2) Add 'single' or '1' at the end of the 'linux' line, 3) Press Ctrl X to start. Maintenance mode provides a secure environment for tasks such as system repair, password reset and system upgrade.

Linux: How to Enter Recovery Mode (and Maintenance)Linux: How to Enter Recovery Mode (and Maintenance)Apr 18, 2025 am 12:05 AM

The steps to enter Linux recovery mode are: 1. Restart the system and press the specific key to enter the GRUB menu; 2. Select the option with (recoverymode); 3. Select the operation in the recovery mode menu, such as fsck or root. Recovery mode allows you to start the system in single-user mode, perform file system checks and repairs, edit configuration files, and other operations to help solve system problems.

Linux's Essential Components: Explained for BeginnersLinux's Essential Components: Explained for BeginnersApr 17, 2025 am 12:08 AM

The core components of Linux include the kernel, file system, shell and common tools. 1. The kernel manages hardware resources and provides basic services. 2. The file system organizes and stores data. 3. Shell is the interface for users to interact with the system. 4. Common tools help complete daily tasks.

Linux: A Look at Its Fundamental StructureLinux: A Look at Its Fundamental StructureApr 16, 2025 am 12:01 AM

The basic structure of Linux includes the kernel, file system, and shell. 1) Kernel management hardware resources and use uname-r to view the version. 2) The EXT4 file system supports large files and logs and is created using mkfs.ext4. 3) Shell provides command line interaction such as Bash, and lists files using ls-l.

Linux Operations: System Administration and MaintenanceLinux Operations: System Administration and MaintenanceApr 15, 2025 am 12:10 AM

The key steps in Linux system management and maintenance include: 1) Master the basic knowledge, such as file system structure and user management; 2) Carry out system monitoring and resource management, use top, htop and other tools; 3) Use system logs to troubleshoot, use journalctl and other tools; 4) Write automated scripts and task scheduling, use cron tools; 5) implement security management and protection, configure firewalls through iptables; 6) Carry out performance optimization and best practices, adjust kernel parameters and develop good habits.

Understanding Linux's Maintenance Mode: The EssentialsUnderstanding Linux's Maintenance Mode: The EssentialsApr 14, 2025 am 12:04 AM

Linux maintenance mode is entered by adding init=/bin/bash or single parameters at startup. 1. Enter maintenance mode: Edit the GRUB menu and add startup parameters. 2. Remount the file system to read and write mode: mount-oremount,rw/. 3. Repair the file system: Use the fsck command, such as fsck/dev/sda1. 4. Back up the data and operate with caution to avoid data loss.

How Debian improves Hadoop data processing speedHow Debian improves Hadoop data processing speedApr 13, 2025 am 11:54 AM

This article discusses how to improve Hadoop data processing efficiency on Debian systems. Optimization strategies cover hardware upgrades, operating system parameter adjustments, Hadoop configuration modifications, and the use of efficient algorithms and tools. 1. Hardware resource strengthening ensures that all nodes have consistent hardware configurations, especially paying attention to CPU, memory and network equipment performance. Choosing high-performance hardware components is essential to improve overall processing speed. 2. Operating system tunes file descriptors and network connections: Modify the /etc/security/limits.conf file to increase the upper limit of file descriptors and network connections allowed to be opened at the same time by the system. JVM parameter adjustment: Adjust in hadoop-env.sh file

How to learn Debian syslogHow to learn Debian syslogApr 13, 2025 am 11:51 AM

This guide will guide you to learn how to use Syslog in Debian systems. Syslog is a key service in Linux systems for logging system and application log messages. It helps administrators monitor and analyze system activity to quickly identify and resolve problems. 1. Basic knowledge of Syslog The core functions of Syslog include: centrally collecting and managing log messages; supporting multiple log output formats and target locations (such as files or networks); providing real-time log viewing and filtering functions. 2. Install and configure Syslog (using Rsyslog) The Debian system uses Rsyslog by default. You can install it with the following command: sudoaptupdatesud

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.