Advanced usage of Linux tar command - backup data
There is a powerful tar command on Linux system. tar was originally designed for making tape backups (tape archives), which can back up files and directories to tapes and extract or restore files from tapes. Now, we can use tar to back up data to any storage medium. It is a file-level backup that does not need to consider the type of the underlying file system and supports incremental backups.
1. Some common options
●-z, –gzip: Use the gzip tool (de)compression, the suffix is generally .gz
●**-c, –create: **tar packaging, the suffix is generally .tar
●**-f, –file=: ** is immediately followed by the file name obtained after packaging or compression
●**-x, –extract: **Unpacking command, corresponding to -c
●-p:Retain the original permissions and attributes of the backup data
●**-g: **Snapshot file followed by incremental backup
●**-C:**Specify the decompression directory
●**–exclude: **Exclude unpackaged directories or files, support regular matching
other
●**-X, –exclude-from: **List the directories or files to be excluded in a file (used when –exclude= is more)
●**-t, –list: **List the file list in the backup archive, do not appear at the same time as -c and -x
●**-j, –bzip2: **Use bzip2 tool (de)compression, the suffix is generally .bz2
●**-P: **Keep the absolute path, and it will also be automatically decompressed to the absolute path when decompressing
●**-v: **(de)compression process displays the file processing process, commonly used but not recommended for large files
2. Incremental backup (website) data
Many systems (applications or websites) generate static files every day. If there is a need for regular backup of some more important static files, they can be compressed and backed up to a designated place through tar packaging, especially for some total files. For larger and larger files, you can also use the -g option to do incremental backups.
It is best to use a relative path for the backup directory, that is, enter the root directory that needs to be backed up
Specific example methods are as follows.
The“
备份当前目录下的所有文件# tar -g /tmp/snapshot_data.snap -zcpf /tmp/data01.tar.gz .在需要恢复的目录下解压恢复# tar -zxpf /tmp/data01.tar.gz -C .”
-g option can be understood to take a snapshot of the directory file during backup and record information such as permissions and attributes. If /tmp/snapshot_data.snap does not exist during the first backup, it will create a new one and make a full backup. When the files in the directory are modified, execute the first backup command again (remember to modify the subsequent archive file name), and the modified files, including permissions and attributes, will be automatically incrementally backed up based on the snapshot file specified by -g. Files that have been moved will not be backed up again.
Also note that the above recovery is a "preservation recovery", that is, files with the same file name will be overwritten, and files that already exist in the original directory (but not in the backup file) will still be retained. So if you want to completely restore the files exactly as they were backed up, you need to clear the original directory. If there are incremental backup files, you need to use the same method to decompress these files separately, and pay attention to the order.
The following demonstrates a more comprehensive example, requiring:
●Back up the /tmp/data directory, but exclude the cache directory and temporary files
●Because the directory is relatively large (>4G), the backup files are divided into parts during full backup (for example, each backup file can be up to 1G)
●Preserve all file permissions and attributes, such as user groups and read and write permissions
“
# cd /tmp/data
Make a full backup
# rm -f /tmp/snapshot_data.snap
# tar -g /tmp/snapshot_data.snap -zcpf – –exclude=./cache ./ | split -b 1024M – /tmp/bak_data$(date -I).tar.gz_
After splitting, the file name will be followed by aa, ab, ac,..., and the final backup archive will be saved as
bak_data2014-12-07.tar.gz_aa
bak_data2014-12-07.tar.gz_ab
bak_data2014-12-07.tar.gz_ac
…
Incremental backup
can be the same command as a full backup, but it should be noted that if you back up multiple times a day, it may cause duplicate file names, which will result in
Backup implementation, because split will still be named starting from aa, ab. If the amount of files generated (modified) in a day is not particularly large, it is recommended that the incremental part is not
Split processing: (If it must be split, add a more detailed time to the file name such as $(date %Y-%m-%d_%H))
# tar -g /tmp/snapshot_data.snap -zcpf /tmp/bak_data2014-12-07.tar.gz –exclude=./cache ./
Additional reserves on the second day
# tar -g /tmp/snapshot_data.snap -zcpf /tmp/bak_data2014-12-08.tar.gz –exclude=./cache ./
”
Recovery process
“
Restore full backup archive files
You can choose whether to clear the /tmp/data/ directory first
# cat /tmp/bak_data2014-12-07.tar.gz_* | tar -zxpf – -C /tmp/data/
Restore incremental backup archive files
$ tar –zxpf /tmp/bak_data2014-12-07.tar.gz -C /tmp/data/
$ tar –zxpf /tmp/bak_data2014-12-08.tar.gz -C /tmp/data/
…
Be sure to restore in chronological order. For file name rules like the one below, you can also use the above wildcard form
”
If regular backup is required, such as full backup once a week and incremental backup once a day, it can be implemented in combination with crontab.
3. Back up file system
There are many ways to back up a file system, such as cpio, rsync, dump, tar. Here is an example of backing up the entire Linux system through tar. The entire backup and recovery process is similar to the above.
First of all, there are some directories in Linux (CentOS here) that are not necessary to back up, such as /proc, /lost found, /sys, /mnt, /media, /dev, /proc, /tmp. If you are backing up to tape You don’t need to care so much about /dev/st0, because I am backing up to the local /backup directory, so I also need to exclude other directories mounted by NFS or network storage.
“
Create exclusion list file
# vi /backup/backup_tar_exclude.list
/backup
/proc
/lost found
/sys
/mnt
/media
/dev
/tmp
$ tar -zcpf /backup/backup_full.tar.gz -g /backup/tar_snapshot.snap –exclude-from=/backup/tar_exclude.list /
”
4.Attention
Whether you are using tar to back up data or file systems, you need to consider whether to restore on the original system or another new system.
●tar backup is extremely dependent on the atime attribute of the file,
●The user to whom the file belongs is determined based on the user ID. Cross-machine recovery needs to consider that the same user has the same USERID
●Try not to run other processes during the backup and recovery process, which may cause data inconsistency
●Soft and hard link files can be restored normally
The above is the detailed content of Advanced usage of Linux tar command - backup data. For more information, please follow other related articles on the PHP Chinese website!

The main tasks of Linux system administrators include system monitoring and performance tuning, user management, software package management, security management and backup, troubleshooting and resolution, performance optimization and best practices. 1. Use top, htop and other tools to monitor system performance and tune it. 2. Manage user accounts and permissions through useradd commands and other commands. 3. Use apt and yum to manage software packages to ensure system updates and security. 4. Configure a firewall, monitor logs, and perform data backup to ensure system security. 5. Troubleshoot and resolve through log analysis and tool use. 6. Optimize kernel parameters and application configuration, and follow best practices to improve system performance and stability.

Learning Linux is not difficult. 1.Linux is an open source operating system based on Unix and is widely used in servers, embedded systems and personal computers. 2. Understanding file system and permission management is the key. The file system is hierarchical, and permissions include reading, writing and execution. 3. Package management systems such as apt and dnf make software management convenient. 4. Process management is implemented through ps and top commands. 5. Start learning from basic commands such as mkdir, cd, touch and nano, and then try advanced usage such as shell scripts and text processing. 6. Common errors such as permission problems can be solved through sudo and chmod. 7. Performance optimization suggestions include using htop to monitor resources, cleaning unnecessary files, and using sy

The average annual salary of Linux administrators is $75,000 to $95,000 in the United States and €40,000 to €60,000 in Europe. To increase salary, you can: 1. Continuously learn new technologies, such as cloud computing and container technology; 2. Accumulate project experience and establish Portfolio; 3. Establish a professional network and expand your network.

The main uses of Linux include: 1. Server operating system, 2. Embedded system, 3. Desktop operating system, 4. Development and testing environment. Linux excels in these areas, providing stability, security and efficient development tools.

The Internet does not rely on a single operating system, but Linux plays an important role in it. Linux is widely used in servers and network devices and is popular for its stability, security and scalability.

The core of the Linux operating system is its command line interface, which can perform various operations through the command line. 1. File and directory operations use ls, cd, mkdir, rm and other commands to manage files and directories. 2. User and permission management ensures system security and resource allocation through useradd, passwd, chmod and other commands. 3. Process management uses ps, kill and other commands to monitor and control system processes. 4. Network operations include ping, ifconfig, ssh and other commands to configure and manage network connections. 5. System monitoring and maintenance use commands such as top, df, du to understand the system's operating status and resource usage.

Introduction Linux is a powerful operating system favored by developers, system administrators, and power users due to its flexibility and efficiency. However, frequently using long and complex commands can be tedious and er

Linux is suitable for servers, development environments, and embedded systems. 1. As a server operating system, Linux is stable and efficient, and is often used to deploy high-concurrency applications. 2. As a development environment, Linux provides efficient command line tools and package management systems to improve development efficiency. 3. In embedded systems, Linux is lightweight and customizable, suitable for environments with limited resources.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

WebStorm Mac version
Useful JavaScript development tools

Atom editor mac version download
The most popular open source editor

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software