Home >System Tutorial >LINUX >Advanced usage of Linux tar command - backup data
There is a powerful tar command on Linux system. tar was originally designed for making tape backups (tape archives), which can back up files and directories to tapes and extract or restore files from tapes. Now, we can use tar to back up data to any storage medium. It is a file-level backup that does not need to consider the type of the underlying file system and supports incremental backups.
1. Some common options
●-z, –gzip: Use the gzip tool (de)compression, the suffix is generally .gz
●**-c, –create: **tar packaging, the suffix is generally .tar
●**-f, –file=: ** is immediately followed by the file name obtained after packaging or compression
●**-x, –extract: **Unpacking command, corresponding to -c
●-p:Retain the original permissions and attributes of the backup data
●**-g: **Snapshot file followed by incremental backup
●**-C:**Specify the decompression directory
●**–exclude: **Exclude unpackaged directories or files, support regular matching
other
●**-X, –exclude-from: **List the directories or files to be excluded in a file (used when –exclude= is more)
●**-t, –list: **List the file list in the backup archive, do not appear at the same time as -c and -x
●**-j, –bzip2: **Use bzip2 tool (de)compression, the suffix is generally .bz2
●**-P: **Keep the absolute path, and it will also be automatically decompressed to the absolute path when decompressing
●**-v: **(de)compression process displays the file processing process, commonly used but not recommended for large files
2. Incremental backup (website) data
Many systems (applications or websites) generate static files every day. If there is a need for regular backup of some more important static files, they can be compressed and backed up to a designated place through tar packaging, especially for some total files. For larger and larger files, you can also use the -g option to do incremental backups.
It is best to use a relative path for the backup directory, that is, enter the root directory that needs to be backed up
Specific example methods are as follows.
The“
备份当前目录下的所有文件# tar -g /tmp/snapshot_data.snap -zcpf /tmp/data01.tar.gz .在需要恢复的目录下解压恢复# tar -zxpf /tmp/data01.tar.gz -C .”
-g option can be understood to take a snapshot of the directory file during backup and record information such as permissions and attributes. If /tmp/snapshot_data.snap does not exist during the first backup, it will create a new one and make a full backup. When the files in the directory are modified, execute the first backup command again (remember to modify the subsequent archive file name), and the modified files, including permissions and attributes, will be automatically incrementally backed up based on the snapshot file specified by -g. Files that have been moved will not be backed up again.
Also note that the above recovery is a "preservation recovery", that is, files with the same file name will be overwritten, and files that already exist in the original directory (but not in the backup file) will still be retained. So if you want to completely restore the files exactly as they were backed up, you need to clear the original directory. If there are incremental backup files, you need to use the same method to decompress these files separately, and pay attention to the order.
The following demonstrates a more comprehensive example, requiring:
●Back up the /tmp/data directory, but exclude the cache directory and temporary files
●Because the directory is relatively large (>4G), the backup files are divided into parts during full backup (for example, each backup file can be up to 1G)
●Preserve all file permissions and attributes, such as user groups and read and write permissions
“
# cd /tmp/data
Make a full backup
# rm -f /tmp/snapshot_data.snap
# tar -g /tmp/snapshot_data.snap -zcpf – –exclude=./cache ./ | split -b 1024M – /tmp/bak_data$(date -I).tar.gz_
After splitting, the file name will be followed by aa, ab, ac,..., and the final backup archive will be saved as
bak_data2014-12-07.tar.gz_aa
bak_data2014-12-07.tar.gz_ab
bak_data2014-12-07.tar.gz_ac
…
Incremental backup
can be the same command as a full backup, but it should be noted that if you back up multiple times a day, it may cause duplicate file names, which will result in
Backup implementation, because split will still be named starting from aa, ab. If the amount of files generated (modified) in a day is not particularly large, it is recommended that the incremental part is not
Split processing: (If it must be split, add a more detailed time to the file name such as $(date %Y-%m-%d_%H))
# tar -g /tmp/snapshot_data.snap -zcpf /tmp/bak_data2014-12-07.tar.gz –exclude=./cache ./
Additional reserves on the second day
# tar -g /tmp/snapshot_data.snap -zcpf /tmp/bak_data2014-12-08.tar.gz –exclude=./cache ./
”
Recovery process
“
Restore full backup archive files
You can choose whether to clear the /tmp/data/ directory first
# cat /tmp/bak_data2014-12-07.tar.gz_* | tar -zxpf – -C /tmp/data/
Restore incremental backup archive files
$ tar –zxpf /tmp/bak_data2014-12-07.tar.gz -C /tmp/data/
$ tar –zxpf /tmp/bak_data2014-12-08.tar.gz -C /tmp/data/
…
Be sure to restore in chronological order. For file name rules like the one below, you can also use the above wildcard form
”
If regular backup is required, such as full backup once a week and incremental backup once a day, it can be implemented in combination with crontab.
3. Back up file system
There are many ways to back up a file system, such as cpio, rsync, dump, tar. Here is an example of backing up the entire Linux system through tar. The entire backup and recovery process is similar to the above.
First of all, there are some directories in Linux (CentOS here) that are not necessary to back up, such as /proc, /lost found, /sys, /mnt, /media, /dev, /proc, /tmp. If you are backing up to tape You don’t need to care so much about /dev/st0, because I am backing up to the local /backup directory, so I also need to exclude other directories mounted by NFS or network storage.
“
Create exclusion list file
# vi /backup/backup_tar_exclude.list
/backup
/proc
/lost found
/sys
/mnt
/media
/dev
/tmp
$ tar -zcpf /backup/backup_full.tar.gz -g /backup/tar_snapshot.snap –exclude-from=/backup/tar_exclude.list /
”
4.Attention
Whether you are using tar to back up data or file systems, you need to consider whether to restore on the original system or another new system.
●tar backup is extremely dependent on the atime attribute of the file,
●The user to whom the file belongs is determined based on the user ID. Cross-machine recovery needs to consider that the same user has the same USERID
●Try not to run other processes during the backup and recovery process, which may cause data inconsistency
●Soft and hard link files can be restored normally
The above is the detailed content of Advanced usage of Linux tar command - backup data. For more information, please follow other related articles on the PHP Chinese website!