Notes
linux and macOS come with rsync pre-installed
windows uses rsync see 如何在windows上使用rsync
Original Text
This article is converted by SimpRead 简悦, original address www.ruanyifeng.com
1. Introduction
rsync is a commonly used Linux application for file synchronization.
It can synchronize files between a local computer and a remote computer, or between two local directories (but does not support synchronization between two remote computers). It can also be used as a file copying tool, replacing the cp and mv commands.
The r in its name stands for remote; rsync actually means “remote sync”. Unlike other file transfer tools such as FTP or scp, the biggest feature of rsync is that it checks the files already existing at the sender and receiver, and only transfers the changed parts (the default rule is based on file size or modification time changes).
2. Installation
If rsync is not installed on the local or remote computer, you can install it with the following commands.
$ sudo apt-get install rsync $ sudo yum install rsync $ sudo pacman -S rsync
Note that rsync must be installed on both sides of the transfer.
3. Basic Usage
3.1 -r Parameter
When using the rsync command locally, it can serve as a replacement for the cp and mv commands, synchronizing the source directory to the destination directory.
$ rsync -r source destination
In the above command, -r stands for recursive, including subdirectories. Note that -r is required, otherwise rsync will not run successfully. The source directory indicates the source, and destination indicates the target directory.
If multiple files or directories need to be synchronized, you can write it like this:
$ rsync -r source1 source2 destination
In the above command, both source1 and source2 will be synchronized to the destination directory.
3.2 -a Parameter
The -a parameter can replace -r. In addition to recursive synchronization, it also synchronizes metadata (such as modification time, permissions, etc.). Since rsync determines whether a file needs updating by default through file size and modification time, -a is more useful than -r. The following usage is the common form.
$ rsync -a source destination
If the target directory destination does not exist, rsync will create it automatically. After executing the above command, the source directory source is fully copied under the target directory destination, forming the directory structure destination/source.
If you only want to synchronize the contents inside the source directory source to the destination directory destination, you need to add a trailing slash to the source directory.
$ rsync -a source/ destination
After executing the above command, the contents inside the source directory will be copied to the destination directory but will not create a source subdirectory under destination.
3.3 -n Parameter
If you are unsure what rsync will do after execution, you can first simulate the result with the -n or --dry-run parameter.
$ rsync -anv source/ destination
In the above command, the -n parameter simulates the execution result without really running the command. The -v parameter outputs the result to the terminal so you can see what content will be synchronized.
3.4 --delete Parameter
By default, rsync only ensures that all content from the source directory (excluding explicitly excluded files) is copied to the target directory. It does not ensure the two directories are identical and does not delete files. To make the target directory a mirror copy of the source directory, you must use the --delete parameter, which deletes files that exist only in the target directory but not in the source directory.
$ rsync -av --delete source/ destination
In the above command, the --delete parameter makes destination a mirror of source.
4. Excluding Files
4.1 --exclude Parameter
Sometimes, we want to exclude some files or directories when synchronizing. We can specify exclusion patterns using the --exclude parameter.
$ rsync -av --exclude='*.txt' source/ destination $ rsync -av --exclude '*.txt' source/ destination
The above commands exclude all TXT files.
Note that rsync synchronizes hidden files starting with a dot. To exclude hidden files, you can write as --exclude=".*".
If you want to exclude all files inside a directory but do not want to exclude the directory itself, you can write like this:
$ rsync -av --exclude 'dir1/*' source/ destination
Multiple exclusion patterns can be specified using multiple --exclude parameters.
$ rsync -av --exclude 'file1.txt' --exclude 'dir1/*' source/ destination
Multiple exclusion patterns can also be written using Bash brace expansion with a single --exclude parameter.
$ rsync -av --exclude={'file1.txt','dir1/*'} source/ destination
If there are many exclusion patterns, you can write them into a file, one pattern per line, and specify it with the --exclude-from parameter.
$ rsync -av --exclude-from='exclude-file.txt' source/ destination
4.2 --include Parameter
The --include parameter is used to specify patterns of files that must be synchronized, often used together with --exclude.
$ rsync -av --include="*.txt" --exclude='*' source/ destination
The above command excludes all files but includes TXT files during synchronization.
5. Remote Synchronization
5.1 SSH Protocol
rsync supports not only synchronization between two local directories but also remote synchronization. It can synchronize local content to a remote server.
$ rsync -av source/ username@remote_host:destination
You can also synchronize remote content to local.
$ rsync -av username@remote_host:source/ destination
rsync uses SSH by default for remote login and data transfer.
Earlier versions of rsync did not use the SSH protocol and required the -e parameter to specify the protocol, which was later changed. Therefore, the following -e ssh can be omitted.
$ rsync -av -e ssh source/ user@remote_host:/destination
However, if the ssh command has additional parameters, you must use the -e parameter to specify the SSH command to be executed.
$ rsync -av -e 'ssh -p 2234' source/ user@remote_host:/destination
In the above command, the -e parameter specifies SSH to use port 2234.
5.2 rsync Protocol
Besides using SSH, if the other server has the rsync daemon installed and running, you can use the rsync:// protocol (default port 873) for transfer. The specific syntax uses double colons :: between the server and the target directory.
$ rsync -av source/ 192.168.122.32::module/destination
Note that the module in the address above is not an actual path name but a resource name specified by the rsync daemon, assigned by the administrator.
If you want to see all modules assigned by the rsync daemon, you can run:
$ rsync rsync://192.168.122.32
Besides using double colons, the rsync protocol can also specify addresses directly with the rsync:// protocol.
$ rsync -av source/ rsync://192.168.122.32/module/destination
6. Incremental Backup
The biggest feature of rsync is that it can perform incremental backups, meaning it only copies changed files by default.
In addition to direct comparison between the source and target directories, rsync supports using a reference directory. It synchronizes changes between the source and the reference directory to the target directory.
The method is: the first time synchronization is a full backup, synchronizing all files into the reference directory. Every subsequent synchronization is incremental, only synchronizing changed parts between the source and reference directories to a new target directory. The new target directory also contains all files but only stores changed files physically; unchanged files are hard links pointing to files in the reference directory.
The --link-dest parameter specifies the reference directory during synchronization.
$ rsync -a --delete --link-dest /compare/path /source/path /target/path
In the above command, the --link-dest parameter specifies the reference directory /compare/path. Then the source directory /source/path is compared with the reference directory to find changed files, which are copied to the target directory /target/path. Unchanged files generate hard links. The first backup is a full backup; subsequent backups are incremental.
Below is a script example backing up the user’s home directory.
set -o errexit set -o nounset set -o pipefail readonly SOURCE_DIR="${HOME}" readonly BACKUP_DIR="/mnt/data/backups" readonly DATETIME="$(date '+%Y-%m-%d_%H:%M:%S')" readonly BACKUP_PATH="${BACKUP_DIR}/${DATETIME}" readonly LATEST_LINK="${BACKUP_DIR}/latest" mkdir -p "${BACKUP_DIR}" rsync -av --delete \ "${SOURCE_DIR}/" \ --link-dest "${LATEST_LINK}" \ --exclude=".cache" \ "${BACKUP_PATH}" rm -rf "${LATEST_LINK}" ln -s "${BACKUP_PATH}" "${LATEST_LINK}"
In the above script, each synchronization generates a new directory ${BACKUP_DIR}/${DATETIME} and creates a symbolic link ${BACKUP_DIR}/latest pointing to this directory. Next time, ${BACKUP_DIR}/latest is used as the reference directory to generate the new backup directory. Finally, the symbolic link ${BACKUP_DIR}/latest is updated to point to the new backup directory.
7. Configuration Options
-a, --archive mode saves all metadata, such as modification time, permissions, ownership, etc., and symbolic links are also synchronized.
--append continues file transfer from where it left off in the last interrupted transfer.
--append-verify is similar to --append but verifies the transferred file. If verification fails, the entire file is retransmitted.
-b, --backup backs up files in the target directory before deleting or updating them by renaming the files. By default, files are deleted. The renaming rule adds a suffix specified by --suffix (default is ~).
--backup-dir specifies the directory to store backups, e.g., --backup-dir=/path/to/backups.
--bwlimit limits bandwidth, default unit is KB/s, e.g., --bwlimit=100.
-c, --checksum changes rsync’s verification method. By default, rsync checks only file size and modification date; if different, it retransmits. With this parameter, checksum of file content is used to decide retransmission.
--delete deletes files only present in the target directory but absent in the source, ensuring the target is a mirror of the source.
-e specifies using SSH for data transfer.
--exclude specifies files not to synchronize, e.g., --exclude="*.iso".
--exclude-from specifies a local file containing exclude patterns, one per line.
--existing, --ignore-non-existing mean do not synchronize files and directories that do not exist in the target directory.
-h outputs in human-readable format.
-h, --help displays help information.
-i outputs detailed differences between source and target directories.
--ignore-existing skips files that already exist in the target directory and does not synchronize them.
--include specifies files to include in synchronization, usually combined with --exclude.
--link-dest specifies the reference directory for incremental backups.
-m skips syncing empty directories.
--max-size sets the maximum file size to transfer, e.g., no more than 200KB (--max-size='200k').
--min-size sets the minimum file size to transfer, e.g., at least 10KB (--min-size=10k).
-n, --dry-run simulates operations without really executing. Combined with -v, you can see what will be synchronized.
-P combines the --progress and --partial parameters.
--partial allows resuming interrupted transfers. Without this parameter, rsync deletes partially transferred files on interruption; with it, partial files are kept and synced to the target directory to resume later, usually used with --append or --append-verify.
--partial-dir specifies a temporary directory to save partially transferred files, e.g., --partial-dir=.rsync-partial. Usually used with --append or --append-verify.
--progress shows the transfer progress.
-r means recursive, including subdirectories.
--remove-source-files deletes files from the sender after successful transfer.
--size-only only synchronizes files with different sizes, ignoring modification time differences.
--suffix specifies the suffix added to backup file names, default is ~.
-u, --update skips files in the target directory that have newer modification times; i.e., do not synchronize files with updated timestamps.
-v outputs details. -vv outputs more details, -vvv outputs the most detailed information.
--version displays the rsync version.
-z compresses data during synchronization.
8. References
- How To Use Rsync to Sync Local and Remote Directories on a VPS, Justin Ellingwood
- Mirror Your Web Site With rsync, Falko Timme
- Examples on how to use Rsync, Egidio Docile
- How to create incremental backups using rsync on Linux, Egidio Docile
(End)
