vps·web

Rsync Generator

Sync directories with rsync over SSH: incremental backups, mirroring with --delete, exclude lists, cron jobs.

Utilities · Rsync Generator

Rsync Directory Synchronization — A Practical Guide for Backups and Mirroring

rsync is the Linux tool for moving directories between machines without re-sending what's already on the other end. The delta-transfer algorithm only ships the bytes that changed since the last run, which is why a daily 200 GB backup finishes in two minutes instead of two hours. Hand-typing the command works for one server, but the moment you juggle non-standard SSH ports, exclude patterns, and the difference between a copy and a mirror, a generator earns its keep. This guide covers the rsync command structure, a working SSH-based workflow, the --delete vs incremental copy distinction, and the half-dozen mistakes that quietly destroy backups before anyone notices.

The Rsync generator — what it does and who it's for

The Rsync directory generator on vps.pyrek.com.pl builds the full rsync command from a short web form. You fill in the server address, source and destination paths, SSH port, username, and pick whether the local end is the source or destination — the generator emits a ready-to-paste command and a small shell script (rsync-source.sh or rsync-destination.sh) you can drop into a backup directory.

It's aimed at three concrete profiles. First, the developer pulling a production website down to a laptop for local debugging — the screenshot above shows exactly that, syncing /var/www/own.pyrek.com.pl/ from a remote VPS into ~/AntigravityProjects/own.pyrek.com.pl/ on a Mac. Second, the sysadmin who wants a daily backup of /etc, /var/www, or a database dump directory pushed off-site over SSH. Third, anyone who has typed rsync -avz -e "ssh -p ..." enough times to know they always forget the trailing slash on the source path.

What the generator saves you is muscle memory and the man page. The -e "ssh -p PORT" syntax is finicky — the quotes matter, the order matters, and a single misplaced colon turns a remote sync into a local file copy with a colon in the filename.

Hands-on — rsync over SSH, step by step

The goal in this section is a working command that synchronizes a directory between a local machine and a remote server, runs over SSH on a non-standard port, and shows progress while it works. We'll start with a one-shot pull, then walk through each flag, then turn it into a script you can run from cron.

Step 1 — Verify rsync is installed on both ends

rsync ships with most Linux distributions, but the remote server needs it too — not just the local one. The remote rsync binary is what answers the local rsync's wire protocol; without it, you get rsync: command not found over SSH and the transfer never starts.

# Check version on the local machine
rsync --version

# Check it on the remote server
ssh -p 5840 root@198.51.100.42 'rsync --version'

On Debian or Ubuntu, install with apt install rsync; on RHEL or Rocky, dnf install rsync. macOS ships an older rsync 2.6.9 by default — for serious use, install a current one via Homebrew (brew install rsync), which gives you 3.x with the modern delta algorithm and proper --info=progress2 support.

Step 2 — Set up SSH key authentication

You can use rsync with password authentication, but you'll be retyping the password every run, and cron jobs need passwordless auth or they'll silently hang. Generate a key, copy the public half to the remote authorized_keys, and you're done:

# Generate an Ed25519 key (faster, smaller, modern default)
ssh-keygen -t ed25519 -C "rsync backup key" -f ~/.ssh/rsync_key

# Copy the public key to the remote server
ssh-copy-id -i ~/.ssh/rsync_key.pub -p 5840 root@198.51.100.42

If ssh-copy-id isn't available (some macOS setups don't ship it), append ~/.ssh/rsync_key.pub to ~/.ssh/authorized_keys on the remote host manually and make sure permissions are 700 on ~/.ssh and 600 on authorized_keys. SSH refuses keys it considers world-readable.

Step 3 — The basic command structure

Here's the canonical rsync-over-SSH command — this is exactly what the generator produces for the screenshot above:

rsync -av --progress -e "ssh -p 5840" \
    root@198.51.100.42:/var/www/example.com/ \
    /Users/admin/Projects/example.com/

Read it left to right. The flags -a and -v turn on archive mode and verbose output. --progress prints a per-file progress bar. The -e "ssh -p 5840" block tells rsync which transport to use and overrides the default SSH port. The first path is the source — note the colon between hostname and path, which is what makes it a remote location. The second path is the local destination.

The trailing slash on the source matters more than people realize. /var/www/example.com/ (with the slash) means "copy the contents of this directory into the destination". /var/www/example.com (without) means "copy this directory itself into the destination", which produces /Users/admin/Projects/example.com/example.com/. Once you've nested a directory inside itself by accident, you'll never forget the rule again.

Step 4 — What the flags actually do

The flag stack -av and friends look like a single option but each letter does something specific. Worth knowing what:

  • -a (archive) — combines -rlptgoD. Recursive, preserves symlinks, permissions, modification times, group and owner, and special device files. This is the flag you want for backups and mirrors. It does not preserve hard links (use -H for that) or extended attributes (-X / -A).
  • -v (verbose) — prints filenames as they're processed. For very large transfers, drop it or pair with --info=progress2 for a single-line progress display instead of one line per file.
  • -z (compress) — compresses data over the wire. Useful on slow links or text-heavy content. Skip it for already-compressed payloads (videos, JPEGs, .tar.gz files); the CPU cost outweighs the saved bandwidth.
  • --progress — per-file transfer progress. Newer rsync (3.1+) also supports --info=progress2 for an aggregate progress bar across the entire transfer, which is what you want for backups.
  • -P — shorthand for --progress --partial. The --partial part keeps half-transferred files on disk if the connection drops, so the next run resumes where it left off instead of starting from zero.
  • --delete — removes files from the destination that no longer exist at the source. This turns an incremental copy into a true mirror. Dangerous flag: a typo in the source path can wipe the destination. Always pair the first run with --dry-run.
  • -n / --dry-run — simulate the transfer without writing anything. Mandatory before any --delete run on data you care about.
  • -e — specifies the remote shell. Used to pass SSH options like a non-standard port (-e "ssh -p 5840") or a specific identity file (-e "ssh -i ~/.ssh/rsync_key").

Step 5 — Incremental copy vs full mirror

The "Kopia przyrostowa" / "Incremental copy" checkbox in the generator controls one critical behavior: whether rsync removes files at the destination that no longer exist at the source.

# Incremental copy — adds and updates, never deletes
rsync -av --progress -e "ssh -p 5840" \
    root@198.51.100.42:/var/www/example.com/ \
    /Users/admin/Projects/example.com/

# Full mirror — destination matches source exactly, including deletions
rsync -av --delete --progress -e "ssh -p 5840" \
    root@198.51.100.42:/var/www/example.com/ \
    /Users/admin/Projects/example.com/

For a working development copy of a website, incremental is the right default — you don't want to lose local edits when the remote drops a file. For an off-site backup that needs to mirror the production state exactly, you need --delete. Be deliberate about which one you're running.

Step 6 — Excluding files you don't want

Almost every real-world sync needs an exclude list. Cache directories, version-control internals, log files, and node_modules shouldn't waste bandwidth.

rsync -av --progress \
    --exclude='.git/' \
    --exclude='node_modules/' \
    --exclude='*.log' \
    --exclude='cache/' \
    -e "ssh -p 5840" \
    root@198.51.100.42:/var/www/example.com/ \
    /Users/admin/Projects/example.com/

For long lists, point rsync at a file with --exclude-from=excludes.txt instead — one pattern per line, comments with #. This is what production backup scripts do, and it's easier to keep under version control than a wall of --exclude flags.

Step 7 — Wrapping it in a script for cron

Here's a minimal but production-ready backup script. Read the upstream rsync documentation for the full option set:

#!/bin/bash
# /usr/local/bin/backup-www.sh
set -euo pipefail

SRC="root@198.51.100.42:/var/www/"
DST="/srv/backups/www/"
LOG="/var/log/rsync-www.log"

rsync -az --delete --partial \
    -e "ssh -p 5840 -i /root/.ssh/rsync_key" \
    --exclude-from=/etc/rsync/excludes.txt \
    "$SRC" "$DST" >> "$LOG" 2>&1

if [ $? -ne 0 ]; then
    echo "rsync failed at $(date)" | mail -s "Backup FAILED" admin@example.com
fi

Drop it in cron with crontab -e:

# Run nightly at 02:00
0 2 * * * /usr/local/bin/backup-www.sh

This pattern — a single script, key-based auth, an excludes file, logging, and a failure notification — is enough infrastructure to run reliably for years. Pair it with the Let's Encrypt certificate generator if you also need to back up /etc/letsencrypt/, which you do.

Common mistakes and pitfalls

Forgetting the trailing slash on the source

rsync -av /var/www example.com:/backup/ copies /var/www itself into /backup/, producing /backup/www/. rsync -av /var/www/ example.com:/backup/ copies the contents of /var/www into /backup/. The two are not the same, and the difference shows up as nested directories or, worse, files written to the wrong path. Decide what you mean and check the trailing slash.

Running --delete without --dry-run first

--delete removes anything at the destination that isn't at the source. If you accidentally swap source and destination, or point at the wrong path, you'll mirror an empty directory over a directory full of data. Always run with -n first and read the file list. If anything in the "deleting" list surprises you, don't run the real command.

rsync: command not found on the remote

The remote server needs rsync installed too — the local rsync calls a remote rsync over SSH and they speak a wire protocol to each other. The error appears as rsync: connection unexpectedly closed or a literal rsync: command not found in the SSH stream. Fix: apt install rsync or dnf install rsync on the remote.

Permission denied (publickey) after configuring keys

The most common cause is permission bits on the remote ~/.ssh directory or the authorized_keys file. SSH refuses keys it considers exposed. On the remote server: chmod 700 ~/.ssh && chmod 600 ~/.ssh/authorized_keys. The second most common cause: the public key was copied with line breaks or extra whitespace. Open authorized_keys and confirm the key is on a single line.

Files re-transferring every run despite no changes

Usually a timestamp problem. FAT filesystems (USB drives), some NFS mounts, and certain Docker volume drivers don't preserve modification times accurately. rsync sees the timestamp as different and re-syncs. Add --checksum temporarily to confirm the files are byte-identical, then either fix the destination filesystem or accept that you need --checksum on every run (slower, but accurate).

-z slowing down transfers on fast networks

Compression (-z) is a win on slow links and text-heavy data. On a 1 Gbps LAN transferring already-compressed files (videos, JPEGs, archives), the CPU cost of compressing exceeds the bandwidth saved. Drop -z and watch throughput climb.

SSH connection drops mid-transfer with no resume

Without --partial, rsync deletes incomplete files when a connection drops, and the next run starts each file from byte zero. Use -P (which is --partial --progress) for any transfer where the connection might be flaky or the files are large.

FAQ

How do I run rsync over a non-standard SSH port?

Use the -e flag to pass the port to SSH: rsync -av -e "ssh -p 5840" source/ user@host:dest/. The quotes around "ssh -p 5840" matter — without them, the shell splits the argument and rsync misinterprets the command. The Rsync generator on vps.pyrek.com.pl produces this exact syntax automatically when you fill in a custom port.

What's the difference between rsync and scp?

scp copies files in one shot, every time, with no awareness of what's already at the destination. rsync compares source and destination before transferring and only sends the bytes that differ. For a one-off transfer of a single file, both work; for repeated syncs of large directories, rsync is dramatically faster. Modern OpenSSH actually deprecated the legacy scp protocol in favor of SFTP — yet another reason to default to rsync.

Can I use rsync to back up a remote server to my local machine?

Yes — this is the "pull" pattern, and it's often safer than pushing. You put the source on the remote (user@host:/path/) and the destination locally. The local machine initiates the connection, holds the SSH key, and writes to its own disk — the remote server never gets credentials to write back. This matches the screenshot at the top of the article: the developer's laptop pulls /var/www/own.pyrek.com.pl/ from the VPS into ~/AntigravityProjects/.

Does rsync resume interrupted transfers?

Only if you tell it to. Plain rsync deletes incomplete files when a transfer is interrupted. Add --partial (or use -P, which combines --partial and --progress) to keep partial files on disk so the next run picks up where the previous one left off. For large files over unstable connections, this is essential.

How do I exclude .git, node_modules, and similar directories?

Use --exclude='.git/' and --exclude='node_modules/' on the command line, or — for longer lists — --exclude-from=path/to/excludes.txt with one pattern per line. Patterns match from the top of the source tree by default; lead with / to anchor or use trailing / to match directories only. Test exclusion rules with --dry-run -v before relying on them.

Will rsync preserve file permissions, ownership, and timestamps?

The -a (archive) flag preserves permissions (-p), modification times (-t), group (-g), owner (-o, requires root or matching UID), and recurses into directories (-r). It does not preserve hard links (add -H), ACLs (-A), or extended attributes (-X). For a full filesystem-level mirror, the flag stack is -aHAX.

Can I run rsync without root on the remote server?

Yes, and you should where possible. Create a dedicated unprivileged user on the remote (backup-user or similar), give it read access to the directories you need to back up, and lock its SSH key down with command="rsync --server ..." in authorized_keys. The OpenSSH documentation on authorized_keys covers the restriction syntax.

How do I make rsync show a single-line progress bar instead of one per file?

Use --info=progress2 (rsync 3.1.0 and newer). This replaces per-file output with a running aggregate showing total bytes transferred, percentage complete, and ETA. Pair with --no-inc-recursive if you want the percentage to be accurate from the start instead of climbing as rsync discovers more files.

Next steps

Generate your full rsync command — with your server address, paths, port, and the right --delete choice — in the Rsync directory generator. It's faster than typing the -e "ssh -p ..." block from memory, and harder to get the trailing slashes wrong.

Related topics that build out a real backup workflow:

If you prefer video, check out YouTube channel — practical Linux administration, Proxmox, and self-hosting tutorials.