Infrastructure

ZFS for self-hosted infrastructure: pools, datasets, and data integrity.

February 25, 2026 ยท13 min read ยท47Network Engineering

ZFS is the filesystem we use for all persistent storage in 47Network Studio hardware engagements. It's not the simplest option โ€” ext4 or xfs would work fine for many use cases โ€” but ZFS offers a combination of properties that matter for infrastructure you're expected to operate reliably for years: end-to-end checksums that catch silent data corruption, copy-on-write snapshots with no performance penalty, built-in compression that often makes storage faster not just smaller, and a replication primitive (send/receive) that makes off-site backup straightforward. This post covers the concepts and the concrete commands for a production-grade setup.

Pool design: choosing your RAID level

A ZFS pool is the top-level storage construct โ€” it contains one or more vdevs (virtual devices), each of which can be a single disk, a mirror, or a RAIDZ array. The pool's available capacity and fault tolerance are determined by the vdev configuration.

  • Mirror (RAID-1): two or more disks, all containing identical data. Excellent read performance (reads from any disk), survives any single disk failure, high write performance. Costs 50% of raw capacity for a 2-way mirror. Best for NVMe SSDs where you want maximum IOPS.
  • RAIDZ1 (RAID-5 equivalent): N+1 disks. One disk of parity, survives one failure. Acceptable for archival storage. Not recommended for modern large disks โ€” an 18TB disk rebuild takes days and rebuild reads can trigger another failure on a stressed array.
  • RAIDZ2 (RAID-6 equivalent): N+2 disks. Two disks of parity, survives two simultaneous failures. The recommended choice for spinning disks holding important data. Minimum 4 disks.
  • RAIDZ3: N+3 disks. Three parity disks. Justified only for very large arrays or extremely high-value data where extended rebuild time is a concern.
# Create a RAIDZ2 pool with 6 disks
zpool create -o ashift=12 datapool raidz2 \
  /dev/disk/by-id/ata-WDC_WD180EDAZ_A1B2C3 \
  /dev/disk/by-id/ata-WDC_WD180EDAZ_D4E5F6 \
  /dev/disk/by-id/ata-WDC_WD180EDAZ_G7H8I9 \
  /dev/disk/by-id/ata-WDC_WD180EDAZ_J1K2L3 \
  /dev/disk/by-id/ata-WDC_WD180EDAZ_M4N5O6 \
  /dev/disk/by-id/ata-WDC_WD180EDAZ_P7Q8R9

# Always use disk/by-id paths, not /dev/sdX โ€” device letters change after reboots
# ashift=12 = 4K sector size alignment (correct for all modern drives)

Dataset hierarchy: structure before you need it

ZFS datasets are lightweight, filesystem-like containers within a pool. Each dataset can have its own compression, quota, snapshot policy, and mountpoint. Design your dataset hierarchy to match your data access patterns and retention requirements:

# Create datasets for different workloads
zfs create datapool/vms           # Proxmox VM disk images
zfs create datapool/containers    # LXC containers
zfs create datapool/backups       # restic backup repository
zfs create datapool/media         # Large media files (different compression)
zfs create datapool/postgres      # PostgreSQL data directory

# Per-dataset settings
zfs set compression=lz4      datapool/vms          # Fast, good ratio for VM images
zfs set compression=zstd     datapool/backups      # Better ratio for backup data
zfs set compression=off      datapool/media        # Already compressed, skip
zfs set compression=lz4      datapool/postgres     # DB pages compress well

# Quotas and reservations
zfs set quota=2T             datapool/backups      # Hard cap at 2TB
zfs set reservation=100G     datapool/postgres     # Always reserve 100GB

# Disable access time updates for better performance
zfs set atime=off            datapool/vms
zfs set atime=off            datapool/postgres

Snapshots: free-as-in-beer point-in-time backups

ZFS snapshots are copy-on-write โ€” creating one is instant and costs no disk space until data in the dataset diverges from the snapshot. A snapshot of a 1TB dataset takes less than a second and initially uses zero additional space.

# Take a snapshot (naming convention: dataset@YYYY-MM-DD)
zfs snapshot datapool/postgres@2026-02-25

# List all snapshots
zfs list -t snapshot

# Roll back to a snapshot (DESTRUCTIVE โ€” discards all changes since snapshot)
zfs rollback datapool/postgres@2026-02-25

# Clone a snapshot to a new dataset (useful for testing migrations)
zfs clone datapool/postgres@2026-02-25 datapool/postgres-test

# Automated snapshots with a cron job or systemd timer
# Using zfs-auto-snapshot or manual:
# /etc/cron.d/zfs-snapshots
0  * * * *  root  zfs snapshot datapool/postgres@$(date +%Y-%m-%d-%H%M)  # Hourly
0  2 * * *  root  zfs snapshot datapool/vms@$(date +%Y-%m-%d)           # Daily
# Prune old snapshots โ€” keep last 24 hourly, 30 daily

Send/receive: replication and off-site backup

ZFS send/receive is the most powerful ZFS feature for backup and replication. It streams the exact ZFS data stream โ€” including snapshot history โ€” to another pool, either locally or over SSH. Incremental sends transfer only the changed blocks since the last snapshot:

# Initial full send to backup host
zfs send datapool/postgres@2026-02-25 | \
  ssh backup-host zfs receive backuppool/postgres

# Incremental send (only changes since @2026-02-24)
zfs send -i datapool/postgres@2026-02-24 \
           datapool/postgres@2026-02-25 | \
  ssh backup-host zfs receive backuppool/postgres

# With compression and verbose progress
zfs send -v datapool/postgres@2026-02-25 | \
  pv | \                         # pv shows transfer progress
  zstd | \                       # compress the stream
  ssh backup-host "zstd -d | zfs receive backuppool/postgres"

# Resume interrupted transfer (large pools over slow links)
zfs send -t $(cat /tmp/resume-token) | ssh backup-host zfs receive -s backuppool/postgres

Maintenance: scrubs and monitoring

# Schedule weekly scrubs (checks every block for corruption)
echo "0 2 * * 0 root /sbin/zpool scrub datapool" >> /etc/crontab

# Check pool health
zpool status datapool

# Monitor ZFS with Prometheus (zfs_exporter)
# Key metrics to alert on:
# - zpool_state != 0 (pool is degraded or faulted)
# - zfs_scrub_errors_total > 0 (checksum errors found)
# - zpool_free_bytes < 10% of total (pool getting full)
# ZFS performance degrades sharply above 80% full โ€” alert at 75%

In 47Network Studio hardware engagements, ZFS runs on every NAS and storage server. The law firm engagement uses a 6-disk RAIDZ2 pool for all file server storage, with daily snapshots retained for 30 days and weekly send/receive replication to an off-site backup server. The media Proxmox cluster uses ZFS-backed shared storage for all VM disk images, with per-VM snapshots before any maintenance window.

ZFS and ECC RAM: ZFS's data integrity guarantees are only as strong as your RAM. If RAM flips a bit while data passes through it, ZFS will checksum that corrupted data as correct. ECC (error-correcting) RAM catches and corrects these in-flight bit flips. For any storage server where data integrity is critical โ€” and if you're using ZFS, it is โ€” ECC RAM is not optional.


โ† Back to Blog Proxmox in Production โ†’