ZFS: Scrubbing with an Asyncronous Cronjob

3 minute read

If you’re following along, we’ve covered setting up a ZFS mirror and upgrading ZFS. Today we’ll discuss scrubbing our zpool.

Scrubbing

ZFS scrubbing is a process which detects and corrects silent data errors in the pool. During a scrub a checksum is being calculated and compared to a known value. If the values differ, then zfs tries to correct the error with good data in the pool.

We want to avoid silent data corruption, so we should scrub our zpool regularly. This can be done while the system is running and could easily be handled with a weekly cronjob. However, I suspend my desktop every night, so if I used a simple cronjob there would be a chance my computer would be asleep when the scrub is scheduled to run.

Anacron

Enter anacron, which stands for “anac(h)ronistic cron”. Anacron is like cron in that it schedules regularly occurring events, however, it does not assume the computer is on during the scheduled time. When the system is turned back on anacron checks the last time a task was run and compares it to when it should be run next (sometimes this will be in the past). If it’s been, say, more than a week since our last zfs scrub, it runs the scrub. On Arch Linux anacron comes with cronie.

Installation and Setup

First we need to install cronie.

sudo pacman -S cronie

Now, to setup our zfs scrub we need to add the following line to /etc/anacrontab:

# zfs scrub
7   10  scrublotus.weekly   zpool scrub lotus

You’ll see the comments in your anacrontab describing the columns, but for completeness the columns are the period in days, in this case we want to run the zpool scrub once a week, so I’ve entered 7, the delay for starting the job in minutes, we have set this to 10, the job-identifier, which is a unique name given to the job and is used when storing the last run time for a job, this could be anything, and finally the command we want to run, zpool scrub lotus. You’ll want to enter the name of your zpool where I’ve entered lotus.

Anacron inserts a random delay to the start of the jobs it runs on top of the entered base delay in the second column. This is meant to prevent all the jobs from running at once, potentially putting a sudden large load on the system.

Finally we need to make sure the cronie.service is running:

sudo systemctl enable cronie.service
sudo systemctl start cronie.service

If cronie is running you’ll see:

$ sudo systemctl status cronie.service
● cronie.service - Periodic Command Scheduler
   Loaded: loaded (/usr/lib/systemd/system/cronie.service; enabled; vendor preset: disabled)
   Active: active (running) since Sat 2015-05-16 18:37:23 EDT; 19h ago
 Main PID: 673 (crond)
   CGroup: /system.slice/cronie.service
           └─673 /usr/bin/crond -n

Scrub Status

To view the status of an on going scrub, or when you last finished a scrub you can run sudo zpool status, there will be a line with scan results on it similar to this one.

scan: scrub repaired 0 in 3h2m with 0 errors on Sat May 16 14:15:45 2015

During a scrub there will be a lot of I/O operations occurring, so reading and writing for normal usage will slow down a bit. I was curious about whether or not I could suspend or shutdown the computer during a scrub when I first set this up, as there is a chance I will do so, but I could not find a lot of information on the topic. I have not had issue with it yet, though I believe I have suspended during a scrub at least once since I set this up.

Now that we have automated scrubbing setup, one of the last automated maintenance tasks to setup is automated snapshots of the zpool. I’ll write about this hopefully next week, though it may get delayed by up to a month, as I have a rather involved, month long, exam for my degree beginning tomorrow.

Tags:

Updated: