Friday, October 9, 2009

FreeNAS 0.7, ZFS snapshots and scrubbing

I've described here how I am doing snapshots. Cron and a little script works perfect, but I also scrub from time to time the zpools. But the scrub never finishes :-(
The reason behind this is described here. Scrubbing or resilvering a pool starts over when a snapshot is taken.
So it is necessary to change the script that creates the snapshots

snapshot_hourly.sh

#!/bin/sh

# If a scrub is running, don't do a snapshot (otherwise scrub will restarts at 0%)
pools=$(zpool list -H -o name)
for pool in $pools;
do
if zpool status -v $pool | grep -q "scrub in progress"; then
exit
fi
done

# Destroy the old snapshot and create the new
zfs destroy $1@hourly.`date "+%H"` > /dev/null 2>&1
zfs snapshot $1@hourly.`date "+%H"`

snapshot_daily.sh

#!/bin/sh

# If a scrub is running, don't do a snapshot (otherwise scrub will restarts at $
pools=$(zpool list -H -o name)
for pool in $pools;
do
if zpool status -v $pool | grep -q "scrub in progress"; then
exit
fi
done

# Destroy the old snapshot and create the new
zfs destroy $1@daily.`date "+%a"` > /dev/null 2>&1
zfs snapshot $1@daily.`date "+%a"`

and snapshot_weekly.sh (please be aware that I use this script to keep snapshots of the last twelve weeks. If necessary, you need to change this...)

#!/bin/sh

# If a scrub is running, don't do a snapshot (otherwise scrub will restarts at $
pools=$(zpool list -H -o name)
for pool in $pools;
do
if zpool status -v $pool | grep -q "scrub in progress"; then
exit
fi
done

# Destroy the oldest snapshot, rotate the other and create the new
zfs destroy $1@weekly.12 > /dev/null 2>&1
zfs rename $1@weekly.11 @weekly.12 > /dev/null 2>&1
zfs rename $1@weekly.10 @weekly.11 > /dev/null 2>&1
zfs rename $1@weekly.9 @weekly.10 > /dev/null 2>&1
zfs rename $1@weekly.8 @weekly.9 > /dev/null 2>&1
zfs rename $1@weekly.7 @weekly.8 > /dev/null 2>&1
zfs rename $1@weekly.6 @weekly.7 > /dev/null 2>&1
zfs rename $1@weekly.5 @weekly.6 > /dev/null 2>&1
zfs rename $1@weekly.4 @weekly.5 > /dev/null 2>&1
zfs rename $1@weekly.3 @weekly.4 > /dev/null 2>&1
zfs rename $1@weekly.2 @weekly.3 > /dev/null 2>&1
zfs rename $1@weekly.1 @weekly.2 > /dev/null 2>&1
zfs snapshot $1@weekly.1

I've found a script from http://hype-o-thetic.com to run a scrub regularly. See his blogpost here...

Just for backup reasons, I post this script below...

#!/bin/bash

#VERSION: 0.2
#AUTHOR: gimpe
#EMAIL: gimpe [at] hype-o-thetic.com
#WEBSITE: http://hype-o-thetic.com
#DESCRIPTION: Created on FreeNAS 0.7RC1 (Sardaukar)
# This script will start a scrub on each ZFS pool (one at a time) and
# will send an e-mail or display the result when everyting is completed.

#CHANGELOG
# 0.2: 2009-08-27 Code clean up
# 0.1: 2009-08-25 Make it work

#SOURCES:
# http://aspiringsysadmin.com/blog/2007/06/07/scrub-your-zfs-file-systems-regularly/
# http://www.sun.com/bigadmin/scripts/sunScripts/zfs_completion.bash.txt
# http://www.packetwatch.net/documents/guides/2009073001.php

# e-mail variables
FROM=from@devnull.com
TO=to@devnull.com
SUBJECT="$0 results"
BODY=""

# arguments
VERBOSE=0
SENDEMAIL=1
args=("$@")
for arg in $args; do
case $arg in
"-v" | "--verbose")
VERBOSE=1
;;
"-n" | "--noemail")
SENDEMAIL=0
;;
"-a" | "--author")
echo "by gimpe at hype-o-thetic.com"
exit
;;
"-h" | "--help" | *)
echo "
usage: $0 [-v --verbose|-n --noemail]
-v --verbose output display
-n --noemail don't send an e-mail with result
-a --author display author info (by gimpe at hype-o-thetic.com)
-h --help display this help
"
exit
;;
esac
done

# work variables
ERROR=0
SEP=" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - "
RUNNING=1

# commands & configuration
ZPOOL=/sbin/zpool
PRINTF=/usr/bin/printf
MSMTP=/usr/local/bin/msmtp
MSMTPCONF=/var/etc/msmtp.conf

# print a message
function _log {
DATE="`date +"%Y-%m-%d %H:%M:%S"`"
# add message to e-mail body
BODY="${BODY}$DATE: $1\n"

# output to console if verbose mode
if [ $VERBOSE = 1 ]; then
echo "$DATE: $1"
fi
}

# find all pools
pools=$($ZPOOL list -H -o name)

# for each pool
for pool in $pools; do
# start scrub for $pool
_log "starting scrub on $pool"
zpool scrub $pool
RUNNING=1
# wait until scrub for $pool has finished running
while [ $RUNNING = 1 ]; do
# still running?
if $ZPOOL status -v $pool | grep -q "scrub in progress"; then
sleep 60
# not running
else
# finished with this pool, exit
_log "scrub ended on $pool"
_log "`$ZPOOL status -v $pool`"
_log "$SEP"
RUNNING=0
# check for errors
if ! $ZPOOL status -v $pool | grep -q "No known data errors"; then
_log "data errors detected on $pool"
ERROR=1
fi
fi
done
done

# change e-mail subject if there was error
if [ $ERROR = 1 ]; then
SUBJECT="${SUBJECT}: ERROR(S) DETECTED"
fi

# send e-mail
if [ $SENDEMAIL = 1 ]; then
$PRINTF "From:$FROM\nTo:$TO\nSubject:$SUBJECT\n\n$BODY" | $MSMTP --file=$MSMTPCONF -t
fi

9 comments:

KC said...

Awesome post. Just a quick note that you are missing a closing back tick (`) for both the hourly and the daily script. I was wondering why cron doesn't want to execute them!

harryd said...

Many thanks for this. I've corrected it in the post. THX!

Anonymous said...

Hi I just want to say thanks a lot, I have set this up on my two Freenas VMs (0.7.1 Shere (revision 5127)), and it is working like a charm. There is a typo in the first line (comment) for snapshot_daily.sh and snapshot_weekly.sh where it should say "(otherwise scrub will restarts at 0%)" instead of "(otherwise scrub will restarts at $" - but it does not affect the script in any way of course :)

Thanks again!

v1ncen7 said...

Hi I just want to say thanks a lot, I have set this up on my two Freenas VMs (0.7.1 Shere (revision 5127)), and it is working like a charm. There is a typo in the first line (comment) for snapshot_daily.sh and snapshot_weekly.sh where it should say "(otherwise scrub will restarts at 0%)" instead of "(otherwise scrub will restarts at $" - but it does not affect the script in any way of course :)

Thanks again!

Mathew (with one T) said...

Hi Harry,

The script looks awesome, but I cannot seem to get it to work. I am admittedly a novice at scripting, but I followed all your directions and get a "cron cannot execute" error without fail. Is the script literally copy and paste or are there things I need to adjust? Can I put it in any folder (not that this matters as I copied you exactly and put it in a bin folder in my mount)?

Any help would be awesome. Thanks again.

harryd said...

You have to give the execute permission to the script. (chmod u+x , pls have a look here -> http://www.freebsd.org/cgi/man.cgi?chmod)

Mathew (with one T) said...

Hey Harry,

Thanks for the prompt response. I had changed the permissions before (and double checked them after your post), but it still won't work. I did a little bit of digging and if I execute it from the shell instead of cron, I don't get an error, but "zfs list" also doesn't show any of the snapshots. Does the file/bin folder need to be in a certain location?

harryd said...

You've also read this one? -> http://harryd71.blogspot.com/2008/08/freenas-07-and-zfs-snapshots.html
Pls have a look a the screenshot in that post. Hope that helps...

Mathew (with one T) said...

Yes. I actually have two sets of scripts and tried both. Same result with both, but if I run the non-scrub version (from the shell), I get "cannot snapshot '@hourly.11' : empty component in name".

Anyways, don't mean to take up your time. I'll look around and see what I can find. Thanks again.