Big Data Junkie

Big Data Junkie - Chip Schweiss

Slow zpool export and import

For those of you who have worked with ZFS in an HA environment you know the pain of moving a pool from one server to another.  Zpool export can get very long for a few reasons.   None of them have been addressed in any platform for ZFS that I am aware.   Nexenta may be an exception to this, but I have not confirmed. 

When 'zpool export' is called a series of user space events get triggered before the pool is actually exported.   First, for each NFS exported zfs folder 'exportfs -u' is called one folder at  time. Second, for each zfs folder 'zfs unmount' is called again one folder at a time.  When there are lot of zfs folders exported by NFS this process stacks of fast.   One of my pools would take 45 minutes to export every time.

After the user space work is done L2ARC is de-serialized.   This too can take a considerable amount of time.  I had learned from the ZFS mailing lists about how large L2ARC can be a culprit in slow zpool exports.   This is where I first focused on hoping to resolve the export problem.   My first attempts was to simply remove cache devices one at a time on the active pool before starting its shutdown.   The problem this created was each time a device was removed all activity would be block to the pool until the cache device was removed.   

Fast Zpool Exports

Fast export can be achieved in 3 steps.

Remove Cache Devices

The default Illumos configuration will make this painful.   The the problem is ZFS tries to disconnect a single cache device in one transaction group.   This can block all other I/O for many minutes.   The solution is a change in maximum freed blocks per transaction group.  This can be done in one of two ways.  

    1. via /etc/system:
      set zfs:zfs_free_max_blocks = 10000
    2. on demand:
      echo zfs_free_max_blocks/Wot10000 | mdb -wk

The cache devices can be remove just before exporting the pool.

Remove NFS exports

One of the first task 'zpool export' does is unshare all the NFS exports.  It does this by calling 'exportfs -u' on each zfs folder with an export.   This can be done in user space in parallel.

Unmount ZFS folders

Again 'zpool export' does this in serial fashion taking a lot of time.   It too can be done in parallel in user space before calling 'zpool export'.  The trick to doing this in parallel is all children folders must be unmounted before their parent.  

After completing the three preparation steps calling 'zpool export' executes very quickly.  

Fast Zpool Imports

Fast import can be achived by reversing the steps for an export.   The trick to doing this all in user space in parallel is using

zpool import -N

This will import the pool very quickly leaving everything unmounted.   Then the export process can be reversed doing as many things in parallel as possible.  

Useful Scripts

I've scripted these operations.   They are in my BitBucket project https://bitbucket.org/ozmt/ozmt.   The scripts of interest are in the utils folder:

fast-zfs-mount.sh
fast-zfs-unmount.sh
fast-zpool-export.sh
fast-zpool-import.sh

 

You don't have permission to view or post comments.