Automation Ninja's Dojo

ZFS – creating pool from disks CentOs 7

Is it storage system time ?

So todays’ post will be short about creating ZFS pool on CentOs 7. This is logical follow up from previous post where I covered build out of new server. So what i have decided on is software RAID-1 for OS system using LVM.

Now for the data disk I have 3x4TB disks. And after looking around I made decision to use ZFS. Why ZFS ? Its reliable ( worked with systems based on it before ) and its really fast if you do a deep dive and configure it up to your needs.  As I would like to avoid duplication of posts you can find install guidelines in here on ZFS wiki.

 

For some of ppl ( like me 🙂 ) it’s handy to drop an eye on documentation so you know what you are dealing with. This can be good entry point before we continue and I will most probably refer you to RT*M 🙂 couple of times along the way. Documentation for administering ZFS is here

 

Which drives do we use ?

So let’s start by checking our available disks

 

Although here it might be worth to look into assign human readable alias details to your drives. In single host scenario it might not be so useful. But when you get into working with enterprise systems in production where for obvious reasons 🙂 you have more than one server it becomes really handy.
But before actually doing this on operating system I have done the prep work on the server itself

rsz_2016-07-23_201702

 

So off we go to create vdev_id.conf  /etc/zfs/vdev_id.conf

 

Once this is done we need to trigger update using the udevadm command

Now after doing the above we will be able to list the disks using our aliases

listdevbyvdev

Now all its left to do is to create ZFS pool. However just to be on the safe side we can execute a dry run.

In the command above the following happens:

  • we request pool to be created by using zpool create
  • we indicate we would like to have a dry run by using the -n switch
  • data is our pool name
  • RaidZ is ZFS raid type which I have chosen since I have 3 disks ( would be cool to have 4 and use RaidZ2)

Result shows what would be done for our drives

zfs_dry_run_pool_creation

 

For me this looks promising – lets go ahead and get our pool created for real.

which causes:

  • -f : forces creation as ZFS suspects we have partition on those drives – but trust me – we don’t
  • ashift=12 : following recommendation of drives with 4k blocksizes ( Advanced Format Drives  – which I recommended to get familiar with)
  • atime=off : disable access time which in return gives us more performance boost. This is something you need to decide if you would be using it
  • -m : is our mount point for the pool. Directory needs to exist already
  • RAIDZ : is of course the type of RAIDZ we would be using

 

The reason I’m mentioning here 4K Advanced Format drive is performance. Found here is snippet from forum thread that explains what we are looking at:

 


Furthermore, some ZFS pool configurations are much better suited towards 4K advanced format drives.

The following ZFS pool configurations are optimal for modern 4K sector harddrives:
RAID-Z: 3, 5, 9, 17, 33 drives
RAID-Z2: 4, 6, 10, 18, 34 drives
RAID-Z3: 5, 7, 11, 19, 35 drives

The trick is simple: substract the number of parity drives and you get:
2, 4, 8, 16, 32 …

This has to do with the recordsize of 128KiB that gets divided over the number of disks. Example for a 3-disk RAID-Z writing 128KiB to the pool:
disk1: 64KiB data (part1)
disk2: 64KiB data (part2)
disk3: 64KiB parity

Each disk now gets 64KiB which is an exact multiple of 4KiB. This means it is efficient and fast. Now compare this with a non-optimal configuration of 4 disks in RAID-Z:
disk1: 42,66KiB data (part1)
disk2: 42,66KiB data (part2)
disk3: 42,66KiB data (part3)
disk4: 42,66KiB parity

Now this is ugly! It will either be downpadded to 42.5KiB or padded toward 43.00KiB, which can vary per disk. Both of these are non optimal for 4KiB sector harddrives. This is because both 42.5K and 43K are not whole multiples of 4K. It needs to be a multiple of 4K to be optimal.


 

So after running the command above we have our pool running

zfs_pool_created

 

And thats more less it for now 🙂 we got our pool running and mounted as it should.

 

Extra resources ? Something for future ?

In later posts we will look into performance consideration within different configurations. Which will enable us to be faster based on factual decisions in configuration.

Also I have came across really useful post about ZFS which you can find below:

Install ZFS on Debian GNU/Linux

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: