ZFS – creating pool from disks CentOs 7

Is it storage system time ?

So todays’ post will be short about creating ZFS pool on CentOs 7. This is logical follow up from previous post where I covered build out of new server. So what i have decided on is software RAID-1 for OS system using LVM.

Now for the data disk I have 3x4TB disks. And after looking around I made decision to use ZFS. Why ZFS ? Its reliable ( worked with systems based on it before ) and its really fast if you do a deep dive and configure it up to your needs.  As I would like to avoid duplication of posts you can find install guidelines in here on ZFS wiki.


For some of ppl ( like me 🙂 ) it’s handy to drop an eye on documentation so you know what you are dealing with. This can be good entry point before we continue and I will most probably refer you to RT*M 🙂 couple of times along the way. Documentation for administering ZFS is here


Which drives do we use ?

So let’s start by checking our available disks

[[email protected] ~]# fdisk -l /dev/sd?


Disk /dev/sdc: 4000.8 GB, 4000787030016 bytes, 7814037168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Disk /dev/sdd: 4000.8 GB, 4000787030016 bytes, 7814037168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Disk /dev/sde: 4000.8 GB, 4000787030016 bytes, 7814037168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Disk /dev/sdf: 240.1 GB, 240057409536 bytes, 468862128 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0xdb1d2969

   Device Boot      Start         End      Blocks   Id  System
[[email protected] ~]#


Although here it might be worth to look into assign human readable alias details to your drives. In single host scenario it might not be so useful. But when you get into working with enterprise systems in production where for obvious reasons 🙂 you have more than one server it becomes really handy.
But before actually doing this on operating system I have done the prep work on the server itself



So off we go to create vdev_id.conf  /etc/zfs/vdev_id.conf

# Custom by-path mapping for large JBOD configurations
#<ID> <by-path name>
alias BAY1_DISK1 pci-0000:00:17.0-ata-1.0
alias BAY1_DISK2 pci-0000:00:17.0-ata-2.0
alias BAY0_DISK2 pci-0000:00:17.0-ata-3.0
alias BAY0_DISK1 pci-0000:00:17.0-ata-4.0
alias BAY0_DISK0 pci-0000:00:17.0-ata-5.0
# alias  xxx      pci-0000:00:17.0-ata-6.0


Once this is done we need to trigger update using the udevadm command

udevadm trigger

Now after doing the above we will be able to list the disks using our aliases


Now all its left to do is to create ZFS pool. However just to be on the safe side we can execute a dry run.

zpool create -f -n data raidz BAY0_DISK0 BAY0_DISK1 BAY0_DISK2

In the command above the following happens:

  • we request pool to be created by using zpool create
  • we indicate we would like to have a dry run by using the -n switch
  • data is our pool name
  • RaidZ is ZFS raid type which I have chosen since I have 3 disks ( would be cool to have 4 and use RaidZ2)

Result shows what would be done for our drives



For me this looks promising – lets go ahead and get our pool created for real.

zpool create -f -o ashift=12 -O atime=off -m /pools/data data raidz BAY0_DISK0 BAY0_DISK1 BAY0_DISK2

which causes:

  • -f : forces creation as ZFS suspects we have partition on those drives – but trust me – we don’t
  • ashift=12 : following recommendation of drives with 4k blocksizes ( Advanced Format Drives  – which I recommended to get familiar with)
  • atime=off : disable access time which in return gives us more performance boost. This is something you need to decide if you would be using it
  • -m : is our mount point for the pool. Directory needs to exist already
  • RAIDZ : is of course the type of RAIDZ we would be using


The reason I’m mentioning here 4K Advanced Format drive is performance. Found here is snippet from forum thread that explains what we are looking at:


Furthermore, some ZFS pool configurations are much better suited towards 4K advanced format drives.

The following ZFS pool configurations are optimal for modern 4K sector harddrives:
RAID-Z: 3, 5, 9, 17, 33 drives
RAID-Z2: 4, 6, 10, 18, 34 drives
RAID-Z3: 5, 7, 11, 19, 35 drives

The trick is simple: substract the number of parity drives and you get:
2, 4, 8, 16, 32 …

This has to do with the recordsize of 128KiB that gets divided over the number of disks. Example for a 3-disk RAID-Z writing 128KiB to the pool:
disk1: 64KiB data (part1)
disk2: 64KiB data (part2)
disk3: 64KiB parity

Each disk now gets 64KiB which is an exact multiple of 4KiB. This means it is efficient and fast. Now compare this with a non-optimal configuration of 4 disks in RAID-Z:
disk1: 42,66KiB data (part1)
disk2: 42,66KiB data (part2)
disk3: 42,66KiB data (part3)
disk4: 42,66KiB parity

Now this is ugly! It will either be downpadded to 42.5KiB or padded toward 43.00KiB, which can vary per disk. Both of these are non optimal for 4KiB sector harddrives. This is because both 42.5K and 43K are not whole multiples of 4K. It needs to be a multiple of 4K to be optimal.


So after running the command above we have our pool running



And thats more less it for now 🙂 we got our pool running and mounted as it should.


Extra resources ? Something for future ?

In later posts we will look into performance consideration within different configurations. Which will enable us to be faster based on factual decisions in configuration.

Also I have came across really useful post about ZFS which you can find below:

Install ZFS on Debian GNU/Linux


Leave a Reply

Your email address will not be published. Required fields are marked *