March 8, 2012

Why Zfs is the best file system...

The universe of file systems is populated with an incredible amount of good and less-good ones but the best above all is certainly Zfs from Oracle. Maybe it isn't the fastest of the less CPU-expencive but if the most secure. Why? Everyone knows it combines a file system, a logical volume manager and two software raid levels (5 and 6, named raid-z and raid-z2) but it is the only one (as far I know) to implement a continuous data integrity verification against data corruption mode.

Let's explain with an example!

The WD20EARS, a verey reliable drive, is rated of non-recoverable read errors per bits read as 1 in 10^14 i.e. you can have a 0,88% chance of a bit error rate per Terabyte. It doesn't seem very hight but let's apply to the real life!

In my Nas I actually have 3x2 TByte disks (internally) and 4x3 TByte disks (in the e-sata enclosure) so my error chance should be around 14,84%. It doesn't mean the 14,84% of my data is bad, it means I have a 14,84% chance to have a read error (bad data!).

Zfs to avoid the problem implements a check-sum (Fletcher-based or SHA-256) hash throughout the file system tree so, even if my hard disk has a read error it can recover it and my data is safe-read!

The main issue about Zfs is the license: it appears to be incompatible with the Linux one so it has not been implemented in the main kernel but there is a port named ZFSonLinux actually in development.
That porting works only on a 64 bit kernel due to the Zfs heavy use of the virtual address space.

In the next days I'll compile it and install to test zfs on my nas so stay tuned!

Cheers!