Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One thing to note here, is that BTRFS use (Lack of use), is heavily based on the way RAID1 is implemented. A BTRFS RAID1 doeesn't do dual disk striping, it writes the file twice on different drives. You can do a RAID1 with 3 drives or 4 or 5. giving the same redundancy of RAID5 with a different name. If you want to double the parity, I believe you would put dual RAID1s into another RAID1.

I think the functionality of BTRFS is there, though the ideas about how we build redundant data sets will need to shift a bit.



Btrfs calculator http://carfax.org.uk/btrfs-usage/

Btrfs raid1 currently means "2 copies", even if you have 3+ devices. Btrfs is also capable of using different sized drives efficiently. Try creating a single volume raid5 with 3x 1TB and 3x 2TB. This is ~7TB usable (not including some space for fs metadata but does account for parity). Rebalance is not required when adding those 3x 2TB drives, they'll fairly immediately be put to use (the next time data chunks are allocated). And also no initial syncs when creating or adding drives.


I'm sorry but I don't think I follow you. Mirroring can never achieve the space / redundancy efficiency of parity raid, I don't see how the above works.


It looks like he's trying to say: in an array of N >= 2 drives, "BTRFS RAID 1" will still only choose 2 disks to replicate a file to. It's just 2 different ones for every file.

Well, not every file, but you see the point.

If that's the case, this has the advantage of having a subset of files recoverable even after >1 disk failure. Unlike RAID 5.

On the flip side: read speed is limited to at most 2 concurrent disks. Unlike RAID 5, where it's N-1. But that's theoretical, what's the reality on read speeds in RAID 5?

Anyway, if that's true, that would be a really weird name. Why didn't they call that "BTRFS RAID 5"? A look at the docs doesn't make this clear to me at all. I'm not sure what to believe...


I was not even aware of this possibility, thanks for explaining it to me.

The concept fills with me with absolute horror though. Losing a part of my files is as bad as losing everything.

The performance benefit with larger files of RAID1 vs RAID5/6 can be very 'substantial'. Even my old 18 TB NAS could achieve 1+ GB/s on sequential reads.


RAID5/6+ is only as fast as your slowest disk, and seek times are no better. In most respects, mirroring gets you better performance, and for most drive configurations, is more reliable than RAID5. Massive throughput on single files one at a time is not interesting to most people with big storage requirements.

I currently run RAID6 (raidz2) across 10 drives. But I'll be moving to RAID10 before long. Whether using zfs or not, depends on btrfs stability.


RAID5/6 is only as fast as the slowes disk regarding ZFS yes if you talk about random IOPS. BTRFS performance I haven't measured yet.

RAID5/6 random IOPS can be quite good or as you can expect if you use MDADM.

The best solution seems to be tons of RAM for caching and ZFS with SSD SLOG if you need reliable and fast sync writes. But with ZFS, performance comes second over reliability.


RAID isn't backup. Losing some of your files instead of all your files means you have access to some of the files during the restore window. And your restore might be quicker?


But how do you even fire up an Oracle database with some of the dbf files missing? And I can't think of any other application (other than maybe a document storage system) that would be useful with some of the files missing.


You don't. You rsync the missing files over and skip the data that is already present, saving time.


As you said yourself, volume sizes aren't exactly the problem anymore ie there is tons of space to store bits these days.

Raid rebuilds take way too long for modern volume sizes.

Raid rebuilds take so long that the likelihood of losing a second drive before completion is very high.

The safety you generally want is redundant copies of checksummed data and metadata blocks.

Btrfs allows you to choose the number of copies of data or metadata separately.

Talking about RAID in the context of btrfs and zfs generally seems to confuse people as it brings in the old expectations and understandings they worked so hard at figuring out for the RAID era.


Rebuilds on ZFS and BTRFS are quite reasonable because most of the time only the data itself is rebuild, not the entire drive as with old-fashioned solutions.

Rebuilds depend on drive size, not array size.

Tripple-parity as part of ZFS allows you to create even larger VDEVS while keeping the risks manageable. Interesting for low-performance archiving solutions.


ZFS has ditto blocks, too. Surely having parity striping and ditto blocks both is better than just ditto blocks pretending to be parity striping.


ZFS allows you to use this form of data duplication in addition to RAIDZ-x, under the filesystem property named copies.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: