Zfs vs hardware raid controller

From UVOO Tech Wiki
Revision as of 20:16, 28 March 2026 by Busk (talk | contribs) (Created page with "That’s actually a **very solid default rule**, and it lines up almost perfectly with how ZFS behaves in the real world 👍 Let me sharpen it a bit specifically for ZFS + y...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

That’s actually a very solid default rule, and it lines up almost perfectly with how ZFS behaves in the real world 👍

Let me sharpen it a bit specifically for ZFS + your kind of workloads:


🧠 Your Rule (Refined for ZFS)

“Use mirrors unless workload is mostly sequential + read-heavy + static.”

That’s basically the right mental model.


🔥 Why RAID10 (mirrors) is usually the right choice

For your environment (LXD, Postgres, APIs, etc.):

You are mostly doing:

  • random reads/writes
  • small block IO
  • sync writes (fsync, WAL, etc.)
  • metadata-heavy operations

👉 ZFS mirrors excel at:

  • parallel IOPS (each vdev = queue)
  • low latency under load
  • fast resilvers (huge deal)

⚠️ Where RAIDZ2 actually wins

Your “static / read-heavy” instinct is correct, but let’s define it more precisely:

RAIDZ2 is best when:

  • Large sequential reads (media, backups)
  • Large sequential writes (archives, logs)
  • Data is mostly append-only
  • Not latency-sensitive
  • Capacity matters

🧪 Practical Examples

🟢 Mirrors (RAID10)

Use for:

  • LXD VM disks
  • PostgreSQL / databases
  • Kubernetes / etcd
  • anything with fsync
  • CI/CD workloads
  • general-purpose root pool

👉 This is probably 90% of real-world infra workloads


🔵 RAIDZ2

Use for:

  • backups
  • object storage
  • logs (cold)
  • media / large files
  • snapshot archives

💥 The Big Gotcha Most People Miss

RAIDZ2 looks good in benchmarks…

…but falls apart under:

  • queue depth
  • random IO
  • mixed workloads

👉 Especially with spinning disks


⚡ Why mirrors feel “faster” even when MB/s looks similar

Because:

  • ZFS schedules IO per vdev
  • Mirrors = more vdevs = more parallelism

👉 So system responsiveness is WAY better


🔁 Resilvering (VERY important)

This is a huge differentiator:

Mirrors:

  • Only resilver used blocks
  • Much faster rebuild
  • Less stress on disks

RAIDZ2:

  • Must rebuild entire vdev stripe
  • Slower
  • Higher risk during rebuild

🧠 Your 20-disk scenario (applied)

Your instinct:

“Always RAID10 unless read-heavy static”

👉 I would translate that to:

Default:

10 mirrors (RAID10)

Only use RAIDZ2 if:

  • You explicitly need capacity
  • AND workload is not latency sensitive

🔥 What I would actually do (based on your stack)

Since you:

  • run infra services
  • care about performance
  • likely run databases / APIs

👉 I would do:

Pool A (primary)

  • Mirrors (RAID10)
  • All active workloads

Pool B (optional)

  • RAIDZ2
  • backups / cold data

💡 One nuance worth knowing

There is one middle-ground pattern:

“Wide mirrors”

Still mirrors, but tuned datasets:

  • recordsize tuning
  • compression
  • SLOG if needed

👉 This gives you:

  • most of RAID10 benefits
  • decent capacity efficiency

🏆 Final Take

Your rule is already good. I’d just upgrade it to:

Use mirrors by default. Use RAIDZ only when capacity matters more than latency.


👍 Blunt version

  • Mirrors = performance + responsiveness + safety
  • RAIDZ2 = capacity + efficiency
  • RAIDZ is almost never the right choice for VM workloads

If you want, I can:

  • Map your exact 20-disk system into optimal mirror grouping + spare strategy
  • Or show how to split it into two pools for best LXD scheduling