Replacing a silently failing disk in a ZFS pool

Maybe I can’t read, but I have the feeling that official documentations explain every single corner case for a given tool, except the one you will actually need. My today’s struggle: replacing a disk within a FreeBSD ZFS pool .

What? there’s a shitton of docs on this topic! Are you stupid?

I don’t know, maybe. Yet none covered the process in a simple, straight and complete manner. Here’s the story:

Since yesterday I felt my personal FreeBSD NAS was sluggish, and this morning, I saw those horrible messages popping in my syslog console:

Jul  2 12:49:53 <kern.crit> newcoruscant kernel: ahcich1: Timeout on slot 8 port 0
Jul  2 12:49:53 <kern.crit> newcoruscant kernel: ahcich1: is 00000000 cs 00000000 ss 00000300 rs 00000300 tfd 40 serr 00000000 cmd 0000c917
Jul  2 12:49:53 <kern.crit> newcoruscant kernel: (ada1:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 08 50 25 e9 40 3b 00 00 00 00 00
Jul  2 12:49:53 <kern.crit> newcoruscant kernel: (ada1:ahcich1:0:0:0): CAM status: Command timeout
Jul  2 12:49:53 <kern.crit> newcoruscant kernel: (ada1:ahcich1:0:0:0): Retrying command
Jul  2 12:51:02 <kern.crit> newcoruscant kernel: cant/memory/memory-inactive: ds[0] = 52350976.000000
Jul  2 12:51:02 <kern.crit> newcoruscant kernel: ahcich1: AHCI reset: device not ready after 31000ms (tfd = 00000080)

Yeah… that bad.

The first thing that stroke me is that ZFS seemed perfectly fine with that:

root@newcoruscant:~ # zpool status
  pool: zroot
 state: ONLINE
  scan: scrub repaired 0 in 2h26m with 0 errors on Tue Jun 25 12:08:56 2019
config:

	NAME        STATE     READ WRITE CKSUM
	zroot       ONLINE       0     0     0
	  raidz1-0  ONLINE       0     0     0
	    ada0p4  ONLINE       0     0     0
	    ada1p4  ONLINE       0     0     0
	    ada2p4  ONLINE       0     0     0

errors: No known data errors

But the input/output error thrown by smartctl -a /dev/ada1 made things clear, I needed to replace this disk quickly!

Thanks to past-me, there already was a disk ready for this task at ada3 , so, after trustfully reading the zpool administration guide , and in particular Replacing a Functioning Device , I entered:

# zpool replace zroot ada1p4 ada3p4

Except it didn’t ran as expected:

cannot open 'ada3p4': no such GEOM provider
must be a full path or shorthand device name

What a fantastic and explicit error message just to say that ada3 doesn’t have a corresponding partition table.

I am no FreeBSD guru and very occasional user, so no, I am not used to GEOM , gpart , GELI etc… finally, this very well written stackexchange post showed me how to replicate the correct partition table to the new disk:

# gpart backup ada0|gpart restore -F ada3

Now zpool replace zroot ada1p4 ada3p4 would work! I also did not forget to replicate the boot sequence to the new disk as instructed by both the documentation and zpool :

# gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada3 
partcode written to ada3p1
bootcode written to ada3

And at last the silvering was taking place:

root@newcoruscant:~ # zpool status
  pool: zroot
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Jul  2 11:21:24 2019
	3.91M scanned out of 1.84T at 38.5K/s, (scan is slow, no estimated time)
        1.30M resilvered, 0.00% done
config:

	NAME             STATE     READ WRITE CKSUM
	zroot            ONLINE       0     0     0
	  raidz1-0       ONLINE       0     0     0
	    ada0p4       ONLINE       0     0     0
	    replacing-1  ONLINE       0     0     0
	      ada1p4     ONLINE       0     0     0
	      ada3p4     ONLINE       0     0     0
	    ada2p4       ONLINE       0     0     0

errors: No known data errors

But… at less than 40K/s ! Turns out that very logically the failing disk and its timeouts was slowing down the silvering, so I learned that to avoid this kind of situation, you should offline the failing disk from the zpool :

# zpool offline zroot ada1p4

And then

$ sudo zpool status
  pool: zroot
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Jul  2 16:01:22 2019
	514G scanned out of 1.84T at 167M/s, 2h20m to go
        170G resilvered, 27.22% done
config:

	NAME                        STATE     READ WRITE CKSUM
	zroot                       DEGRADED     0     0     0
	  raidz1-0                  DEGRADED     0     0     0
	    ada0p4                  ONLINE       0     0     0
	    replacing-1             DEGRADED     0     0     8
	      15084350875675872541  OFFLINE      0     0     0  was /dev/ada1p4
	      ada3p4                ONLINE       0     0     0
	    ada2p4                  ONLINE       0     0     0

errors: No known data errors

Much better. At the end of the resilvering , everything is now working correctly:

$ sudo zpool status
  pool: zroot
 state: ONLINE
  scan: resilvered 628G in 2h52m with 0 errors on Tue Jul  2 18:53:48 2019
config:

	NAME        STATE     READ WRITE CKSUM
	zroot       ONLINE       0     0     0
	  raidz1-0  ONLINE       0     0     0
	    ada0p4  ONLINE       0     0     0
	    ada3p4  ONLINE       0     0     0
	    ada2p4  ONLINE       0     0     0

errors: No known data errors

I read that you should zpool remove the failing disk at the end of this operation, but when trying to do so:

root@newcoruscant:~ # zpool remove zroot ada1p4
cannot remove ada1p4: no such device in pool
root@newcoruscant:~ # zpool remove zroot 15084350875675872541
cannot remove 15084350875675872541: no such device in pool

So I guess zpool did it itself.

Now it’s time to buy and add a new spare for the next disk that fails…

Recommend

百度CTO王海峰首秀百度大脑5.0 AI大生产平台加速产业智能化

CronJ IT Technologies

Ethereum Blockchain Development Company | Ethereum DApp Development Company

Running your PHP application on Azure App Engine - Rob Allen

大名鼎鼎的布隆过滤器过时了？被一只鸟取而代之。。。

React: rendering vs running your components

容器十年 ——一部软件交付编年史

前端学serverless系列——WebApplication迁移实践

引介 | 以太坊 2.0 信标链验证者

Which is the best ICO development company?

About Joyk