Ubuntu: What to do about a corrupted ZFS Pool



Question:

Have been running a test instance of a NAS using a ZFS as mentioned in Restoring an Ubuntu Server using ZFS RAIDZ for data.

This week one of my disks died. Shouldn't be a problem, should it (the benefits of RAID being resilience as well as performance)?

Except that my ZFS pool got corrupted, as in:

andy@ubuntu:~$ sudo zpool status -v    pool: tank   state: UNAVAIL  status: One or more devices could not be used because the label is missing          or invalid.  There are insufficient replicas for the pool to continue          functioning.  action: Destroy and re-create the pool from          a backup source.     see: http://www.sun.com/msg/ZFS-8000-5E    scan: none requested  config:        NAME        STATE     READ WRITE CKSUM      tank        UNAVAIL      0     0     0  insufficient replicas        raidz1-0  UNAVAIL      0     0     0  insufficient replicas          sdb     FAULTED      0     0     0  corrupted data          sdc     FAULTED      0     0     0  corrupted data          sdd     UNAVAIL      0     0     0  

Fortunately this is a test instance and so I can easily start again. But what if this pool contained important data? What would the right next step(s) be to recover the data and restore my NAS to working order? Or does ZFS automatically try all possible restoration approaches, such that the data is now toast?


Solution:1

It looks like your pool may not actually be corrupted. Though from the output it seems like multiple devices may be in trouble. I am guessing multiple disks may be in questionable state, hence the faulted state on sdb and sdc. Figure out what might be wrong with them and your pool may prove you wrong. This does not look like a fatal state of pool.


Solution:2

Armed with the insight of @slashdot, I have mostly fixed my problem, but I don't really know what I did. Please examine the following trail and enlighten me.

In particular which of the following hypotheses are true and or what am I missing?

  1. Neither zdb -u tank nor zdb -dcsv tank did anything useful.
  2. The second zpool import -f tank worked when the first one didn't because enough time had elapsed since the zpool export tank for ZFS to have a chance to fix itself.
  3. This whole episode had something to do with labels changing themselves after one of the drives failed (think it was sdb which caused sdc>sdb & sdd>sdb).

LOG

andy@ubuntu:~$ zpool status  andy@ubuntu:~$ sudo zpool status    pool: tank   state: UNAVAIL  status: One or more devices could not be used because the label is missing          or invalid.  There are insufficient replicas for the pool to continue          functioning.  action: Destroy and re-create the pool from          a backup source.     see: http://www.sun.com/msg/ZFS-8000-5E    scan: none requested  config:            NAME        STATE     READ WRITE CKSUM          tank        UNAVAIL      0     0     0  insufficient replicas            raidz1-0  UNAVAIL      0     0     0  insufficient replicas              sdb     FAULTED      0     0     0  corrupted data              sdc     FAULTED      0     0     0  corrupted data              sdd     UNAVAIL      0     0     0  andy@ubuntu:~$ sudo zdb -u tank  zdb: can't open 'tank': No such device or address  andy@ubuntu:~$ sudo zpool scrub tank  cannot scrub 'tank': pool is currently unavailable  andy@ubuntu:~$ sudo zdb -bcsv tank  zdb: can't open 'tank': No such device or address  andy@ubuntu:~$ sudo zpool export tank  andy@ubuntu:~$ sudo zpool import tank  cannot import 'tank': pool may be in use from other system  use '-f' to import anyway  andy@ubuntu:~$ sudo zpool import -f tank  cannot import 'tank': one or more devices is currently unavailable  andy@ubuntu:~$ sudo zpool status  no pools available  andy@ubuntu:~$ sudo zpool status -x  no pools available  andy@ubuntu:~$ sudo zpool import    pool: tank      id: 9117894036185671023   state: UNAVAIL  status: One or more devices contains corrupted data.  action: The pool cannot be imported due to damaged devices or data.     see: http://www.sun.com/msg/ZFS-8000-5E  config:            tank        UNAVAIL  insufficient replicas            raidz1-0  UNAVAIL  insufficient replicas              sdb     FAULTED  corrupted data              sdb     UNAVAIL              sdc     ONLINE  andy@ubuntu:~$ sudo zpool import tank  cannot import 'tank': pool may be in use from other system  use '-f' to import anyway  andy@ubuntu:~$ sudo zpool import -f tank  andy@ubuntu:~$ sudo zpool status        pool: tank   state: DEGRADED  status: One or more devices could not be used because the label is missing or          invalid.  Sufficient replicas exist for the pool to continue          functioning in a degraded state.  action: Replace the device using 'zpool replace'.     see: http://www.sun.com/msg/ZFS-8000-4J    scan: scrub repaired 0 in 0h13m with 0 errors on Mon Nov 21 09:22:11 2011  config:            NAME                      STATE     READ WRITE CKSUM          tank                      DEGRADED     0     0     0            raidz1-0                DEGRADED     0     0     0              10820373921989571629  UNAVAIL      0     0     0  was /dev/sdb1              sdb                   ONLINE       0     0     0              sdc                   ONLINE       0     0     0    errors: No known data errors  andy@ubuntu:~$  


Solution:3

Can you just mixed up disks?

Once I mixed up disks and zpool said "disks contain corrupted data". After I connected disks in previous sequence it started working.

Maybe after you imported tank zpool recognized right sequence.


Solution:4

I think the previous posters have highlighted the problem. The probable cause was due to the way you specified the disks.

My experience is with ZFS on Ubuntu. Though I do use ZFS on freenas as well, I've never had to delve into the bsd implementation.

Certainly for Ubuntu, you are strongly advised to specify your devices by-id rather than by descriptor: i.e. /dev/disk/by-id/scsi-SATA-long string that uniquely ids the physical disk rather than /dev/sda.

Using by-id device removes any dependence on the specific SATA port the disk is connected to.

Gareth


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »