69

ASRock motherboard destroys Linux software RAID

 5 years ago
source link: https://www.tuicool.com/articles/hit/qaINFv7
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Hello,

I have reason to suspect that ASRock motherboards accidentally wipe out Linux RAID metadata. Details below.

I'm a programmer and just upgraded from a Gigabyte H97-HD3 to the ASRock Z97 Extreme6 motherboard.

I use software RAID1 on Linux using mdadm using whole disk devices (no partitions).

After I installed the new motherboard and rebooted, I noticed that my software RAID was broken, because the superblocks (RAID meta information at the beginning of the disk) of all my RAID disks had been wiped out with zero-bytes.

In particular, the disk area between (hexadecimal) offset 0x1000 (inclusive) and 0x4000 (exclusive) are overridden with zero-bytes.

This happens on every boot of the machine. I can reproduce it reliably.

I am very sure that it is the motherboard UEFI that performs this zeroing during bootup, before control is passed to the bootloader:

With the previous mainboard, the zeoring does not occur. When a disk is not attached during boot, but attached when Linux is already running, the zeroing does not occur. When I boot into the UEFI Setup utility with a disk attached, and then immediately remove it and attach it to another PC for inspection, the zeroing did occur.

It is important to know that a disk configured to be part of an mdadm RAID array can look like a broken EFI disk. In particular, running `gdisk -l` on a functioning mdadm RAID array disk produce output like this:

GPT fdisk (gdisk) version 1.0.1

Caution! After loading partitions, the CRC doesn't check out!

Warning! Main partition table CRC mismatch! Loaded backup partition table

instead of main partition table!

Warning! One or more CRCs don't match. You should repair the disk!

Partition table scan:

MBR: protective

BSD: not present

APM: not present

GPT: damaged

****************************************************************************

Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk

verification and recovery are STRONGLY recommended.

****************************************************************************

Disk /dev/sdc: 7814037168 sectors, 3.6 TiB

Logical sector size: 512 bytes

Disk identifier (GUID): 5D940099-EC12-42B0-9DF9-CDAE167EE6EE

Partition table holds up to 128 entries

First usable sector is 34, last usable sector is 7814037134

Partitions will be aligned on 2048-sector boundaries

Total free space is 7814037101 sectors (3.6 TiB)

Number Start (sector)    End (sector) Size       Code Name

Note that this is NOT an error, since we don't expect there to be a GPT partition table on the disk (because it's used as a whole disk device in an mdadm RAID and that one doens't have anything to do with GPT or partitioning).

However, I suspect that the device looking like it has a damaged GPT triggers some undocumented "recovery" features in ASRock mainboards.

I suspect this in particular because after booting through the ASRock UEFI, the disk suddenly has a "correct" GPT; sgdisk reports:

GPT fdisk (gdisk) version 1.0.1

Partition table scan:

MBR: protective

BSD: not present

APM: not present

GPT: present

Found valid GPT with protective MBR; using GPT.

Disk /dev/sdc: 7814037168 sectors, 3.6 TiB

Logical sector size: 512 bytes

Disk identifier (GUID): 5D940099-EC12-42B0-9DF9-CDAE167EE6EE

Partition table holds up to 128 entries

First usable sector is 34, last usable sector is 7814037134

Partitions will be aligned on 2048-sector boundaries

Total free space is 7814037101 sectors (3.6 TiB)

Number Start (sector)    End (sector) Size       Code Name

I suspect that the following happens:

The motherboard's UEFI finds that there's something on the disk that looks like a damaged GPT, and it "fixes" the GPT, not knowing that it is in fact destroying valuable data. It does this already before booting into the UEFI Setup utility (perhaps so that the UEFI GUI can then provide features like displaying disk contents).

Can you confirm or deny whether the ASRock Z97 Extreme6 motherboard firmware has such a feature to modify disk contents to "repair" broken-looking GPT disks?

If yes, can you confirm which other ASRock motherboards have this feature, and whether it is possible to disable this behaviour?

Thank you.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK