[mythtv-users] OT: Raid Reporting Degraded, can't get it rebuilt!

Douglas Wagner douglasw0 at gmail.com
Fri Jun 9 23:02:21 UTC 2006


Hey all.

So i'm looking through some logs and e-mails and come across an e-mail
telling me my raid is degraded.  *sigh*, ok, time to fix that problem.

So, looking at /proc/mdstat I get the following:

/proc/mdstat
----------------------------------------------------
Personalities : [raid1]
md0 : active raid1 hdi1[1]
      117218176 blocks [2/1] [_U]

unused devices: <none>
-----------------------------------------------------

Well, sure enough [_U] my first drive seems no longer to be in the
raid...oddly enough tho, it's not being reported for a device on the md0
line (typically i'd expect to see md0: active raid1 hdg1[0] hdi1[1] or
something like that).

FYI, before I go farther here's the cuts from the various important logs:

/proc/devices
------------------------------
Block devices:
  1 ramdisk
  2 fd
  3 ide0
  9 md
 33 ide2
 34 ide3
 56 ide4
253 device-mapper
254 mdp
-------------------------------

/proc/partitions
--------------------------------
major minor  #blocks  name

  33     0   40021632 hde
  33     1     104391 hde1
  33     2    1020127 hde2
  33     3   38893365 hde3
  34     0  117220824 hdg
  56     0  117220824 hdi
  56     1  117218241 hdi1
 253     0  117220823 dm-0
 253     1  117218241 dm-1
 253     2    1015808 dm-2
 253     3    5111808 dm-3
 253     4    2031616 dm-4
 253     5     491520 dm-5
 253     6    7143424 dm-6
 253     7   10223616 dm-7
 253     8     491520 dm-8
   9     0  117218176 md0
---------------------------------

/var/log/messages
---------------------------------
Jun  9 15:45:53 mail kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Jun  9 15:45:53 mail kernel: HPT374: IDE controller at PCI slot 0000:00:0f.0
Jun  9 15:45:53 mail kernel: PCI: Found IRQ 5 for device 0000:00:0f.0
Jun  9 15:45:53 mail kernel: PCI: Sharing IRQ 5 with 0000:00:0f.1
Jun  9 15:45:53 mail kernel: HPT374: chipset revision 7
Jun  9 15:45:53 mail kernel: HPT374: 100% native mode on irq 5
Jun  9 15:45:53 mail kernel:     ide2: BM-DMA at 0x1400-0x1407, BIOS
settings: h
de:pio, hdf:pio
Jun  9 15:45:53 mail kernel:     ide3: BM-DMA at 0x1408-0x140f, BIOS
settings: h
dg:DMA, hdh:pio
Jun  9 15:45:53 mail kernel: PCI: Found IRQ 5 for device 0000:00:0f.1
Jun  9 15:45:53 mail kernel: PCI: Sharing IRQ 5 with 0000:00:0f.0
Jun  9 15:45:53 mail kernel:     ide4: BM-DMA at 0x1800-0x1807, BIOS
settings: h
di:DMA, hdj:pio
Jun  9 15:45:53 mail kernel:     ide5: BM-DMA at 0x1808-0x180f, BIOS
settings: h
dk:pio, hdl:pio
Jun  9 15:45:53 mail kernel: hde: Maxtor 4D040H2, ATA DISK drive
Jun  9 15:45:53 mail kernel: ide2 at 0x1c88-0x1c8f,0x1c76 on irq 5
Jun  9 15:45:53 mail kernel: hdg: ST3120026A, ATA DISK drive
Jun  9 15:45:53 mail kernel: ide3 at 0x1c78-0x1c7f,0x1c72 on irq 5
Jun  9 15:45:53 mail kernel: hdi: ST3120026A, ATA DISK drive
Jun  9 15:45:53 mail kernel: ide4 at 0x1ca0-0x1ca7,0x1c96 on irq 5
Jun  9 15:45:53 mail kernel: hde: max request size: 128KiB
Jun  9 15:45:53 mail kernel: hde: 80043264 sectors (40982 MB) w/2048KiB
Cache, C
HSe535/16/63, UDMA(100)
Jun  9 15:45:53 mail kernel: hde: cache flushes not supported
Jun  9 15:45:53 mail kernel:  hde: hde1 hde2 hde3
Jun  9 15:45:53 mail kernel: hdg: max request size: 512KiB
Jun  9 15:45:53 mail kernel: hdg: 234441648 sectors (120034 MB) w/8192KiB
Cache,
 CHS383/255/63, UDMA(100)
Jun  9 15:45:53 mail kernel: hdg: cache flushes supported
Jun  9 15:45:53 mail kernel:  hdg: hdg1
Jun  9 15:45:53 mail kernel: hdi: max request size: 512KiB
Jun  9 15:45:53 mail kernel: hdi: 234441648 sectors (120034 MB) w/8192KiB
Cache,
 CHS383/255/63, UDMA(100)
Jun  9 15:45:53 mail kernel: hdi: cache flushes supported
Jun  9 15:45:53 mail kernel:  hdi: hdi1
<snip>
Jun  9 15:45:53 mail kernel: md: md driver 0.90.3 MAX_MD_DEVS%6,
MD_SB_DISKS=2
7
Jun  9 15:45:54 mail kernel: md: bitmap version 4.39
<snip>
Jun  9 15:45:54 mail kernel: md: Autodetecting RAID arrays.
Jun  9 15:45:54 mail kernel: md: could not open unknown-block(34,1).
Jun  9 15:45:54 mail kernel: md: autorun ...
Jun  9 15:45:54 mail kernel: md: considering hdi1 ...
Jun  9 15:45:54 mail kernel: md:  adding hdi1 ...
Jun  9 15:45:54 mail kernel: md: created md0
Jun  9 15:45:54 mail kernel: md: bind<hdi1>
Jun  9 15:45:54 mail kernel: md: running: <hdi1>
Jun  9 15:45:54 mail kernel: md: raid1 personality registered for level 1
Jun  9 15:45:54 mail kernel: raid1: raid set md0 active with 1 out of 2
mirrors
Jun  9 15:45:54 mail kernel: md: ... autorun DONE.
-------------------------------

Ok, so what we have here is that HDG is a valid device, it was detected by
the kernel and an HDG1 was recognized (/dev/hdg1 does exist)
/proc/devices shows that ide3 (Mapped to hdg in messages) is a valid block
device recognzied by the system.

And here's where it gets wierd:

/proc/filesystems seems to have NO KNOWLEDGE of /dev/hdg1 even though the
kernel does.  Running fdisk /dev/hdg in fact pulls up the following:

Command (m for help): p

Disk /dev/hdg: 120.0 GB, 120034123776 bytes
255 heads, 63 sectors/track, 14593 cylinders
Units 
   Device Boot      Start         End      Blocks   Id  System
/dev/hdg1               1       14593   117218241   fd  Linux raid
autodetect

Command (m for help):

So we seem to have an ok partition created.

Note also that the partition type is RAID Autodetect, meaning, by everything
I read, that the device SHOULD be listed in MDSTAT.

Jun  9 15:45:54 mail kernel: md: Autodetecting RAID arrays.
Jun  9 15:45:54 mail kernel: md: could not open unknown-block(34,1).

Is the obvious culpret to that and doing an ls -las on /dev/hdg1 provides:

[root at mail log]# ls -las /dev/hdg1
0 brwx------ 1 root root 34, 1 Jun  9 10:45 /dev/hdg1

Sure enough, Major 
Not surprisingly when I try to run a device add via mdadm I get:

[root at mail log]# mdadm /dev/md0 -a /dev/hdg1
mdadm: Cannot open /dev/hdg1: No such device or address

And now we come to the REALLY strange part of this whole thing:

[root at mail log]# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Fri Apr  2 03:08:31 2004
     Raid Level : raid1
     Array Size : 117218176 (111.79 GiB 120.03 GB)
    Device Size : 117218176 (111.79 GiB 120.03 GB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Fri Jun  9 17:28:23 2006
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : b39d534f:977aecb1:d2120e72:f24e4eb3
         Events : 0.6390132

    Number   Major   Minor   RaidDevice State
   12395432       0        0        0      removed
       1      56        1        1      active sync   /dev/hdi1


Removed?  It does note that there should be 2 raid devices, only one of
which seems to be active.  Raid is marked as degraded, tho there's no
"failed" disks listed in mdstat.

None of this makes sense to me.  If the drive had just dissappeared I
shouldn't be able to access it via FDISK, If FDISK can read and write to it,
why the heck can't md?  Why can't MD obviously find a major/minor device
that is very obviously there?

I've been playing with this thing this afternoon all afternoon, and have
gone so far as to stop the array, delete the partition on hdg (the missing
drive) and re-create it...to no avail.

Help?  I'd hate to have the one disk crash on me and not be in a mirrored
state...that would be BAD Egon.  No matter what I do I can't seem to get
mdadm to re-create or re-sync the array.

As a final, my /etc/mdadm.conf file looks like the following:

DEVICE  /dev/hdi1 /dev/hdg1
ARRAY   /dev/md0 level=raid1 num-devices=2
uuid³9d534f:977aecb1:d2120e72:f24e4eb3

A small note on hardware: I have a Highpoint RocketRaid card in there.  Is
there anyway to simply NOT use software mirroring and just use hardware
mirroring?  Everything I try to setup with this card always still reports
both disks to the operating system which seems wrong to me.  I'm used to
once the array is created through the controller, only seeing one drive on
the OS side, not both...and in fact i'm guessing this is what i'd see in the
windows side of things...am I missing a driver?  Do I need to re-compile the
kernel for correct support?

--Douglas Wagner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mythtv.org/pipermail/mythtv-users/attachments/20060609/c69111e0/attachment-0001.htm 


More information about the mythtv-users mailing list