[mythtv-users] OT: Raid Reporting Degraded, can't get it rebuilt!
Douglas Wagner
douglasw0 at gmail.com
Fri Jun 9 23:02:21 UTC 2006
Hey all.
So i'm looking through some logs and e-mails and come across an e-mail
telling me my raid is degraded. *sigh*, ok, time to fix that problem.
So, looking at /proc/mdstat I get the following:
/proc/mdstat
----------------------------------------------------
Personalities : [raid1]
md0 : active raid1 hdi1[1]
117218176 blocks [2/1] [_U]
unused devices: <none>
-----------------------------------------------------
Well, sure enough [_U] my first drive seems no longer to be in the
raid...oddly enough tho, it's not being reported for a device on the md0
line (typically i'd expect to see md0: active raid1 hdg1[0] hdi1[1] or
something like that).
FYI, before I go farther here's the cuts from the various important logs:
/proc/devices
------------------------------
Block devices:
1 ramdisk
2 fd
3 ide0
9 md
33 ide2
34 ide3
56 ide4
253 device-mapper
254 mdp
-------------------------------
/proc/partitions
--------------------------------
major minor #blocks name
33 0 40021632 hde
33 1 104391 hde1
33 2 1020127 hde2
33 3 38893365 hde3
34 0 117220824 hdg
56 0 117220824 hdi
56 1 117218241 hdi1
253 0 117220823 dm-0
253 1 117218241 dm-1
253 2 1015808 dm-2
253 3 5111808 dm-3
253 4 2031616 dm-4
253 5 491520 dm-5
253 6 7143424 dm-6
253 7 10223616 dm-7
253 8 491520 dm-8
9 0 117218176 md0
---------------------------------
/var/log/messages
---------------------------------
Jun 9 15:45:53 mail kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Jun 9 15:45:53 mail kernel: HPT374: IDE controller at PCI slot 0000:00:0f.0
Jun 9 15:45:53 mail kernel: PCI: Found IRQ 5 for device 0000:00:0f.0
Jun 9 15:45:53 mail kernel: PCI: Sharing IRQ 5 with 0000:00:0f.1
Jun 9 15:45:53 mail kernel: HPT374: chipset revision 7
Jun 9 15:45:53 mail kernel: HPT374: 100% native mode on irq 5
Jun 9 15:45:53 mail kernel: ide2: BM-DMA at 0x1400-0x1407, BIOS
settings: h
de:pio, hdf:pio
Jun 9 15:45:53 mail kernel: ide3: BM-DMA at 0x1408-0x140f, BIOS
settings: h
dg:DMA, hdh:pio
Jun 9 15:45:53 mail kernel: PCI: Found IRQ 5 for device 0000:00:0f.1
Jun 9 15:45:53 mail kernel: PCI: Sharing IRQ 5 with 0000:00:0f.0
Jun 9 15:45:53 mail kernel: ide4: BM-DMA at 0x1800-0x1807, BIOS
settings: h
di:DMA, hdj:pio
Jun 9 15:45:53 mail kernel: ide5: BM-DMA at 0x1808-0x180f, BIOS
settings: h
dk:pio, hdl:pio
Jun 9 15:45:53 mail kernel: hde: Maxtor 4D040H2, ATA DISK drive
Jun 9 15:45:53 mail kernel: ide2 at 0x1c88-0x1c8f,0x1c76 on irq 5
Jun 9 15:45:53 mail kernel: hdg: ST3120026A, ATA DISK drive
Jun 9 15:45:53 mail kernel: ide3 at 0x1c78-0x1c7f,0x1c72 on irq 5
Jun 9 15:45:53 mail kernel: hdi: ST3120026A, ATA DISK drive
Jun 9 15:45:53 mail kernel: ide4 at 0x1ca0-0x1ca7,0x1c96 on irq 5
Jun 9 15:45:53 mail kernel: hde: max request size: 128KiB
Jun 9 15:45:53 mail kernel: hde: 80043264 sectors (40982 MB) w/2048KiB
Cache, C
HSe535/16/63, UDMA(100)
Jun 9 15:45:53 mail kernel: hde: cache flushes not supported
Jun 9 15:45:53 mail kernel: hde: hde1 hde2 hde3
Jun 9 15:45:53 mail kernel: hdg: max request size: 512KiB
Jun 9 15:45:53 mail kernel: hdg: 234441648 sectors (120034 MB) w/8192KiB
Cache,
CHS383/255/63, UDMA(100)
Jun 9 15:45:53 mail kernel: hdg: cache flushes supported
Jun 9 15:45:53 mail kernel: hdg: hdg1
Jun 9 15:45:53 mail kernel: hdi: max request size: 512KiB
Jun 9 15:45:53 mail kernel: hdi: 234441648 sectors (120034 MB) w/8192KiB
Cache,
CHS383/255/63, UDMA(100)
Jun 9 15:45:53 mail kernel: hdi: cache flushes supported
Jun 9 15:45:53 mail kernel: hdi: hdi1
<snip>
Jun 9 15:45:53 mail kernel: md: md driver 0.90.3 MAX_MD_DEVS%6,
MD_SB_DISKS=2
7
Jun 9 15:45:54 mail kernel: md: bitmap version 4.39
<snip>
Jun 9 15:45:54 mail kernel: md: Autodetecting RAID arrays.
Jun 9 15:45:54 mail kernel: md: could not open unknown-block(34,1).
Jun 9 15:45:54 mail kernel: md: autorun ...
Jun 9 15:45:54 mail kernel: md: considering hdi1 ...
Jun 9 15:45:54 mail kernel: md: adding hdi1 ...
Jun 9 15:45:54 mail kernel: md: created md0
Jun 9 15:45:54 mail kernel: md: bind<hdi1>
Jun 9 15:45:54 mail kernel: md: running: <hdi1>
Jun 9 15:45:54 mail kernel: md: raid1 personality registered for level 1
Jun 9 15:45:54 mail kernel: raid1: raid set md0 active with 1 out of 2
mirrors
Jun 9 15:45:54 mail kernel: md: ... autorun DONE.
-------------------------------
Ok, so what we have here is that HDG is a valid device, it was detected by
the kernel and an HDG1 was recognized (/dev/hdg1 does exist)
/proc/devices shows that ide3 (Mapped to hdg in messages) is a valid block
device recognzied by the system.
And here's where it gets wierd:
/proc/filesystems seems to have NO KNOWLEDGE of /dev/hdg1 even though the
kernel does. Running fdisk /dev/hdg in fact pulls up the following:
Command (m for help): p
Disk /dev/hdg: 120.0 GB, 120034123776 bytes
255 heads, 63 sectors/track, 14593 cylinders
Units
Device Boot Start End Blocks Id System
/dev/hdg1 1 14593 117218241 fd Linux raid
autodetect
Command (m for help):
So we seem to have an ok partition created.
Note also that the partition type is RAID Autodetect, meaning, by everything
I read, that the device SHOULD be listed in MDSTAT.
Jun 9 15:45:54 mail kernel: md: Autodetecting RAID arrays.
Jun 9 15:45:54 mail kernel: md: could not open unknown-block(34,1).
Is the obvious culpret to that and doing an ls -las on /dev/hdg1 provides:
[root at mail log]# ls -las /dev/hdg1
0 brwx------ 1 root root 34, 1 Jun 9 10:45 /dev/hdg1
Sure enough, Major
Not surprisingly when I try to run a device add via mdadm I get:
[root at mail log]# mdadm /dev/md0 -a /dev/hdg1
mdadm: Cannot open /dev/hdg1: No such device or address
And now we come to the REALLY strange part of this whole thing:
[root at mail log]# mdadm --detail /dev/md0
/dev/md0:
Version : 00.90.03
Creation Time : Fri Apr 2 03:08:31 2004
Raid Level : raid1
Array Size : 117218176 (111.79 GiB 120.03 GB)
Device Size : 117218176 (111.79 GiB 120.03 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Fri Jun 9 17:28:23 2006
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
UUID : b39d534f:977aecb1:d2120e72:f24e4eb3
Events : 0.6390132
Number Major Minor RaidDevice State
12395432 0 0 0 removed
1 56 1 1 active sync /dev/hdi1
Removed? It does note that there should be 2 raid devices, only one of
which seems to be active. Raid is marked as degraded, tho there's no
"failed" disks listed in mdstat.
None of this makes sense to me. If the drive had just dissappeared I
shouldn't be able to access it via FDISK, If FDISK can read and write to it,
why the heck can't md? Why can't MD obviously find a major/minor device
that is very obviously there?
I've been playing with this thing this afternoon all afternoon, and have
gone so far as to stop the array, delete the partition on hdg (the missing
drive) and re-create it...to no avail.
Help? I'd hate to have the one disk crash on me and not be in a mirrored
state...that would be BAD Egon. No matter what I do I can't seem to get
mdadm to re-create or re-sync the array.
As a final, my /etc/mdadm.conf file looks like the following:
DEVICE /dev/hdi1 /dev/hdg1
ARRAY /dev/md0 level=raid1 num-devices=2
uuid³9d534f:977aecb1:d2120e72:f24e4eb3
A small note on hardware: I have a Highpoint RocketRaid card in there. Is
there anyway to simply NOT use software mirroring and just use hardware
mirroring? Everything I try to setup with this card always still reports
both disks to the operating system which seems wrong to me. I'm used to
once the array is created through the controller, only seeing one drive on
the OS side, not both...and in fact i'm guessing this is what i'd see in the
windows side of things...am I missing a driver? Do I need to re-compile the
kernel for correct support?
--Douglas Wagner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mythtv.org/pipermail/mythtv-users/attachments/20060609/c69111e0/attachment-0001.htm
More information about the mythtv-users
mailing list