[mythtv-users] OT: FedoraCore3 + NFS + XFS + Software Raid = EIP?

Blammo blammo.doh at gmail.com
Thu Jun 30 21:09:07 EDT 2005


I know this isn't a kernel mailing list, however, given that this list
has a large number of people who combine these features, I thought I'd
give it a shot.

I have a backend running (at the moment) 2.6.12.1 w/8k stacks. I've
been fighting this issue however, since about 2.6.7. going from 4k
stacks to 8k stacks helped, but didn't remove the problem.

Occasionally, specific to heavy IO, I'll have the machine go into a
state of limbo. It will either Hardlock (no keyboard activity) or go
into Limbo. The times it doesn't hard lock, I've been able to track it
back to NFS.

 I've replaced motherboard, CPU, RAM, Video, ethernet, HD's, power
supplies, and kernel versions, chased IRQ's, etc, so I'm fairly sure
it's a bug.


Here's the little present that was left in /var/log/messages for me
today. This was a limbo-lock, meaning I was able to kill almost all
processes, but couldn't shut down because nfsd and mountd were hung:


Jun 30 12:31:52 backend1 kernel: Unable to handle kernel paging
request at virtual address fff72710
Jun 30 12:31:52 backend1 kernel:  printing eip:
Jun 30 12:31:52 backend1 kernel: fff72710
Jun 30 12:31:52 backend1 kernel: *pde = 00002067
Jun 30 12:31:52 backend1 kernel: Oops: 0000 [#1]
Jun 30 12:31:52 backend1 kernel: Modules linked in: nfsd lockd md5
ipv6 parport_pc lp parport autofs4 sunrpc ext3 jbd dm_mod video button
battery ac i2c_viapro b2c2_flexcop_pci b2c2_flexcop dvb_core mt352
bcm3510 stv0299 nxt2002 stv0297 mt312 i2c_core r8169 xfs exportfs
raid5 xor raid0 sata_via libata sd_mod scsi_mod
Jun 30 12:31:52 backend1 kernel: CPU:    0
Jun 30 12:31:52 backend1 kernel: EIP:    0060:[<fff72710>]    Not tainted VLI
Jun 30 12:31:52 backend1 kernel: EFLAGS: 00010286   (2.6.12.1) 
Jun 30 12:31:52 backend1 kernel: EIP is at 0xfff72710
Jun 30 12:31:52 backend1 kernel: eax: e0f98500   ebx: c75864a0   ecx:
dc39fde8   edx: d78eda60
Jun 30 12:31:52 backend1 kernel: esi: d78eda60   edi: da680000   ebp:
e0f98500   esp: dc39fdc4
Jun 30 12:31:52 backend1 kernel: ds: 007b   es: 007b   ss: 0068
Jun 30 12:31:52 backend1 kernel: Process nfsd (pid: 3128,
threadinfo=dc39e000 task=c1720aa0)
Jun 30 12:31:52 backend1 kernel: Stack: e0c46d62 dc39fdec 00000000
00000000 dd60a950 dbdab800 d6a97e60 dc39fe10
Jun 30 12:31:52 backend1 kernel:        00001000 da680000 00001000
d78eda60 00000385 00000000 e0f98500 e0c4477d
Jun 30 12:31:52 backend1 kernel:        00000000 05000900 00000400
dd60a950 d78eda60 dd60a950 dbdab004 11270000
Jun 30 12:31:52 backend1 kernel: Call Trace:
Jun 30 12:31:52 backend1 kernel:  [<e0c46d62>]
cache_make_upcall+0x92/0x270 [sunrpc]
Jun 30 12:31:52 backend1 kernel:  [<e0c4477d>] cache_check+0xed/0x180 [sunrpc]
Jun 30 12:31:52 backend1 kernel:  [<e0f6c788>] fh_verify+0x428/0x590 [nfsd]
Jun 30 12:31:52 backend1 kernel:  [<e0f6e58e>] nfsd_open+0x2e/0x1f0 [nfsd]
Jun 30 12:31:52 backend1 kernel:  [<e0f6eaad>] nfsd_read+0x13d/0xd60 [nfsd]
Jun 30 12:31:52 backend1 kernel:  [<c0386637>] schedule_timeout+0x97/0xe0
Jun 30 12:31:52 backend1 kernel:  [<e0c3dccc>] svc_reserve+0xec/0x1b0 [sunrpc]
Jun 30 12:31:52 backend1 kernel:  [<e0f76d01>] nfsd3_proc_read+0xb1/0x170 [nfsd]
Jun 30 12:31:52 backend1 kernel:  [<e0f789e0>]
nfs3svc_decode_readargs+0x0/0x1a0 [nfsd]
Jun 30 12:31:52 backend1 kernel:  [<e0f6a7ca>] nfsd_dispatch+0x8a/0x210 [nfsd]
Jun 30 12:31:52 backend1 kernel:  [<e0c4040f>] svc_send+0x9f/0x130 [sunrpc]
Jun 30 12:31:52 backend1 kernel:  [<e0f6d111>] fh_put+0x141/0x1a0 [nfsd]
Jun 30 12:31:52 backend1 kernel:  [<e0c3d781>] svc_process+0x571/0x600 [sunrpc]
Jun 30 12:31:52 backend1 kernel:  [<c01340de>] sigprocmask+0xee/0x280
Jun 30 12:31:52 backend1 kernel:  [<e0f6a40c>] nfsd+0x21c/0x550 [nfsd]
Jun 30 12:31:52 backend1 kernel:  [<e0f6a1f0>] nfsd+0x0/0x550 [nfsd]
Jun 30 12:31:52 backend1 kernel:  [<c010130d>] kernel_thread_helper+0x5/0x18
Jun 30 12:31:52 backend1 kernel: Code:  Bad EIP value.
Jun 30 12:31:52 backend1 kernel:  <1>Unable to handle kernel paging
request at virtual address 490130d5
Jun 30 12:31:52 backend1 kernel:  printing eip:
Jun 30 12:31:52 backend1 kernel: e0f72cc3
Jun 30 12:31:52 backend1 kernel: *pde = 00000000
Jun 30 12:31:52 backend1 kernel: Oops: 0000 [#2]
Jun 30 12:31:52 backend1 kernel: Modules linked in: nfsd lockd md5
ipv6 parport_pc lp parport autofs4 sunrpc ext3 jbd dm_mod video button
battery ac i2c_viapro b2c2_flexcop_pci b2c2_flexcop dvb_core mt352
bcm3510 stv0299 nxt2002 stv0297 mt312 i2c_core r8169 xfs exportfs
raid5 xor raid0 sata_via libata sd_mod scsi_mod
Jun 30 12:31:52 backend1 kernel: CPU:    0
Jun 30 12:31:52 backend1 kernel: EIP:    0060:[<e0f72cc3>]    Not tainted VLI
Jun 30 12:31:52 backend1 kernel: EFLAGS: 00010202   (2.6.12.1) 
Jun 30 12:31:52 backend1 kernel: EIP is at svc_export_lookup+0x53/0x200 [nfsd]
Jun 30 12:31:52 backend1 kernel: eax: dbf998dc   ebx: 490130c1   ecx:
00000000   edx: d78fa580
Jun 30 12:31:52 backend1 kernel: esi: dc223ef4   edi: 00000000   ebp:
dbf998dc   esp: dc223d80
Jun 30 12:31:52 backend1 kernel: ds: 007b   es: 007b   ss: 0068
Jun 30 12:31:52 backend1 kernel: Process rpc.mountd (pid: 3139,
threadinfo=dc222000 task=c174daa0)
Jun 30 12:31:52 backend1 kernel: Stack: 00000001 00000000 d5866000
d78fa580 00000020 e0f72ad4 00000000 dc223db5
Jun 30 12:31:52 backend1 kernel:        dc223df6 dc223e36 dc223e76
dc223ebe e0c5ae3d 00000030 00000001 c0213558
Jun 30 12:31:52 backend1 kernel:        c8b4f000 c0157e99 00000000
00000000 00000000 00000000 00000000 00000000
Jun 30 12:31:52 backend1 kernel: Call Trace:
Jun 30 12:31:52 backend1 kernel:  [<e0f72ad4>]
svc_export_parse+0x2a4/0x3a0 [nfsd]
Jun 30 12:31:52 backend1 kernel:  [<c0213558>] fast_clear_page+0x8/0x50
Jun 30 12:31:52 backend1 kernel:  [<c0157e99>] buffered_rmqueue+0x119/0x2d0
Jun 30 12:31:52 backend1 kernel:  [<c01581de>] __alloc_pages+0xbe/0x3b0
Jun 30 12:31:52 backend1 kernel:  [<c0167247>] do_anonymous_page+0x197/0x330
Jun 30 12:31:52 backend1 kernel:  [<c018f652>] do_lookup+0x42/0x90
Jun 30 12:31:52 backend1 kernel:  [<c016744f>] do_no_page+0x6f/0x4f0
Jun 30 12:31:52 backend1 kernel:  [<c0164308>] pte_alloc_map+0x118/0x1c0
Jun 30 12:31:52 backend1 kernel:  [<c01a36d9>] notify_change+0x2b9/0x309
Jun 30 12:31:52 backend1 kernel:  [<c011a092>] do_page_fault+0x1c2/0x558
Jun 30 12:31:52 backend1 kernel:  [<e0f72830>] svc_export_parse+0x0/0x3a0 [nfsd]
Jun 30 12:31:52 backend1 kernel:  [<e0c461b1>] cache_write+0xa1/0xc0 [sunrpc]
Jun 30 12:31:52 backend1 kernel:  [<c017c47e>] vfs_write+0x12e/0x130
Jun 30 12:31:52 backend1 kernel:  [<c017c531>] sys_write+0x41/0x70
Jun 30 12:31:52 backend1 kernel:  [<c0103a6f>] sysenter_past_esp+0x54/0x75
Jun 30 12:31:52 backend1 kernel: Code: 15 04 85 f9 e0 c1 e8 18 8d 2c
82 8b 55 00 89 e8 85 d2 0f 84 98 00 00 00 8b 56 14 eb 0c 8b 0b 89 d8
85 c9 0f 84 87 00 00 00 8b 18 <3b> 53 14 75 ed 8b 43 20 39 46 20 75 e5
8b 43 1c 39 46 1c 75 dd

At this point you can kill most processes, however mountd and nfsd are
hung, and prevent the machine from shutting down correctly.

Machine:
KT660 motherboard, AMD 1800 cpu, 512M PC3200, 120g OS, 4x160G raid5
XFS (2PATA 2SATA), 2xAir2PC
Fedora 3, 2.6.12.1 8k stacks, DVB-CVS
Linux version 2.6.12.1 (root at backend1) (gcc version 3.4.2 20041017
(Red Hat 3.4.2-6.fc3)) #1 Mon Jun 27 17:26:59 MST 2005

[root at backend1 ~]# lspci -v
00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP]
Host Bridge (rev 80)
        Subsystem: VIA Technologies, Inc.: Unknown device 0000
        Flags: bus master, 66Mhz, medium devsel, latency 8
        Memory at e0000000 (32-bit, prefetchable) [size=64M]
        Capabilities: [80] AGP version 3.5
        Capabilities: [c0] Power Management version 2

00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge (prog-if
00 [Normal decode])
        Flags: bus master, 66Mhz, medium devsel, latency 0
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        Memory behind bridge: dde00000-dfefffff
        Prefetchable memory behind bridge: d5c00000-ddcfffff
        Capabilities: [80] Power Management version 2

00:05.0 Network controller: Techsan Electronics Co Ltd B2C2 FlexCopII
DVB chip / Technisat SkyStar2 DVB card (rev 02)
        Subsystem: Techsan Electronics Co Ltd B2C2 FlexCopII DVB chip
/ Technisat SkyStar2 DVB card
        Flags: bus master, slow devsel, latency 192, IRQ 11
        Memory at dffe0000 (32-bit, non-prefetchable) [size=64K]
        I/O ports at d400 [size=32]

00:07.0 Network controller: Techsan Electronics Co Ltd B2C2 FlexCopII
DVB chip / Technisat SkyStar2 DVB card (rev 02)
        Subsystem: Techsan Electronics Co Ltd B2C2 FlexCopII DVB chip
/ Technisat SkyStar2 DVB card
        Flags: bus master, slow devsel, latency 192, IRQ 5
        Memory at dffd0000 (32-bit, non-prefetchable) [size=64K]
        I/O ports at d000 [size=32]

00:08.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169
Gigabit Ethernet (rev 10)
        Subsystem: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet
        Flags: bus master, 66Mhz, medium devsel, latency 192, IRQ 7
        I/O ports at cc00 [size=256]
        Memory at dfffbf00 (32-bit, non-prefetchable) [size=256]
        Expansion ROM at dffa0000 [disabled] [size=128K]
        Capabilities: [dc] Power Management version 2

00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA
RAID Controller (rev 80)
        Subsystem: Micro-Star International Co., Ltd.: Unknown device 0210
        Flags: bus master, medium devsel, latency 192, IRQ 10
        I/O ports at ec00 [size=8]
        I/O ports at e800 [size=4]
        I/O ports at e400 [size=8]
        I/O ports at e000 [size=4]
        I/O ports at dc00 [size=16]
        I/O ports at d800 [size=256]
        Capabilities: [c0] Power Management version 2

00:0f.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
(prog-if 8a [Master SecP PriP])
        Subsystem: Micro-Star International Co., Ltd.: Unknown device 0210
        Flags: bus master, medium devsel, latency 32, IRQ 11
        I/O ports at fc00 [size=16]
        Capabilities: [c0] Power Management version 2

00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge
[KT600/K8T800 South]
        Subsystem: VIA Technologies, Inc.: Unknown device 0000
        Flags: bus master, stepping, medium devsel, latency 0
        Capabilities: [c0] Power Management version 2

01:00.0 VGA compatible controller: nVidia Corporation NV20 [GeForce3]
(rev a3) (prog-if 00 [VGA])
        Subsystem: VISIONTEK: Unknown device 001b
        Flags: bus master, 66Mhz, medium devsel, latency 192, IRQ 11
        Memory at de000000 (32-bit, non-prefetchable) [size=16M]
        Memory at d8000000 (32-bit, prefetchable) [size=64M]
        Memory at ddc80000 (32-bit, prefetchable) [size=512K]
        Expansion ROM at dfef0000 [disabled] [size=64K]
        Capabilities: [60] Power Management version 2
        Capabilities: [44] AGP version 2.0


Anyone want to field a guess?


More information about the mythtv-users mailing list