[mythtv-users] Oops on simultaneous recordings...

Debabrata Banerjee davatar at comcast.net
Wed May 26 22:11:03 EDT 2004


Yes this is very annoying.. A google search for the kswapd oops will turn up
_many_ results, however no answers or clues. This is not a mythbackend
problem, instead the bttv problem. Looks like if it's short on pci bandwidth
or something it starts corrupting memory, beware because this will probably
corrupt your file system if you let it keep happening (do you need to ask me
how I know?) I get this problem with a pvr250 and a bttv card on my system,
and this time it cannot be blamed on the chipset, as usually happens when
this is brought up. The board I am running on has 3 PCI buses and a
Serverworks chipset. The cards basically have the entire 32bit/33Mhz bus to
themselves. Look below.

Either bt878 is fundamentally flawed or some assumption the drivers are
making is wrong or bttv is stimulating a bug in kswapd (which I suppose
could corrupt all memory) I've experienced this on 2.4, 2.6, and many
patched kernels in-between. I've wondered if a real-time patch would help
this, however being on the edge of system corruption is not a nice thought.

http://www.uwsg.iu.edu/hypermail/linux/kernel/0311.3/1277.html
http://www.uwsg.iu.edu/hypermail/linux/kernel/0110.2/1141.html
http://www.cs.helsinki.fi/linux/linux-kernel/2003-27/0894.html
http://www.mail-archive.com/video4linux-list@redhat.com/msg04319.html
http://seclists.org/lists/linux-kernel/2003/May/7647.html

My current information (Dual Xeon 2.4 on Intel SHG2),
Ksymoops output:

Unable to handle kernel paging request at virtual address 1f262424
c014162a
*pde = 00000000
Oops: 0002
CPU:    1
EIP:    0060:[<c014162a>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00210046
eax: dfa73000   ebx: c1c1d174   ecx: c6edf000   edx: 1f262420
esi: 00000089   edi: f7ef1cd4   ebp: 00000004   esp: f7ee7f2c
ds: 0068   es: 0068   ss: 0068
Process kswapd (pid: 7, stackpage=f7ee7000)
Stack: 00017414 00000000 00000007 c1c1d184 c1c1d17c c014270d f7ee6000
f7ef1c00
       00000000 00000000 00000000 00000000 00000020 000001d0 c0306858
c0306858
       c01429a0 f7ee7f90 000001d0 0000003c 00000020 c0142a35 f7ee7f90
f7e92d80
Call Trace:   [<c014270d>]  (0xf7ee7f40)
[<c01429a0>]  (0xf7ee7f6c)
[<c0142a35>]  (0xf7ee7f80)
[<c0142bb5>]  (0xf7ee7fa4)
[<c0142c0e>]  (0xf7ee7fbc)
[<c0142d2b>]  (0xf7ee7fcc)
[<c0142c90>]  (0xf7ee7fe4)
[<c0107229>]  (0xf7ee7ff0)
Code: 89 42 04 8b 43 08 89 48 04 89 01 8b 54 24 10 89 4b 08 89 51


>>EIP; c014162a <kmem_cache_reap+29d/325>   <=====

>>eax; dfa73000 <_end+1f633528/383d1588>
>>ebx; c1c1d174 <_end+17dd69c/383d1588>
>>ecx; c6edf000 <_end+6a9f528/383d1588>
>>edi; f7ef1cd4 <_end+37ab21fc/383d1588>
>>esp; f7ee7f2c <_end+37aa8454/383d1588>

Trace; c014270d <shrink_cache+300/417>
Trace; c01429a0 <shrink_caches+1c/5e>
Trace; c0142a35 <try_to_free_pages_zone+53/c2>
Trace; c0142bb5 <kswapd_balance_pgdat+62/a2>
Trace; c0142c0e <kswapd_balance+19/2f>
Trace; c0142d2b <kswapd+9b/b5>
Trace; c0142c90 <kswapd+0/b5>
Trace; c0107229 <kernel_thread_helper+5/b>

Code;  c014162a <kmem_cache_reap+29d/325>
00000000 <_EIP>:
Code;  c014162a <kmem_cache_reap+29d/325>   <=====
   0:   89 42 04                  mov    %eax,0x4(%edx)   <=====
Code;  c014162d <kmem_cache_reap+2a0/325>
   3:   8b 43 08                  mov    0x8(%ebx),%eax
Code;  c0141630 <kmem_cache_reap+2a3/325>
   6:   89 48 04                  mov    %ecx,0x4(%eax)
Code;  c0141633 <kmem_cache_reap+2a6/325>
   9:   89 01                     mov    %eax,(%ecx)
Code;  c0141635 <kmem_cache_reap+2a8/325>
   b:   8b 54 24 10               mov    0x10(%esp,1),%edx
Code;  c0141639 <kmem_cache_reap+2ac/325>
   f:   89 4b 08                  mov    %ecx,0x8(%ebx)
Code;  c014163c <kmem_cache_reap+2af/325>
  12:   89 51 00                  mov    %edx,0x0(%ecx)

--> soft hang (sysrq keys still work)

------------
/proc/pci
------------
PCI devices found:
  Bus  0, device   0, function  0:
    Host bridge: ServerWorks CMIC-LE (rev 19).
  Bus  0, device   0, function  1:
    Host bridge: ServerWorks CMIC-LE (#2) (rev 0).
  Bus  0, device   0, function  2:
    Host bridge: PCI device 1166:0000 (ServerWorks) (rev 0).
  Bus  0, device   2, function  0:
    VGA compatible controller: ATI Technologies Inc Rage XL (rev 39).
      IRQ 20.
      Master Capable.  Latency=64.  Min Gnt=8.
      Non-prefetchable 32 bit memory at 0xf5000000 [0xf5ffffff].
      I/O at 0x2000 [0x20ff].
      Non-prefetchable 32 bit memory at 0xf4000000 [0xf4000fff].
  Bus  0, device   4, function  0:
    Multimedia video controller: Internext Compression Inc iTVC15 MPEG-2
Encoder (rev 1).
      IRQ 25.
      Master Capable.  Latency=64.  Min Gnt=128.Max Lat=8.
      Prefetchable 32 bit memory at 0xf8000000 [0xfbffffff].
  Bus  0, device   8, function  0:
    Multimedia video controller: Brooktree Corporation Bt878 Video Capture
(rev 17).
      IRQ 22.
      Master Capable.  Latency=64.  Min Gnt=16.Max Lat=40.
      Prefetchable 32 bit memory at 0xf6000000 [0xf6000fff].
  Bus  0, device   8, function  1:
    Multimedia controller: Brooktree Corporation Bt878 Audio Capture (rev
17).
      IRQ 22.
      Master Capable.  Latency=64.  Min Gnt=4.Max Lat=255.
      Prefetchable 32 bit memory at 0xf6001000 [0xf6001fff].
  Bus  0, device   9, function  0:
    Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 7).
      IRQ 23.
      Master Capable.  Latency=64.  Min Gnt=2.Max Lat=20.
      I/O at 0x2400 [0x241f].
  Bus  0, device   9, function  1:
    Input device controller: Creative Labs SB Live! MIDI/Game Port (rev 7).
      Master Capable.  Latency=64.
      I/O at 0x2430 [0x2437].
  Bus  0, device  15, function  0:
    ISA bridge: ServerWorks CSB5 South Bridge (rev 147).
      Master Capable.  Latency=248.
  Bus  0, device  15, function  1:
    IDE interface: ServerWorks CSB5 IDE Controller (rev 147).
      Master Capable.  Latency=64.
      I/O at 0x2420 [0x242f].
  Bus  0, device  15, function  2:
    USB Controller: ServerWorks OSB4/CSB5 OHCI USB Controller (rev 5).
      IRQ 10.
      Master Capable.  Latency=64.  Max Lat=80.
      Non-prefetchable 32 bit memory at 0xf4001000 [0xf4001fff].
  Bus  0, device  15, function  3:
    Host bridge: ServerWorks GCLE Host Bridge (rev 0).
  Bus  0, device  17, function  0:
    Host bridge: PCI device 1166:0101 (ServerWorks) (rev 3).
  Bus  0, device  17, function  2:
    Host bridge: PCI device 1166:0101 (ServerWorks) (rev 3).
      Master Capable.  Latency=64.
  Bus  1, device   4, function  0:
    Ethernet controller: Intel Corp. 82544GC Gigabit Ethernet Controller
(LOM) (rev 2).
      IRQ 19.
      Master Capable.  Latency=64.  Min Gnt=255.
      Non-prefetchable 32 bit memory at 0xfc040000 [0xfc05ffff].
      Non-prefetchable 32 bit memory at 0xfc020000 [0xfc03ffff].
      I/O at 0x2440 [0x245f].
  Bus  1, device   8, function  0:
    Unknown mass storage controller: Promise Technology, Inc. 20269 (rev 2).
      IRQ 26.
      Master Capable.  Latency=64.  Min Gnt=4.Max Lat=18.
      I/O at 0x2478 [0x247f].
      I/O at 0x243c [0x243f].
      I/O at 0x2470 [0x2477].
      I/O at 0x2438 [0x243b].
      I/O at 0x2460 [0x246f].
      Non-prefetchable 32 bit memory at 0xfc000000 [0xfc003fff].
  Bus  1, device   9, function  0:
    PCI bridge: Digital Equipment Corporation DECchip 21150 (rev 6).
      Master Capable.  Latency=64.  Min Gnt=7.
  Bus  2, device   1, function  0:
    RAID bus controller: Promise Technology, Inc. 20268R (rev 2).
      IRQ 29.
      Master Capable.  Latency=64.  Min Gnt=4.Max Lat=18.
      I/O at 0x3030 [0x3037].
      I/O at 0x3024 [0x3027].
      I/O at 0x3028 [0x302f].
      I/O at 0x3020 [0x3023].
      I/O at 0x3000 [0x300f].
      Non-prefetchable 32 bit memory at 0xfc100000 [0xfc10ffff].
  Bus  2, device   2, function  0:
    RAID bus controller: Promise Technology, Inc. 20268R (#2) (rev 2).
      IRQ 30.
      Master Capable.  Latency=64.  Min Gnt=4.Max Lat=18.
      I/O at 0x3048 [0x304f].
      I/O at 0x303c [0x303f].
      I/O at 0x3040 [0x3047].
      I/O at 0x3038 [0x303b].
      I/O at 0x3010 [0x301f].
      Non-prefetchable 32 bit memory at 0xfc110000 [0xfc11ffff].
  Bus  3, device   8, function  0:
    Ethernet controller: PCI device 8086:1026 (Intel Corp.) (rev 4).
      IRQ 24.
      Master Capable.  Latency=64.  Min Gnt=255.
      Non-prefetchable 64 bit memory at 0xfc420000 [0xfc43ffff].
      Non-prefetchable 64 bit memory at 0xfc440000 [0xfc47ffff].
      I/O at 0x4800 [0x483f].
  Bus  3, device   9, function  0:
    SCSI storage controller: Adaptec AIC-7899P U160/m (rev 1).
      IRQ 16.
      Master Capable.  Latency=64.  Min Gnt=40.Max Lat=25.
      I/O at 0x4000 [0x40ff].
      Non-prefetchable 64 bit memory at 0xfc400000 [0xfc400fff].
  Bus  3, device   9, function  1:
    SCSI storage controller: Adaptec AIC-7899P U160/m (#2) (rev 1).
      IRQ 17.
      Master Capable.  Latency=64.  Min Gnt=40.Max Lat=25.
      I/O at 0x4400 [0x44ff].
      Non-prefetchable 64 bit memory at 0xfc401000 [0xfc401fff].


----- Original Message ----- 
From: "Poul Petersen" <petersp at mail.alleft.com>
To: <mythtv-users at mythtv.org>
Sent: Wednesday, May 26, 2004 9:08 PM
Subject: [mythtv-users] Oops on simultaneous recordings...


> I've been trouble-shooting an intermittent problem for quite some time
now,
> and I can't draw a definitive list of conditions. So, I'm sending this
email
> just to see if anyone else has ever seen anything like this.
>
> I've got a separate master backend machine with two capture cards.
> Every now and then (maybe 1 out of three times?) when the box needs to
> record two shows at the same time, the machine will Oops about 10 minutes
into
> the recordings. If I then hard boot the machine, the recordings will start
back
> up and complete fine.
>
> If I reboot the machine every day, the problem will still
> occur every now and then (so it's not a memory leak). I've been running
this
> box for almost 16 months and in the begining I ran lots of tests with
three
> capture cards, and I don't remember it ever crashing. I think the problems
> may have appeared when I upgraded to 0.14, but I can't be sure. At the
> time I also upgraded the disk and added a LCD - these variables have all
> been removed (I built a separate frontend and installed the LCD there,
> and I removed the second hard drive). I also added extra case fans
> in case it was thermally related. I also ran memtestx86 for a full
> weekend with no errors.
>
> I've worked around the problem by changing the encoder scheduling
> so that the frontend gets the second recording, thus ensuring that the
master
> backend never has to perform two simultaneous recordings (unless I try to
> record three shows at the same time, but I haven't needed to do that
recently).
> Since I made this change, the machine has been stable (it has not oopsed
in
> over a month whereas it used to oops about once a week).
>
> Below I've attached the Oops I usually see and the machine
> specs. Notice that I usually get two Oops's, one in mythbackend and
> the other in kswapd, sometimes kjournald, etc.
>
> Thanks for reading...
>
> -poul
>
> Specs:
> RedHat 9 w/ kernel 2.4.25
> ASUS A7V8X-X, AMD XP2200+, 256MB
> Myth-0.14 built from source
> mysql-4.0.13
> (2) WinTV Radio using bttv/btaudio
> alsa-0.9.4 snd-via82xx (on-board)
> ext3 filesystem
> 1GB of swap
>
> kernel:  <1>Unable to handle kernel paging request at virtual address
35333c47
> kernel: c0135b7d
> kernel: *pde = 00000000
> kernel: Oops: 0002
> kernel: CPU:    0
> kernel: EIP:    0010:[<c0135b7d>]    Tainted: P
> kernel: EFLAGS: 00010046
> kernel: eax: 35333c43   ebx: c1305dd4   ecx: c1cfe000   edx: cdaba000
> kernel: esi: 00000018   edi: c1cfecc0   ebp: c1205010   esp: ceaf7dc0
> kernel: ds: 0018   es: 0018   ss: 0018
> kernel: Process mythbackend (pid: 3911, stackpage=ceaf7000)
> kernel: Stack: 00000282 c1cfecc0 c1cfecc0 c013547e c1305dd4 c1cfecc0
c1cfecc0 c0140c27
> kernel:        c1305dd4 c1cfecc0 c0142fdd c1cfecc0 00000000 c1205010
c0270358 00002366
> kernel:        c01362a7 c1205010 000001d2 ceaf6000 00000c80 000001d2
0000000c 00000020
> kernel: Call Trace:    [<c013547e>] [<c0140c27>] [<c0142fdd>] [<c01362a7>]
[<c01364dd>]
> kernel:   [<c0136553>] [<c0137422>] [<c014182e>] [<c0137730>] [<c012efe5>]
[<c012f0bb>]
> kernel:   [<d0830500>] [<c012f783>] [<c012fb7a>] [<c012ffa0>] [<c013012d>]
[<c012ffa0>]
> kernel:   [<c01acc10>] [<c013e863>] [<c0108dbf>]
> kernel: Code: 89 50 04 c7 01 00 00 00 00 8b 43 08 8d 53 08 89 48 04 89 01
>
>
> >>EIP; c0135b7d <kmem_find_general_cachep+69d/1ee0>   <=====
>
> >>ebx; c1305dd4 <_end+fec1dc/104f3488>
> >>ecx; c1cfe000 <_end+19e4408/104f3488>
> >>edx; cdaba000 <_end+d7a0408/104f3488>
> >>edi; c1cfecc0 <_end+19e50c8/104f3488>
> >>ebp; c1205010 <_end+eeb418/104f3488>
> >>esp; ceaf7dc0 <_end+e7de1c8/104f3488>
>
> Trace; c013547e <kmem_cache_free+1e/30>
> Trace; c0140c27 <bread+e7/100>
> Trace; c0142fdd <try_to_free_buffers+5d/1f0>
> Trace; c01362a7 <kmem_find_general_cachep+dc7/1ee0>
> Trace; c01364dd <kmem_find_general_cachep+ffd/1ee0>
> Trace; c0136553 <kmem_find_general_cachep+1073/1ee0>
> Trace; c0137422 <_alloc_pages+62/200>
> Trace; c014182e <block_read_full_page+1ee/2c0>
> Trace; c0137730 <__alloc_pages+170/270>
> Trace; c012efe5 <filemap_fdatawait+1c5/350>
> Trace; c012f0bb <filemap_fdatawait+29b/350>
> Trace; d0830500 <[ext3].text.start+4480/cb0d>
> Trace; c012f783 <grab_cache_page_nowait+183/220>
> Trace; c012fb7a <do_generic_file_read+32a/820>
> Trace; c012ffa0 <do_generic_file_read+750/820>
> Trace; c013012d <generic_file_read+bd/870>
> Trace; c012ffa0 <do_generic_file_read+750/820>
> Trace; c01acc10 <ide_dma_intr+0/430>
> Trace; c013e863 <default_llseek+2a3/ce0>
> Trace; c0108dbf <__up_wakeup+128f/1660>
>
> Code;  c0135b7d <kmem_find_general_cachep+69d/1ee0>
> 00000000 <_EIP>:
> Code;  c0135b7d <kmem_find_general_cachep+69d/1ee0>   <=====
>    0:   89 50 04                  mov    %edx,0x4(%eax)   <=====
> Code;  c0135b80 <kmem_find_general_cachep+6a0/1ee0>
>    3:   c7 01 00 00 00 00         movl   $0x0,(%ecx)
> Code;  c0135b86 <kmem_find_general_cachep+6a6/1ee0>
>    9:   8b 43 08                  mov    0x8(%ebx),%eax
> Code;  c0135b89 <kmem_find_general_cachep+6a9/1ee0>
>    c:   8d 53 08                  lea    0x8(%ebx),%edx
> Code;  c0135b8c <kmem_find_general_cachep+6ac/1ee0>
>    f:   89 48 04                  mov    %ecx,0x4(%eax)
> Code;  c0135b8f <kmem_find_general_cachep+6af/1ee0>
>   12:   89 01                     mov    %eax,(%ecx)
>


----------------------------------------------------------------------------
----


_______________________________________________
mythtv-users mailing list
mythtv-users at mythtv.org
http://mythtv.org/cgi-bin/mailman/listinfo/mythtv-users



More information about the mythtv-users mailing list