[mythtv-users] random livetv stalls

Sun Feb 9 19:07:06 UTC 2014

On 02/09/14 06:00, Jean-Yves Avenard wrote:
> Hi
>
> On Saturday, February 8, 2014, Kevin Johnson<iitywygms at gmail.com>  wrote:
>
>> >Well
>> >After searching around a bit I have tried making some tweaks to
>> >systcl.conf
>> >kernel.shmall = 167772160
>> >kernel.shmmax = 167772160
>> >net.core.wmem_max=12582912
>> >net.core.rmem_max=12582912
>> >net.ipv4.tcp_rmem= 10240 87380 12582912
>> >net.ipv4.tcp_wmem= 10240 87380 12582912
>> >
>> >I will see if this makes a difference.
>> >
>> >
> Interesting. Is this something you've done on the blackend or the frontend?
>
> How did you come up with those values.
> Which kernel version are you running?
I note that in a subsequent post you indicated the problem was not 
present after making these changes.  I seriously doubt that the net.* 
value changes had anything to do with it - unless you have an underlying 
network problem.

I've had occasion to do quite a bit of testing with these values for a 
work project over a large range of network latencies.  The short summary 
is that changing the values has little to no effect on a typical, high 
performance LAN.

For background, the net.core.* values determine the size of the receive 
(rmem_max) and send (wmem_max) socket buffers when the calling program 
explicitly uses setsockopt(2) to request a specific socket buffer size.  
Most programs don't do this.  The net.ipv4.* values are used to set the 
TCP socket buffer sizes when TCP auto-tuning is configured in the 
kernel.  The first value is the lower limit (generally not used unless 
the system is under memory pressure), the second is the starting/default 
value, and the third is the upper limit.  All are in bytes.  The size of 
the buffer needed is proportional to the network latency between the 
sender and receiver.  TCP will send multiple packets in sequence 
("windowing"), without waiting for the receiver to acknowledge, in an 
attempt to increase throughput.  Larger buffers let more unacknowledged 
packets be in transit, e.g., over a WAN link.  Over most LAN's, the 
packet transit time is so low that packet acknowledgments are received 
very quickly and the TCP buffers can be small.  The default values have 
been pretty good in recent kernels.

So, unless you have a very lossy network (i.e., lots of errors causing 
need for retransmission) or you are running your backend and frontend in 
different countries, it is unlikely that these value changes had 
anything to do with the improvement you see.

The best way to determine if your problem is network related is to:
(1) Check for high network usage at the time of the problem.  A 
graphical tool like gkrellm can be useful here.
(2) Check the interface statistics on each NIC in the path to see if the 
counters show non-zero values for errors, drops, overruns, etc. If you 
have non-zero values, check them periodically to see if they increment 
around the time of your observed performance problem.  If you see this, 
then you'll need to determine if you have a cable problem, switch port 
failure, or NIC failure.
(3) If you don't have high usage or port errors, run a network capture 
(e.g., wireshark, tcpdump) while a problem is likely to occur.  Analysis 
takes a bit more experience here, but you want to look for (a) long 
times between sending  a packet and its corresponding ACK - which may 
point to a load problem on the remote end, and (b) long times between 
sending successive packets - which may point to a load problem on the 
local end.

I haven't played with the shared memory values, so I can't comment on 
their impact.

Keith