[mythtv-users] Duplicate detection

Sat Sep 24 15:14:28 UTC 2016

On 24/09/16 12:49, Jan Ceuleers wrote:
> On 24/09/16 11:56, Mike Perkins wrote:
>> A note that may help - if you're linking tables then it might be worth
>> setting up temporary indexes on the fields used for joins (assuming they
>> don't already have them). This should make a BIG difference to the
>> amount of processing required.
>
> The kinds of explosive resource usage I noticed previously was memory,
> not CPU. This was probably because of the subquery I'd included in the
> WHERE clause in my previous attempts at solving this. 't Was an increase
> by several gigabytes in the space of a few seconds, which brought the
> machine to its knees and caused the OOM killer to kick in.
>
> The queries I posted today execute pretty much instantaneously on my
> system, but of course I don't know how mythtv incorporates them into the
> scheduler query, which is why I asked for more details.
>
> But still, for my understanding: I would expect an index to trade
> reduced CPU usage for increased memory and disk usage, is that right? If
> so that might not necessarily be a good idea on my system which has a
> total of 6GB of virtual memory (which is RAM+swap) of which just over
> 2GB is in use while the system is idle (so just under 4GB headroom half
> of which is swap).
>
I'm sorry, perhaps my use of 'processing' was taken as CPU whereas I just meant "overall effort to 
handle the query".

If the field you choose to select on hasn't got an index then the query has to read the entire table 
to find out which records it might need; with an index, it just reads the index - which is already 
sorted into an order the query can make use of.

This obviously can save a whole lot of time and 'processing' since the index will be a lot smaller 
to read than the table and the query will be complete once it reads past the point of interest.

-- 

Mike Perkins