[mythtv-users] No upcoming recordings just spurious Never Record?

Mike Perkins mikep at randomtraveller.org.uk
Tue Apr 7 15:01:17 UTC 2015


On 07/04/15 15:43, Steve Goodey wrote:
> On Tuesday 07 Apr 2015 09:55:26 Andre Newman wrote:
>> On 6 Apr 2015, at 16:47, Michael T. Dean <mtdean at thirdcontact.com>
> wrote:
>>> On 04/05/2015 03:02 PM, Andre Newman wrote:
>>>> The mythweb access control had been messed up at some point and with
>>>> browser saved passwords I’d not noticed! It’s fixed manually now rather
>>>> than through the mythbuntu-control-center and proved working.
>>>>
>>>> I found some bot in my apache logs posting searches and pressing
>>>> buttons...
>>>
>>> Interesting--and disappointing--that it didn't trigger the lockdown
>>> functionality that's supposed to occur when a bot is noticed in your
>>> MythWeb.
>> I didn’t know it did that! Well supposed to anyway, interesting feature.
>>
>> I can post some logs of the bot’s activity if it’s any value?
>>
>> 46.4.32.75 - - [05/Apr/2015:15:23:31 +0100] "GET
>> /mythweb/tv/detail/22020/1428127800 HTTP/1.0" 200 43863 "-"
> "Mozilla/5.0
>> (compatible; MJ12bot/v1.4.5; http://www.majestic12.co.uk/bot.php?+)”
>>
>> It’s something called MJ12bot, new one to me.
>>
>> Andre
>
>>From http://www.majestic12.co.uk/projects/dsearch/mj12bot.php
>
> "*How can I block MJ12bot?*
> MJ12bot adheres to the robots.txt[1] standard. If you want the bot to
> prevent website from being crawled then add the following text to your
> robots.txt:
> User-agent: MJ12bot
> Disallow: /
> Please do not waste your time trying to block bot via IP in htaccess - we do
> not use any consecutive IP blocks so your efforts will be in vain. Also
> please make sure the bot can actually retrieve robots.txt itself - if it can't
> then it will assume (this is the industry practice) that its okay to crawl your
> site.
> If you have reason to believe that MJ12bot did NOT obey your robots.txt
> commands, then please let us know via email: bot at majestic12.co.uk[2].
> Please provide URL to your website and log entries showing bot trying to
> retrieve pages that it was not supposed to."
>
The bigger question might be, why is the OP allowing port 80 (or 443?) access 
from the internet to his site?

If you need to use mythweb from outside your firewall a non-standard port number 
is always advised, for the reason that bots will crawl your site if you don't. 
Leaving port 80 open is asking for trouble, and not just from bots.

-- 

Mike Perkins



More information about the mythtv-users mailing list