[mythtv-users] After 7y5m running, time to start from scratch after update to .27?

Tue Dec 3 01:56:27 UTC 2013

On 11/28/2013 08:47 AM, Stephen P. Villano wrote:
>
> On 11/28/13, 7:47 AM, Paul Clark wrote:
>>
>> Just an idea (as someone who had this happen to them) but shouldn't a 
>> dev add a directive to mythweb to tell the bots not to crawl it?  Not 
>> sure how these things work but there's a page here 
>> <http://en.wikipedia.org/wiki/Robots_exclusion_standard> that seems 
>> to explain it.
>>
> The robots.txt file would stop a "law abiding" service like Google, it 
> wouldn't stop non-compliant spiders, malware or nefarious individuals.

And, of course, a robots.txt file in mythweb's directory structure would 
only help if mythweb is the root application (i.e. the only application) 
on the server.  I'm pretty sure this--and the fact that no one in the 
entire world should even consider running MythWeb on an Internet-facing 
server without authentication enabled (as mentioned by Stephen, 
above)--is why it was never included.

That said, there is a lockdown feature in MythWeb that will forcibly 
block any detected crawler /and/ disable MythWeb completely until the 
owner resets it (in theory, after fixing the broken configuration that 
didn't include authentication).  TTBOMK, this has never, ever, been 
triggered--because I have never once seen any individual ask how to 
reset MythWeb, and I doubt anyone (panicking about why his MythWeb 
suddenly stopped working) would have found the discussion of the 
lockdown feature in the README.  I'm guessing that no one has been 
crawled since it was added in June, 2008.

Mike