Why is google guessing at files that I may have on my web site?
crawl10.google.com has tried to index
- atom.xml
- rss.xml
- index.rdf
Even better, it doesn't believe the 404 file not found it gets back and tries twice. So what's going on here?
Those files don't exist on my site (the RSS feed is actually at rss.ashx),
I've never created a dead link to those files, I doubt anyone else has, so I can only assume that google have updated
their crawlers to look for common RSS feed file names, rather than do the "proper" thing and look for the RSS meta tag
on the index page, or just stick to following links.
Tut tut. What's next, trying to index common password file names? /porn/ directorys? Searching for those pictures you
took of your girlfriend and hide on your web site under a directory you never link to?