Sitemap Protocol: Google just released this today. It’s like robots.txt, except that it shows search engines (well, just Google right now, but others will follow…) how to get to URLs on your site that are not linked from other pages. Actually, you can put all your URLs in this file, if you want — it can be up to 10MB — or 50,000 URLs, after gzip compression.
The Sitemap Protocol allows you to inform search engine crawlers about URLs on your Web sites that are available for crawling. A Sitemap consists of a list of URLs and may also contain additional information about those URLs, such as when they were last modified, how frequently they change, etc.
Now the rush comes for content management systems to include the automatic generation of this file as a feature. I predict Movable Type will be first, since it’s just another index template. Someone could probably write the template in a couple of minutes.
See the comments of this post for a discussion about the pros and cons of the “use-it-if-you-find-it” theory behind robots.txt-like files (such as this one).
Follow Gadgetopia on Twitter
It's time we extend the robots.txt concept to information about businesses. First, let's take a quick detour into robots.txt for a second — In order to tell a search engine how to spider a Web site (or not), webmasters can stick a text file called "robots.txt" in their root directory with information…
Not a bad guess: Here's the story on Google Sitemap templates in Movable Type.
There's a disussion of when you might want to use a Google Sitemap, and links to automatic generators for a range of situations at this search optimization site.
Yes the Google Sitemap Protocol is an excellent idea. I believe that it is a step toward pushing content to search engines instead of the search engines pulling it.
This, I think, will move toward something more like television.
Take care
Waitman
I have made a very simple explanation and form to create a sitemap. Would love your comments. No PHP or scripting knowledge is required.