Are wildcards allowed in sitemap.xml file?
I h开发者_Python百科ave a website that has a directory that contains 100+ html files. I want crawlers to crawl all the html files that directory. I have already added following sentence to my robots.txt:
Allow /DirName/*.html$
Is there any way to include the files in the directory in sitemap.xml file so that all html files in the directory will get crawled? Something like this:
<url>
<loc>MyWebsiteName/DirName/*.html</loc>
</url>
The sitemap protocol neither restricts or allows the use of wildcards; to be honest this is the first time i hear this. Also, I'm pretty much sure that search engines can't make use of the wildcards in sitemaps.
Please take a look at Google's recommendation of sitemap generators. There are tons of tools you can create a sitemap with in a blink of an eye.
It is not allows the use of wildcards. if you run php in your server then you could list all files in the directory and generate sitemap.xml automatically using the DirectoryIterator .
// this is assume you have already a sitemap class.
$sitemap = new Sitemap;
// iterate the directory
foreach(new DirectoryIterator('/MyWebsiteName/DirName') as $directoryItem)
{
// Filter the item
if(!$directoryItem->isFile()) continue;
// New basic sitemap.
$url = new Sitemap_URL;
// Set arguments.
$url->set_loc(sprintf('/DirName/%1$s', $directoryItem->getBasename()))
->set_last_mod(1276800492)
->set_change_frequency('daily')
->set_priority(1);
// Add it to sitemap.
$sitemap->add($url);
}
// Render the output.
$response = $sitemap->render();
// Cache the output for 24 hours.
$cache->set('sitemap', $response, 86400);
// Output the sitemap.
echo $response;
精彩评论