Microsoft Bing will soon rely a lot more on the date you specify in the lastmod field within your XML sitemap for crawling. Fabrice Canel, Principal Product Manager at Microsoft Bing said that starting in June 2023, Bing will revamp its “crawl scheduling stack to better utilize the information provided by the “lastmod” tag in sitemaps.”
What is lastmod. The lastmod field is the date of last modification of the page as specified in your XML sitemap file. It is not necessarily the date the URL was created but the date the last time that page was modified.
Bing says lastmod is critical. Microsoft Bing said the lastmod date is “one of the most critical tags that you can include in your sitemap.” It helps Bing crawl more efficiently today and will help Bing crawl a lot more in the future, the company said. “This will enhance our crawl efficiency by reducing unnecessary crawling of unchanged content and prioritizing recently updated content,” Fabrice Canel wrote.
Again, the lastmod crawl changes should be fully live by June 2023.
How Bing uses lastmod. Bing wrote, “the “lastmod” tag is used to indicate the last time the web pages linked by the sitemaps were modified. This information is used by search engines to determine how frequently to crawl your site, and to decide which pages to index and which to leave out. The inclusion of the “lastmod” tag in your sitemap is crucial as it allows search engines to easily determine when a page was last updated. Without it, search engines may delay crawling updated content or may over-crawl your website as they cannot accurately determine if the content has been modified.”
Lastmod usage statistics. Microsoft Bing also conducted a study showing how the lastmod field was used across the web in XML sitemap files. Here are the highlights of that date, note this is based on Bing’s crawling data.
- 58% of hosts have at least one XML sitemap.
- 84% of these sitemaps have a lastmod attribute set.
- 79% have lastmod values correct.
- 18% have lastmod values not correctly set.
- 3% has lastmod values for only some of the URLs.
- 16% of these sitemaps don’t have a lastmod attribute set.
- 84% of these sitemaps have a lastmod attribute set.
- 42% of hosts don’t have one XML sitemap
The biggest issue they see is that the lastmod dates in XML sitemaps are identical to all the URLs listed in the file. Meaning, Bing is noticing the the date of the lastmod field is being set to the date of generation of the sitemap, rather than the date of content modification. This is of course possible but highly unlike for every URL listed in the XML sitemap file.
changefreq field. Bing also mostly ignores the changefreq field in XML sitemaps, like Google. Google has said in the past it ignores the lastmod date in XML sitemap files but later said they do read it but may not trust it fully. The current documentation says Google “uses the value if it’s consistently and verifiably (for example by comparing to the last modification of the page) accurate.”
IndexNow. This advice is not to replace Microsoft’s IndexNow initiative, Fabrice Canel wrote that they “highly recommended to adopt IndexNow to instantly inform search engines about latest content changes on your websites.”
Why we care. Going forward, if you have not taken the lastmod date in your XML sitemap file seriously, you probably now should. By doing so, you should be able to improve crawl efficiency of search engines, especially Bing. That should improve speed of crawl and potentially indexing and ranking and also your server resources.