As sophisticated as they are, search engines aren’t human—they’re technology. You have to speak their language if you want your website to rank prominently on search engine result pages (SERP). You can create an XML sitemap that search engines like Google use to understand how your website is structured and what it should pay attention to.

What is an XML sitemap and how to create one

XML stands for extensible markup language. It’s the “language” that search engines like Google and Bing speak, which is why XML sitemaps provide a way to communicate directly with search engines and tell them how your site is structured and the role various pages play. 

XML includes codes or tags that explain how a web page is formatted. An XML sitemap functions similarly to a table of contents, which explains to a search engine whether it should show detailed blog posts or a “contact us” page, depending on the level of detail in the query.

A website sitemap tool signals to search engines which URLs you’d like the search engine to crawl on your site and, ideally, index in the SERPs. Like a map or blueprint for your website, Bing and Google XML sitemaps are files that detail elements of your website like:

  • Pages and URLs, including how you prioritize them, when they were created, when they were last updated, and any alternate language or country versions of each page
  • Videos, including category, age-appropriateness rating, and running time
  • Images, including type, subject matter, and license
  • Other files
  • News content and recent updates, including publication dates and article titles
  • Relationships among website elements

When you create a sitemap, you tell search engines like Google what the most important pages and elements on your website are. The answer to what is a sitemap is that it’s a holistic, single view of all your website pages in language search engines understand.

Why Google needs XML sitemaps

Since Google processes information in codes, XML sitemaps translate the content of your website quickly and easily for Google, which can influence your search results. An XML sitemap serves as a guide to your website in the exact language Google understands and can respond quickly to.

Since Google uses complex algorithms to schedule web crawling, a sitemap doesn’t guarantee that all your pages will be crawled and indexed. Sitemaps don’t replace normal web crawling. But they can help get more of your pages indexed. 

Usually, if you have a relatively small website and your pages are properly linked, the Googlebot can discover your content, and you typically don’t need to worry about a sitemap. To confirm your site is crawlable without a sitemap, it should meet the following criteria.

  • Your website has 500 pages or fewer that you want to be shown in search results.
  • Your pages are properly linked, which means Google can find all the important pages on your website by following links starting from the homepage. Proper linking means all the pages on your website that you think are important can be reached through some form of navigation, such as links you placed on pages, or through your website’s menu.

Another instance where you may not need a sitemap is if you don’t have news pages or rich media files like videos and images.

However, Google states in most cases that your site will benefit from having a sitemap. There’s no disadvantage to having one. Since it’s relatively simple, easy, and quick to create one, it makes sense to do so if you’re looking to rank as high as possible when it comes to Google’s SERPs.

In many cases, creating an XML sitemap boosts your website’s SEO performance. That’s because you’re following Google’s recommendations for increasing the odds that crawlers will successfully identify and index your web pages in its search engine.

Why are XML sitemaps important?

We recommend always creating a sitemap to benefit SEO and search ranking purposes. Google specifically indicates if you match any of the following criteria, you need to have a sitemap. 

  • If your website has a deep architecture, or you have at least 500 pages you want to show up in search engines, a sitemap is important. When your site is really large or complex, a sitemap tells Google what to prioritize when crawling your website, like new or recently updated pages.
  • If your pages are isolated or are not well-linked to each other, a sitemap might help Google find those pages so that your pages aren’t overlooked if they don’t naturally reference each other. Sitemaps can help if you have orphan pages or haven’t done a lot of internal linking on your site.
  • If your site has a lot of content that frequently changes on existing pages, or you’re frequently adding new pages, like on a news website, a sitemap will help Google discover your content.
  • Also, if your site is new and doesn’t have a lot of external links pointing to it, Google might not discover your pages quickly when no other sites link to it. A sitemap can help new websites get noticed and indexed in the search engines.
  • If your website has specialized files, like rich media content (images and videos), or if your website is shown in Google News.

Whatever state your website is in, generally, anything you add to a sitemap will result in faster indexation for web crawlers. Sitemaps can provide a needed boost to your website if you want to get more pages indexed and ranked.

How to find an XML sitemap

With the free Conductor SEO Chrome Extension, it’s easy to find your sitemap using the Technical SEO tab and sitemap finder. You can see your sitemap and click on the check or warning icon to get recommendations for your sitemap.

You can also work with your webmaster to find your sitemap. Generally, a sitemap should be located in the root directory. For example, sitemap variations might look like:

http://www.example.com/sitemap_index.xml

http://www.example.com/sitemap.xml  

http://www.example.com/sitemap/

http://www.example.com/sitemap.php

http://www.example.com/sitemap.txt

You can check the robots.txt file for your sitemap, too. This file has directives for search engine robots, so it’s a natural place to host a sitemap. Add /robots.txt to your website URL to view the robots.txt file of your website.

If you have access to Google Search Console for your website, you can click on Sitemaps to see if your XML sitemap has been submitted to Google. Similarly, you can find an XML sitemap that’s been submitted to Bing in the Bing Webmaster Tools. Sitemaps may also be located in a website’s subdirectory or on a different domain. Some webmasters may omit the word “sitemap” from the sitemap URL, as well. There’s a free sitemap tool from SEO Site Checkup you can use to locate a sitemap.

How to create an XML sitemap

Google recommends finding a way to automatically generate sitemaps rather than creating them manually. Usually, this will involve running code on your server, so talk with your development team about how to do this.

Ideally, the system running your website will include an automatic sitemap generator for XML sitemap files. If you’re using a content management system (CMS) like Blogger, Wix, Squarespace, or WordPress, it’s likely that your CMS already has made a sitemap available to search engines. For example, you can find a Drupal extension or WordPress sitemap plug-in if you use those CMS. Check the documentation from your provider, as every platform is slightly different.

You can also use DeepCrawl to generate XML Sitemaps for your website pages that have been discovered and crawled.

Top 5 XML sitemap best practices

An XML sitemap allows you to directly communicate with the search engines and highlight the quality of your website. Use the following best practices to ensure your XML sitemap conveys what you want.

Best practice #1: Adhere to XML sitemap requirements

To optimize your XML sitemap, you’ll want to follow the recommended protocol for tags you should include for various search engines. Thankfully, the three biggest global search engines (Google, Microsoft [Bing], and Yahoo, Inc.) sponsor the website Sitemaps.org. This site has a protocol for how to format sitemaps, with sitemap XML example cases that you can reference. 

According to the protocol, XML sitemaps must:

  • Begin with an opening <urlset> tag and end with a closing </urlset> tag. These tags encapsulate the file and reference the current protocol standard.
  • Specify the namespace (protocol standard) within the <urlset> tag.
  • Include a <url> entry for each URL, as a parent XML tag. The remaining tags are children of this tag.•      Include a <loc> child entry (location tag) for each <url> parent tag.

A sitemap example may include other tags, as well. These are optional and may not be supported by all search engines. These tags may include:

  • <lastmod>: This indicates the date of the last modification of the file. It helps search engines understand that you’re the original publisher and communicates freshness to the search engine, which may help give your link new life in the search engine results pages. Beware of updating the date when you haven’t updated the content. That could result in a Google penalty.
  • <changefreq>: This tag indicates how frequently the page is likely to change, with valid values including always, hourly, daily, weekly, monthly, yearly, and never. Search engines may use this tag to adjust their crawl frequency.
  • <priority>: This tag indicates the priority of the URL compared to other URLs on your site. The least important is 0.0, while 0.5 is the default priority, and 1.0 is the most important page on the website. You can use any number within the 0.0-1.0 range for each URL.

Remember, if possible, it’s best to dynamically generate an XML sitemap so you don’t have to keep it manually updated. There are plenty of free tools and plugins you can use to automatically generate your sitemap. This will help search engines find your new pages more quickly.

Depending on how often your website changes, its XML sitemap should generally be updated two to four times a year. If you’re constantly creating new pages for your site, update the XML sitemap at least once a month.

Best practice #2: Omit URLs you don’t want indexed

Ensure your XML sitemaps contain the absolute, canonicalized version of each page. Since you want to optimize web page crawling and prioritize the highest-quality pages on your website in an XML sitemap, there are certain pages you’ll want to avoid including in XML sitemaps. These include URLs that are:

XML Sitemap URL Omissions

In general, quality is the most important factor that influences Google search engine rankings. That applies to what you should focus on when creating an XML sitemap, as well, so omit low-quality web pages from your sitemap.

Even pages that are essential for your website, like a customer login page, may not be necessary for your sitemap. You might exclude utility pages like a contact us page and a privacy policy page. You should focus on the pages you want a search engine to focus on when creating a sitemap, so limit the sitemap to only SEO-relevant pages.

Best practice #3: Use sitemap index files when necessary

There are limits to the number of URLs you can have in an XML sitemap and limits to the maximum size a sitemap file can be. XML sitemaps accommodate a maximum of 50,000 URLs and limit the uncompressed file size to 50MB. Some plugins will limit a sitemap to even fewer URLs so your sitemap continually loads as quickly as possible. 

If a sitemap exceeds the limits, a sitemap index file is required, which will show all sitemap files. If you have an exceptionally large website, like an eCommerce site, you may choose to split up your large sitemaps and create multiple sitemap index files to make your site easier for Google to crawl and index. If this applies to you, you’ll likely want to create:

  • Main website XML sitemap
  • Sitemaps for each subdomain, like products or articles, for example
  • Blog sitemap
  • Image sitemap
  • Video sitemap

Depending on the tools you use, and how big your site is, you may need several sitemap index files or only one simple XML sitemap.

Best practice #4: Add recently updated URLs to feeds

We recommend that all RSS/Atom feeds contain the latest additions of URLs or any recently updated URLs. That’s because, in addition to an XML sitemap, you can submit those feeds’ URLs as a sitemap to Google, which can help you increase the likelihood your links will get indexed and rank.

Google accepts RSS 2.0 and Atom 1.0 feeds and recommends using both RSS/Atom feeds and XML sitemaps to help the search engine understand the pages of your website to index. To provide Google information on your site’s video content, you can use a media RSS (mRSS) feed.

Best practice #5: Use sitemap data to improve your website

Finally, you’ll want to submit your sitemap to search engine tools like Google Search Console, where you can test or update your XML sitemap, and Bing Webmaster Tools. This will help your URLs get discovered faster and appear in SERPs sooner. Plus, it will help you identify and fix issues that are inhibiting your URLs from being indexed by search engines.

For example, you may discover that you have 5,000 pages on your website, but only 3,000 are being indexed due to problems like duplicate content. Ideally, every page you include in a sitemap will be indexed on Google. When you have URLs that aren’t ranking, you can use the sitemap data to identify those pages and eliminate their errors to help them have a better chance of ranking.

Example of an XML sitemap

Sitemaps.org has a sitemap example that’s a single URL using all optional tags. It looks like this:

Example of XML Sitemap

How to submit a sitemap XML file to Google

Google provides guidelines on how to build and submit an XML sitemap. Once you’ve created your sitemap, you can add it to your robots.txt file to signal Google that it’s ready to be crawled. You can directly submit it to Google Search Console by opening the sitemaps report and submitting the URL, as well, but you will need owner permission for a property in order to submit the URL.

In Google Search Console, you’ll be able to view the following information for each sitemap you’ve submitted:

  • The sitemap URL
  • The type or format of the sitemap
  • The last submission date
  • The date it was last read by Google
  • The crawl status (success, couldn’t fetch, errors, etc.)
  • The number of URLs discovered in the sitemap

If you have errors in the Status column for your sitemap in Google Search Console, you can view details about the errors and get recommendations for how to fix them.

If a sitemap you’ve submitted is no longer relevant, you can delete it from Google Search Console. To make Google “forget” a sitemap you no longer want the search engine to crawl, remove it from your site and set up a 404 for your former sitemap page.

How Conductor helps with XML sitemaps

Conductor can help you optimize your website and your sitemap so that more pages get indexed higher on search engines like Google. You can set up sitemap audits using the DeepCrawl website integration, which enables you to test new URLs, sitemaps, and parameters on a staging site so you’re successful when your website goes live. Once your website is live, it also:

  • Checks redirects
  • Identifies thin content, canonicals, links, and more on your website
  • Ensures your website is properly tagged for each geographic location so search engines show results to users in the correct language
  • Shows you why you’ve been penalized and helps you prioritize website fixes by impact and urgency

Using Conductor, you can check the health and quality of your website so that you maintain a high-quality website that search engines like Google will want to feature.

Download Conductor’s Chrome Extension to see if a page on your site is included in your sitemap. You can also find the sitemap URL of any page using the Technical SEO tab.

Get a free consultation from Conductor to see how our SEO platform and technology can help you.

Comments are closed.