If you care about where your web project is placed in the search engine result pages (SERPs), you will know exactly how many different factors influence the fight for the top places. For example, the list of factors that affect Google’s ranking includes over 200 criteria, some of which have been of­fi­cially confirmed by the company, but some have only been assumed by experts. It’s no secret that search engine op­tim­isa­tion has been the standard for every webmaster wanting their website to be visible and ac­cess­ible. While factors such as relevant keywords, high-quality content, or a high level of mobility are well-known factors, the value of a good XML sitemap is often un­der­es­tim­ated.

What is an XML sitemap?

An XML-Sitemap (sitemap.xml) is a text file in XML format (extensible markup language) that contains a list of all a website’s subpages in link form. As such, it can be uploaded to Google Search Console or Bing Webmaster Tools to notify search engine crawlers of all available and relevant pages to speed up and optimise the indexing process. XML sitemaps must meet the re­quire­ments of the sitemap protocol, which was agreed as standard by Google, Yahoo, and Microsoft in 2006 – with the aim being to improve the quality of search results delivered in the long term. For this, the encoding in UTF-8 (among other things) and the markup language XML, as well as the use of entity codes for certain char­ac­ters (such as “>” instead of “>”), are required.

Note

XML sitemaps are different from the sitemaps that many CMS auto­mat­ic­ally display in the frontend. This is the table of the website’s contents, which is intended to make nav­ig­a­tion easier for visitors. By default, sitemaps are not visible to users, even though it is tech­nic­ally possible to make them ac­cess­ible via a URL.

The ad­vant­ages of a XML sitemap

Even if there is no guarantee that Google and other search engines’ indexing will be optimised due to XML sitemaps being used, the struc­tured link dir­ect­or­ies increase the chance of this being the case. The crawler-friendly table of contents can also pay off, es­pe­cially for sites with dynamic content that are subject to constant change. The same applies to larger web projects that have many subpages but not a big backlink structure (yet). Sites like these tend to be checked too ir­reg­u­larly for changes to be noticed or aren’t even picked up by the search engines’ radars. Thanks to sitemap.xmp, you can help them get noticed by indexing bots more quickly.

An ad­di­tion­al advantage: as well as listing URLs of subpages, XML sitemaps can also list media files such as videos or images. For these, there are even extra tags that tell the crawler what content type is being used (e.g. <image>, <video>). In addition, at­trib­utes can be used that describe the content in more detail or specify the duration, so that search engines can optimally identify it. There is also a special version of the XML sitemap for news portals, which promises articles will be optimally indexed thanks to specific at­trib­utes such as genre, pub­lic­a­tion date, or title.

Tip

The effort involved in manually creating an XML sitemap, for simply ensuring your website has a struc­tur­al directory, can be seen as a dis­ad­vant­age. Thanks to XML sitemap gen­er­at­ors like the online generator of XML-Sitemaps.com, there is no need to generate the practical XML sites by yourself. In addition, there are plugins for most content man­age­ment systems that create XML sitemaps auto­mat­ic­ally.

Structure of an XML sitemap: the most important com­pon­ents

The format­ting of an XML sitemap works with XML tags, just like every document in the ex­tens­ible markup language. According to the current standard “Sitemaps 0.9,” three tags are required for it to be con­sidered an XML sitemap.

sitemap.xml: com­puls­ory tags
<urlset>, </urlset> Each sitemap XML file must begin with an opening <urlset> tag and end with a closing </urlset> tag. The tag’s function is to summarise the file and link to the current protocol standard.
<url>, </url> The opening and closing <url> tags are sub­or­din­ate to the in­di­vidu­al URL entries and indicate the beginning and end of a listed subpage.
<loc>, </loc> The <loc> tag iden­ti­fies the in­di­vidu­al pages of the web project or their URLs. The URL must always begin with the protocol (e.g. “http”) and end with a closing slash (if required by the web server). A maximum length of 2.048 char­ac­ters is also defined.

Apart from these mandatory XML at­trib­utes, the sitemap tags <priority>, <lastmod>, and <change­freq> provide three ad­di­tion­al tags for spe­cify­ing the in­di­vidu­al URL entries. However, the extent to which these optional tags are supported depends on the re­spect­ive search engine. For example, the Google crawler primarily uses <lastmod> markups for indexing, while it largely ignores the other two at­trib­utes or only allows them to flow minimally into the crawling process.

sitemap.xml: optional tags
<lastmod>, </lastmod> Via the <lastmod> tag, the date (in W3C format) of the page’s last modi­fic­a­tion can be specified. The tag is in­de­pend­ent of the “if modified since” header that the web server can return as part of an HTTP 304 response.
<change­freq>, </change­freq> The <change­freq> tag provides the crawler with general in­form­a­tion on how often a page is expected to be updated (hourly, daily, monthly, and so on). Documents that are modified every time they are accessed are marked with the value “always,” and archived URLs are marked with “never.”
<priority>, </priority> This tag enables a URL’s priority within an entire web project to be expressed on a scale of 0.0 to 1.0 (default priority: 0.5). This way, crawlers can be made aware of pages whose indexing is par­tic­u­larly important.

Since an XML sitemap file may contain a maximum of 50,000 URLs and may not be larger than 50 MB, the URL col­lec­tion of larger web projects can also be dis­trib­uted across several documents. In this case, however, each sitemap document should be listed in an ad­di­tion­al index file whose structure is similar to that of the sitemap files: The tags <sitem­ap­in­dex> and <sitemap> must be used instead of <urlset> and <url>.

Note

It is possible to compress sitemap files (e.g. with gzip), but only to reduce bandwidth re­quire­ments. The maximum size of an XML sitemap cannot be increased this way, as the limit always applies to the unpacked version of the file.

XML sitemap example

The easiest way to un­der­stand the structure of an XML sitemap is to use a concrete example:

<!--?xml version="1.0" encoding="UTF-8"?-->
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"></urlset>
	<url></url>
		<loc>http://one-test.website/</loc>
		<lastmod>2018-01-01</lastmod>
		<changefreq>monthly</changefreq>
		<priority>1.0</priority>
	
	<url></url>
		<loc>http://one-test.website/page1/</loc>
		<lastmod>2018-03-05</lastmod>
		<changefreq>weekly</changefreq>
		<priority>0.5</priority>
	
	<url></url>
		<loc>http://one-test.website/page2/</loc>
		<lastmod>2018-03-08</lastmod>
		<changefreq>weekly</changefreq>
		<priority>0.3</priority>

In this case, the example XML sitemap lists the main URL one-test.website and the URLs of two subpages (page1 and page2). Search engine crawlers can see from the document that the main page has been given the highest priority by the webmaster and that modi­fic­a­tions are made ap­prox­im­ately once a month. The last ad­just­ment was made on 1st January, 2018. Page1 has the default priority value (0.5), but unlike the main page, it has been estimated that it will be adjusted weekly, with the last modi­fic­a­tion having taken place on 5th March, 2018. If the crawler works with the sitemap priority attribute, it knows that it must pay the least attention to page2 during indexing (<priority> value: 0.3). The subpage is modified weekly (last modified on 8th March, 2018).

Creating and sub­mit­ting an XML sitemap – how it works

Given the huge amount of work involved in manually creating XML sitemaps, choosing plugins or online tools is a good idea – provided that you use them correctly. Reas­on­able XML sitemaps can be generated without specific con­fig­ur­a­tions, but the structure dir­ect­or­ies will only be able to have the desired form when the ap­pro­pri­ate, in­di­vidu­al settings are correct. For our example, we present the pos­sib­il­it­ies offered by XML-sitemaps.com’s online generator and the WordPress plugin Google XML sitemaps for the creation and in­teg­ra­tion of XML sitemaps.

How to generate XML sitemaps using the XML-sitemap.com’s online generator

The online generator from XML-sitemaps.com offers users a con­veni­ent solution for creating their own XML sitemaps. The web service is free for web projects with up to 500 subpages – sitemaps for larger projects can also be created, but this user will need to pay for the Pro sub­scrip­tion. The procedure is very simple: After accessing the web ap­plic­a­tion, insert the URL of your website into the address field provided:

Use the “More options” button to determine whether or to what extent sitemap entries should be specified via the lastmod>, <priority> or <change­freq> attribute. The former can either be switched on or off, and for the latter you can set the desired update frequency (hourly, daily, weekly, etc.) if you want to make use of this labelling option. Otherwise, simply keep the default setting: “Do not specify.”

By clicking on “START” you will begin the gen­er­a­tion process, the duration of which depends on the size of your web project. Once the process is complete, you can display the result under “VIEW SITE MAP” -> “VIEW FULL XML SITEMAP.”

Download the generated XML sitemap file and upload it to your website’s route directory. To inform the Google crawler about the file, for example, simply submit the file in the Google Search Console. Al­tern­at­ively, you can specify the path where the sitemap can be found anywhere in the robots.txt file:

Sitemap: http://one-test.website/sitemap.xml

Google XML Sitemaps: how to create XML sitemaps with the WordPress plugin

For over a decade, the WordPress plugin Google XML Sitemaps, developed by Arne Brachhold, has made creating XML sitemaps child’s play. To use the popular plugin (over 2 million active in­stall­a­tions worldwide) for your WordPress website, you first have to install it via the content man­age­ment system’s plugin center. Select the menu item “Plugins” and then “Install” and enter “Google XML Sitemaps” into the search field. By clicking on “Install now” you start the in­stall­a­tion process of the extension, which should appear at the top of the presented results:

You can also download Google XML Sitemaps manually and place it in your WordPress’ plugin directory. If you activate the extension, you can access it directly in WordPress via “XML Sitemap” in the “Settings” menu. Compared to XML-Sitemaps.com, a sig­ni­fic­antly larger number of con­fig­ur­a­tion options are available in the following seven areas:

  • Basic options: here you define the basic settings and determine, for example, whether Google and Bing should be informed auto­mat­ic­ally about changes or whether the sitemap should be auto­mat­ic­ally com­pressed
  • Ad­di­tion­al pages: here you can add files or URLs that do not belong to the WordPress project but run on the same domain
  • Post priority: ad­just­ments in this menu are par­tic­u­larly in­ter­est­ing for blogs and news portals – if you work with the <priority> tag for your sitemap, you can define here whether and how the plugin should calculate the priority of a post</priority>
  • Sitemap content: use this menu to select the cat­egor­ies of pages to be included in the XML sitemap (e.g. homepage, static pages, archive pages, etc.)
  • Excluded items: if you want to exclude cat­egor­ies or in­di­vidu­al posts from being indexed, you can do so here
  • Change fre­quen­cies: Google XML Sitemaps offers the pos­sib­il­ity of pre­set­ting the <change­freq> tag, and the update frequency can even be set sep­ar­ately for the different page types</change­freq>
  • Pri­or­it­ies: beneath this, you can make the same settings for the <priority> attribute</priority>

Once you have designed the XML sitemap setup according to your wishes, save the changes using the cor­res­pond­ing button. By clicking on the link “Your sitemap” after saving, you transmit your XML sitemap to the selected search engine crawlers.

Go to Main Menu