A map of a website: sitemaps portray the structure of web presences, including all dir­ect­or­ies and subpages. Nev­er­the­less, they are not ne­ces­sar­ily intended for visitors to the website. There is usually a much clearer nav­ig­a­tion tool for users. However, this doesn’t mean that website operators can neglect their sitemap. But for which purposes do you need this overview, and what vari­ations of sitemaps are there?

What is a sitemap?

A sitemap contains all documents—in other words, webpages—of a website, and presents them hier­arch­ic­ally. This means that the structure of the entire web presence is du­plic­ated in this overview. To un­der­stand it, you should briefly fa­mil­i­ar­ise yourself with the set-up of a website: a basic website is comprised of in­di­vidu­al HTML documents, which are stored in various folders and in­ter­con­nec­ted via hy­per­links. All of them together are found on the webspace. In the sitemap, pages are recorded along with cor­res­pond­ing URLs.

In the early days of the World Wide Web, the sitemap was prin­cip­ally created to make users’ website nav­ig­a­tion easier. Often inserted as a frame in addition to the main content, sitemap documents gave visitors the op­por­tun­ity to move from one site to another at any time, without having to click through in­di­vidu­al hy­per­links one after the other. Nowadays, the nav­ig­a­tion process is usually solved much more elegantly, but the sitemap is still justified. For one thing, having this ad­di­tion­al nav­ig­a­tion tool can increase user-friend­li­ness, and for another, search engines make use of these files.

XML vs. HTML: sitemap com­par­is­on

It is common to dif­fer­en­ti­ate between two versions of the sitemap: it exists in the XML format, and there are also HTML sitemaps. If you want to make the sitemap available to website visitors, you should make an HTML sitemap. This is es­sen­tially an ad­di­tion­al document that is part of the website, and can be in­cor­por­ated into the structure of its online presence just like any other HTML page. A sitemap that is created in the XML format, however, is primarily oriented toward search engines. XML is a markup language just like HTML, but the former boasts more functions.

This results in ad­vant­ages and dis­ad­vant­ages of XML and HTML sitemaps. A nav­ig­a­tion file in the HTML format can be used by visitors to the website without com­plic­a­tions. Users can easily find their way around the site via the links when they are looking for something. In this way, the sitemap becomes akin to the search function and the nav­ig­a­tion bar. The sitemap is thus an ad­di­tion­al website component that increases user-friend­li­ness. These days, the sitemap usually isn’t in­teg­rated as a frame. Instead, it is common to provide a link to the overview document, above the header or footer of the website, for example.

If you create a sitemap in the XML format, you have the option to submit it to the Google Search Console. This will allow the search engine to gain a better un­der­stand­ing of your entire website. XML also allows you to create a so-called video sitemap. It is difficult for Google and other search engines to read the content of video files, making the search engines dependent on ad­di­tion­al data, called metadata. If you have in­cor­por­ated videos into your site and would like Google to integrate them into its video search, you should provide a video sitemap.

This requires creating an XML file that supplies data about the in­di­vidu­al clips on the site. The data includes the title and de­scrip­tion of the video file, the URL of the subpage on which the clip is shown, a link to a thumbnail picture, and the storage location of the video player you used. The same strategy also applies to images, so that they show up in image searches.

As a webmaster, you luckily don’t even need to decide whether you would rather trust in an XML or HTML sitemap. Using both is possible; in fact, it provides the best results, for visitors to the website as well as for the web crawlers sent by Google and other search engines. Although the XML option is directly oriented toward the search engine, HTML sitemaps are also used in the web crawlers’ ex­am­in­a­tion of the website as an easy way to take all pages into account.

Note

You can find more in­form­a­tion about how to make a strong XML sitemap in our com­pre­hens­ive article on the topic.

Sitemaps and SEO

Sitemaps play a large role in search engine op­tim­isa­tion (SEO). Why is this? Search engines allow programs – the well-known web crawlers, otherwise known as search bots – to sift through the Internet in order to un­der­stand and index it as com­pletely as possible. When such a program arrives on a website, it follows the hy­per­links to find out their content. It is not ne­ces­sar­ily guar­an­teed that the web crawler will be able to record all subpages, though. This is es­pe­cially relevant to very extensive websites. A sitemap – in XML as well as in HTML format – sim­pli­fies the search engine bot’s ex­am­in­a­tion process by providing it with an index of all the webpages.

Even when it comes to pages that are not very well connected to other pages, a sitemap is more than helpful. Web crawlers always follow hy­per­links to move through the World Wide Web. This is why every single page should be linked in a sitemap. Google un­for­tu­nately can’t guarantee that the bot will truly take each page into con­sid­er­a­tion, but the chances of this are at least higher. It is also relevant if the website is still fairly new and if few or no other websites link to its pages.

A strong sitemap in the XML format provides the search engine with ad­di­tion­al data about the website: When was it created? How often was it updated? What is one page’s relation to the others of the website? How important is the content in the context of the overall ap­pear­ance?

Even though one can generally say that an HTML sitemap is oriented more toward users and its coun­ter­part in XML more toward web crawlers, both are important in an SEO context. Sitemaps in an HTML format have just as much influence on ranking, because these documents are also con­sidered during the ex­am­in­a­tion of a site. When de­term­in­ing the ranking order of the search results, Google pays attention to websites’ user-friend­li­ness, too. A clearly organised sitemap boosts usability and can lead to an improved ranking.

Creating a sitemap – explained with examples

The creation of a sitemap is not a hard process, and using a sitemap generator makes it even easier. The best course of action depends on the format you are going for. The HTML sitemap is generally easier to create. This only requires knowing a few HTML ground rules – es­pe­cially how to correctly mark links. By using href at­trib­utes, you can compile a list with links. In actuality, web masters direct more energy toward sitemap creation and, for example, adapting the design of the nav­ig­a­tion document to the rest of the website.

<li class="lpage"><a href="http://one-test.website/" title="Theme Preview – Previewing Another WordPress Blog">Theme Preview – Previewing Another WordPress Blog</a></li>
Theme Preview – Previewing Another WordPress Blog
<li class="lpage"><a href="http://one-test.website/about-us" title="About us – Theme Preview">About us – Theme Preview</a></li>
<li class="lpage"><a href="http://one-test.website/our-projects" title="Our Projects – Theme Preview">Our Projects – Theme Preview</a></li>
<li class="lpage"><a href="http://one-test.website/sample-page" title="Sample Page – Theme Preview">Sample Page – Theme Preview</a></li>
<li class="lpage"><a href="http://one-test.website/shop" title="Products – Theme Preview">Products – Theme Preview</a></li>

The creation of the file in the XML format is sig­ni­fic­antly more extensive. The sitemap begins with a <urlset> tag. In­di­vidu­al URLs are entered within these brackets. The URLs are in turn each embedded in a <url> tag, while the actual link to the subpage should be found in a <loc> tag. While these elements always need to be included, the ad­di­tion­al details about the frequency of page edits (<change­freq>), about the date of the last edit (<lastmode>), and about the im­port­ance of the page (<priority>), are optional.

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"></urlset>
<url></url>
		<loc>http://one-test.website/</loc>
		<lastmod>2018-03-23T14:32:21+00:00</lastmod>
		<priority>1.00</priority>
<url></url>
		<loc>http://one-test.website/about-us</loc>
<lastmod>2018-03-23T14:32:21+00:00</lastmod>
		<priority>0.80</priority>

Those who want to make it easier for them­selves and avoid writing the entire sitemap manually can fall back on a sitemap generator. All you have to do when using these web services is simply enter the main URL of your own web presence – the sitemap generator will then search the entire website and create an index of all its pages in the process. These helpful online tools are available for XML as well as for HTML sitemaps. Some gen­er­at­ors even create several vari­ations at once for the user. For some content man­age­ment systems, such as WordPress, plugins for the creation of sitemaps are available.

Google sug­ges­tions for sitemaps

Although you have a lot of freedom in deciding what the nav­ig­a­tion document will look like, there are a few re­quire­ments that Google sets for sitemaps that you should meet if you want to improve your search engine ranking. As a result, the sitemap should be coded in UTF-8, should not include more than 50,000 URLs, and should not be larger than 50 MB. The size limit applies to the un­com­pressed file. You can submit com­pressed versions of sitemaps to Google as well, but this doesn’t increase the maximum file size.

Google re­com­mends the creation of several sitemaps for web presences that are es­pe­cially extensive. After doing so, you must create an index file that ref­er­ences all other sitemaps, and submit this to the search engine.

Tip

You don’t ne­ces­sar­ily need to include all pages of a website in the sitemap. Even with pages that can be accessed via various URLs, you only need to choose the preferred address. The same applies to websites that are very similar (for example, websites that have the same content but were created for use on different devices). You only need to enter the so-called canonical page that Google is supposed to work with.

To finally put the sitemap on Google’s radar, there are two options. One option is to directly upload the file in the Search Console, and the other is to add a reference to the file in robots.txt. This text file is specially designed for search engines and is the first to be retrieved by web crawlers. By linking to the sitemap on the server, the search bot is told where it should look next.

Go to Main Menu