XML is a text-based data format that is used, among other things, for storing data and platform-in­de­pend­ent data exchanges online. XML forms the basis for other languages such as HTML and can be read by humans and machines alike.

What is XML?

XML stands for 'Ex­tens­ible Markup Language'. According to the defin­i­tion of XML it’s a markup language used to represent data in a struc­tured text file, which can be evaluated by humans and computers. The meta language was developed by the World Wide Web Con­sor­ti­um as early as 1998 and is now an in­dis­pens­able part of web de­vel­op­ment. The current version of XML, the fifth edition, was published in 2008. The standard character encoding of XML documents is UTF-8.

The usage and function of XML

XML can be used in many different ways. It all begins with the creation of an XML file which is a text document with the extension .xml created using any text editor. For a document to be valid you must adhere to the syntax of the language. These documents can be used, for example, when HTML reaches its limits. This is often the case, for example, with website sitemaps. A sitemap.xml file is created here.

To read in the XML documents, a parser is needed. This provides a pro­gram­ming interface through which ap­plic­a­tions can access the XML document. Access happens via:

  • DOM: The XML document is rep­res­en­ted as a tree structure and can be read in this way. Changing the tree structure and writing to the structure are also enabled.
  • Pull-API: Data from an XML document is processed se­quen­tially and mostly event-based.
  • SAX: The XML document is treated as a se­quen­tial data stream.

Standard web browsers usually include a parser, which sim­pli­fies reading XML documents.

Use of XML

The ex­tens­ible markup language is used across many different areas. For example, it can be used as the basis for other languages. HTML, the Hypertext Markup Language, is based on XML. SVG, the most popular file format for vector graphics, is also based on XML.

However, XML isn’t just relevant as a basis for other languages. In web de­vel­op­ment, pure XML is often used to set up website sitemaps. In addition, the ex­tens­ible markup language can be used to create XML databases. These document-oriented databases are more flexible than re­la­tion­al databases and un­sur­pris­ingly very popular.

How is XML struc­tured?

At first glance, XML documents resemble HTML documents. That’s because HTML is built on the ex­tens­ible markup language and uses the same notation accessing various at­trib­utes and tags.

With XML you can dis­tin­guish between different document types. Document-centric XML documents are largely com­pre­hens­ible to human readers without ad­di­tion­al in­form­a­tion and are only slightly struc­tured. However, this reduces their machine read­ab­il­ity enorm­ously. In a sense, data-centric XML documents can be regarded as the opposite. The high degree of struc­tur­ing goes hand in hand with good machine read­ab­il­ity. For human readers, data-centric documents are intuitive, but hardly com­pre­hens­ible. Finally, as a com­prom­ise, there are semi-struc­tured documents.

Is XML a pro­gram­ming language?

The ex­tens­ible markup language is a markup language that’s not a pro­gram­ming language in its own right. There’s no XML compiler, and you cannot create ex­ecut­able files using XML. The languages based on XML are not con­sidered pro­gram­ming languages

The different XML elements

The most important component of the ex­tens­ible markup language are elements. Their names can be freely chosen. Elements start and end with tags. To declare the element 'house' in your XML document, for example, the code would look like this:       

<house></house>
You can list the contents of the house here.
xml

Elements can be nested as desired. You can also add at­trib­utes to the elements that contain ad­di­tion­al in­form­a­tion about your elements. For example, if you want to add a roof and two numbered rooms to your house, this changes the XML document as follows:

<house></house>
You can list the contents of the house here.
	<roof></roof>
	The roof of the house.
	
	<room number="“1”"></room>
	Room 1 in your house.
	
< room number = “2”>
	Room 2 in your house.
xml

The room numbers are the pre­vi­ously mentioned at­trib­utes.

Other com­pon­ents of XML documents can be comments of the form .

XML entities

You can access pre­defined entities with the ex­tens­ible markup language. Creating your own entities is just as easy. These is specific content that is defined for later use. Other XML documents can be included using entities.

Important entities for using XML are, for example, <; for the < character or > for the > character. You can create your own entities in the XML document as follows:

<!--ENTITY eg “Example entity”-->
xml

Dif­fer­ences between XML and HTML

HTML, now coming up to version 5 with HTML5, is based on XML and looks very similar to the ex­tens­ible markup language. In com­par­is­on to HTML there are no pre­defined sets of per­miss­ible at­trib­utes and tags. With XML these can be defined by the pro­gram­mers. Compared to HTML, XML is case-sensitive. Another dif­fer­ence is that XML requires closed tags. A line break beginning on
in HTML would be
in XML.

Go to Main Menu