Skip to main content icon/video/no-internet

Extensible Markup Language (XML)

Extensible Markup Language (XML) is a text-based markup or metalanguage used to define other markup languages. It allows content authors to define their own grammar and treelike document structures. XML files are platform and application independent and are readable for humans and machines. XML enjoys widespread support in industry and in open source software. XML files can be edited in any text editor, but specialized XML editors provide more features and convenience.

XML is widely used in GIS and especially in WebGIS applications. Use cases include XML-based file formats (geometry, attributes, and data modeling), data exchange between products or installations, communication between Web services, styling languages, configuration files, and user interface languages. The majority of Open Geospatial Consortium (OGC) specifications and data formats are based on XML. In addition, many companies introduced their own proprietary XML formats (e.g., Google Earth KML, ESRI ArcXML).

XML was specified and is maintained by the World Wide Web Consortium (W3C) and is originally a subset of Standard Generalized Markup Language (SGML). The idea of structuring documents by using tags to create markup goes back to the 1960s (IBM's General Markup Language [GML]). A tag is a marker that is used to structure the document and often also indicates the purpose or function of an element (see Figure 1 for an example XML file and some XML terms). Tags are surrounded by angle brackets (< and >) to distinguish them from text. Elements with content have opening and closing tags (see Figure 2). Empty elements may be closed directly in the opening tag. XML files are case sensitive.

XML allows a clean separation of content, presentation, and rules. On top of XML, a base infrastructure is provided that can be used to access, manipulate, and transform XML data (see Figure 3). Examples of this base technology layer are Document Type Definition (DTD) and Schema for defining rules; DOM/Scripting and XSL/XSLT/XPath to access, manipulate, style, and transform data; namespaces for mixing multiple XML languages; and XLINK/XPOINTER to link to internal and external resources.

Authors can define their own rules in DTDs or Schemas (e.g., W3C Schema, RelaxNG or others). Existing XML files may be validated against “well-formedness” and “validity.” While the former checks only against the general XML rules, the latter checks against the domain-specific rules defined in the DTD or Schemas. The DTD provides a list of valid elements, valid attributes, and entities and defines how elements may be nested; whether elements or attributes are required, recommended, or optional; and how often elements may be used (zero, one, or more). DTDs may also be used to define default values. DTDs are not written in XML and are very limited for defining rules. As a consequence, W3C and other organizations introduced more powerful rule languages, defined in XML. The W3C and RelaxNG Schema allow more fine-grained rules, such as checking against data types, valid ranges, better constraints and grouping, support for schema inheritance and evolution, and namespace support. XML namespaces can be used to mix various XML languages or to extend existing XML dialects with proprietary extensions. One example for the use of namespaces would be the integration of GIS feature attributes in a Scalable Vector Graphics (SVG) graphic (e.g., attaching the population value to an SVG path element representing a province or embedding an SVG graphic directly in an XHTML file).

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading