What are page description and markup languages?

The definition of Markup Language on this page is an original TechTerms.com definition. If you would like to reference this page or cite this definition, you can use the green citation links above.

The goal of TechTerms.com is to explain computer terminology in a way that is easy to understand. We strive for simplicity and accuracy with every definition we publish. If you have feedback about the Markup Language definition or would like to suggest a new technical term, please contact us.

Want to learn more tech terms? Subscribe to the daily or weekly newsletter and get featured terms and quizzes delivered to your inbox.

From Seobility Wiki

Definition

What are page description and markup languages?

Figure: Markup Language - Author: Seobility - License: CC BY-SA 4.0

Markup language is a term used in computer text processing to refer to an organized annotation system (i.e. language) that marks certain parts or elements of a document as different from plain text. Essentially, markup language is used in web documents or applications to format text and to give it a specific structure. Another basic characteristic is that markup language is invisible to the reader of a web page or document since the only way to view it is by accessing the source code. Unlike programming languages, markup languages are not executed: they are read and rendered instead.

How it works

In order to format or structure a page, markup language uses a series of tags, often enclosed in angle brackets, whose symbol is <>. These tags function like instructions, and enclosing them in angle brackets makes the tags syntactically different from plain text. Markup tags are read and interpreted by parsers, which then generate output, usually as a page or document that is formatted and styled accordingly while keeping the actual markup hidden from plain sight.

For example, in HTML markup language, the h1 tag indicates a web page’s main header element and these tags instruct the machine to render the header in bolder or larger font, as per the standard HTML conventions. However, a reader browsing this page would only see the formatted header, instead of the markup tags.

Descriptive markup languages vs. procedural markup languages

There are several markup languages available to web developers, and it is possible to use different types of markup languages in the same document. Broadly speaking, they can be classified into two groups depending on their main function: descriptive and procedural languages.

Descriptive markup languages

As their name suggests, the function of descriptive markup languages is to describe or label elements or sections of a web document. This language is purely semantic, which means that it does not give any instructions on how it must be processed. Instead, the focus of descriptive markup is to label parts of a document along conceptually established areas.

HTML and XML are some of the most widely used descriptive markup systems. XML stands for Extensible Markup Language that was developed in the 1970s in order to simplify the existing web document annotation systems. XML markup differs from HTML in that its applications go beyond the display of web documents. XML has been used to support a range of applications, from productivity to communication tools.

On the other hand, HTML stands for Hypertext Markup Language and appeared in the early 1990s. Over the following decades, it became the most commonly used markup system in the world wide web. Common uses include creating web page sections and divisions, and formatting lists, text, and contact forms. A basic HTML document would look as follows:

<!DOCTYPE html> <html> <head> <meta charset="utf-8"> <title>Page Title</title> </head> <body> <h1>Main Heading</h1> <p>Sample paragraph</p> <h2>Sample Sub-heading</h2> <p>Sample paragraph</p> <img src="image-file-name" alt="image title"> </body> </html>

In the document above, HTML markup would tell the browser how to format and display elements like title, main heading, and sub-heading, all according to default parameters unless otherwise specified by CSS.

Procedural markup languages

Unlike descriptive markup systems, procedural languages do offer instructions on how their tags should be processed. These systems are used in a variety of applications, from desktop publishing to typesetting and word processing. Some examples of the most commonly used procedural languages include Postscript and LaTex.

LaTex was initially released in the mid 1980s and has since gained wide adoption in academia and the sciences. LaTex users produce a series of tags to add structure and style the document they are working on. Its main focus is typesetting and document preparation, and its tags are based on familiar and logical concepts like chapter or table. A sample LaTeX document would look as follows:

\documentclass [14pt]{article} \usepackage{name of package being used} \title{\Document title} \begin{document} \section*{Section name} \subsection*{Sub-section Name} \end{document}

In the document above, LaTex markup would specify the font size applicable to the document (14pt), as well as delimit different sections of the document (title, section, and sub-section).

Importance for search engine optimization

Your choice of markup language will not impact SEO directly; that is, you can choose HTML, XML, or LaTex and this choice will not necessarily make a page rank higher or lower. Having said that, certain types of markup make it easier for search engines to understand the content of a page. This is the case of meta tags or Google’s Schema markup. Similarly, certain types of schema markup allow Google to display rich snippets in the search results page.

What are page description and markup languages?

Screenshot with a recipe in the SERPs of google.com

Therefore, it is recommended to use these markup types on web pages. Altogether, this helps search engines gain a more accurate understanding of a site’s content so they can serve this content to users who are likely to engage with it.

Similar articles

As you begin exploring the world of web design, you will undoubtedly be introduced to a number of words and phrases that are new to you. One of the terms that you will likely hear is "markup" or perhaps "markup language". How is "markup" different than "code" and why do some web professionals seem to use these terms interchangeably? Let's start by taking a look at exactly what a "markup language" is.

This example is an HTML paragraph. It is made up of an opening tag (

), a closing tag (

), and the actual text that would be displayed on the screen (this is the text contained between the two tags). Each tag includes a "less than" and "greater than" symbol to designate it as part of the markup.When you format text to be displayed on a computer or other device screen, you need to distinguish between the text itself and the instructions for the text. The "markup" is the instructions for displaying or printing the text.

Markup doesn’t have to be computer-readable. Annotations done in print or in a book are also considered markup. For example, many students in school will highlight certain phrases in their textbooks. This indicates that the highlighted text is more important than the surrounding text. The highlight color is considered markup.

Markup becomes a language when rules are codified around how to write and use that markup. That same student could have their own “note-taking markup language” if they codified rules like “purple highlighter is for definitions, yellow highlighter is for exam details, and pencil notes in the margins are for additional resources.” 

Most markup languages are defined by an outside authority for use by many different people. This is how the markup languages for the Web work. They are defined by the W3C or World Wide Web Consortium.

Nearly every acronym on the Web that has an “ML” in it is a “markup language” (big surprise, that is what the "ML" stands for). Markup languages are the building blocks used to create web pages or all shapes and sizes.

In reality, there are many different markup languages out there in the world. For web design and development, there are three specific markup languages that you will likely run across. These are HTML, XML, and XHTML.

To properly define this term — a markup language is a language that annotates text so that the computer can manipulate that text. Most markup languages are human-readable because the annotations are written in a way to distinguish them from the text itself. For example, with HTML, XML, and XHTML, the markup tags are

<

and

>

Any text that appears within one of those characters is considered part of the markup language and not part of the annotated text. For example:

HTML or HyperText Markup Language is the primary language of the Web and the most common one you will work with as a web designer/developer. In fact, it may be the only markup language you use in your work.

All web pages are written in a flavor of HTML. HTML defines the way that images, multimedia, and text are displayed in web browsers. This language includes elements to connect your documents (hypertext) and make your web documents interactive (such as with forms). Many people call HTML "website code", but in truth, it is really just a markup language. Neither term is strictly wrong and you will hear people, including web professionals, use these two terms interchangeably. 

HTML is a defined standard markup language. It is based upon SGML (Standard Generalized Markup Language). It is a language that uses tags to define the structure of your text. Elements and tags are defined by the < and > characters.

While HTML is by far the most popular markup language used on the Web today, it is not the only choice for web development. As HTML was developed, it got more and more complicated and the style and content tags combined into one language. Eventually, the W3C decided that there was a need for a separation between the style of a web page and the content. A tag that defines the content alone would remain in HTML while tags that define style were deprecated in favor of CSS (Cascading Style Sheets).

The newest numbered version of HTML is HTML5. This version added more features into HTML and removed some of the strictness that was imposed by XHTML (more on that language shortly). 

The way that HTML is released has been altered with the rise of HTML5. Today, new features and changes are added without there needing to be a new, numbered version released. The latest version of the language is simply referred to as "HTML."

The eXtensible Markup Language is the language that another version of HTML is based on. Like HTML, XML is also based off of SGML. It is less strict than SGML and more strict than plain HTML. XML provides the extensibility to create various different languages.

XML is a language for writing markup languages. For example, if you are working on genealogy, you might create tags using XML to define the father, mother, daughter, and son in your XML like this:  . There are also several standardized languages already created with XML: MathML for defining mathematics, SMIL for working with multimedia, XHTML, and many others.

XHTML 1.0 is HTML 4.0 redefined to meet the XML standard. XHTML has been replaced in modern web design with HTML5 and the changes that have come since. You are unlikely to find any newer sites using XHTML, but if you are working on a much older site, you may still encounter XHTML out there in the wild. 

There aren't a lot of major differences between HTML and XHTML, but here is what you will notice:

  • XHTML is written in lower case. While HTML tags can be written in UPPER case, MiXeD case, or lower case, to be correct, XHTML tags must be all lower case. (many web professionals write HTML in all lowercase, even though it is not technically required).
    • All XHTML elements must have an end tag. Elements with only one tag, such as and need a closing slash (/) at the end of the tag:
  • All attributes must be quoted in XHTML. Some people remove the quotes around attributes to save space, but they are required for correct XHTML.
  • XHTML requires that tags are nested correctly. If you open a bold () element and then an italics () element, you must close the italics element () before you close the bold (). (Note that both of these elements have been deprecated because they are visual elements. HTML now uses and in place of these two).
  • HTML Attributes must have a name and a value. Attributes that are stand-alone in HTML must be declared with values as well, for example, the HR attribute would be written noshade="noshade".