Infrastructure:
Markup

Parke Godfrey
21 September 2012
24 September 2012
CSE-2041

Credits

These slides are based in part on ones from the following sources.

Presentation & Rendering

Format Goals

Semantics vs. Presentation & Rendering

Our content format should abstract away from how it is to be rendered.

What should this universal format be?

Markup languages provide this abstraction.

XML (eXtensible Markup Language)

XML is a markup language that is entirely semantic based.

XML
The Spectrum

XML
XML as a database?!

We can think of XML as a database, just as the relational databases we have studied.


Model-View-Controller ( MVC) Paradigm

The Web Ecosystem
Format

The Web follows the MVC paradigm.

We study each of these in Section III: Client-side.

We study the basics of markup and HTML here.

Hyptertext Markup Language (HTML)

“Derived” from XML. Instead of free tags, there is a defined list of tags.

HTML
Origins

Originally, derived from Standard Generalized Markup Language (SGML).

Why? This provided existing tools such as parsers.

XML developed in parallel with HTML (and, originally, derives from SGML too).

HTML standards later changed (HTML4, XHTML, HTML5) to define HTML as derived from XML instead.

Why? Tremendous support exists for XML. These tools apply directly to HTML too.

HTML
Whitespace & References

HTML
Structural Elements

HTML
Common Elements

HTML
Figures & Media

Screen units: px%em, & pt

HTML
The Anchor Element

<a> also anchors the other side of a link!

HTML
Lists

HTML
Tables

HTML
Well‐Formed & Valid

Well‐Formed
declaration (preamble)
one root: <html>
paired open and close tags
no straddling of tag pairs
 
Valid
valid element names (tags)
valid element nesting
valid attribute names
valid attribute values

Scruffy versus Neat
loose?

Should the format be loosely or strictly enforced?


loose

Harder to author pages.

Harder to maintain valid documents.

+

Automated tools can understand and manipulate content.

Renderer knows how to handle the page.

E.g., XHTML

Scruffy versus Neat
strict?


strict

+

Easier to author pages.

Renderer makes best effort. (Graceful degradation.)

Automated tools have a harder time to understand and manipulate content.

Renderer can mess up badly. (Document is harder to parse. Renderer may refuse non-well-formed or invalid documents.)

E.g., HTML4, HTML5