Chapter 8 - CSS - XML: A Deeper Understanding by John Shirrell

8.1 CSS

Now that you have your XML documents, you are probably anxious to develop visual formatting rules, so your document can be viewed in a browser in a manner you define, rather than just seeing source code. You are in luck, because there are not one but two standards for visual presentation. The simpler one is Cascading Style Sheets (CSS). Cascading Style Sheets are widely used on webpages to format fonts, background colors, table column widths, and that sort of thing. The use of CSS varies slightly between HTML (and XHTML) and XML, so I will cover both applications of CSS. For the most part, the syntax is the same between the two.

CSS is another W3C standard, and it was created to make the visual appearance of web pages more consistent and easier to customize. For example, before CSS, it was common to decrease the font size of capital letters to create a "small caps" effect on webpages. CSS has a small-caps keyword that accomplishes the same effect automatically. There are also many tricks you can do with CSS that you could not do with plain HTML; for example, if you are viewing the online version of the book in a browser other than Internet Explorer 6 or below, you will see the navigation menu fixed in the top-left corner of the window. This is called fixed positioning, and is an example of a place where Internet Explorer 6 deviates from the CSS recommendation. (Internet Explorer 7 displays the fixed menu properly.)

There are two things you really need to know about CSS; the rest is just syntax that you can look up in the specification (I am using CSS2.1 and CSS2, but the CSS1 recommendation is all in one page and easier to follow). A Cascading Style Sheet is made up of rules, and those rules are defined by selectors and properties. A selector is a string that defines where this rule applies. For example, if you want a rule to apply to all b elements in an XHTML document, you would make b your selector. Section 8.2 will cover selectors. A property is an instruction to display the content affected by the selector in a certain way, and that is covered in section 8.3. Here is a quick and dirty example, which shows what your web browser does with b elements using its default stylesheet:

b {
   font-weight: bold;
}

The selector is just the element b. All properties are listed inside the curly braces, with the name of the property first, then a colon, then the property's value, then a semicolon. This one sets the font-weight property to bold. This may raise the question, why would someone need a stylesheet to define that b elements should display in bold? Well, HTML just happens to have another element, strong, which defines strong emphasis. Most browsers display the strong element in bold. However, an author might decide he would like for his strong elements to be displayed in a big, red font with a yellow border. He can override the strong element's meaning in the browser's default stylesheet with his own:

strong {
   color: red;
   font-size: 2em;
   border: 1px solid yellow;
}

This rule would change all the strong elements to red, double-size font, with a 1 pixel solid yellow border. The font would, however, still be bold; any existing properties are inherited at lower levels unless the lower level specifically resets them. For example, if you have an i (italics) element within the strong element, the contents of the i element do not appear in the default font, without any border. The italic property is set on the i element's contents in addition to other properties that would be set if the text were in the parent element. The same is true for stylesheets being set on top of existing stylesheets. The browser will first apply the default stylesheet, then any stylesheets that are linked from the document, then any stylesheets embedded directly in the document, then any style attributes set directly on elements (the last two can only be done in HTML). A lower-level stylesheet can reset or unset any property that is set at a higher level. There is an exception to this: The ! important rule can be added to any property to tell the browser that this rule is important, and cannot be unset by any lower-level rule, unless, of course, that lower-level property is also important. This is often used by viewers with visual impairments; in their browser's default stylesheet, they set an important rule to set all font sizes to a certain point size they can see clearly. For that reason, use the important rule sparingly.

8.2 Selectors

Selectors are used to select the space where a rule will apply. The simplest selector, as seen above, is simply the element name. One handy feature of CSS selectors is that you can group several of them in one rule, delimited with commas. The below example applies to the elements b and strong, but either of those could be a more complicated selector, which I will cover in a moment.

b, strong {
   font-weight: bold;
}

You can also define selectors as appearing when one element is a child or a descendent of another. To define a selector in a case where element d is a descendant of element a, you simply define it like so:

a d { }

Any properties you define in this rule will apply to the d elements that are descendants (children, grandchildren, etc.) and only those d elements. If you want to limit the rule to children, you add a greater-than sign. The next example matches c elements when c is a child of p:

p > c { }

One way to help remember this syntax is to remember that parents are older than, bigger than, and smarter than children (at least usually so). Or, if your problem is that you can never remember which way the arrow points for greater than, just remember that two parents make each child, and there are two points on the parent side.

Would you believe me if I told you that CSS even has a syntax for the first born child (well, first child, anyway)? Well, it does. This is the first example of a pseudo-class, which is a selector that applies under special circumstances. The name of a pseudo-class comes after a colon in all cases.

c:first-child { }

This rule applies to any element c that is the first child of its parent. It can be any parent element, though. If you want to ensure that it only applies when c is the first child of p, combine the two selectors like this:

p > c:first-child { }

You can use pseudo-classes to apply rules in certain situations that can change dynamically. The most popular (and in some cases, most overused) is the :hover pseudo-class, which applies when a mouse cursor hovers over the selected area. :focus applies to an item that has focus (applies to keyboard navigation and form controls). There are also pseudo-classes for the three types of links in HTML. For a link that has not been visited, use a:link, for a visited link, use a:visited, and for an active link, use a:active. (:active applies to any element that is activated, but only hyperlinks seem to use it.)

You can select an element b only when it is immediately preceded by an element a:

a + b { }

You can select an element e where attribute a is set to value:

e[a="value"] { }

Or you can select by an element's ID, or identifier. Recall that an identifier is defined uniquely for one instance of an element. You would not select by ID as if it is an attribute like the above example. Select by ID using this syntax:

e#abc { } /* element e with ID abc */

#abc { }  /* ID abc, does not have to be an e element */

Note the comments; in CSS, you use C-style comments. However CSS does not allow line-doc (two forward slashes). You may try it and find that the browser you test in supports it, but it is not supported by all browsers because it is not in the CSS recommendation.

You use a number-sign to select by ID. The first example will only match an element with the ID abc if it appears on an e element. The second will match regardless of what element abc is.

Classes are groups of elements, or specifically, instances of elements, that are related in some way. Classes are often used in HTML to make some div elements have one set of properties, and other div elements have another. In HTML, a class is defined using the attribute class="classname". You could use the attribute selector from above, but in HTML only, you can use this shorthand instead:

e.classname { } /* Only applies to element e */

.classname { }  /* Applies to all elements in class */

The first example combines an element with the class name. Many WYSIWYG HTML editors do this, and I will never understand why. It can be very confusing to define a class, and then have it only apply to one element that has that class.

What if you want to use classes in XML? The CSS2 Recommendation forbids this shorthand outside of HTML/XHTML, so the only way you can use classes in XML is to use an attribute selector. However, attribute selectors cannot just be left hanging like this:

BAD EXAMPLE

[class="classname"] { }

There is a solution, and it is more intuitive than you might think. This works anywhere where you would like to represent "all" elements: Just use the good old * as a wildcard.

*[class="classname"] { }

This will select all elements in that class in your XML document. (It works in HTML and XHTML, too, but why use a more complicated syntax? Just use the shorthand.)

These selectors are all you should ever need on a regular basis. However, if you want to have fun, go to the W3C site and look up the pseudo-elements :first-line, :first-letter, :before, and :after. They do what you might think they do—they select part of an element.

8.3 Properties

The properties are mostly intuitive, and there are many great references that can be consulted when you want find a property for the style you are looking for. One of my favorites is to type in "CSS" and the effect I am looking for at the moment into Google. I will cover a few basic things that we will need to style Fred's coupon documents.

One of the main considerations for how an item is drawn is whether it is displayed as block or inline. You might remember from HTML that p and div are block-level containers and span is an inline container. These, too, are defined by the default stylesheet in your browser. This is how it might appear:

p, div {
   display: block;
}
span {
   display: inline;
}

The display property can be block or inline. Inline display flows with the surrounding text, block is set apart on its own. For most of your XML elements, you will want block display, but there is always the occasional exception. You can also set the display property to none if there is an item you do not want displayed at all.

The position property allows you to choose how a block element is displayed. It does not apply to inline elements.

   position: static;
   position: relative;
   position: absolute;
   position: fixed;

These four positioning schemes each mean something different. Static positioning is the positioning of an element in the normal document flow (top to bottom, left to right). This is obviously the default, and basically instructs the browser to display this element immediately after it has displayed the previous element and immediately before the next. This will change depending on the browser size and other factors, so you cannot assign it top, bottom, left, and right properties. (More on this in a moment.)

Relative positioning begins by positioning the item as if it was static, then it takes a knife and cuts your block out of the document and moves it up, down, left, or right from its original location. For example, top: -20px; moves the whole block up 20 pixels. The following element stays where it was before, leaving a blank space where the original block was cut out. Of the four, I understand the point of this one the least. If any of you readers find this necessary in real life, please tell me about it.

Absolute positioning completely disregards the flow of the document, and places your block wherever you specify with top/bottom/left/right. It is then affixed there on the finished document and scrolls with the page.

Fixed positioning is like absolute positioning, but instead of placing your block in the document, it sticks it to the viewport (the monitor of the viewer) at a fixed location, and does not scroll with the document. If you are using a browser that properly displays fixed positioning, you should notice this effect on the sidebar of this book.

Now, the elements top/bottom/left/right define where the element is displayed relative to the frame of reference (the document for absolute, the viewport for fixed). You should not set top and bottom, or left and right parameters both at the same time. The idea is that if you want something in the top-left corner, you would set top and left to 0px, and if you want something in the bottom-right corner, you would set bottom and right to 0px.

This is an example for how I styled the sidebar on the online version of this book:

#sidebar {
   position: fixed;
   top: 2px;
   left: 2px;
}

The sidebar is fixed to the viewport 2 pixels from the top and bottom. You could use negative measurements, and then the sidebar would be cut off at the edge of the viewport. The important thing is that in CSS, all measurements must include units. You cannot just set the top property to 2 and leave it at that. The reason you must include units is because CSS supports pixels, inches, centimeters, percentages, and numerous other units. You must include the abbreviation px for pixels (or in, cm, or %).

The same applies to height and width, which may be set to a measurement or a percentage.

   height: 100px;
   width: 200px;

The color property takes several measurements of color, and applies it to the content (often this sets the font color). I intentionally did not discuss HTML color, because HTML color can be needlessly complicated. In CSS, you may give simple color names (red, green, purple, gray, black, white) or specify colors numerically. Here are five different ways to specify the color red:

   color: red;
   color: #f00;               /* #RGB */
   color: #ff0000;            /* #RRGGBB */
   color: rgb(255,0,0);
   color: rgb(100%, 0%, 0%);

The first is the simplest way to define the color red, but is not an option for more obscure colors (such as terra cotta red). The second syntax is fairly confusing, and I would suggest you avoid it. The third is an HTML-style hexadecimal color code, which starts at 00 and goes up to FF for each of red, green, and blue. If you're comfortable with hexadecimal, this syntax is the same that was used in HTML before CSS came along and is more compact. The third is a decimal representation of the same thing: A range from 0-255 for each of red, green, and blue. The fourth is a percentage. I would suggest using a color picker from a graphics program, or using a color chart like the one at html-color-codes.com instead of using a trial-and-error method. You can also set the background color with the background-color property.

Font selection can be done in steps, but it is easiest to use the catch-all font property, which allows you to set all the font options you want in one step. You can look up the individual property names if you only want to change one of them—anything not specified in a font property is set to default, and you may want to inherit font properties on occasion.

   font: bold italic 24pt Arial, Helvetica, sans-serif;
   font: 2em "Courier New";

First, note the quote marks. You must place quote marks around any font name that contains spaces. This applies to any situation where your value might contain spaces; your stylesheet becomes ambiguous when these spaces are found outside quote marks, and the browser discards the whole thing.

The commas indicate a chain of fonts. The first font will be chosen; if it is unavailable, the parser chooses the next font. In the first example, the browser will check for the Arial font, and if it is unavailable, the browser checks for Helvetica, and if that is unavailable, the browser chooses a generic sans-serif font. The generic fonts in CSS are serif, sans-serif, cursive, fantasy, and monospace.

This property demonstrates two of other units in CSS: points and ems. Point sizes are problematic, because they are actually the same as inches (1 point = 1/72^nd of an inch). Inch sizes can vary depending on the dots-per-inch of displays, so use pixel sizes or relative sizes.

An em is a relative size, basically defined as the height of one letter "m." That information is important if you are using ems for width or height, but for text, you are basically making the height of the letter m a multiple of the default height of the letter m. The default font size is 1em, and 2em would be double font size. Be careful with ems, though! Always remember that it is a font size. If you define the font size for the body element in an HTML document as 2em, and then define the size of an h1 element as 10em, your actual h1 font size will be 20 times the default size, because the default size of the h1 font is inherited from body.

An important thing to note is you cannot set color on a font property; you use color for that.

You can also draw borders on any element. This is another catch-all property. This one does accept colors.

   border: 2px solid red;
   border: 1px dashed;
   border: none;

The first rule sets a 2 pixel wide border around the box, the border is solid, and red in color. The second sets a 1 pixel wide border and it is dashed, and the color is default (probably black). The third specifies no border at all.

Last but not least, there is the content property. This is a good one to use for XML documents, because it allows you to display labels and attributes that otherwise would not be displayed. There is a small (big) catch-22 with this property, though: it is not supported by any version of Internet Explorer up through 7. As a result, your document may be styled differently in IE, and specifically, the labels in your content properties will be gone. However, if you are willing to accept that, and expect most viewers of the XML document to be using Firefox, this can be a handy property. You may only use the content property in :before and :after pseudo-elements.

name:before {
   content: "Hello ";
}

This will appear in Firefox as "Hello " followed by the content of name. What if name has an attribute prefix with values like Mr. and Mrs.? You can chain strings with values of attributes, like so:

Name[prefix]:before {
   content: "Hello " attr(prefix) " ";
}

This will say "Hello Mr. " name or "Hello Mrs. " name. This can be useful for styling XML documents, but since it doesn't work in IE, don't get too attached to it. There is a better solution coming up next chapter that works in both browsers.

There are many other selectors, so be sure to look them up when you find them.

8.4 CSS Linking

For both HTML and XML, you can link to an external CSS file, and this is the best way to style a document. I will cover the HTML styling methods first, because they are the only ones that will work for HTML/XHTML in most browsers.

In HTML, you can use an external stylesheet by using the link element:

<link rel="stylesheet" type="text/css" href="styles.css" />

You can also embed the stylesheet directly in the document (this is an internal stylesheet):

<style type="text/css">
...
</style>

If you are in a hurry and just want to set a style on one element, you can set a style right on the element (although these element stylesheets can be hard to revise later and I would not recommend their use):

<div style="background-color: blue;">
Blue background here
</div>

In XML, you use the xml-stylesheet tag (which is another kind of processing instruction):

<?xml-stylesheet type="text/css" href="styles.css"?>

You could do this for an XHTML document, but if you send the webpage as text/html the browser will ignore it.

As an example, I have designed a full stylesheet for Fred's coupon vocabulary. It should be fairly simple to understand if you look carefully at the selectors and properties. You can view the styled document here.

Before we look at the stylesheet, I'll make one brief explanation of something you will see. When you have a long line, you can break it off onto a new line. However, the CSS parser treats new lines as the end of a property value. To prevent this from happening, place a \ character at the end of the line and it will be treated by the CSS parser as if there was no new line there.

coupon {
	margin: 5px;
	display: block;
	border: 1px solid black;
	background-color: white;
	color: black;
}
serial-number {
	display: block;
	font: 1em "Courier New";
}
valid-at:before {
	font: 1.5em italic "Times New Roman";
	content: "Valid at: ";
}
valid-at {
	display: block;
	background-color: yellow;
	margin: 5px;
	text-align: center;
	background-color: yellow;
	font: 1em bold Arial, sans-serif;
}
valid-at location {
	display: inline;
}
deal {
	display: block;
	background-color: lightcyan;
	font: 1em Verdana;
}
deal location:before {
	content: "Location: ";
	font-weight: normal; /* Otherwise inherits bold from element */
}
deal location {
	font-weight: bold;
}

deal value:before {
	content: "Value: ";
}

deal value {
	color: darkgreen;
}

requirement:before {
	content: "Required: " attr(guests) " Guests " \
attr(dollars) " Dollars ";
}

body {
	display: block;
	font: Arial;
}
body text[type="header"] {
	display: block;
	font-size: 2em;
	font-color: blue;
}
body text[type="regular"] {
	display: block;
}
terms {
	display: block;
	font: .7em "Courier New", monospace;
}
boiler:before {
	content: "Boilerplate text: " attr(code);
}
boiler {
	display: inline;
}
terms text {
	display: block;
}

One thing you may notice is that in the absence of either the guests or dollars attribute on the requirement element, you will still see the word "Guests" or "Dollars" after it. I could have redefined this rule for all four combinations of these two optional elements, but since this is already a horribly inelegant solution to the problem, I left it alone. We will be making a much better stylesheet using XSLT in the next chapter, so this should be viewed as a temporary solution. Of course, in the case of webpages, CSS is the de facto standard, and it works well in the final phase of a document (when no more transforming or content manipulation is necessary). If you are using XSLT to publish on the web in HTML or XHTML, your final document should still include a Cascading Style Sheet for proper display in a browser. I've only scratched the surface of CSS, but if you want to learn more, there are many websites and books available that will go into more detail than you ever wanted to know about CSS.

8.5 Chapter Review & Exercises

At the end of this chapter, you should know how to define a rule in CSS. You should have a basic understanding of selectors and properties that you can use to control the appearance of a document. You should know what pseudo-classes and pseudo-elements do, as well as classes in HTML. You should know the difference between block and inline display. You should understand relative, absolute, and fixed positioning. You should know what the units of measurement are in CSS, and when to use them. Finally, you should know how to link your CSS file into your HTML or XML document.

Create a CSS for your XHTML webpage. Find a use for all of the selectors and properties you learned in the chapter. Use examples of external, internal, and element stylesheets.
Create a CSS for your computer lab XML vocabulary. Make sure that, at least in Firefox, all of the information is conveyed either by iconography (colors, borders, positioning) or with labels.
Create a CSS for Fred's menu XML vocabulary. Make sure that, at least in Firefox, all of the information is conveyed either by iconography (colors, borders, positioning) or with labels.