How to write better semantic HTML
Here are my tips to producing better HTML markup.
- HTML first, then CSS!
- Perfect semantic correctness isn’t necessarily best!
- Consider other applications of your document (other media, plus DOM manipulation).
- Semantics applies to IDs and Classnames, as well as tags!
- When to use an ID, when to use a Class.
HTML first, then CSS!
The single most useful suggestion I have for writing better HTML is to do it first, before you start applying any styles, or even thinking about the styles.
This is quite a challenge, I know, but this method really makes you think about what you write in your HTML tags! Basically, you’re forcing yourself to compose HTML based on the content alone, separately from the problems of CSS production (“How do I achieve that effect?”), and only when the HTML is written do you start to address the page styling.
You can’t always implement every design with minimal, basic HTML, but it’s a really worthwhile goal. (There are other techniques I’ll demonstrate that can help you implement complex styles that would normally require additional nested HTML elements.)
One good way to know when your HTML markup is right is to show it to another designer or developer, and ask them to read it out loud, explaining what each piece of content means.
Perfect semantic correctness isn’t necessarily best!
This may seem to go against the point of the book, but it’s worth saying. The fact is that, as with anything in web page production, there’s never just one best way to achieve anything. And web pages are complex creations, with more considerations and dependencies than the stuff that’s markup-related.
The semantically perfect HTML page would have the absolute minimum number of tags, with the minimum of description (by way of IDs and Classnames) required to communicate the meaning. But the absolute minimum may not always also be useful, so some pragmatism is also required.
The fact is that sometimes you do need to put in one or two extra tags that may not be required to assign meaning, but simply make your life as a CSS producer so much easier, so it’s worth the trade-off.
You may also need to insert IDs or classes to facilitate DHTML coding, or for the benefit of a middleware developer or 3rd-party system.
At the end of the day, there’s no actual agreed standard for semantic markup.
It’s your page, so do what you feel is best.
Consider other applications of your document (other media, plus DOM manipulation)
Always bear in mind that your web page won’t necessarily only ever be a web page. When you publish content online today, it becomes part of a general mass of information, which may be consumed using all kinds of browsers and other devices, by humans as well as programs.
Another way to interact with an HTML document is by querying and manipulating the DOM (Document Object Model) using DHTML or other methods. We’ll see an example of DHTML manipulation later. Pure, semantic HTML makes all programmatic interaction and manipulation much easier.
Semantics applies to IDs and Classnames as well as tags!
Semantic HTML doesn’t stop with tag selection.
Structural meaning is also contained within tag parameters, including alt parameters, IDs and Classnames. These should follow some basic common-sense rules.
In the same way that you should only use a table for structuring tabular data, and not for layout purposes, every HTML element should only have classnames that describe accurately what the element does or contains.
For example, you have a side column on the left of your main content, which contains links to selected sections of your site, plus advertisements. First instinct may be to call the column id=”leftCol”, but is that correct?
The key question to ask is:
What is the property of this element that differentiates it from other content?
Then, try to move up the hierarchy of meaning, striving for a simpler and more generic descriptor, until you can get no more generic and simple without losing the specificity of the descriptor, and you’ll have your answer.
Taking our side left column as an example:
- Obviously we need to give the left column a useful classname or ID parameter that we can use in CSS to shift it alongside the main content.
- “Left” is not an appropriate descriptor, as with CSS you (or someone else) might choose to switch the content over to the right hand side. Left-ness is not a core property of the content itself – it’s display property, so has no place in semantic markup.
- What about “column”? Well, taking the same strict stance, a column is actually a visual organisation of content, it’s a style property, not a semantic property, so really we shouldn’t use that either. (Some smart CSS layouts can switch from laying content out side-by-side in columns, where width allows, to displaying the content in the additional or minor column beneath the main content in narrower displays.)
- So, it’s not left-ness, or side-ness, and it’s not column-ness, because these are all stylistic attributes. What property is it, then, of our left-hand column that differentiates it semantically from the main right-hand content? Well, there’s never just one right answer, but you might find that <div class=”minor-content”> or class=”secondaryContent” would fit the bill. Something like this would be meaningful enough to a human reader, or indeed a computer program, and would still be flexible enough to make sense if the content were rearranged for some medium, or even if part of it were borrowed for publication elsewhere.
When to use an ID, when to use a Class
This is another interesting area. I’m finding it appropriate to use more classnames these days, where I might have used IDs in the past.
What’s the difference between IDs and classnames?
Strictly, there should only ever be at most one element with any particular ID on a page. Now, this doesn’t matter to CSS. You could have several <p id=”callout”> on the same page, and the CSS would work just fine.
But IDs aren’t only to do with CSS! They’re actually much more important and useful in the world of Dynamic HTML (DHTML), where you can programatically manipulate the document using the DOM (Document Object Model).
In DHTML, you can grab any element using the code:
var someParagraph = document.getElementByID(“para1”) ;
If you use any ID more than once, getElementByID can’t work.
What about classnames? There are a couple of obvious differences first-up.
- You can re-use the same classnames several times within the same document (page)
- An element can have multiple classnames (separated with spaces, e.g. <ul class=”nav special banana”>)
That’s all well and good, but what other differences are there in the context of semantic HTML?
Basically, use a Class to describe a property of an element (if not already implicit in its tag type).
Use an ID to identify the unique element itself. This is the element’s core, unchangeable essence.
So, going back to our side column example from earlier. We settled on the “minor content-ness” of the element as being the most generic-yet-useful differentiating feature. Now, is this the core essence of the thing, or is it a property?
I guess that it’s theoretically possible to stick in a second side column (which might turn into some other kind of formatting device under different circumstances), so the “minor_content” should arguably be a Classname of the element, not its ID (which is what I used in the example anyway). But it’s common to find the ID parameter used for things that are actually circumstantial properties rather than core essence (most of the sites I’ve ever coded probably do it!), so look out for that in your own semantic HTML code.
Next: Buy The Complete Semantic HTML Guide, including Worked Example