Saturday, January 10 2009

When doing activities that impact the web site presentation of projects I'm involved with, I occasionally hop down to the menu item "Validate Local HTML" in Firefox, a function that is available when you have the web development tools (you can also access it via Ctrl-Shift-A, and of course can always run it directly, but that seemingly tiny improvement in ease and efficiency of utilization can dramatically increase the usage of it). In a weak sort of TDD, it is a constant sanity test of at least the fundamental HTML validity of the generated presentation, and I always strive to get it to the rewarding green no-errors-no-warnings state.

Valid XHTML 1.0 Transitional

Does it really matter though? Ultimately what really matters is if the site renders as close to as expected as possible in the major browsers, and most of them happily overlook even egregious errors (Internet Explorer was criticized early on for being so forgiving, but given its dominance the other browsers really had no choice but to allow the same sloppiness. Most web publishers weren't about to re-engineer their site just to ensure that it displayed correctly in Opera, for instance.)

Out of curiousity I decided to check some other sites to see how many ensure that their (X)HTML is clean. The following are the results as they stand at this moment, though of course as content is added or removed the state will change (though a clean site is often a clean site with intention, and new content is automatically filtered to ensure that it is pure).

(I searched around for more good examples to sit in the PASS category, but sadly they are very few and far between)

Should this be normal?

No, it shouldn't.

Some of the errors in some of the mechanically generated HTML are simply unexcusable, and testify to the general level of sloppiness in the web industry in particular.

Check your HTML. Ensure it conforms to the specs it purports to obey, or accept defeat and step back to a less-demanding level. With tools like one keystroke validation and auto-cleanup HTML Tidy (which is available in module form, allowing you to auto-cleanup content mechanically inline in your site code - see this entry for an example of using Tidy from .NET code), there's simply no excuse.

Many will wave off such criticism, declaring that if it renders fine that's what really matters. Yet the worry about purity has more to do with the code maintenance process, and ensuring that an appropriate amount of care and concern is put into the product, in much the same way that you should strive to have 0 warnings in your projects, even if the compiled output works fine regardless. In the same way that I try (albeit with failures at time) to ensure that I avoid misspellings and typos, even if the message could be successfully conveyed with them.

   

Reader Comments

I always verify the little HTML that I write/generate, especially as an Opera user, but I can sympathize with those that don't due to the sheer volume of HTML that some people have to work with and the fact that correct HTML isn't rendered as it should be on the major browser(s). It seems to be a bit of a Catch 22 situation.

-Mark
Mark Roddy @ 1/10/2009 4:37:15 PM
Many developers are sloppy, and it comes through in all of their output. I am quite strict about ensuring that my team creates fully compliant output, but clearly many teams aren't.

Making better products isn't difficult, and with the horsepower and free software libraries and tools we have available to us, there is no excuse for slacking.
John Reynolds @ 1/10/2009 6:59:19 PM
Ultimately, having HTML that validates adds zero to the profit of a business. Therefore spending time and money fixing it is a pointless money hole.
James @ 1/11/2009 2:21:28 AM
"Ultimately, having HTML that validates adds zero to the profit of a business. Therefore spending time and money fixing it is a pointless money hole."

You could make such a claim about virtually any code quality effort. Thankfully most in the industry know that it is a ridiculous, foolish way to approach building a product. Quality does matter.

And making compliant HTML is actually ridiculously easy, so long as it is something you pay a minimum amount of attention to while building, and you don't build up a giant technical deficit and then, finished product in hand, discover that it's grossly divergent.
Lurker @ 1/11/2009 5:44:05 AM
Valid HTML markup is an ideal goal. Part of the problem for sites such as Facebook are the user generated content through the apps that are created for the community.

The most important thing for us has always been to make sure that the site displays well in the suite of most popular browsers, with a secondary concern being the site accessibility and with HTML conformance coming in third.
Ross Mason @ 1/12/2009 2:01:49 AM
If I have a choice between:

1) Write HTML that renders properly in all browsers but does not validate.

and

2) Write HTML that validates but won't render properly in Opera or Firefox

Then I will choose 1 every time.

The time that it takes to satisfy both criteria can often be far too high and I'm not setting out to create a web site that validates, I'm creating one that can be used by the largest number of people.

I won't go to a site, check if it validates and then not bother using it if it fails.

I will however go to a site, see that it doesn't work properly in Firefox and move on to a different site.

Valid HTML is a great thing to strive for, but as long as the browsers allow (require) sloppy code then that's probably what they will get.
Carl @ 1/12/2009 5:24:26 AM
Hey there Carl. Thanks for the comment.

For sure, there is a very real pragmatism that has to be applied, and cutting what really matters just to get code that validates is not a credible choice.

Looking at many of the deviations from validation on the sites listed, though, and it usually can be chalked up to nothing more than sloppiness. The errors and warnings usually relate to items for which there would be no detriment to cross-browser compatibility if they were fixed (if anything there would be an improvement, and behaviours would be more predictable).

Ultimately it's a goal that is very similar to making clean, properly-commented code: You can certainly make a working, great product without doing so, but with a bit of discipline you can achieve both with no real sacrifice behind care and concern about the product.
Dennis Forbes @ 1/12/2009 6:27:51 AM
PASS - http://www.bbc.co.uk/ no errors as XHTML 1.0 Strict
PASS - http://london.craigslist.co.uk/ no errors as HTML 4.01 Transitional
PASS - http://wikipedia.org/ no errors as XHTML 1.0 Strict
PASS - http://en.wikipedia.org/wiki/Main_Page no errors as XHTML 1.0 Transitional
aka @ 1/13/2009 5:23:41 AM

Add Comment

Name *:

Email Address:

(your email address is not displayed)
Website:

Comment *:



About the Author
Dennis Forbes Dennis Forbes is a Toronto-based software architect. While focused primarily on the .NET and SQL Server worlds, Dennis frequently ventures outside of this comfort zone into game development and image processing. He has been published in several industry magazines, has been quoted in the Wall Street Journal and has been interviewed by NPR.

He is a vice president and lead software architect at an innovative New York City hedge fund back-office services firm.

Dennis has been working on solutions for the financial, telecommunications, and power generation markets for over 15 years.





 

Dennis Forbes