Validating HTML

Jun 5

Validating HTML

I’ve been playing around with the W3C HTML validator, and I’ve found, sadly, that there’s no easy way to get this page to validate. There were some problems that I fixed, but when I try to validate against 4.01 Transitional, I get about 50 errors related to the use of “&” in URLs.

Apparently you’re supposed to use the HTML entity for the ampersand (“& a m p ;”) even in URLs. But since this entitiy isn’t present in the URL in the address bar of the browser, and that’s where you generally copy the URL from, how are you supposed to convert these without manually picking through every URL you use? You could try to get funky with regular expressions, but I can’t imagine that would work perfectly in every case.

This brings up a larger point in that you can’t really expect to validate a site where a large part of the HTML of the page is provided by people other than the original Web developer. Every entry on this page — comprising the entire middle section — can be entered by someone else, and how can I make sure they’re entering valid HTML markup?

This is where HTML Tidy integration will work very well in PHP 5. Using this tool, you can validate HTML that people enter before you store it in the database, or before you output it. You can make sure all tags are closed, all tags match, etc. so perhaps you can hope for some sort of valid markup.

But, in an even larger sense, does validation matter much? I’ve never gotten any comment from anyone about the validation of this site. So what that I’m throwing 50 errors because of ampersands in URLs — can someone provide me with a valid (excuse the pun) reason why this matters?

I understand problems can occur from gross misuse of the HTML spec, but are all validation errors created equal? My apparent misuse of ampersands has got to rank pretty low on the sin list.


Comments

by Philipp Lenssen,   June 5, 2004 12:30 PM  

Yes, ampersands must be written as & in HTML (this includes attribute values). This is to differentiate them from the beginning of an entity. Naturally most browsers can cope with a lonely & in a URL.

However I always make sure to write &. For one thing, I don't want 50 trivial errors to hide the 51st real error which might matter, and I the HTML validator is useful in finding those too.

With a content management system I wrote some years ago (Onpage2.com) all user-input would be validated by Tidy HTML before it would be stores as XML snippet in the SQL database. You might not think too much of HTML validation, but if you use XML (to have it be XHTML in the end) validation is absolutely necessary. You cannot handle XML objects which don't validate, really.

Talking about validation, just found some bugs in my current template. Bugs always reintroduce themselves if you don't constantly validate!

Good luck with your ampersands!


by Philipp Lenssen,   June 5, 2004 12:34 PM  

Just noticed you don't correctly escape ampersands in comments either, so some of the meaning above got lost -- you should always convert all "&" entered by the user to "& a m p ;"...


by Mean Dean,   June 7, 2004 9:20 AM  

You've been blogged ... click on my name


by dz,   June 9, 2004 2:36 PM  

There is a plugin for MT that will perform the '&' to '& amp;' conversion for you, or you could hack one in yourself.


by ,   March 20, 2005 1:52 PM  

Thats why W3C recommends using semicolons rather than ampersands to separate key/value pairs in the query string. Quite simple solution really.


by Deane,   March 20, 2005 3:22 PM  

But what Web development languages support that syntax? (And by "support," I mean "automatically parse.")


by emil,   March 21, 2005 6:15 AM  

Any, unless your afraid of parsing the query string yourself.


by http://validator.w3.org like source,   July 8, 2009 5:27 AM  

hi, i need http://validator.w3.org. like site. can any one help me starting building like that site.

contact: sayfrndship@gmail.com thnkx in advance



Add Comment


Want to advertise on this site? Contact FM.
Laser Toner Cartridges UK laser toner, toner cartridges, hp toner, lexmark toner, samsung toner, canon, toner, epson toner, oki toner, kyocera toner, xerox toner, remanufactured toner, compatible toner
Direct TV Deals Free 4 room direct tv deals. no equipment to buy. free fast professional direct tv installation. this is the best direct tv deal available anywhere.
SEO Article Learn from the experts with our SEO article.
rope light Shopping with birddog distributing, inc., gives you access to the lowest prices, the best customer service and the quickest delivery times possible.
Laptop AC Adapter We offer genuine factory direct replacement AC adapters.
Direct TV Best satellite TV deals.
Direct TV Deals Direct TV programming deals are varied and include packages containing from 50 channels up to over 250 channels.
8mm film to DVD Retain family memories with the only frame by frame digital restoration service in the United States for your 8mm film to DVD today
Rubber Stamp Shop for custom self-inking stamps, hand stamps, address stamps, label stamps, check endorsement stamps, check deposit stamps, date stamps, pre inks, pocket stamps, ink and much more!