So youve written a bunch of HTML, and you think youre done. Not by a long shot -- at least, not if you want to avoid being a laughing stock on the Web. Before all these great tools were available, an occasional misspelled word or broken link was acceptable, but now you have no excuse for such boo-boos. Before you post pages for public display, you must perform three important checks: HTML validation, spelling, and links. Lots of different tools can do one or more of these jobs, many of which are available for free on the Web. You can find a number of stand-alone utilities as well, not to mention those already embedded in HTML editors. As we said, you no longer have a valid excuse for mistakes.
A majority of browsers are forgiving of markup errors. Most dont even require an <HTML> tag to identify an HTML page, and instead look only for an .html or .htm suffix to identify a document as readable. Just because the real world is that way doesnt make it right. You may see a day when browsers cant afford to be so forgiving, and that day is drawing closer as HTML becomes more complicated and precise. Its better to get it right from the beginning and save yourself a bunch of trouble later on.
HTML validation is built into many HTML editors, and although not many standalone HTML validation applications exist, you do have your choice of a number of free online validation systems. We like a couple in particular. The first is the Webtechs HTML Validator, which was one of the original validators. This validator lets you check entire documents or just snippets of HTML.
You can choose which HTML DTD you want to match -- from strict HTML 2.0 to various implementations of 4.0, as well as several browser-specific DTDs in between. The output from this validator can be a little difficult to read, however, because it is in a dialect of English we call DTD speak. Heres the output we received when we intentionally broke a page and submitted it to the validator to check against the 3.2 DTD with the Show Input, Show Formatted Output, and Treat URL Ampersands parameters selected:
nsgmls:<OSFD>0:13:14:E: there is no attribute "BGCOLOR"
nsgmls:<OSFD>0:19:7:E: element "CENTER" undefined
nsgmls:<OSFD>0:23:59:E: there is no attribute "BORDER"
nsgmls:<OSFD>0:29:15:E: character "%" is not allowed in the value of attribute "WIDTH"
nsgmls:<OSFD>0:33:19:E: there is no attribute "CELLPADDING"
nsgmls:<OSFD>0:37:11:E: there is no attribute "WIDTH"
nsgmls:<OSFD>0:39:47:E: there is no attribute "WIDTH"
nsgmls:<OSFD>0:41:86:E: there is no attribute "BORDER"
nsgmls:<OSFD>0:45:47:E: there is no attribute "WIDTH"
nsgmls:<OSFD>0:47:95:E: there is no attribute "BORDER"
nsgmls:<OSFD>0:57:11:E: there is no attribute "WIDTH"
nsgmls:<OSFD>0:59:47:E: there is no attribute "WIDTH"
nsgmls:<OSFD>0:61:82:E: there is no attribute "BORDER"
nsgmls:<OSFD>0:65:47:E: there is no attribute "WIDTH"
nsgmls:<OSFD>0:67:92:E: there is no attribute "BORDER"
nsgmls:<OSFD>0:75:11:E: there is no attribute "WIDTH"
nsgmls:<OSFD>0:77:47:E: there is no attribute "WIDTH"
nsgmls:<OSFD>0:79:93:E: there is no attribute "BORDER"
nsgmls:<OSFD>0:83:47:E: there is no attribute "WIDTH"
nsgmls:<OSFD>0:85:86:E: there is no attribute "BORDER"
nsgmls:<OSFD>0:93:11:E: there is no attribute "WIDTH"
nsgmls:<OSFD>0:95:49:E: there is no attribute "WIDTH"
nsgmls:<OSFD>0:97:88:E: there is no attribute "BORDER"
nsgmls:<OSFD>0:107:15:E: character "%" is not allowed
in the value of attribute "WIDTH"
nsgmls:<OSFD>0:109:96:E: there is no attribute "BORDER"
nsgmls:<OSFD>0:110:98:E: there is no attribute "BORDER"
nsgmls:<OSFD>0:114:15:E: character "%" is not allowed
in the value of attribute "WIDTH"
nsgmls:<OSFD>0:125:16:E: general entity "amp" not defined and no default entity
Pretty ugly if we do say so ourselves. Notice that there are several repetitions of some errors and what that tells us is that 3.2 doesnt support that attribute. You have to learn how to read validator output to know what to fix and what to leave alone. Generally, editor validators are kinder or gentler and provide some mechanism in the interface to help you fix errors, rather than presenting you with complicated output like this.
|To check your HTML with the Webtechs Validator, point your Web browser to www.webtechs.com/html-val-svc/.|
If the Webtechs Validator is a little intimidating, you can use the Kinder, Gentler HTML Validator instead. Based on the Webtechs Validator, its output is a little less overwhelming and easier to read, but with fewer options. You must enter the URL to be checked, make a couple of choices about what kind of information you want returned, and then wait for the results. The URL for this site is ugweb.cs.ualberta.ca/~gerald/validate/.
When we submitted the same Web page as in the previous example to this validator, we received an explanation of not only what was wrong, but why it was wrong. For instance, the page didnt include a <DOCTYPE> definition and the validator provided us with a solid discussion of why the page needed to have this often overlooked tag. The error output is much easier to read as well.
Granted, it takes a bit longer to get through the programs output, and you have no control over the DTD that your HTML is checked against. But if youve never worked with a validator before, this is a good place to start before tackling Webtechs more detailed (but less intelligible) results.
Regardless of which validator you use, you must check each and every HTML page for accuracy. The more valid your HTML, the better the chance that your pages will look as you intend them to on a variety of browsers.
What is the biggest problem with checking HTML pages for spelling errors? The tags themselves are misspellings, according to Websters and most other dictionaries. Sitting and clicking the ignore key for each and every new tag can make spell checking tedious. After your eyes glaze over, youre more apt to miss real misspellings. Once again, many editors include HTML-aware spell checkers that skip markup and check just the text. Because so many editors support this option, few stand-alone utilities are available, or any dedicated online spell checkers that we could find.
|Dr. HTML is an HTML checking tool that performs
several different checks, including spell checks, on any HTML document or
on an entire site. To investigate this utility and try its analytical skills,
please visit this Web site:
Regardless of how you do it, even if it means cutting and pasting text from a browser to a word processor, you must check your pages for spelling errors. Bad spelling is often considered to be an indicator of intelligence and abilities, and we wouldnt want anyone to underestimate you.
If you think spelling errors are embarrassing, heres something thats even worse: broken hyperlinks. Hyperlinks make the Web what it is; if you have broken links on your site, thats borderline blasphemous. Seriously, if your text promises a link to a great resource or page but produces the dreaded 404 Object Not Found error when that link is clicked, users will be disappointed and may not ever revisit your site. The worst broken link is one that points to a resource in your own pages. You cant be held responsible for what others do to their sites, but you are 100 percent accountable for your own site. Dont let broken links happen to you!
As with the other checks, many HTML editors include built-in local link checkers, and some editors even scour the Web for you to check external links. In addition, a majority of Web servers also offer this feature. Checking external links isnt as simple as it sounds because a program is involved that must work over an active Internet connection to query each link. This can be processor intensive, and you should check external links only during off-peak hours, like early morning, to avoid tying up other Web servers as well. A number of scripts and utilities are available on the Web to help you test your links. In the following sections, we share some of our favorites.
MOMSpider was one of the first link checkers available to Web authors. This link checker is written in Perl and runs on any virtually any UNIX machine. The nice thing about MOMSpider is that it neednt reside on the same computer as the site it checks, so even if you dont serve your Web from UNIX, you can still check links from MOMSpider on a remote system.
Anyone who has some knowledge of Perl can easily configure MOMSpider to create custom output and to check both internal and external links on a site. Dont fret; if you dont know Perl, you can easily find a programmer who can adjust a MOMSpider in his or her sleep for a nominal fee. Many ISPs run a MOMSpider on your site for a low monthly fee and will cheerfully handle the configuration and implementation for you.
|To find out more about MOMSpider visit the
official site at
Web Walker is a simpler, annotated version of MOMSpider that non-Perl users can implement themselves with just a little study. Once again it must run on a UNIX server with Perl installed, but the program itself is heavily commented to help you configure it without calling in a programmer. If you feel adventurous and want to try your hand at a little programming, give Web Walker a shot.
|Point your browser at the Web Walker page for
CheckBot is yet another Perl script based on the work of Roy Fielding, the programmer who created MOMSpider, and is similar to Web Walker in that it is a simpler, more annotated version of MOMSpider. CheckBot runs on any server with Perl installed and you can configure it without too much hassle if youre willing to do a little reading.
|To learn more about CheckBot take a look at
its Web page at
Youve probably noticed that all the link checkers we mention are scripts (Perl scripts to be specific). Let this be your first clue that link checking is not quick and easy, but an essential task all the same. We recommend you check all links on a site weekly. If you cant manage that, check them at least monthly. If not, youll have dead links and eventually a dead site.
Weve only highlighted a few of the many
different validation and checking utilities available on the Web. You
can find numerous others and more will soon appear. For a complete, up-to-date
list of these tools, visit the Yahoo! Validation and Checkers page at
Extra 6 Main Page | Previous Section | Next Section
E-mail: HTML For Dummies
Webmaster: Natanya Pitts, LANWrights
Revised -- January 16, 1998