HTML Tidy - Frequently Asked Questions

Overview

Certain questions about Tidy come up on a regular basis. These are some that have been culled from postings to the html-tidy@w3.org and tidy-develop@lists.sourceforge.net mailing lists. If you don't see your question addressed here, see How To Get Support below.


What Now?

If you have a popup screen that reads as follows:

HTML Tidy for Windows <vers 1st August 2002; built on Aug 8 2002, at 15:41:13>
Parsing Console input <stdin>

and do not know what to do next, read on.

Tidy is waiting for your HTML to come in, so it can parse it. Tidy is fundamentally a tool that reads in HTML cleans it up and writes it out again. It was developed as a program you run from the console prompt, but there are GUI encapsulations available, e.g. HTML-Kit, which you might prefer.

If you are using Windows, the first step is to unzip the zip file and place the tidy.exe file in a folder somewhere on your executables path. You may also want to set up a config file to save having to type lots of options each time you run Tidy. From the console prompt you can run Tidy like this:

C> tidy -m mywebpage.html

In this case, the -m option requests Tidy to write the tidied file back to the same filename as it read from (mywebpage.html). Tidy will give you a breakdown of the problems it found and the version of HTML the file appears to be using.

To get a listing of Tidy command line options, just type tidy -?. To see a listing on configuration options, try tidy -help-config. To get more info on the config options, see the Quick Reference.

See also Dave Raggett's User Guide.

If you're not comfortable with the DOS command line, you should try one of the GUI Applications.

How To Get Support

For general HTML Tidy support, the original mailing list html-tidy@w3.org is best. Sometimes developers are the last to know... Also, this list covers both Java and C versions, not to mention various value-added products such as GUI front ends, Perl and Python integration, etc. If you don't get a response after a couple tries or if you have a bug fix, bump it over to the developer list at tidy-develop@lists.sourceforge.net. It's not a hard line, but that is the general arrangement.

How to Submit A Bug Report

You are encouraged to report bugs you found to the Tidy developer team. Tidy's quality depends on your feedback. You can either file your bug report in the Sourceforge bug tracker for HTML Tidy (recommended) or send a mail to the mailing list at html-tidy@w3.org. Note you do not have to have a Sourceforge account in order to file bug reports, or be subscribed to html-tidy@w3.org in order to post messages to the list.

Prior to submitting a bug report, please check that the bug is not already known. Many are. If you are not sure, just ask. If it is new bug, make sure to include at least the following information in your report:

These information are necessary to reproduce whatever is failing, without them we cannot help you. Additional information - and patches - are very welcome!

Please include only one bug per report. Reports with multiple bugs are less easy to track and some bugs may get missed.

How to Submit A Feature Request

If you want Tidy to do something new that it doesn't do today (or stop doing something), then it is probably a feature request.

The process for submitting a feature request is very similar to bug requests. A different tracker is used on SourceForge to denote the difference in subject matter.

As with bugs, please be sure that the feature has not already been requested. If the feature has already requested, you can add your comments to the feature request tracker, or send mail to the mailing list indicating your wish to also have the feature implemented. If the feature has not already been requested, send the same information as for a bug report, but place special emphasis on the desired output for a given input, desired options, etc. - please be as specific as possible about what you want Tidy to do.

How Do I Control the Output Layout?

There are three primary options that control how Tidy formats your markup:

Briefly, indent sets the level of left-to-right indenting and, somewhat, how often elements are put onto a new line. The options are yes, no, and auto. indent-attributes is a flag that, when set, tells Tidy to put each attribute on a new line. vertical-space is a flag that, when set, tells Tidy to add some empty lines for readability. The default for all three is no. These options may be used in any combination to control you you want your markup to look. The best thing is to experiment a bit to see what you like. Be aware that indent yes is deprecated for production use as it will cause visual changes in most browsers.

To get Tidy Classic --indent auto layout, use the following options:

indent: auto
indent-attributes: no
vertical-space: yes

You can read about more Pretty Print options here.

What Version of Tidy Should I Use?

The current Source Forge builds are recommended. You can find these at http://tidy.sourceforge.net. People continue to report examples where Tidy does not catch some ill-formed HTML or, worse, generates ill-formed HTML. These cases have been significantly reduced. That said, be sure to test Tidy with some representative files from your environment.

For development work, use CVS directly on your development system. For information on how to pull Tidy sources from CVS. This way you can keep abreast of changes to Tidy and quickly resolve conflicts.

For building a front end (e.g. GUI or language binding), the simplest approach is to use TidyLib. For more information about building and coding with TidyLib, see the Introduction To TidyLib.

How Do I Run A Regression Test?

You might ask, "Why should I run a regression test?". If you are a Tidy user, you might want to compare a new version of Tidy to the version you are currently running. This is a good idea if you are using Tidy in production applications such as web publishing. If you are a Tidy developer, it is a good idea to run the regression test suite to make sure your fix or enhancement doesn't add new bugs.

Detecting new bugs is easier said than done, because sometimes they are subtle and can only be seen in browsers (or one particular browser you don't even have). But you can catch most crashes and many layout problems by running the test suite as described here.

The basic process is simple: run the test suite before and after making changes to TidyLib and compare the output markup and messages. Be aware that the test scripts for WinNT/2K/XP (alltest.cmd) and Linux/Unix (testall.sh) place the output files in tidy/test/tmp. If you forget to run the before test, you can always download a binary from the Project Page. If you are not a TidyLib developer, you can download the Test Suite directly. Here are the steps to evaluate the impact of a TidyLib change.

For Windows

Before making changes:

C:\tidy\test> alltest.cmd
C:\tidy\test> ren tmp baseline

After making changes and building Tidy:

C:\tidy\test> alltest.cmd
C:\tidy\test> windiff tmp baseline

For Linux/Unix

Before making changes:

~/tidy/test$ ./testall.sh
~/tidy/test$ mv tmp baseline

After making changes and building Tidy:

~/tidy/test$ ./testall.sh
~/tidy/test$ diff -u tmp baseline > diff.txt