The TidyHTMLTreeBuilder parser can read (almost) arbitrary HTML files, and turn
them into well-formed element trees. This parser uses a library version of Dave
Raggett’s HTML Tidy utility to fix any problems with the HTML before converting
it to XHTML (the XML version of HTML).