HTML Parsers
JTidy is a Java port of HTML Tidy, a HTML syntax checker and pretty printer. Like its non-Java cousin, JTidy can be used as a tool for cleaning up malformed and faulty HTML.
Last Release on Sep 11, 2024
Relocated → net.sf.jtidy »
jtidy
3.TagSoup201 usages
org.ccil.cowan.tagsoup » tagsoup Apache
TagSoup is a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: poor, nasty and brutish, though quite often far from short. TagSoup is designed for people who have to ...
Last Release on Aug 22, 2011
Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
Last Release on Oct 25, 2015
The Validator.nu HTML Parser is an implementation of the HTML5 parsing algorithm in Java for applications. The parser is designed to work as a drop-in replacement for the XML parser in applications that already support XHTML 1.x content with an XML ...
Last Release on Jun 7, 2012
Relocated → nu.validator »
htmlparser
8.HTML Lexer Jar15 usages
org.htmlparser » htmllexer CPL +1
HTML Lexer is the low level lexical analyzer.
Last Release on Apr 24, 2011
