HTML Parsers
Sort by:Popular

An HTML parser and tag balancer.
Last Release on Apr 17, 2015
Powerful, fast and easy to use HTML and XML parser for Java
Last Release on Jul 30, 2023
JTidy is a Java port of HTML Tidy, a HTML syntax checker and pretty printer. Like its non-Java cousin, JTidy can be used as a tool for cleaning up malformed and faulty HTML.
Last Release on Jul 20, 2010
TagSoup is a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: poor, nasty and brutish, though quite often far from short. TagSoup is designed for people who have to ...
Last Release on Aug 22, 2011
HTML Lexer is the low level lexical analyzer.
Last Release on Apr 24, 2011