HTML Parsers
Sort by:Popular

Takes third-party HTML and produces HTML that is safe to embed in your web application. Fast and easy to configure.
Last Release on Mar 13, 2026
NekoHtml is the Html parser used by HtmlUnit.
Last Release on Dec 28, 2025
Apache Tika HTML Parser Module
Last Release on Mar 23, 2026
An HTML parser and tag balancer.
Last Release on Apr 17, 2015
Powerful, fast and easy to use HTML and XML parser for Java
Last Release on Jul 30, 2023
TagSoup is a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: poor, nasty and brutish, though quite often far from short. TagSoup is designed for people who have to ...
Last Release on Aug 22, 2011
Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
Last Release on Oct 25, 2015
HTML-parser provides a parser for HTML 5 that produces HTML 5 document object model. It aims to be a Java-implementation of http://www.w3.org/TR/html5/.
Last Release on Apr 29, 2024
Neko HTML
Last Release on Mar 23, 2008

Provides HTML parsing functionality
Last Release on Feb 8, 2021