Web Crawlers
Sort by:Popular

Apache Nutch
Last Release on Feb 17, 2026
Crawljax Core
Last Release on Jun 1, 2023
Norconex HTTP Collector is a web spider, or crawler that aims to be very flexible, easy to extend, and portable
Last Release on May 25, 2025
A crawler framework. It covers the whole lifecycle of crawler: downloading, url management, content extraction and persistent.
Last Release on Apr 23, 2024
Open Source Web Crawler for Java
Last Release on Mar 26, 2018
crawler-commons is a set of reusable Java components that implement functionality common to any web crawler.
Last Release on Oct 10, 2014
A java crawler for information collection
Last Release on Sep 7, 2017
Easy to use lightweight web crawler
Last Release on Jul 4, 2020
Charles is a smart web crawling library.
Last Release on Jan 29, 2017
Simple java (1.6) crawler to crawl web pages on one and same domain.
Last Release on Feb 8, 2014