The Apache PDFBox library is an open source Java tool for working with PDF documents. This artefact contains commandline tools using Apache PDFBox.

Artifacts using Apache PDFBox Tools (113)
Sort by:Popular

Apache Tika Parser Modules
Last Release on Mar 23, 2026
OpenCms is an enterprise-ready, easy to use website content management system based on Java and XML technology. Offering a complete set of features, OpenCms helps content managers worldwide to create and maintain beautiful websites fast and ...
Last Release on Apr 13, 2026
Alfresco Repository
Last Release on Mar 1, 2026
# Tess4J ## Description: A Java JNA wrapper for Tesseract OCR API. Tess4J is released and distributed under the Apache License, v2.0.
Last Release on Jan 17, 2026
Apache Solr Content Extraction Library integrates Apache Tika content extraction framework into Solr
Last Release on Sep 25, 2024
GATE - general architecture for text engineering - is open source software capable of solving almost any text processing problem. This artifact enables you to embed the core GATE Embedded with its essential dependencies.
Last Release on Mar 9, 2021
Apache Tika PDF Parser Module
Last Release on Mar 23, 2026
Apache PDFBox Application
Last Release on Mar 12, 2026
LogicalDOC Core
Last Release on Apr 15, 2026
OSGi bundle that contains tika-parsers. Repackaged to include the full ooxml-schemas instead of the poi-ooxml-schemas subset.
Last Release on Apr 14, 2021