PDF to text
Positioned PDF text and glyph extraction for semantic diff and comparison pipelines.
Extract text, tables, and structured content from PDF files
Search text in PDF
Text run formatting: inject font-size and color changes with state isolation
A CLI tool to convert Arabic PDFs to text using Google's Gemini API
Text extraction and search geometry for open-redact-pdf.
The fastest Rust PDF library with text extraction: 0.8ms mean, 100% pass rate on 3,830 PDFs. 5× faster than pdf_extract, 17× faster than oxidize_pdf. Extract, create, and edit PDFs.
CLI for pdf-oxide — the fastest PDF toolkit. 22 commands: text extraction, PDF to markdown, search, merge, split, images, compress, encrypt, watermark, forms, and more.
Extract text from email attachments (PDF + image OCR). PDF text via `pdf-extract` (pure Rust); OCR via the `tesseract` CLI subprocess (not linked as a C library). Two-stage fallback for scanned PDFs: try embedded text first, fall back to OCR on the raw bytes if the text is too short. Returns `ExtractionResult` with text + language + confidence + page count + JSON metadata.
A CLI pdf to plain text converter
elizaOS PDF Plugin - PDF reading and text extraction
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.