Positioned PDF text and glyph extraction for semantic diff and comparison pipelines.
Extract text, tables, and structured content from PDF files
PDF to text
Search text in PDF
Text run formatting: inject font-size and color changes with state isolation
A CLI tool to convert Arabic PDFs to text using Google's Gemini API
Text extraction and search geometry for open-redact-pdf.
Extract text from email attachments (PDF + image OCR). PDF text via `pdf-extract` (pure Rust); OCR via the `tesseract` CLI subprocess (not linked as a C library). Two-stage fallback for scanned PDFs: try embedded text first, fall back to OCR on the raw bytes if the text is too short. Returns `ExtractionResult` with text + language + confidence + page count + JSON metadata.
A CLI pdf to plain text converter
elizaOS PDF Plugin - PDF reading and text extraction
Local-first MCP server bridging Claude to your Zotero library — search, read, cite, enrich, write — over stdio or streamable-HTTP with OAuth 2.1.
CLI for pdf-oxide — the fastest PDF toolkit. 22 commands: text extraction, PDF to markdown, search, merge, split, images, compress, encrypt, watermark, forms, and more.
This is a ChupaText decomposer plugin for to extract text and meta-data from PDF. You can use `pdf` decomposer.
PDF::Reader extension to find text in PDF and get the page and position
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.