BETAmodules.com is in beta — open to partnerships & joint ventures.Build with us

cross-ecosystem search · live

Results for pdf_to_text

Found in 3 of 7 ecosystems · 1206 matches across other registries

How we search: free-text on npm, crates.io, RubyGems, NuGet and Maven. PyPI and Go do exact-name lookup only. Tip: click an ecosystem chip below to filter; click Show all ecosystems to come back.

Sort

Auto-load on scroll

crates.io matches

Showing 12 of 1,157 · Rust

PDF to text

Maintained. Niche but maintained, actively maintained.

pdf_textv0.1.12

crates.io

Positioned PDF text and glyph extraction for semantic diff and comparison pipelines.

Maintained. Niche but maintained, actively maintained.

pdf-text-extractv0.2.0

crates.io

Extract text, tables, and structured content from PDF files

Maintained. Niche but maintained, actively maintained.

Search text in PDF

Aging — last published over a year ago — check before adopting.

pdf-text-formatv1.0.0-beta.8

crates.io

Text run formatting: inject font-size and color changes with state isolation

Maintained. Niche but maintained, actively maintained.

arabic_pdf_to_textv0.1.0

crates.io

A CLI tool to convert Arabic PDFs to text using Google's Gemini API

Aging — last published 11 months ago — check before adopting.

open-redact-pdf-textv0.6.0

crates.io

Text extraction and search geometry for open-redact-pdf.

Maintained. Niche but maintained, actively maintained.

pdf_oxidev0.3.60

crates.io

The fastest Rust PDF library with text extraction: 0.8ms mean, 100% pass rate on 3,830 PDFs. 5× faster than pdf_extract, 17× faster than oxidize_pdf. Extract, create, and edit PDFs.

Worth a look. Actively maintained and growing, actively maintained.

pdf_oxide_cliv0.3.60

crates.io

CLI for pdf-oxide — the fastest PDF toolkit. 22 commands: text extraction, PDF to markdown, search, merge, split, images, compress, encrypt, watermark, forms, and more.

Maintained. Niche but maintained, actively maintained.

mailrs-attachment-extractv1.0.0

crates.io

Extract text from email attachments (PDF + image OCR). PDF text via `pdf-extract` (pure Rust); OCR via the `tesseract` CLI subprocess (not linked as a C library). Two-stage fallback for scanned PDFs: try embedded text first, fall back to OCR on the raw bytes if the text is too short. Returns `ExtractionResult` with text + language + confidence + page count + JSON metadata.

Maintained. Niche but maintained, actively maintained.

cli-pdf-to-textv0.1.20

crates.io

A CLI pdf to plain text converter