BETAmodules.com is in beta — open to partnerships & joint ventures.Build with us

cross-ecosystem search · live

Results for pdf_ocr

Found in 4 of 7 ecosystems · 1621 matches across other registries

How we search: free-text on npm, crates.io, RubyGems, NuGet and Maven. PyPI and Go do exact-name lookup only. Tip: click an ecosystem chip below to filter; click Show all ecosystems to come back.

Sort

Auto-load on scroll

PyPI matches

Exact match · Python

pdf-ocrv2.0.0

PyPI

نرم‌افزار هوشمند استخراج اطلاعات از فایل‌های PDF اظهارنامه

Aging — last published 11 months ago — check before adopting.

crates.io matches

Showing 12 of 136 · Rust

See all crates.io →

pdf-ocrv1.0.0-beta.9

crates.io

OCR integration for scanned PDFs with pluggable engine support

Maintained. Niche but maintained, actively maintained.

mailrs-attachment-extractv1.0.0

crates.io

Extract text from email attachments (PDF + image OCR). PDF text via `pdf-extract` (pure Rust); OCR via the `tesseract` CLI subprocess (not linked as a C library). Two-stage fallback for scanned PDFs: try embedded text first, fall back to OCR on the raw bytes if the text is too short. Returns `ExtractionResult` with text + language + confidence + page count + JSON metadata.

Maintained. Niche but maintained, actively maintained.

spdf-pdfv0.2.0-alpha.2

crates.io

PdfEngine trait + PDFium-backed implementation.

Maintained. Niche but maintained, actively maintained.

superbook-pdfv0.1.0

crates.io

High-quality PDF converter for scanned books with AI enhancement, deskew correction, and Japanese OCR

Maintained. Niche but maintained, actively maintained.

tbel-pdfv0.2.0

crates.io

PDF processing pipeline for Belarusian financial reports with OCR, table extraction, and data normalization

Maintained. Niche but maintained, actively maintained.

engram-ingestv0.1.0

crates.io

Multimodal artifact ingestion and OCR pipeline for Engram

Maintained. Niche but maintained, actively maintained.

udataframe_rsv0.1.5

crates.io

A pure Rust library for data frame operations, particularly useful for processing data extracted from PDF files or OCR recognize

Maintained. Niche but maintained, actively maintained.

paperdownv0.2.0

crates.io

A fast CLI tool to batch convert PDFs into Markdown using GLM-OCR.

Maintained. Niche but maintained, actively maintained.

harumiv0.7.0

crates.io

Overlay searchable CJK text on PDFs, extract text, merge/split pages — pure Rust, zero C dependencies

Maintained. Niche but maintained, actively maintained.

kreuzbergv4.9.8

crates.io

High-performance document intelligence library for Rust. Extract text, metadata, and structured data from PDFs, Office documents, images, and 90+ formats and 300+ programming languages via tree-sitter code intelligence with async/sync APIs.

Maintained. Niche but maintained, actively maintained.

rasterrocketv1.0.1

crates.io

PDF to pixel buffer — pure Rust, zero Poppler. For the CLI tool: cargo install rasterrocket-cli