BETAmodules.com is in beta — open to partnerships & joint ventures.Build with us

cross-ecosystem search · live

Results for pdf-ocr-ts

Found in 2 of 7 ecosystemsnpm 1–24 of 355,217 · 2 matches across other registries

How we search: free-text on npm, crates.io, RubyGems, NuGet and Maven. PyPI and Go do exact-name lookup only. Tip: click an ecosystem chip below to filter; click Show all ecosystems to come back.

Sort

Auto-load on scroll

npm matches

Showing 24 of 355,217 · JavaScript

See all npm →

pdf-ocr-tsv1.0.37

npm

Javascript-only library to perform OCR on scanned PDFs to turn them into searchable PDFs

Maintained. Maintained, actively maintained.

scribe.js-ocrv0.12.6

npm

High-quality OCR and text extraction for images and PDFs.

Maintained. Maintained, actively maintained.

@adobe/pdfservices-node-sdkv4.1.0

npm

The Adobe PDF Services Node.js SDK provides APIs for creating, combining, exporting and manipulating PDFs.

Abandoned. Last published over a year ago.

node-ts-ocrv1.0.15

npm

A simple wrapper around command-line utils to assist in PDF / Image OCR (Optical Character Recognition) processing using Tesseract.

Has 1 high-severity advisory. Verify a patched version exists before using.

@firecrawl/pdf-inspectorv1.9.5

npm

Fast PDF classification and text extraction. Detect text-based vs scanned PDFs, extract text by region with quality checks. Native Rust performance via napi-rs.

Maintained. Maintained, actively maintained.

pdfreaderv3.0.8

npm

Read text and parse tables from PDF files. Supports tabular data with automatic column detection, and rule-based parsing.

Aging — last published 7 months ago — check before adopting.

@opendataloader/pdfv2.4.7

npm

A Node.js wrapper for the opendataloader-pdf Java CLI.

Maintained. Maintained, actively maintained.

officeparserv7.2.1

npm

A robust, strictly-typed Node.js and Browser library for parsing office files (.docx, .pptx, .xlsx, .odt, .odp, .ods, .pdf, .rtf, .csv, .md, .html) and generating high-fidelity outputs in Markdown, HTML, CSV, RTF, and RAG-focused chunks.

Maintained. Maintained, actively maintained.

firecrawl-pdf-inspectorv1.0.0

npm

Fast PDF classification and text extraction. Detect text-based vs scanned PDFs, extract text by region with quality checks. Native Rust performance via napi-rs.

Maintained. Maintained, actively maintained.

node-tesseract-ocrv2.2.1

npm

A Node.js wrapper for the Tesseract OCR API

Has 1 high-severity advisory. Verify a patched version exists before using.

@docmost/pdf-inspectorv1.9.6

npm

Fast PDF classification, text extraction, and image extraction. Native Rust performance via napi-rs.

Maintained. Maintained, actively maintained.

pdf-extractv1.0.11

npm

Node PDF is a set of tools that takes in PDF files and converts them to usable formats for data processing. The library supports both extracting text from searchable pdf files as well as performing OCR on pdfs which are just scanned images of text

Abandoned. Last published 9 years ago.

n8n-nodes-htmlcsstopdfv3.2.5

npm

n8n community node to convert HTML and CSS to PDF using PdfMunk API - perfect for invoices, reports, certificates, and document generation

Maintained. Maintained, actively maintained.

pdf.js-extractv1.0.1

npm

super-simple async PDF reader that extracts text with x,y page positions based on pdf.js

Maintained. Maintained, actively maintained.

zeroxv1.1.20

npm

ocr documents using gpt-4o-mini

Aging — last published over a year ago — check before adopting.

pdf-parsev2.4.5

npm

Pure TypeScript, cross-platform module for extracting text, images, and tabular data from PDFs. Run directly in your browser or in Node!

Aging — last published 8 months ago — check before adopting.

react-pdfv10.4.1

npm

Display PDFs in your React app as easily as if they were images.

MaintenanceHealthy

PopularityWidely used

SecurityClean

DepsModerate

Solid choice. Well-adopted and active, actively maintained.

pdf-libv1.17.1

npm

Create and modify PDF files with JavaScript

Abandoned. Last published 4 years ago.

unpdfv1.6.2

npm

PDF extraction and rendering across all JavaScript runtimes

Maintained. Maintained, actively maintained.

mistral-ocrv0.1.1

npm

PDF to Markdown and DOCX conversion powered by Mistral OCR.

Maintained. Maintained, actively maintained.

@llamaindex/liteparsev2.0.7

npm

Fast, lightweight PDF and document parsing with spatial text extraction

Maintained. Maintained, actively maintained.

@gutenye/ocr-modelsv1.4.2

npm

Guten OCR is a high accurate text detection (OCR) Javascript/Typescript library that runs on Node.js, Browser, React Native and C++. Based on PaddleOCR and ONNX runtime

Abandoned. Last published 2 years ago.

@pdf-lib/upngv1.0.1

npm

Small, fast and advanced PNG / APNG encoder and decoder

Abandoned. Last published 6 years ago.

@gutenye/ocr-nodev1.4.8

npm

Guten OCR is a high accurate text detection (OCR) Javascript/Typescript library that runs on Node.js, Browser, React Native and C++. Based on PaddleOCR and ONNX runtime

Aging — last published over a year ago — check before adopting.