BETAmodules.com is in beta — open to partnerships & joint ventures.Build with us

Home Search Compare Equivalents

One search box and one honest, consistent read on every open-source library — across every ecosystem.

npmPyPIcrates.ioRubyGemsGoMavenNuGet

Discover

Tools

Compare Equivalents

Data

deps.dev OSV advisories npm registry PyPI

About

Methodology Partner with us

© 2026 Modules · A precision instrument for picking dependencies.Data refreshed continuously from public registries, deps.dev & OSV

cross-ecosystem search · live

Results for pdf-extract

Found in 4 of 7 ecosystemsnpm 1–24 of 40,896 · 1130 matches across other registries

npm40896 crates.io44 RubyGems11 NuGet1075

How we search: free-text on npm, crates.io, RubyGems, NuGet and Maven. PyPI and Go do exact-name lookup only. Tip: click an ecosystem chip below to filter; click Show all ecosystems to come back.

Sort

Auto-load on scroll

npm matches

Showing 24 of 40,896 · JavaScript

See all npm →

pdf-extractv1.0.11

Node PDF is a set of tools that takes in PDF files and converts them to usable formats for data processing. The library supports both extracting text from searchable pdf files as well as performing OCR on pdfs which are just scanned images of text

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 9 years ago.

pdf-extract-imagev0.0.3

Extract image from pdf without binary dependency

MaintenanceAbandoned

PopularityUnknown

Abandoned. Last published over a year ago.

pdf-extract-textv0.1.3

A fast, native Node.js module to extract and process text from PDF files using Rust and N-API. Built with [Tokio](https://tokio.rs/), [`pdf-extract`](https://docs.rs/pdf-extract), and [`text-splitter`](https://crates.io/crates/text-splitter), this package

MaintenanceAging

PopularityUnknown

Aging — last published over a year ago — check before adopting.

pdf-extract-api-clientv1.0.2

TypeScript client for the PDF Extract API

MaintenanceAbandoned

PopularityUnknown

Abandoned. Last published over a year ago.

PDF extraction and rendering across all JavaScript runtimes

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

text-extract-api-clientv1.0.2

TypeScript client for the PDF Extract API

MaintenanceAging

PopularityUnknown

Aging — last published over a year ago — check before adopting.

pdfer-job-pusherv1.0.5

Push new pdf extract jobs out to workers

MaintenanceAbandoned

PopularityUnknown

Abandoned. Last published 13 years ago.

pdf-text-extractv1.5.0

Extract text from pdfs that contain searchable pdf text

MaintenanceAbandoned

PopularityUnknown

Abandoned. Last published 9 years ago.

mail-exportv2.2.1

Parse .eml and .msg files or convert to pdf. Extract headers and attachments from .eml and msg files. Natively in typescript, support mjs & cjs!

MaintenanceAbandoned

PopularityUnknown

Abandoned. Last published over a year ago.

@cantoo/pdf-libv2.7.1

Create and modify PDF files with JavaScript

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

pdf.js-extractv1.0.1

super-simple async PDF reader that extracts text with x,y page positions based on pdf.js

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

pdf-parse-forkv1.2.0

Pure javascript cross-platform module to extract text from PDFs.

MaintenanceAbandoned

PopularityUnknown

Abandoned. Last published 2 years ago.

mini-css-extract-pluginv2.10.2

extracts CSS into separate files

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@syncfusion/ej2-pdf-data-extractv33.2.10

This repository provides advanced support for data extraction from PDF documents

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@adobe/pdfservices-node-sdkv4.1.0

The Adobe PDF Services Node.js SDK provides APIs for creating, combining, exporting and manipulating PDFs.

MaintenanceAbandoned

PopularityUnknown

Abandoned. Last published over a year ago.

postcss-modules-extract-importsv3.1.0

A CSS Modules transform to extract local aliases for inline imports

MaintenanceAbandoned

PopularityTop 1%

Abandoned. Last published 2 years ago.

@meistrari/document-sdkv1.9.0

SDK para a API de Processamento de Documentos, com suporte a extração de PDF, templates, conversão para imagem e mais.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

extract-zipv2.0.1

unzip a zip file into a directory using 100% javascript

MaintenanceAbandoned

PopularityTop 1%

Abandoned. Last published 5 years ago.

pdf-parsev2.4.5

Pure TypeScript, cross-platform module for extracting text, images, and tabular data from PDFs. Run directly in your browser or in Node!

MaintenanceAging

PopularityUnknown

Aging — last published 7 months ago — check before adopting.

PDF text extraction in TypeScript

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 2 years ago.

react-pdfv10.4.1

Display PDFs in your React app as easily as if they were images.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

mcp-server-docpipev1.0.0

MCP server for document processing - PDF extract/merge/split, DOCX to Markdown, image resize/compress

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

ab-pdf-extractv1.0.1

Extract pages from a PDF into canvas elements on the client side

MaintenanceAbandoned

PopularityUnknown

Abandoned. Last published 9 years ago.

Create and modify PDF files with JavaScript

MaintenanceAbandoned

PopularityUnknown

Abandoned. Last published 4 years ago.

1 2 3 4 5…1704

crates.io matches

Showing 12 of 44 · Rust

See all crates.io →

pdf-extractv0.10.0

A library to extract content from pdfs

MaintenanceAging

PopularityWidely used

Aging — last published 8 months ago — check before adopting.

omniparsev0.4.0

A Rust toolkit for detecting and extracting metadata, text, and content from various file formats

MaintenanceHealthy

PopularityRising

Worth a look. Actively maintained and growing, actively maintained.

mailrs-attachment-extractv1.0.0

Extract text from email attachments (PDF + image OCR). PDF text via `pdf-extract` (pure Rust); OCR via the `tesseract` CLI subprocess (not linked as a C library). Two-stage fallback for scanned PDFs: try embedded text first, fall back to OCR on the raw bytes if the text is too short. Returns `ExtractionResult` with text + language + confidence + page count + JSON metadata.

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

Self-contained web search MCP server. 9 backends with automatic fallback. Works from any IP.

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

zotero-mcpv0.3.1

Local-first MCP server bridging Claude to your Zotero library — search, read, cite, enrich, write — over stdio or streamable-HTTP with OAuth 2.1.

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

High-performance PDF text extraction library for vectorization pipelines

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

langchain-ai-rustv5.0.1

Build LLM applications in Rust with type safety: chains, agents, RAG, LangGraph, embeddings, vector stores, and 20+ document loaders. A LangChain port supporting OpenAI, Claude, Gemini, Mistral, Bedrock, Ollama, and more. Includes streaming, structured output, and multi-agent (Deep Agent) workflows.

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

transmutationv0.3.2

High-performance document conversion engine for AI/LLM embeddings - 27 formats supported

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

rusty-page-indexerv0.5.5

A high-performance, reasoning-based RAG indexer in Rust following the PageIndex pattern.

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

pdfsink-rsv0.2.8

Fast pure-Rust PDF extraction library and CLI — ~10-50x faster than pdfplumber for text, word, table, layout, image, and metadata extraction from PDFs. By Clark Labs Inc.

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

TUI for webpage summarisation

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

havocomparev0.8.0

A flexible rule-based file and folder comparison tool and crate including nice html reporting. Compares CSVs, JSON, text files, pdf-texts and images.

MaintenanceAging

PopularityNiche

Aging — last published 10 months ago — check before adopting.

RubyGems matches

Exact match · Ruby

pdf-extractv0.1.1

PDF content extraction tool and library.

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 14 years ago.

pdf_extractv0.5.0

description yo

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 12 years ago.

pdf-extract-metav0.1.1

A command line utility for extracting annotation and field metadata from a PDF in JSON format.

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 7 years ago.

pdf-reader-extract-imagesv0.2.0

Extract all images with format conversions based upon Pdf::Reader library

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 4 years ago.

Grim is a simple gem for extracting a page from a pdf and converting it to an image as well as extract the text from the page as a string. It basically gives you an easy to use api to ghostscript, imagemagick, and pdftotext specific to this use case.

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 2 years ago.

Extract citations from PDFs.

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 12 years ago.

Extract tables from PDF as a structured info. Uses ghostscript to print pdf to image, then recognizes table separators optically. No OpenCV or other heavy dependencies

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 3 years ago.

pdfbox_text_extractionv1.2.0

This gem lets you extract plain text from PDF documents. It is a Jruby wrapper for the Apache PDFBox library.

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 9 years ago.

chupa-text-decomposer-pdfv1.1.1

This is a ChupaText decomposer plugin for to extract text and meta-data from PDF. You can use `pdf` decomposer.

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 7 years ago.

fillable-pdfv1.0.1

FillablePDF is an extremely simple and lightweight utility that bridges iText and Ruby in order to fill out fillable PDF forms or extract field values from previously filled out PDF forms.

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

textractorv0.2.0

simple wrapper around CLI for extracting text from PDF and Word documents

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 14 years ago.

NuGet matches

Showing 12 of 1,075 · .NET

See all NuGet →

pdf-extractv1.0.1

No description provided.

MaintenanceAbandoned

PopularityUnknown

Abandoned. Last published 10 years ago.

aspose.pdfv26.5.0

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

ironpdfv2026.6.1

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

syncfusion.pdf.net.corev33.2.10

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

ironpdf.slimv2026.6.1

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

bitmiracle.docotic.pdfv9.9.19928

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

xdoc.pdfv12.6.1

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

spire.pdfv12.5.8

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

nreco.pdfgeneratorv1.2.1

No description provided.

MaintenanceAbandoned

PopularityUnknown

Abandoned. Last published 3 years ago.

itextsharpv5.5.13.5

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

ironpdf.linuxv2026.6.1

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

aspose.pdf.drawingv26.5.0

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.