BETAmodules.com is in beta — open to partnerships & joint ventures.Build with us

Home Search Compare Equivalents

One search box and one honest, consistent read on every open-source library — across every ecosystem.

npmPyPIcrates.ioRubyGemsGoMavenNuGet

Discover

Tools

Compare Equivalents

Data

deps.dev OSV advisories npm registry PyPI

About

Methodology Partner with us

© 2026 Modules · A precision instrument for picking dependencies.Data refreshed continuously from public registries, deps.dev & OSV

cross-ecosystem search · live

Results for pdf-extraction

Found in 4 of 7 ecosystemsnpm 1–24 of 29,807 · 359 matches across other registries

npm29807 crates.io43 RubyGems11 NuGet305

How we search: free-text on npm, crates.io, RubyGems, NuGet and Maven. PyPI and Go do exact-name lookup only. Tip: click an ecosystem chip below to filter; click Show all ecosystems to come back.

Sort

Auto-load on scroll

npm matches

Showing 24 of 29,807 · JavaScript

See all npm →

pdf-extractionv1.0.2

Pure javascript cross-platform module to extract text from PDFs.

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 5 years ago.

PDF extraction and rendering across all JavaScript runtimes

MaintenanceHealthy

PopularityWidely used

Solid choice. Well-adopted and active, actively maintained.

@tagspaces/tagspaces-pdf-extractionv4.8.4

TagSpaces pdf extraction module

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

pi-web-accessv0.10.7

Web search, URL fetching, GitHub repo cloning, PDF extraction, YouTube video understanding, and local video analysis for Pi coding agent

MaintenanceHealthy

PopularityRising

Worth a look. Actively maintained and growing, actively maintained.

edgeparsev0.2.3

High-performance PDF extraction — Rust engine, Node.js interface

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

pdfexcavatorv0.1.2

A powerful PDF extraction library for Node.js built on Mozilla's pdf.js - extract text, tables, and visual elements with precision

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

@d0paminedriven/pdfdownv0.9.9

Rust powered PDF extraction for Node

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@okrapdf/cliv0.5.3

OkraPDF command-line interface for PDF extraction and document chat

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@d0paminedriven/pdfdown-win32-ia32-msvcv0.9.9

Rust powered PDF extraction for Node

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@d0paminedriven/pdfdown-win32-arm64-msvcv0.9.9

Rust powered PDF extraction for Node

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@d0paminedriven/pdfdown-linux-arm64-gnuv0.9.9

Rust powered PDF extraction for Node

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@d0paminedriven/pdfdown-win32-x64-msvcv0.9.9

Rust powered PDF extraction for Node

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@d0paminedriven/pdfdown-darwin-x64v0.9.9

Rust powered PDF extraction for Node

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@d0paminedriven/pdfdown-darwin-arm64v0.9.9

Rust powered PDF extraction for Node

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@d0paminedriven/pdfdown-android-arm64v0.9.9

Rust powered PDF extraction for Node

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@d0paminedriven/pdfdown-android-arm-eabiv0.9.9

Rust powered PDF extraction for Node

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@d0paminedriven/pdfdown-linux-x64-gnuv0.9.9

Rust powered PDF extraction for Node

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@d0paminedriven/pdfdown-freebsd-x64v0.9.9

Rust powered PDF extraction for Node

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@d0paminedriven/pdfdown-linux-arm-gnueabihfv0.9.9

Rust powered PDF extraction for Node

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@d0paminedriven/pdfdown-linux-x64-muslv0.9.9

Rust powered PDF extraction for Node

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@d0paminedriven/pdfdown-wasm32-wasiv0.9.9

Rust powered PDF extraction for Node

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

@d0paminedriven/pdfdown-linux-arm64-muslv0.9.9

Rust powered PDF extraction for Node

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

mineru-parserv0.1.0

Parse MinerU PDF extraction JSON output into clean Markdown

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

mcp-ansc-serverv1.0.3

MCP server for Moldova's National Agency for Solving Complaints (ANSC) — appeals, decisions, hearing schedule, and multi-modal PDF extraction.

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

1 2 3 4 5…1242

crates.io matches

Showing 12 of 43 · Rust

See all crates.io →

pdf-extractv0.10.0

A library to extract content from pdfs

MaintenanceAging

PopularityWidely used

Aging — last published 8 months ago — check before adopting.

omniparsev0.4.0

A Rust toolkit for detecting and extracting metadata, text, and content from various file formats

MaintenanceHealthy

PopularityRising

Worth a look. Actively maintained and growing, actively maintained.

mailrs-attachment-extractv1.0.0

Extract text from email attachments (PDF + image OCR). PDF text via `pdf-extract` (pure Rust); OCR via the `tesseract` CLI subprocess (not linked as a C library). Two-stage fallback for scanned PDFs: try embedded text first, fall back to OCR on the raw bytes if the text is too short. Returns `ExtractionResult` with text + language + confidence + page count + JSON metadata.

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

Self-contained web search MCP server. 9 backends with automatic fallback. Works from any IP.

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

zotero-mcpv0.3.1

Local-first MCP server bridging Claude to your Zotero library — search, read, cite, enrich, write — over stdio or streamable-HTTP with OAuth 2.1.

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

High-performance PDF text extraction library for vectorization pipelines

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

langchain-ai-rustv5.0.1

Build LLM applications in Rust with type safety: chains, agents, RAG, LangGraph, embeddings, vector stores, and 20+ document loaders. A LangChain port supporting OpenAI, Claude, Gemini, Mistral, Bedrock, Ollama, and more. Includes streaming, structured output, and multi-agent (Deep Agent) workflows.

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

transmutationv0.3.2

High-performance document conversion engine for AI/LLM embeddings - 27 formats supported

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

rusty-page-indexerv0.5.5

A high-performance, reasoning-based RAG indexer in Rust following the PageIndex pattern.

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

pdfsink-rsv0.2.8

Fast pure-Rust PDF extraction library and CLI — ~10-50x faster than pdfplumber for text, word, table, layout, image, and metadata extraction from PDFs. By Clark Labs Inc.

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

TUI for webpage summarisation

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

havocomparev0.8.0

A flexible rule-based file and folder comparison tool and crate including nice html reporting. Compares CSVs, JSON, text files, pdf-texts and images.

MaintenanceAging

PopularityNiche

Aging — last published 10 months ago — check before adopting.

RubyGems matches

Exact match · Ruby

Grim is a simple gem for extracting a page from a pdf and converting it to an image as well as extract the text from the page as a string. It basically gives you an easy to use api to ghostscript, imagemagick, and pdftotext specific to this use case.

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 2 years ago.

Extract citations from PDFs.

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 12 years ago.

pdf-extractv0.1.1

PDF content extraction tool and library.

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 14 years ago.

Extract tables from PDF as a structured info. Uses ghostscript to print pdf to image, then recognizes table separators optically. No OpenCV or other heavy dependencies

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 3 years ago.

pdf_extractv0.5.0

description yo

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 12 years ago.

pdfbox_text_extractionv1.2.0

This gem lets you extract plain text from PDF documents. It is a Jruby wrapper for the Apache PDFBox library.

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 9 years ago.

chupa-text-decomposer-pdfv1.1.1

This is a ChupaText decomposer plugin for to extract text and meta-data from PDF. You can use `pdf` decomposer.

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 7 years ago.

fillable-pdfv1.0.1

FillablePDF is an extremely simple and lightweight utility that bridges iText and Ruby in order to fill out fillable PDF forms or extract field values from previously filled out PDF forms.

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

textractorv0.2.0

simple wrapper around CLI for extracting text from PDF and Word documents

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 14 years ago.

nameday_vvc_pdf_extractorv0.1.3

Nameday data extraction from Valsts valodas centrs PDF

MaintenanceAbandoned

PopularityNiche

Abandoned. Last published 4 years ago.

kreuzbergv4.9.8

Kreuzberg is a high-performance document intelligence library with a Rust core and native Ruby bindings via Magnus. Extract text, metadata, and structured data from 75+ file formats including PDF, DOCX, PPTX, XLSX, HTML, RTF, images (with OCR), email, archives, and more. Features async/sync APIs, text chunking, language detection, and keyword extraction.

MaintenanceHealthy

PopularityNiche

Maintained. Niche but maintained, actively maintained.

NuGet matches

Showing 12 of 305 · .NET

See all NuGet →

itextsharpv5.5.13.5

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

leadtools.pdfv23.0.0.6

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

cete.dynamicpdf.coresuite.netv12.44.0

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

aspose.wordsv26.5.0

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

itextsharp.xmlworkerv5.5.13.5

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

apitron.pdf.rasterizerv3.1.49

No description provided.

MaintenanceDeprecated

PopularityUnknown

Deprecated. Don't start a new project on this.

xdoc.pdf.ocrv12.3.5.36

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

edgepdf.viewer.ocr.blazorv12.3.5.36

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

edgepdf.viewer.ocr.aspnet.mvcv12.3.5.36

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.

edgepdf.viewer.ocr.aspnetv12.3.5.36

No description provided.

MaintenanceHealthy

PopularityUnknown

Maintained. Maintained, actively maintained.