pdf_table_extractor — package search across npm, PyPI, crates.io, RubyGems, Go, Maven & NuGet

BETAmodules.com is in beta — open to partnerships & joint ventures.Build with us

crates.io matches

Showing 12 of 16 · Rust

See all crates.io →

pdf2mdv0.1.0

crates.io

PDF → Markdown extractor with figure rasterization, table & banner detection. Built on pdfium-render.

Maintained. Niche but maintained, actively maintained.

pdfoxv0.1.0

crates.io

A pure-Rust PDF library — create, parse, and render PDF documents with zero C dependencies

Maintained. Niche but maintained, actively maintained.

project-ragv0.1.0

crates.io

RAG-based codebase indexing and semantic search - dual purpose library and MCP server

Maintained. Niche but maintained, actively maintained.

olgav0.1.3

crates.io

Four formats, one engine. PDF, DOCX, XLSX, HTML → Markdown and typed JSON. 15–40× faster than equivalent-quality OSS tools, with pipeline pre-flight and element-level provenance.

Maintained. Niche but maintained, actively maintained.

deformatv0.15.1

crates.io

Extract plain text from HTML, PDF, and other document formats

Maintained. Niche but maintained, actively maintained.

zeph-memoryv0.21.3

crates.io

Semantic memory with SQLite and Qdrant for Zeph agent

Maintained. Niche but maintained, actively maintained.

corpus-preprocv0.1.0

crates.io

A preprocessor for text and HTML corpora

Abandoned. Last published 4 years ago.

kowalski-academic-agentv0.5.0

crates.io

Kowalski Academic Agent: A Rust-based agent for interacting with Ollama models

Aging — last published 11 months ago — check before adopting.

readexv0.19.2

crates.io

HTML main-content extraction (article body, title, metadata) — Rust ports of Mozilla Readability, Trafilatura, and htmldate.

Maintained. Niche but maintained, actively maintained.

omniparsev0.4.0

crates.io

A Rust toolkit for detecting and extracting metadata, text, and content from various file formats

Worth a look. Actively maintained and growing, actively maintained.

deepwiki-rsv1.5.0

crates.io

deepwiki-rs(also known as Litho) is a high-performance automatic generation engine for C4 architecture documentation, developed using Rust. It can intelligently analyze project structures, identify core components, parse dependency relationships, and leverage large language models (LLMs) to automatically generate professional architecture documentation.

Maintained. Niche but maintained, actively maintained.

paperoniv0.6.1-alpha1

crates.io

A web article downloader

Abandoned. Last published 4 years ago.