A PDF to Text Extractor
Extract the text from pdf files
MCP server for loading and extracting text from PDF files with chunked pagination and interactive viewer
Extracts text from pdf with pdftotext (poppler)
Extract the text from pdf files and more utils
a lightweight, promise style, functional wrapper of pdf2json, extract text from pdf easily
A lightweight easy to use package to parse text from PDF files on client side without any server dependency.
A pure JavaScript, cross-platform module designed for extracting text from PDF files.
extract text from pdf documents
Extract text from PDF page range (CJS)
Popplonode can extract metadata & text from pdf files in a quick way
Extract the text from pdf files
parse text from pdf to csv format
A robust Node.js utility for extracting text from PDF, DOCX, DOC, XLSX, and TXT buffers.
MCP server for parsing text from PDF url
AI-powered pdf document MCP server for agents. Supports extract text from pdf, convert pdf to markdown, merge pdfs. By MEOK AI Labs.
Extract text from PDF page range (CJS)
Extract plain text from pdf, docx, xlsx, ...
n8n node for extracting text from PDF pages along with images
Pure TypeScript, cross-platform module for extracting text, images, and tabular data from PDFs. Run directly in your browser or in Node!
Display PDFs in your React app as easily as if they were images.
PDF extraction and rendering across all JavaScript runtimes
Define uninitialized elements
An advanced text layout framework
Grim is a simple gem for extracting a page from a pdf and converting it to an image as well as extract the text from the page as a string. It basically gives you an easy to use api to ghostscript, imagemagick, and pdftotext specific to this use case.
Scrape text from common file formats (.pdf,.doc,.docx, .sketch, .txt) with a single convenient command.
Read text and metadata from files and documents (.doc, .docx, .pages, .odt, .rtf, .pdf)
This gem lets you extract plain text from PDF documents. It is a Jruby wrapper for the Apache PDFBox library.
Scrap text from pdf file.
This is a ChupaText decomposer plugin for to extract text and meta-data from PDF. You can use `pdf` decomposer.
simple wrapper around CLI for extracting text from PDF and Word documents
Build a CV PDF from a yaml text file.
Kreuzberg is a high-performance document intelligence library with a Rust core and native Ruby bindings via Magnus. Extract text, metadata, and structured data from 75+ file formats including PDF, DOCX, PPTX, XLSX, HTML, RTF, images (with OCR), email, archives, and more. Features async/sync APIs, text chunking, language detection, and keyword extraction.
Read text and metadata from files and documents using Apache Tika toolkit
Read text and metadata from files and documents (.doc, .docx, .pages, .odt, .rtf, .pdf)
Reads a text file (if supplied as the first argument) and creates a pdf file with the same name but with .pdf as extension in the current directory via the program pdflatex (the only requirement besides ruby itself). If '-h' is the first argument, then the program displays the helptext and exits. The program can also read the input text from STDIN (STanDard IN) and create the pdf file in the user's home directory. When this method is used, no argument is given to the program and the text is simply piped directly into the program like this: $ echo 'Hello' | txt2pdf This would create a pdf file with only 'Hello' and the page number at the bottom of the resulting pdf page. With this, you could map a key binding in your window manager to create a pdf file from the text you selected in any program, be it the terminal, your browser or your text editor. In my wm of choice, i3, I have added the following to my i3 config: bindsym $mod+p exec xclip -o | txt2pdf This would create a pdf file from the text I have selected as I hit the 'Window Button' and 'p'.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.