**Lightning-fast text extraction for Office documents — built with pure native JavaScript.**
A robust, strictly-typed Node.js and Browser library for parsing office files (.docx, .pptx, .xlsx, .odt, .odp, .ods, .pdf, .rtf, .csv, .md, .html) and generating high-fidelity outputs in Markdown, HTML, CSV, RTF, and RAG-focused chunks.
Yet another library to extract text from MS Office and PDF files
Full-text PDF, DOCX, PPTX, XLSX search for static sites — Apache Solr for client-side apps, without Solr.
A robust, strictly-typed Node.js and Browser library for parsing office files (.docx, .pptx, .xlsx, .xls, .csv, .odt, .odp, .ods, .pdf, .rtf) into structured AST with rich metadata, formatting, and attachment support.
Converts most common file types into clean text or Markdown
Enhanced n8n document converter with flexible sheet processing. Converts DOCX, XML, YML, XLS, XLSX, CSV, PDF, TXT, PPT, PPTX, HTML, JSON, ODT, ODP, ODS to JSON/text. Features individual sheet workflow items, toggleable metadata, Excel row/column preservat
ECMAScript AST recursive visitor
Optional Office format renderers (XLSX, PPTX/ODP slide decks, legacy Office, cloud viewer).
deterministic `JSON.stringify()` - a faster version of substack's json-stable-strigify without jsonify
Array#isArray for older browsers
Light ECMAScript (JavaScript) Value Notation - human written, concise, typed, flexible
Buffers events from a stream until you are ready to handle them.
Get the command from a shebang
CLI arguments parser. Native port of python's argparse.
## Introduction
Resolve the path of a module like `require.resolve()` but from a given path
JSON parse & stringify that supports binary via bops & base64
Run a function exactly one time
Callback wrapping utility
Fast deep equal
Allows users to use generators in order to write common functions that can be both sync or async.
Astro integration for @icjia/pdf-search-index — adds linked PDFs as first-class search rows.
deterministic JSON.stringify() with custom sorting to get deterministic hashes from stringified results, with no public domain dependencies