A modular, open source library for converting HTML content into professional document formats. Initially focused on HTML-to-DOCX conversion, with planned support for PDF and XLSX. Built with TypeScript, it features a core HTML parsing engine and separate
PDF adapter for html-to-document-core — converts a DocumentElement tree into .pdf using the html2pdf.js library.
DOCX adapter for html-to-document-core — converts a DocumentElement tree into a .docx Buffer using the docx library.
PDF deconverter for html-to-document-core — converts PDF files to DocumentElement[] using pdf-parse.
CSS parser plugin for html-to-document-core that harvests <style> tags and appends parsed statements to the per-parse stylesheet.
Core engine that parses HTML into an intermediate DocumentElement tree and exposes a plugin registry so external adapters can convert that tree into DOCX, PDF, XLSX, Markdown and more.
Convert Word documents from docx to simple HTML and Markdown
Language service for HTML
HTML templates literals in JavaScript
HTML language support for the CodeMirror code editor
Advanced html to plain text converter
A JavaScript implementation of many web standards
hast utility to parse from HTML
Inlines img, script and link tags into the same file.
This package contains HTML helpers and functionality for Lexical.
A standalone version of the readability library used for Firefox Reader View.
rehype plugin to parse HTML
hast utility that turns HTML into a syntax tree (while being small in browsers)
A minimal DOM implementation
Format for representing rich text documents and changes.
A triple-linked lists based DOM implementation
[](https://www.npmjs.com/package/docx-preview) [](https://war.ukraine.ua/)
A CSS selector engine.
Fast implementation of JSON-Patch (RFC-6902) with duplex (observe changes) capabilities
A gem to convert PDF, Word, Excel, and image files to rich HTML format compatible with Trix editor
RDoc produces HTML and command-line documentation for Ruby projects. RDoc includes the +rdoc+ and +ri+ tools for generating and displaying documentation from the command-line.
Convert HTML document to Microsoft Word document. This gem is in active development.
Inject content into an existing HTML document.
Loofah is a general library for manipulating and transforming HTML/XML documents and fragments, built on top of Nokogiri. Loofah also includes some HTML sanitizers based on `html5lib`'s safelist, which are a specific application of the general transformation functionality.
Nokogiri::Diff adds the ability to calculate the differences (added or removed nodes) between two XML/HTML documents.
isodoc converts documents in the IsoDoc document model into Microsoft Word and HTML. This gem is in active development.
Uses command-line pdftohtml tools to convert PDF files to HTML.
Generate and manipulate HTML documents and nodes.
Email services like GMail and Hotmail don't like stylesheets. The only way around it is to use inline tags. Replacing stylesheet references with inline tags is a pain in the arse. Use this tool to do the dirty work for you.
YARD plugin to build a sitemap.xml for generated HTML documentation.
Generates an HTML calendar from a Polyrex document
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.