HTML to DOM parser.
Prototyped HTML DOM Parser and compiler
Fast & forgiving HTML/XML parser
A very fast HTML parser, generating a simplified DOM, with basic element query support.
HTML DOM parser in server side
An evented streaming XML parser in JavaScript
Fast parser of a html string and lot of options.
Utilities for working with htmlparser2's dom
An evented streaming XML parser in JavaScript
Handler for htmlparser2 that turns pages into a dom
JSSoup: the JavaScript HTML DOM parser for node.js
Parses well-formed HTML (meaning all tags closed) into an AST and back. quickly.
This package contains HTML helpers and functionality for Lexical.
A very fast HTML parser, generating a simplified DOM, with basic element query support.
render domhandler DOM nodes to a string
Validate XML, Parse XML, Build XML without C/C++ based libraries
Simple html parser util
HTML to React parser.
Scanner and parser for JSON with comments.
Fast dom parser based on regexps
A javascript library to extract Exif metadata from images, in node and in the browser.
Streaming HTML parser with scripting support.
lezer-based HTML grammar
Apricot is a HTML / DOM parser, scraper for Nodejs. It is inspired by rubys hpricot and designed to fetch, iterate, and augment html or html fragments.
There are three main function of this gem read html, search data, rebuild html.
Scrapetor is a Ruby HTML parsing + scraping toolkit. The parser is a native C arena DOM with structural indexes built at parse time and NEON SIMD scanners in the SAX hot loop. A streaming extraction engine compiles the schema DSL into a single forward pass — no DOM materialised, one Ruby boundary crossing per document. On builds where libcurl is available, Scrapetor::Fetcher adds an HTTP/2-capable fetch layer with per-thread connection cache, shared DNS + TLS session pool, in-process gzip / deflate / brotli / zstd decoding, iconv charset transcoding, retry + exponential backoff, ETag / Last-Modified disk cache with bulk revalidation, per-host throttle, cookie jar, basic + bearer auth, proxy, and three bulk concurrency models (parallel_fetch / multi_fetch / streaming multi_each). Scrapetor::Session ties the cookie / auth / throttle / retry policies together. Also ships robots.txt + sitemap.xml parsers, a bounded-memory streaming HTML parser, and structured-data extractors (JSON-LD, OpenGraph, Schema.org, Microdata, RDFa, Twitter Cards). The Net::HTTP-based Scrapetor.fetch is preserved as the no-libcurl fallback.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.