This library converts HTML to JSON or JSON to HTML tree
HTML to JSON parser
HTML to JSON parser
Simple HTML to JSON parser use Regexp and String.indexOf
HTML to JSON parser
HTML to JSON parser
Parse, Resolve, and Dereference JSON Schema $ref pointers
HTML to JSON parser for easyplate
Validate XML, Parse XML, Build XML without C/C++ based libraries
lezer-based JSON grammar
Scanner and parser for JSON with comments.
Parse, Resolve, and Dereference JSON Schema $ref pointers
A very fast HTML parser, generating a simplified DOM, with basic element query support.
An evented streaming XML parser in JavaScript
HTML to React parser.
HTML to DOM parser.
HTML to JSON parser
lezer-based HTML grammar
A micro-library of stream components for building custom JSON and JSONC processing pipelines with a minimal memory footprint — parse, filter, and transform JSON far larger than available memory with a SAX-inspired token API, on Node.js or Web Streams.
html to json parser work for weapp rich-text
Parser for @html-eslint/eslint-plugin
Node.js body parsing middleware
HTML to JSON parser
Fast & forgiving HTML/XML parser
feedparser - web feed parser and normalizer (RSS, Atom, JSON Feed, HTML h-entry, etc.)
`article_json` is a format definition for news articles and a ruby gem that offers conversions from and to different formats: - Parser for Google Doc HTML exports - Converter to: - simple HTML format - AMP format - Apple News Format (ANF) - plain text - JSON
Scrapetor is a Ruby HTML parsing + scraping toolkit. The parser is a native C arena DOM with structural indexes built at parse time and NEON SIMD scanners in the SAX hot loop. A streaming extraction engine compiles the schema DSL into a single forward pass — no DOM materialised, one Ruby boundary crossing per document. On builds where libcurl is available, Scrapetor::Fetcher adds an HTTP/2-capable fetch layer with per-thread connection cache, shared DNS + TLS session pool, in-process gzip / deflate / brotli / zstd decoding, iconv charset transcoding, retry + exponential backoff, ETag / Last-Modified disk cache with bulk revalidation, per-host throttle, cookie jar, basic + bearer auth, proxy, and three bulk concurrency models (parallel_fetch / multi_fetch / streaming multi_each). Scrapetor::Session ties the cookie / auth / throttle / retry policies together. Also ships robots.txt + sitemap.xml parsers, a bounded-memory streaming HTML parser, and structured-data extractors (JSON-LD, OpenGraph, Schema.org, Microdata, RDFa, Twitter Cards). The Net::HTTP-based Scrapetor.fetch is preserved as the no-libcurl fallback.