declarative html parser, an extension of jsonframe-cheerio
lezer-based JSON grammar
Parse, Resolve, and Dereference JSON Schema $ref pointers
Parse, Resolve, and Dereference JSON Schema $ref pointers
Validate XML, Parse XML, Build XML without C/C++ based libraries
Scanner and parser for JSON with comments.
A very fast HTML parser, generating a simplified DOM, with basic element query support.
HTML to React parser.
Parser for @html-eslint/eslint-plugin
HTML to DOM parser.
An evented streaming XML parser in JavaScript
Streaming HTML parser with scripting support.
A micro-library of stream components for building custom JSON and JSONC processing pipelines with a minimal memory footprint — parse, filter, and transform JSON far larger than available memory with a SAX-inspired token API, on Node.js or Web Streams.
Parse incomplete json text in best-effort manner
lezer-based HTML grammar
Fast & forgiving HTML/XML parser
Node.js body parsing middleware
Parse HTML/XML to PostHTMLTree
Parser adapter for parsing JSON documents into OpenAPI 2.0 namespace.
Parser adapter for parsing JSON documents into Arazzo 1.x.y namespace.
Parser adapter for parsing JSON documents into OpenAPI 3.0.x namespace.
Parser adapter for parsing JSON documents into API Design Systems namespace.
Parser adapter for parsing JSON documents into AsyncAPI 2.x.y namespace.
HTML parser and serializer.
feedparser - web feed parser and normalizer (RSS, Atom, JSON Feed, HTML h-entry, etc.)
`article_json` is a format definition for news articles and a ruby gem that offers conversions from and to different formats: - Parser for Google Doc HTML exports - Converter to: - simple HTML format - AMP format - Apple News Format (ANF) - plain text - JSON
Scrapetor is a Ruby HTML parsing + scraping toolkit. The parser is a native C arena DOM with structural indexes built at parse time and NEON SIMD scanners in the SAX hot loop. A streaming extraction engine compiles the schema DSL into a single forward pass — no DOM materialised, one Ruby boundary crossing per document. On builds where libcurl is available, Scrapetor::Fetcher adds an HTTP/2-capable fetch layer with per-thread connection cache, shared DNS + TLS session pool, in-process gzip / deflate / brotli / zstd decoding, iconv charset transcoding, retry + exponential backoff, ETag / Last-Modified disk cache with bulk revalidation, per-host throttle, cookie jar, basic + bearer auth, proxy, and three bulk concurrency models (parallel_fetch / multi_fetch / streaming multi_each). Scrapetor::Session ties the cookie / auth / throttle / retry policies together. Also ships robots.txt + sitemap.xml parsers, a bounded-memory streaming HTML parser, and structured-data extractors (JSON-LD, OpenGraph, Schema.org, Microdata, RDFa, Twitter Cards). The Net::HTTP-based Scrapetor.fetch is preserved as the no-libcurl fallback.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.