Async and sync crawler for json object
Agnostic tree traversal library.
A specification compliant robots.txt parser with wildcard (*) matching support.
Official Firecrawl nodes for n8n - scrape, crawl, map, search, and extract data from websites. Supports AI Agent tool usage.
The fastest directory crawler & globbing alternative to glob, fast-glob, & tiny-glob. Crawls 1m files in < 1s
Schema-driven relayfile adapter generator and runtime
CLI tool for Genspark Tool API - search, crawl, analyze images, generate media
A community node for n8n to integrate Tavily API for web search and content extraction.
SEO analytics and indexing diagnostics module with Google Search Console integration and AI-ready reports
Fast, token-efficient web content extraction - fetch web pages and convert to clean Markdown
JavaScript SDK for Firecrawl API
Agent-first open-source AEO operating platform - track how answer engines cite your domain
Mdream Crawl generates comprehensive llms.txt artifacts from a single URL, using mdream to convert HTML to Markdown.
JavaScript SDK for Firecrawl API
n8n nodes for Crawl4AI v0.8.5 web crawler and data extraction with enhanced features
Crawls and extracts data from the Mauve API.
Pi extension that exposes Firecrawl web scraping and crawling tools.
Parse JSON with more helpful errors
MCP server for advanced web search using Tavily
Accessibility utilities for PatternFly.
Val: a 100% MCP QA agent for vibecoders. Drives a real browser to catch UX bugs (broken links, 404s, console errors, broken images) so your coding agent can fix them.
Headless Playwright static publisher for WordPress/Elementor sites with sitemap-only page discovery, strict asset capture, escaped URL rewrite, structured logs, and targeted retry modes.
Allow parsing of the U+2028 LINE SEPARATOR and U+2029 PARAGRAPH SEPARATOR in JS strings
Escape U+2028 LINE SEPARATOR and U+2029 PARAGRAPH SEPARATOR in JS strings
Create instance (i.e. JSON representation of site crawling rules), set Server Nodes addresses with callback urls and choose schedule time.
This gem crawls the latest CircleCI artifact file you specified. For Example, you can get the result JSON of simplecov.gem etc.
Nous crawls same-host web pages, extracts readable content, and serializes clean Markdown as text or JSON.
Crawl a site and check various health indicators, such as: HTTP 4XX, 5XX status, valid HTML/XML/JSON. Missing image alt attributes/missing HTML title/description.
SidekiqStatusMonitor offers a solution to add HTTP server for the sidekiq instance. Can be used for Kubernetes livenessProbe and readinessProbe checks. Other liveness/alive checks can be done too since the server returns 200/500 status codes. Also provides a HTTP JSON interface for crawling metrics.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.