Results for the-crawl

Abandoned. Last published 7 years ago.

Crawler for the-frameworks

json-crawlv0.5.3

Abandoned. Last published 2 years ago.

Async and sync crawler for json object

tree-crawlv1.2.2

Abandoned. Last published over a year ago.

Agnostic tree traversal library.

fdirv6.5.0

Aging — last published 10 months ago — check before adopting.

The fastest directory crawler & globbing alternative to glob, fast-glob, & tiny-glob. Crawls 1m files in < 1s

robots-parserv3.0.1

Abandoned. Last published 3 years ago.

A specification compliant robots.txt parser with wildcard (*) matching support.

@mendable/n8n-nodes-firecrawlv2.1.2

Official Firecrawl nodes for n8n - scrape, crawl, map, search, and extract data from websites. Supports AI Agent tool usage.

@mendable/firecrawl-jsv4.25.2

JavaScript SDK for Firecrawl API

@tavily/n8n-nodes-tavilyv0.5.1

A community node for n8n to integrate Tavily API for web search and content extraction.

firecrawlv4.25.2

JavaScript SDK for Firecrawl API

@relayfile/adapter-corev0.3.36

Schema-driven relayfile adapter generator and runtime

@mdream/crawlv1.3.0

Mdream Crawl generates comprehensive llms.txt artifacts from a single URL, using mdream to convert HTML to Markdown.

@genspark/cliv1.0.23

CLI tool for Genspark Tool API - search, crawl, analyze images, generate media

cypress-mapv1.56.0

Extra Cypress query commands for v12+

recrawlv2.2.1

Abandoned. Last published 5 years ago.

[![npm](https://img.shields.io/npm/v/recrawl.svg)](https://www.npmjs.com/package/recrawl) [![ci](https://github.com/aleclarson/recrawl/actions/workflows/release.yml/badge.svg)](https://github.com/aleclarson/recrawl/actions/workflows/release.yml) [![codeco

cypress-data-sessionv3.0.0

Cypress command for flexible test data setup

@crawlee/typesv3.17.0

Shared types for the crawlee projects

n8n-nodes-firecrawlv0.3.0

Abandoned. Last published over a year ago.

FireCrawl nodes for n8n

@ainyc/canonryv4.71.0

Agent-first open-source AEO operating platform - track how answer engines cite your domain

@crawlee/utilsv3.17.0

A set of shared utilities that can be used by crawlers

@xapp/arachne-cliv1.17.2

No description provided.

@crawlee/playwrightv3.17.0

The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

@crawlee/memory-storagev3.17.0

A simple in-memory storage implementation of the Apify API

@crawlee/browser-poolv3.17.0

Rotate multiple browsers using popular automation libraries such as Playwright or Puppeteer.

@crawlee/puppeteerv3.17.0

The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

RubyGems matches

Exact match · Ruby

proxycrawlv1.0.2

Abandoned. Last published 2 years ago.

Ruby based client for the ProxyCrawl API that helps developers crawl or scrape thousands of web pages anonymously

twittercrawlerv0.0.12

Crawls Twitter

Abandoned. Last published 8 years ago.

Crawl websites

Abandoned. Last published 15 years ago.

Crawling framework

Abandoned. Last published 7 years ago.

aranhav0.20.0

Aging — last published 12 months ago — check before adopting.

Ruby utilities for web crawling.

fassbinderv0.0.15

Abandoned. Last published 15 years ago.

Fassbinder crawls book offers on Amazon.

spiekerv0.0.10

Abandoned. Last published 12 years ago.

Easilly crawl a website

linkedincrawlerv0.0.20

Abandoned. Last published 9 years ago.

Crawls public LinkedIn profiles via Google

indeedcrawlerv0.0.5

Abandoned. Last published 9 years ago.

Crawls Indeed resumes

simplecrawlerv0.1.8

Abandoned. Last published 14 years ago.

The SimpleCrawler module is a library for crawling web sites. The crawler provides comprehensive data from the page crawled which can be used for page analysis, indexing, accessibility checks etc. Restrictions can be specified to limit crawling of binary files.

wayfarerv0.4.10