A web crawler and scraper, building blocks for data curation workloads.
gottem CLI: universal scraper that reliably gets the data.
dyer is designed for reliable, flexible and fast Request-Response based service, including data processing, web-crawling and so on, providing some friendly, flexible, comprehensive features without compromising speed.
A concurrent asynchronous webscraping framework.
Blazing fast pure Rust Spatial Bench data generation library.
Rust port of trafilatura - web content extraction library
Fetch web pages and convert to clean Markdown for LLM pipelines
Pure Rust Computational Photonics & Optical Simulation Framework
Texting Robots: A Rust native `robots.txt` parser with thorough unit testing.
A Mock API for every need and more
Detect if a user-agent is a known bot