supercrawler
на сайте с December 16, 2022 12:31
Define custom handlers to parse content. Obeys robots.txt, rate limits and concurrency limits.
Supercrawler is a Node.js web crawler. It is designed to be highly configurable and easy to use.
When Supercrawler successfully crawls a page (which could be an image, a text document or any other file), it will fire your custom content-type handlers. Define your own custom handlers to parse pages, save data and do anything else you need.