http://www.screamingfrog.co.uk/seo-spider/
screaming frog seo spider is a tool that can cause huge impact on stability and performance of a hypernode.
You can easily whitelist the user agent and remote ip on your own hypernode using: https://support.hypernode.com/knowledgebase/resolving-429-many-requests/#Ratelimitting_against_bots_and_crawlers
-
Jen commented
xml sitemap creator https://gringomarketing.com/
-
Hans Kuijpers commented
The SEO Spider Tool Crawls & Reports On The Following
A quick summary of some of the data collected in a crawl include –Errors – Client errors such as broken links & server errors (No responses, 4XX, 5XX).
Redirects – Permanent or temporary redirects (3XX responses).
External Links – All external links and their status codes.
Protocol – Whether the URLs are secure (HTTPS) or insecure (HTTP).
URI Issues – Non ASCII characters, underscores, uppercase characters, parameters, or long URLs.
Duplicate Pages – Hash value / MD5checksums algorithmic check for exact duplicate pages.
Page Titles – Missing, duplicate, over 65 characters, short, pixel width truncation, same as h1, or multiple.
Meta Description – Missing, duplicate, over 156 characters, short, pixel width truncation or multiple.
Meta Keywords – Mainly for reference, as they are not used by Google, Bing or Yahoo.
File Size – Size of URLs & images.
Response Time.
Last-Modified Header.
Page Depth Level.
Word Count.
H1 – Missing, duplicate, over 70 characters, multiple.
H2 – Missing, duplicate, over 70 characters, multiple.
Meta Robots – Index, noindex, follow, nofollow, noarchive, nosnippet, noodp, noydir etc.
Meta Refresh – Including target page and time delay.
Canonical link element & canonical HTTP headers.
X-Robots-Tag.
rel=“next” and rel=“prev”.
AJAX – The SEO Spider obeys Google’s AJAX Crawling Scheme.
Inlinks – All pages linking to a URI.
Outlinks – All pages a URI links out to.
Anchor Text – All link text. Alt text from images with links.
Follow & Nofollow – At page and link level (true/false).
Images – All URIs with the image link & all images from a given page. Images over 100kb, missing alt text, alt text over 100 characters.
User-Agent Switcher – Crawl as Googlebot, Bingbot, Yahoo! Slurp, mobile user-agents or your own custom UA.
Redirect Chains – Discover redirect chains and loops.
Custom Source Code Search – The SEO Spider allows you to find anything you want in the source code of a website! Whether that’s Google Analytics code, specific text, or code etc. (Please note – This is not a data extraction or scraping feature yet.)
XML Sitemap Generator – You can create an XML sitemap and an image sitemap using the SEO spider.
The Screaming Frog SEO Spider is an SEO auditing tool, built by real SEOs with thousands of users worldwide.