Screaming Frog SEO Spider offers several different crawl types that allow users to perform various types of website analysis. Each crawl type serves a specific purpose depending on the data you want to gather from a website. Here’s an overview of the different crawl types in Screaming Frog:
1. Standard Crawl (Default Mode)
- Description: This is the basic crawl mode where Screaming Frog crawls a website’s entire HTML structure, following links and analyzing all internal and external URLs.
- Use Case: Best for general website audits, identifying broken links, on-page SEO issues, duplicate content, and gathering basic metadata like titles and descriptions.
- Crawls: HTML, CSS, JavaScript, images, PDFs, and other linked content.
2. List Mode
- Description: In this mode, you provide a list of specific URLs to crawl, either from a file or pasted directly into the tool. Screaming Frog crawls only those URLs.
- Use Case: Useful for crawling specific sections of a site, such as when you need to audit a batch of landing pages, track performance of specific URLs, or compare before and after content changes.
- Crawls: Only the URLs on the list, without following links to new pages.
3. Crawl JavaScript-Rendered Content (Dynamic Crawl)
- Description: Screaming Frog can render JavaScript content (using a headless Chrome browser) to crawl pages where content is generated dynamically. This includes AJAX-loaded content or SPAs (Single Page Applications).
- Use Case: Use this when a website relies heavily on JavaScript for rendering key content or links, such as React, Angular, or Vue.js-based sites.
- Crawls: Fully rendered HTML after the JavaScript execution, including dynamically loaded content.
4. Spider Mode with Crawl Limits
- Description: Set up specific parameters to limit how far and how deeply the tool crawls a site. You can restrict the crawl by URL depth, subdomains, protocols (HTTP vs HTTPS), or page types (HTML, images, PDFs, etc.).
- Use Case: Use this for focused crawling, such as only analyzing a specific directory, or stopping the crawl after a certain number of pages.
- Crawls: Defined by user restrictions, focusing only on certain site areas or content types.
5. XML Sitemap Crawl
- Description: This mode allows you to crawl URLs found in an XML sitemap, ensuring that the pages included in the sitemap are valid, accessible, and optimized.
- Use Case: Best for checking the health and optimization of pages listed in your XML sitemap. Helps to ensure the sitemap reflects the correct site structure and that there are no broken or orphaned pages.
- Crawls: URLs directly from the provided XML sitemap file.
6. Crawl with Custom Extraction (XPath or CSS Path)
- Description: This advanced mode allows you to extract specific pieces of data from a site using XPath or CSS path selectors, in addition to crawling the site.
- Use Case: Ideal for pulling out custom data like product pricing, SKU numbers, or other non-SEO data from e-commerce sites or content-rich pages.
- Crawls: HTML with the additional extraction of targeted elements via XPath or CSS selectors.
7. Log File Analysis
- Description: Instead of crawling the website, this mode allows you to import server log files and analyze how search engine bots and users are interacting with the site.
- Use Case: Useful for identifying issues related to search engine crawls, such as crawl budget, frequency, and which pages bots are crawling. Can also help to detect bot-related errors or wasted crawl resources.
- Crawls: No actual crawling is done. The analysis is based on imported log file data.
8. API Integrations Crawl
- Description: Screaming Frog can integrate with various third-party APIs (e.g., Google Analytics, Google Search Console, PageSpeed Insights) to enhance your crawl data with performance and traffic metrics.
- Use Case: Best for combining SEO crawl data with external metrics, such as comparing crawl data with search traffic, identifying high-traffic pages with SEO issues, or linking to PageSpeed scores.
- Crawls: Standard crawl, but enriched with API data.
9. Headless Crawl Mode
- Description: In headless mode, Screaming Frog simulates the browsing experience without rendering the page in a traditional browser. This mode helps you check how search engines and bots see your site without additional browser overhead.
- Use Case: Useful for situations where you want a fast crawl without full JavaScript rendering, or when you need to gather data quickly from a large site.
- Crawls: Only HTML and linked resources, without fully loading JavaScript.
10. Crawl Using the Canonical or Hreflang Attribute
- Description: Configure Screaming Frog to crawl only canonical URLs or hreflang URLs (alternative language versions).
- Use Case: Useful for auditing international websites or ensuring that canonical tags are properly set up and not leading to crawl issues.
- Crawls: Focuses on canonical or hreflang URLs, depending on the configuration.
