Why is it important?
Both are right.
How HTTP requests work
What is the DOM?
DOM stands for Document Object Model:
- Document: This is the web page.
- Object: Every element on the web page (e.g.
- Model: Describes the hierarchy within the document (e.g.
<title></title>goes into the
The issues this presents
If it took your browser several seconds to fully render the web page, and the page source didn’t contain much body content, then how will search engines figure out what this page is about?
They’ll need to render the page, similar to what your browser just did, but without having to display it on a screen. Search engines use a so-called “headless browser.”
Note: we keep talking about “search engines,” but from here on out, we’ll be focusing on Google specifically. Mind you, Bing—which also powers Yahoo and DuckDuckGo— , but since their market share is much smaller than Google’s, we’ll focus on Google.
We’ll explain every step of the process:
- Crawl Queue: It keeps track of every URL that needs to be crawled, and it is updated continuously.
- Crawler: When the crawler (“Googlebot”) receives URLs from the
Crawl Queue, they request its HTML.
- Processing: The HTML is analyzed, and
a) URLs found are passed on to the
Crawl Queuefor crawling.
b) The need for indexing is assessed—for instance if the HTML contains a
meta robots noindex, then it won’t be indexed (and will not be rendered either!). The HTML will also be checked for any new and changed content. If the content didn’t change, the index isn’t updated.
Render Queue. Please note that Google can already use the initial HTML response while rendering is still in progress.
d) URLs are canonicalized (note that this goes beyond the canonical link element; other canonicalization signals such as for example the XML sitemaps and internal links are taken into account as well).
- Render Queue: It keeps track of every URL that needs to be rendered, and—similar to the
Crawl Queue—it’s updated continuously.
- Renderer: When the renderer (Web Rendering Services, or “WRS” for short) receives URLs, it renders them and sends back the rendered HTML for processing. Steps 3a, 3b, and 3d are repeated, but now using the rendered HTML.
- Index: It analyzes content to determine relevance, structured data, and links, and it (re)calculates the PageRank and lay-out.
- Ranking: The ranking algorithm pulls information from the index to provide Google users with the most relevant results.
- Forward URLs that need to be crawled to the
- Forward information that needs to be indexed to the
Indexphase. This makes the whole crawling and indexing process very inefficient and slow.
Imagine having a site with 50,000 pages, where Google needs to do a double-pass and render all of those pages. That doesn’t go down great, and it negatively impacts your SEO performance—it will take forever for your content to start driving organic traffic and deliver ROI.
Rest assured, when you continue reading you’ll learn how to tackle this.
Avoid search engines having to render your pages
Based on the initial HTML response, search engines need to be able to fully understand what your pages are about and what your crawling and indexing guidelines are. If they can’t, you’re going to have a hard time getting your pages to rank competitively.
Include essential content in initial HTML response
If you can’t prevent your pages from needing to be rendered by search engines, then at least make sure essential content, such as the title and meta elements that go into the
They should be included in the initial HTML response. This enables Google to get a good first impression of your page.
All pages should have unique URLs
Every page on your site needs to have a unique URL; otherwise Google will have a really hard time exploring your site and figuring out what your pages need to rank for.
Don’t use fragments in URLs to load new pages, as Google will mostly ignore these. While it may be fine for visitors to check out your “About Us” page on
https://example.com#about-us, search engines will often disregard the fragment, meaning they won’t learn about that URL.
Did you know...
…that technical SEOs recommend using fragments for faceted navigation on places like eCommerce sites to preserve crawl budget?
Include navigational elements in your initial HTML response
All navigational elements should be present in the HTML response. Including your main navigation is a no-brainer, but don’t forget about your sidebar and footer, which contain important contextual links.
And especially in eCommerce, this one is important: pagination. While infinite scrolling makes for a cool user experience, it doesn’t work well for search engines, as they don’t interact with your page. So they can’t trigger any events required to load additional content.
Here’s an example of what you need to avoid, as it requires Google to render the page to find the navigation link:
Instead, do this:
Send clear, unambiguous indexing signals
Meta robots directives
- If you’ve got a
<meta name="robots" content="noindex, follow" />included in the initial HTML response that you’ve overwritten with a
noindex, they decide not to spend precious rendering resources on it.
On top of that, even if they were to discover that the
noindexhas been changed to
index, Google generally adheres to the most restrictive directives, which is the
noindexin this case.
- But what if you do it the other way around, having
In that case, Google is likely to just index the page because it’s allowed according to the initial HTML response. However, only after the page has been rendered, Google finds out about the
noindexand removes the page from its index. For a (brief) period of time, that page which you didn’t want to be indexed was in fact indexed and possibly even ranking.
Overwriting canonical links equally makes mayhem.
, John Mueller said: “We (currently) only process the rel=canonical on the initially fetched, non-rendered version.” and Martin Splitt that this “undefined behaviour” leads to guesswork on Google’s part—and that’s really something you should avoid.
rel="nofollow" link attribute value
The same goes for adding the
rel="nofollow". Again, this is a waste of crawl budget and only leads to confusion.
Don’t forget to include other directives in your initial HTML response as well, such as for instance:
Remove render-blocking CSS
Leverage code splitting and lazy loading
Implement image lazy loading with loading attribute
Lazy-loading images is a great way to improve page load speed, but you don’t want Google to have to fully render a page to figure out what images are included.
See below for an image that’s included through the loading attribute example:
<img src="/images/cat.png" loading="lazy" alt="Black cat" width="250" height="250">
By including images via the loading attribute, you get the best of both worlds:
- Search engines are able to extract the image URLs directly from the HTML (without having to render it).
- Your visitors’ browsers know to lazy-load the image.
Don’t assume everyone has the newest iPhone and access to fast internet
Don’t make the mistake of assuming everyone is walking around with the newest iPhone and has access to 4G and a strong WiFi signal. , so be sure to test your site’s performance on different and older devices—and on slower connections. And don’t just rely on lab data; instead, rely on field data.
While there are lots of rendering options (e.g. pre-rendering) out there, covering them all is outside of the scope of this article. Therefore, we'll cover the most common rendering options to help provide search engines (and users!) a better experience:
- Server-side rendering
- Dynamic rendering
Server-side rendering is the process of rendering web pages on the server before sending them to the client (browser or crawler), instead of just relying on the client to render them.
- Every element that matters for search engines is readily available in the initial HTML response.
- It provides a fast First Contentful Paint ("FCP").
- Slow Time to First Byte ("TTFB"), because the server has to render web pages on the fly.
Dynamic Rendering means that a server responds differently based on who made a request. If it’s a crawler, the server renders the HTML and sends that back to the client, whereas a visitor needs to rely on client-side rendering.
- Every element that matters for search engines is readily available in the initial HTML response sent to search engines.
- It’s often easier and faster to implement.
- It makes debugging issues more complex.
How do I check what my rendered pages look like?
Note that you can also access the
HTML tab, which shows you the rendered HTML. This can be helpful for debugging.
And as you can imagine, Google needs to set priorities in rendering, because not all websites are equal. Therefore, websites have an assigned render budget. This allows Google to dedicate more time to rendering pages that they expect visitors to search for more often.
What about social media crawlers?
Social media crawlers like those from Facebook, Twitter, LinkedIn, and Slack need to have easy access to your HTML as well so that they can generate meaningful snippets.
If they can’t find a page’s OpenGraph, Twitter Card markup or—if those aren’t available—your title and meta description, they won’t be able to generate a snippet. This means your snippet will look bad, and it’s likely you won’t get much traffic from these social media platforms.
- —a good foundation for understanding JS SEO, crawl budget, and rendering.
- —pretty accessible, and it serves as a useful foundation.
- —more technical but quite useful, as they walk you through the different steps in the crawling, indexing, and rendering process.
- —a good article, supported by a slide deck, that describes interesting take-aways.
- —documentation by Google's Martin Splitt.
- —great guide by Bartosz Góralewicz.
- —a massive collection of resources by Barry Adams, for when you're ready to take deep dive.
What do Google bots look for?
Google's crawlers continuously look for new content, and when they find it a separate process will process it. That process will first look at the initial HTML response. This response needs to include all of the essential content for search engines to understand what the page is about, and what relations it has to other pages.
No, they can't. Therefore, it's highly recommended to either use server-side rendering or dynamic rendering. Otherwise, Facebook (and other social media platforms) won't be able to generate a good snippet for your URL when it's shared.