SEO Requirements for a New Website
Building websites is hard. Building SEO-friendly websites is even harder. It's all too common that a development team's members are high-fiving each other after a massive project, and yet it turns out that the website's not up to contemporary SEO standards.
There are many reasons why this can happen. The four most common are:
- The development team wasn't briefed properly.
- The development team wasn't educated well enough by the SEO specialists.
- SEOs were involved way too late in the process.
- The development team thought they had "this SEO thing" covered.
Requirements engineering—figuring out what you need to build—is a true science and is essential for the success of every development project, on and off the web.
If you don't know what the project's goals are, what to build, and what standards to adhere to, how can you make the project successful and build exactly what your customer wants?
In this article we'll cover all the SEO requirements you need to consider when building or maintaining a website. It's useful for everyone in the process: the customer who wants to have the website built, the SEO specialists who are tasked with ensuring the new website is SEO-proof, and the web development firms that are doing the actual building.
When do SEO specialists need to be involved?
SEO specialists need to be involved before, during, and after the building of a new website. It's important to keep them informed and updated whenever new websites are being built and changes are being made.
When you're launching a new website development project, involve SEOs from the get-go, because the new website's SEO requirements will heavily influence its price.
All too often, SEO specialists' involvement in the development process starts way too late. They then quickly realize that the new website isn't going to be SEO-proof as-is, so they compile a list of SEO requirements for which an additional budget is required. This results in a messy project flow, higher costs, and delays… none of which makes the customer happy!
Now let's get our hands dirty and move on to the actual SEO requirements!
All the requirements in this section are no-brainers. They've been industry standards for years, so they should come as no surprise to anyone.
A responsive design means that the design adjusts itself to the device it's being used on. Responsive designs are great for both visitors and search engine crawlers. They a consistent experience for visitors to your website, regardless of the device they're using. Google values websites that provide their visitors with a good mobile experience (and other search engine crawlers are doing the same). The reason behind this is that in late 2015, the number of mobile searches in Google surpassed the number of desktop searches. So making sure your website caters to mobile visitors is very important.
Responsive designs make SEO specialists' lives easier as well, because they only have one URL to promote per page. In the past, you'd have separate desktop and mobile sites that would both get linked to from other websites, and you'd need to consolidate those link and relevancy signals. This would always be sub-optimal compared to just getting links to a single URL.
Accelerated Mobile Pages
If you're working on a website for a publisher, then you need to consider implementing Accelerated Mobile Pages (AMP). We don't recommend implementing AMP for any other type of website.
AMP is an open-source initiative and format by Google with the aim of speeding up the web experience for mobile users. AMP pages are essentially stripped-down versions of pages that are optimized to load fast on mobile devices.
So why is AMP only useful for publishers? Because for publishers, it enables you to get into the Google News carousel, which can drive lots and lots of traffic… but other types of sites can't benefit from this, which makes the pros.
HTTPS stands for Hyper Text Transfer Protocol Secure. It's the secure version of HTTP, the protocol over which data is sent between your browser and the website you're visiting. Using HTTPS makes it harder for people to try and eavesdrop on you.
Google's been pushing for websites to adopt HTTPS and has made it into a minor ranking factor because of that. While it may help a little, it's not going to provide a significant competitive edge in terms of SEO. To support their cause though, they're showing the "Not Secure" notice in the URL for sites that contain forms and aren't running on HTTPS.
Here's an example:
HTTPS is a must-have. Any new website that's being built today should be served over HTTPS. That's why ContentKing can check a website to see if it's available via HTTPS and whether its HTTPS certificate is valid:
Make sure to load all resources over HTTPS, because you want to avoid having so called " " issues. Mixed content issues occur when some resourced are loaded over HTTP instead of HTTPS, thereby leaving the page unsafe.
The reasoning here is:
- Search engines' page-rendering resources are limited. Rendering a page can easily cost twenty times as many resources as crawling a regular HTML page, so search engines can only allocate a small portion of their resources to this. This results in waiting days, if not weeks, for your content to be indexed.
- Content that isn't indexed doesn't rank. Until your content is indexed, you'll get zero traffic from search engines.
SEO aside, client-side rendering also makes for a higher Time to Interactive (TTI). This means that visitors will have to wait longer before they're able to interact with the page.
Studies have shown that visitors like fast-loading pages. They decrease bounce rates and raise conversion rates. Amazon found that their revenue increased by 1% for every 100ms decrease in load time. On top of that, having fast loading pages helps your SEO. Especially on the first page of Google, page speed can really make a difference. And, as of May 2021, Google will take into account more metrics that aim to measure user experience. This new set of metrics is called .
There are hundreds of tweaks you can make to your website and web server that will help you get better page speed, but here are the most common best practices that you should take into account:
- Use a content delivery network
- Optimize your images
- Reduce server response time to <500 ms
- Use browser caching and file compression
Hierarchy of headings
Headings, H1–H6, are used to provide hierarchy and clarity for your web pages. By using headings appropriately, you can ensure that visitors can scan your pages quickly and help search engine crawlers to grasp your content's content and structure.
An H1 heading on a page should convey its main topic. For that reason, you shouldn't put the H1 heading around the logo, and you should only use one H1 heading per page.
The robots.txt file contains the rules of engagement for crawlers. You use it to tell crawlers what sections they can't access and to give them hints that help them discover your content efficiently by referencing the location of your XML sitemap.
Here's what's important to keep in mind when dealing with robots.txt:
- Different search engines interpret the robots.txt differently.
- Be careful not to
Disallowfiles that are required to render pages. This keeps search engines from rendering your pages, and it could hurt your SEO performance.
- And last but not least: robots.txt is very powerful. Be careful around it, as you can easily make your whole site inaccessible to search engines. Therefore it's important to monitor your robots.txt file.
XML sitemaps are an efficient way of telling search engines about the content you have on your website. Therefore, they play an important role in making sure your content is crawled quickly after publishing or updating.
Best practices surrounding XML sitemaps:
- Keep the XML Sitemap up to date with your website's content.
- Make sure it's clean: only indexable pages should be included.
- Reference the XML Sitemap from your robots.txt file.
- Don't list more than 50,000 URLs in a single XML Sitemap.
- Make sure the (uncompressed) file size doesn't exceed 50MB.
- Don't obsess about the
Keep in mind that there are special XML sitemaps for images and news articles.
HTTP status codes
It's all too common that "Page not found" error pages aren't returning an HTTP status code
404. This status code clearly communicates that the page doesn't exist.
HTTP status code
404 should be used for pages that never existed and can be used for pages that used to exist. We're deliberately writing can because there's an alternative status code that's more definitive. This is the HTTP status code:
410. HTTP status code
410signals to search engines that the page has been removed and will never return. Because of its definitive nature, be careful with this one as there's no turning back.
is the art and science of designing a structure for presenting your website's content. It's about defining what content is present and how it's made accessible. Information Architecture is where User Experience (UX) and Search Engine Optimization (SEO) meet. Everyone benefits from having a good information architecture, so this is an important phase in building a new website.
In this section, we'll be focusing on the Information Architecture SEO requirements for web development. Part of that is defining the URL structure, and how it behaves under certain conditions.
URLs should be lowercase, descriptive, readable, and short.
Ideally, URLs shouldn't have extensions such as .html, .php and .aspx as it allows you to switch platforms without having to redirect these URLs. Whatever structure you pick, make sure it's used consistently across your entire website.
The website needs to support the defining of a template that details how URLs are built up.
Things to consider:
- Are you using subdirectories in your URLs or not?
- Are you using a trailing slash (a slash at the end of each URL) or not?
- Also, make sure you're able to manually overwrite the URL template if need be.
Filters, variants, and multiple categories
If the new website contains filters, describe how they're going to affect the URL structure, and whether the resulting URLs should be accessible and indexable for search engines or not.
Do the same for page variants.
Example: you're going to be building a new eCommerce website for selling shoes. Each shoe is available in different colors and sizes, easily leading to dozens of product variants. Which pages do you want to be accessible and indexable, and which not?
And what about blog articles that are in multiple categories? Or products that are in multiple categories? There may be very good reasons to do so from a user point of view, but from an SEO point of view, it can really be a headache.
A blog article is in categories A, B and C, leading to the following URLs:
Category A is the primary category, so this article will be indexable on
https://example.com/blog/c/example-article should both be canonicalized to the primary URL.
Smart templates that save you time
From an SEO standpoint, it's important to be able to define templates for the title tag, meta description, and headings. They help you optimize consistently, while also saving you a lot of work.
You need to be able to define these templates on a website level, as well as for category pages and for all pages in a certain category. Specificity wins, so the most specific template for a page will be used.
$pageName - HappyShoes
$pageName - HappyShoes
Pages in Category "adidas"
$pageName - Buy adidas shoes
You've got a page named "adidas Ultra Boost size 44", so the
adidas Ultra Boost size 44 - Buy adidas shoes.
Manually overwriting smart defaults
On a page level, you should be able to overwrite these templates with manually defined ones.
That way you can quickly define a default title tag template for all of your blog articles. Or all the pages surrounding a certain service.
Crawling & Indexing
All the requirements described in this section are aimed at letting a search engine's . You want search engines to learn quickly about any new or updated content. And upon crawling your content, you want them to quickly understand it as well as possible.
Robots directives let you choose how crawlers should treat your pages, with the most well-known directives being the
nofollow directives. You can define these robots directives using the meta tag in the
<head> section of a page, but you can also do it through
X-Robots-Tag in the HTTP header.
By default, the robots directives should allow indexing, and not have the
nofollow directive applied.
Here again, you want to define the robots directives on the following three levels:
- The website level
- The category/segment level
- The page level
Canonical URLs inform search engines that they should prefer one page over other identical or similar pages. For instance, if you have three near identical pages—A, B, and C—and you want page A to be indexed, you then canonicalize both B and C to A.
By default, canonical URLs should be self-referencing—telling search engines they're the right variant to index.
When it comes to canonicals, you only want to define these on the page level.
Best practices surrounding canonical URLs:
- Use absolute URLs, including the domain and protocol.
- Define only one canonical URL per page.
- Define the canonical URL in the page's
<head>section or HTTP header.
- Point to an indexable page.
For a website that's available in multiple languages and/or regions, be sure to use the
hreflang attribute. The
hreflang attribute is used to indicate what language your content is in and what geographical region your content is meant for. You can define the
hreflang attribute by including it in the
<head> section, or using the HTTP header.
Say you have an English, Dutch, and French version of the website. You can use the
hreflang attribute to point search engines to the translated versions of your pages. The page's English version would have the following in its
<link rel="canonical" href="https://www.example.com/" /> <link rel="alternate" hreflang="en" href="https://www.example.com/" /> <link rel="alternate" hreflang="nl" href="https://www.example.nl/" /> <link rel="alternate" hreflang="fr" href="https://www.example.fr/" /> <link rel="alternate" hreflang="x-default" href="https://www.example.com/" />
Best practices surrounding
- Reference both the page itself and its translated variants.
- Make sure to have bidirectional
- Correctly define language and region combinations.
- Always set
hreflangattribute and the canonical URL must match.
- Use absolute URLs when defining the
- Use only one method to implement the
Using structured data, you can provide additional information about your content. For instance, you can use Schema.org to do this for search engines, while Open Graph serves platforms such as Facebook, LinkedIn, and Slack, and Twitter Cards do this for Twitter. By using structured data, you ensure you have full control of how your content is presented.
Schema.org is often used to mark up reviews and to communicate who wrote an article and what organization a website belongs to. Or in the case of a local business, you can use Schema.org to explain the type of business and what its opening hours are. There are lots of opportunities here, so make sure your website supports the defining of Schema.org properties. The markup should be added to the HTML.
There are multiple ways to define Schema, but the most popular (and Google-preferred) one is using the JSON-LD format.
You should be able to define these on the follow levels:
- The website level (e.g. for defining Organization)
- The page level (e.g. for defining Reviews)
Open Graph markup
There are four required Open Graph properties that you should be able to define:
There are also two recommended properties; use these to provide even more context about the content:
You should be able to define these on the following levels:
- The Website level (e.g. for defining a default image)
- The Category/Segment level (e.g. for defining a default image for all blog articles)
- The Page level (e.g. for defining Reviews)
Twitter Card markup
While Twitter Cards are quite similar to Open Graph, with Twitter Cards, there are four different types:
- Summary Card
- Summary Card with Large Image
- App Card
- Player Card
Here are the required Twitter Card properties:
But we highly recommend also including the three properties below to provide more context about the content:
While there are multiple factors that determine whether or not an image ranks highly, the website should support the defining of an
alt attribute and a
title tag attribute for each image. Also, when you're uploading media you need to make sure the URL path makes sense.
For instance, WordPress' default URL path, which contains the year and month, makes for very long URLs, and can also give people the false impression that your media is outdated when it's served from
Using image compression to decrease the file size of your images is recommended too. This improves page load speed.
To ensure that search engines learn about your images quickly and easily, be sure to use an image XML sitemap.
Therefore, it's essential that you be able to manage your navigation. There are various navigation types, such as for instance:
- Main navigation
- Sidebar navigation
- Footer navigation
Make sure you can manage these for the website, and also specifically for certain sections of the website. Take for example an eCommerce website that sells shoes, pants, and sweaters. In the shoes section, you may want to define a sidebar and footer that contain nothing but links to your most important product pages and sub-category pages about shoes.
In order for any new website to be SEO proof when it's launched, it has to have good specifications. Use this article to your advantage, and be sure to click through to the in-depth articles about the topics mentioned for additional information.