SEO Requirements for a New Website
Building websites is hard. Building SEO-friendly websites is even harder. It's all too common that a development team's members are high-fiving each other after a massive project, and yet it turns out that the website's not up to contemporary SEO standards.
There are many reasons why this can happen. The four most common are:
- The development team wasn't briefed properly.
- The development team wasn't educated well enough by the SEO specialists.
- SEOs were involved way too late in the process.
- The development team thought they had "this SEO thing" covered.
Requirements engineering—figuring out what you need to build—is a true science and is essential for the success of every development project, on and off the web.
If you don't know what the project's goals are, what to build, and what standards to adhere to, how can you make the project successful and build exactly what your customer wants?
In this article we'll cover all the SEO requirements you need to consider when building or maintaining a website. It's useful for everyone in the process: the customer who wants to have the website built, the SEO specialists who are tasked with ensuring the new website is SEO-proof, and the web development firms that are doing the actual building.
When do SEO specialists need to be involved?
SEO specialists need to be involved before, during, and after the building of a new website. It's important to keep them informed and updated whenever new websites are being built and changes are being made.
When you're launching a new website development project, involve SEOs from the get-go, because the new website's SEO requirements will heavily influence its price.
All too often, SEO specialists' involvement in the development process starts way too late. They then quickly realize that the new website isn't going to be SEO-proof as-is, so they compile a list of SEO requirements for which an additional budget is required. This results in a messy project flow, higher costs, and delays… none of which makes the customer happy!
Now let's get our hands dirty and move on to the actual SEO requirements!
No-brainers
All the requirements in this section are no-brainers. They've been industry standards for years, so they should come as no surprise to anyone.
Responsive design
A responsive design means that the design adjusts itself to the device it's being used on. Responsive designs are great for both visitors and search engineSearch Engine
A search engine is a website through which users can search internet content.
Learn more crawlersCrawlers
A crawler is a program used by search engines to collect data from the internet.
Learn more. They a consistent experience for visitors to your website, regardless of the device they're using. Google values websites that provide their visitors with a good mobile experience (and other search engine crawlers are doing the same). The reason behind this is that in late 2015, the number of mobile searches in Google surpassed the number of desktop searches. So making sure your website caters to mobile visitors is very important.
Responsive designs make SEO specialists' lives easier as well, because they only have one URL to promote per page. In the past, you'd have separate desktop and mobile sites that would both get linked to from other websites, and you'd need to consolidate those link and relevancy signals. This would always be sub-optimal compared to just getting links to a single URL.
Accelerated Mobile Pages
If you're working on a website for a publisher, then you need to consider implementing Accelerated Mobile Pages (AMP)Accelerated Mobile Pages (AMP)
AMP – Accelerated Mobile Pages – is an Open Source framework designed to speed up load times for mobile internet users.
Learn more. We don't recommend implementing AMP for any other type of website.
AMP is an open-source initiative and format by Google with the aim of speeding up the web experience for mobile users. AMP pages are essentially stripped-down versions of pages that are optimized to load fast on mobile devices.
So why is AMP only useful for publishers? Because for publishers, it enables you to get into the Google NewsGoogle News
Google News is a vertical search engine that Alphabet began in 2002. Results from Google searches – mostly news – are listed very prominently as a box in the organic search results in the event of current topicality.
Learn more carousel, which can drive lots and lots of traffic… but other types of sites can't benefit from this, which makes the cons outweigh the pros.
HTTPS
HTTPS stands for Hyper Text Transfer Protocol Secure. It's the secure version of HTTP, the protocol over which data is sent between your browser and the website you're visiting. Using HTTPS makes it harder for people to try and eavesdrop on you.
Google's been pushing for websites to adopt HTTPS and has made it into a minor ranking factorRanking Factor
The term “Ranking Factors” describes the criteria applied by search engines when evaluating web pages in order to compile the rankings of their search results. Ranking factors can relate to a website’s content, technical implementation, user signals, backlink profile or any other features the search engine considers relevant. Understanding ranking factors is a prerequisite for effective search engine optimization.
Learn more because of that. While it may help a little, it's not going to provide a significant competitive edge in terms of SEO. To support their cause though, they're showing the "Not Secure" notice in the URL for sites that contain forms and aren't running on HTTPS.
Here's an example:
HTTPS is a must-have. Any new website that's being built today should be served over HTTPS. That's why ContentKing can check a website to see if it's available via HTTPS and whether its HTTPS certificate is valid:
Make sure to load all resources over HTTPS, because you want to avoid having so called "mixed content " issues. Mixed content issues occur when some resourced are loaded over HTTP instead of HTTPS, thereby leaving the page unsafe.
If your website is still not running on HTTPS, we recommend migrating to HTTPS as soon as possible because Google has started treating Page Experience metrics as a ranking factor as of May 2021.
Limit your use of JavaScript
A large part of SEO is focused on making it as easy as possible for search engines to crawl and index content correctly. Building a website using a JavaScript framework and relying on client-side rendering results in the complete opposite.
Although search engines can render web pages nowadays, it's not a best practice to actually rely on that, because it results in your content getting crawled and indexed slowly. Instead, make sure to just feed plain HTML to search engines. And if you absolutely must use a JavaScript framework, make sure to use server-side rendering or a pre-rendering solution such as prerender.io.
The reasoning here is:
- Search engines' page-rendering resources are limited. Rendering a page can easily cost twenty times as many resources as crawling a regular HTML page, so search engines can only allocate a small portion of their resources to this. This results in waiting days, if not weeks, for your content to be indexed.
- Content that isn't indexed doesn't rank. Until your content is indexed, you'll get zero traffic from search engines.
SEO aside, client-side rendering also makes for a higher Time to Interactive (TTI). This means that visitors will have to wait longer before they're able to interact with the page.
Page Speed
Studies have shown that visitors like fast-loading pages. They decrease bounce rates and raise conversionConversion
Conversions are processes in online marketing that lead to a defined conclusion.
Learn more rates. Amazon found that their revenue increased by 1% for every 100ms decrease in load time. On top of that, having fast loading pages helps your SEO. Especially on the first page of Google, page speed can really make a difference. And, as of May 2021, Google will take into account more metrics that aim to measure user experienceUser Experience
User experience (or UX for short) is a term used to describe the experience a user has with a product.
Learn more. This new set of metrics is called Core Web Vitals.
There are hundreds of tweaks you can make to your website and web server that will help you get better page speed, but here are the most common best practices that you should take into account:
- Use a content delivery network
- Limit the amount of JavaScript libraries that are loaded
- Minify JavaScript and CSS files
- Optimize your images
- Reduce server response time to <500 ms
- Use browser caching and file compression
Hierarchy of headings
Headings, H1–H6, are used to provide hierarchy and clarity for your web pages. By using headings appropriately, you can ensure that visitors can scan your pages quickly and help search engine crawlers to grasp your content's content and structure.
An H1 heading on a page should convey its main topic. For that reason, you shouldn't put the H1 heading around the logo, and you should only use one H1 heading per page.
Robots.txt
The robots.txt file contains the rules of engagement for crawlers. You use it to tell crawlers what sections they can't access and to give them hints that help them discover your content efficiently by referencing the location of your XML sitemap.
Here's what's important to keep in mind when dealing with robots.txt:
- Different search engines interpret the robots.txt differently.
- Be careful not to
Disallow
files that are required to render pages. This keeps search engines from rendering your pages, and it could hurt your SEO performance. - And last but not least: robots.txt is very powerful. Be careful around it, as you can easily make your whole site inaccessible to search engines. Therefore it's important to monitor your robots.txt file.
XML sitemaps
XML sitemaps are an efficient way of telling search engines about the content you have on your website. Therefore, they play an important role in making sure your content is crawled quickly after publishing or updating.
Best practices surrounding XML sitemaps:
- Keep the XML Sitemap up to date with your website's content.
- Make sure it's clean: only indexable pages should be included.
- Reference the XML Sitemap from your robots.txt file.
- Don't list more than 50,000 URLs in a single XML Sitemap.
- Make sure the (uncompressed) file size doesn't exceed 50MB.
- Don't obsess about the
lastmod
,priority
andchangefreq
properties.
Keep in mind that there are special XML sitemaps for images and news articles.
HTTP status codes
It's all too common that "Page not found" error pages aren't returning an HTTP status code 404
. This status code clearly communicates that the page doesn't exist.
HTTP status code 404
should be used for pages that never existed and can be used for pages that used to exist. We're deliberately writing can because there's an alternative status code that's more definitive. This is the HTTP status code:410
. HTTP status code 410
signals to search engines that the page has been removed and will never return. Because of its definitive nature, be careful with this one as there's no turning back.
Information Architecture
Information Architecture is the art and science of designing a structure for presenting your website's content. It's about defining what content is present and how it's made accessible. Information Architecture is where User Experience (UX) and Search Engine Optimization (SEO) meet. Everyone benefits from having a good information architecture, so this is an important phase in building a new website.
In this section, we'll be focusing on the Information Architecture SEO requirements for web development. Part of that is defining the URL structure, and how it behaves under certain conditions.
URL structure
URLs should be lowercase, descriptive, readable, and short.
Ideally, URLs shouldn't have extensions such as .html, .php and .aspx as it allows you to switch platforms without having to redirect these URLs. Whatever structure you pick, make sure it's used consistently across your entire website.
Furthermore, avoid using parameters in URLs as much as possible. They don't show visitors what to expect when going to a page, and they can also cause crawl issues.
The website needs to support the defining of a template that details how URLs are built up.
Things to consider:
- Are you using subdirectories in your URLs or not?
- Are you using a trailing slash (a slash at the end of each URL) or not?
- Also, make sure you're able to manually overwrite the URL template if need be.
Filters, variants, and multiple categories
If the new website contains filters, describe how they're going to affect the URL structure, and whether the resulting URLs should be accessible and indexable for search engines or not.
Do the same for page variants.
Example: you're going to be building a new eCommerce website for selling shoes. Each shoe is available in different colors and sizes, easily leading to dozens of product variants. Which pages do you want to be accessible and indexable, and which not?
And what about blog articles that are in multiple categories? Or products that are in multiple categories? There may be very good reasons to do so from a user point of view, but from an SEO point of view, it can really be a headache.
We recommend making sure that your articles and products have a primary category that can be indexed. Other categories an article or product is in should be unindexable, in order to prevent duplicate content.
Example
A blog article is in categories A, B and C, leading to the following URLs:
https://example.com/blog/a/example-article
https://example.com/blog/b/example-article
https://example.com/blog/c/example-article
Category A is the primary category, so this article will be indexable on https://example.com/blog/a/example-article
.
https://example.com/blog/b/example-article
and https://example.com/blog/c/example-article
should both be canonicalized to the primary URL.
Smart templates that save you time
From an SEO standpoint, it's important to be able to define templates for the title tag, meta descriptionMeta Description
The meta description is one of a web page’s meta tags. With this meta information, webmasters can briefly sketch out the content and quality of a web page.
Learn more, and headings. They help you optimize consistently, while also saving you a lot of work.
You need to be able to define these templates on a website level, as well as for category pages and for all pages in a certain category. Specificity wins, so the most specific template for a page will be used.
Example #1
Type | Element | Template |
---|---|---|
Website | Title | $pageName - HappyShoes |
You've got a page named "Privacy Policy," so the title
becomes: Privacy Policy - HappyShoes
.
Example #2
Type | Element | Template |
---|---|---|
Website | Title | $pageName - HappyShoes |
Pages in Category "adidas" | Title | $pageName - Buy adidas shoes |
You've got a page named "adidas Ultra Boost size 44", so the title
becomes: adidas Ultra Boost size 44 - Buy adidas shoes
.
Manually overwriting smart defaults
On a page level, you should be able to overwrite these templates with manually defined ones.
That way you can quickly define a default title tag template for all of your blog articles. Or all the pages surrounding a certain service.
Crawling & Indexing
All the requirements described in this section are aimed at letting a search engine's crawling and indexing process run smoothly. You want search engines to learn quickly about any new or updated content. And upon crawling your content, you want them to quickly understand it as well as possible.
Robots directives
Robots directives let you choose how crawlers should treat your pages, with the most well-known directives being the noindex
and nofollow
directives. You can define these robots directives using the meta tag in the <head>
section of a page, but you can also do it through X-Robots-Tag
in the HTTP header.
By default, the robots directives should allow indexing, and not have the nofollow
directive applied.
Here again, you want to define the robots directives on the following three levels:
- The website level
- The category/segment level
- The page level
Canonical URLs
Canonical URLs inform search engines that they should prefer one page over other identical or similar pages. For instance, if you have three near identical pages—A, B, and C—and you want page A to be indexed, you then canonicalize both B and C to A.
By default, canonical URLs should be self-referencing—telling search engines they're the right variant to index.
When it comes to canonicals, you only want to define these on the page level.
Best practices surrounding canonical URLs:
- Use absolute URLs, including the domain and protocol.
- Define only one canonical URL per page.
- Define the canonical URL in the page's
<head>
section or HTTP header. - Point to an indexable page.
International SEO
The requirements described in this section applies to websites that do international SEO.
Hreflang attribute
For a website that's available in multiple languages and/or regions, be sure to use the hreflang
attribute. Thehreflang
attribute is used to indicate what language your content is in and what geographical region your content is meant for. You can define the hreflang
attribute by including it in the <head>
section, or using the HTTP header.
Say you have an English, Dutch, and French version of the website. You can use thehreflang
attribute to point search engines to the translated versions of your pages. The page's English version would have the following in its <head>
section:
<link rel="canonical" href="https://www.example.com/" /> <link rel="alternate" hreflang="en" href="https://www.example.com/" /> <link rel="alternate" hreflang="nl" href="https://www.example.nl/" /> <link rel="alternate" hreflang="fr" href="https://www.example.fr/" /> <link rel="alternate" hreflang="x-default" href="https://www.example.com/" />
Best practices surroundinghreflang
attributes:
- Reference both the page itself and its translated variants.
- Make sure to have bidirectional
hreflang
attribute references. - Correctly define language and region combinations.
- Always set
hreflang="x-default"
. - The
hreflang
attribute and the canonical URL must match. - Use absolute URLs when defining the
hreflang
attribute. - Use only one method to implement the
hreflang
attribute.
Structured data
Using structured dataStructured Data
Structured data is the term used to describe schema markup on websites. With the help of this code, search engines can understand the content of URLs more easily, resulting in enhanced results in the search engine results page known as rich results. Typical examples of this are ratings, events and much more. The Searchmetrics glossary below contains everything you need to know about structured data.
Learn more, you can provide additional information about your content. For instance, you can use Schema.org to do this for search engines, while Open Graph serves platforms such as Facebook, LinkedIn, and Slack, and Twitter Cards do this for Twitter. By using structured data, you ensure you have full control of how your content is presented.
Schema.org
Schema.org is often used to mark up reviews and to communicate who wrote an article and what organization a website belongs to. Or in the case of a local business, you can use Schema.org to explain the type of business and what its opening hours are. There are lots of opportunities here, so make sure your website supports the defining of Schema.org properties. The markup should be added to the HTML.
There are multiple ways to define Schema, but the most popular (and Google-preferred) one is using the JSON-LD format.
You should be able to define these on the follow levels:
- The website level (e.g. for defining Organization)
- The page level (e.g. for defining Reviews)
Open Graph markup
There are four required Open Graph properties that you should be able to define:
og:url
og:title
og:description
og:image
There are also two recommended properties; use these to provide even more context about the content:
og:type
og:locale
You should be able to define these on the following levels:
- The Website level (e.g. for defining a default image)
- The Category/Segment level (e.g. for defining a default image for all blog articles)
- The Page level (e.g. for defining Reviews)
Twitter Card markup
While Twitter Cards are quite similar to Open Graph, with Twitter Cards, there are four different types:
- Summary Card
- Summary Card with Large Image
- App Card
- Player Card
Here are the required Twitter Card properties:
twitter:card
twitter:title
But we highly recommend also including the three properties below to provide more context about the content:
twitter:site
twitter:description
twitter:image
Media optimization
Media such as images and PDF files can drive a lot of traffic, and therefore it's essential that you be able to optimize them.
While there are multiple factors that determine whether or not an image ranks highly, the website should support the defining of an alt
attribute and a title tag
attribute for each image. Also, when you're uploading media you need to make sure the URL path makes sense.
For instance, WordPress' default URL path, which contains the year and month, makes for very long URLs, and can also give people the false impression that your media is outdated when it's served from https://example.com/wp-content/uploads/uploads/2017/06/
.
Using image compression to decrease the file size of your images is recommended too. This improves page load speed.
To ensure that search engines learn about your images quickly and easily, be sure to use an image XML sitemap.
Navigation management
As we described in the Information Architecture section, the way in which content is accessible for visitors and search engines plays a key role in SEO.
Therefore, it's essential that you be able to manage your navigation. There are various navigation types, such as for instance:
- Main navigation
- Sidebar navigation
- Footer navigation
Make sure you can manage these for the website, and also specifically for certain sections of the website. Take for example an eCommerce website that sells shoes, pants, and sweaters. In the shoes section, you may want to define a sidebar and footer that contain nothing but links to your most important product pages and sub-category pages about shoes.
Conclusion
In order for any new website to be SEO proof when it's launched, it has to have good specifications. Use this article to your advantage, and be sure to click through to the in-depth articles about the topics mentioned for additional information.
And once you have all the SEO requirements for the new website down, be sure to get ready for the next step: preparing for the website migration.