URL canonicalization is the process of deciding on a definitive version of a URL and then adding a canonical tag on all pages that link back to the definitive page.
What is a Canonical Tag?
The canonical tag, also called "rel canonical" is an HTML tag that tells search engines that the enclosed URL is the original, definitive version of the page. The canonicalized URL.
The tag goes in the page 's <head> section and looks like this:
<link rel="canonical" href="https://www.example.com">
Practically speaking, the canonical tag tells Google which page you want to appear in search results.
Why Do Canonical Tags Matter?
Humans tend to think of pages that look the same and have the same content as all one page. Like the homepage is the homepage is the homepage. Search engines, though, don 't work the same way. They see different URLs as different pages, even if they serve the same purpose.
So to Google, all of these URLs are unique pages:
- https://www.example.com
- www.example.com/
- https://www.example.com/index.php
- example.com
To Google, you 've got 5 unique copies of your homepage even though all humans will see is one page. This situation can cause you to suffer some of the issues associated with duplicate content.
Here 's what happens when Google encounters three of those URLs:
Adding canonical tags to the 4 copies linking to your canonical URL tells Google to ignore the copies and move on to the original:
Modern content management systems and websites can cause this problem when they display content dynamically based on the user. Ecommerce platforms can also be a culprit here thanks to showing multiple versions of the same product (color, size, model number, etc.)
Duplicate content can also be purposeful, like when you create landing pages that are only very slightly different. This is a relatively common practice for PPC and email campaigns.
How does this help your website?
While there is no "duplicate content penalty" in Google, hosting copied pages can cause serious issues for your search engine optimization:
- Diluting link building: Sometimes people don 't always link to the right version of a URL. They often leave off the https part or add a trailing slash. Without a canonical tag, the link juice passed by those links won 't be assigned to the right page.
- Discouraging crawling: Hosting duplicate content makes Google 's crawlers "waste" their time by looking at copies of content they 've already seen. Google is less likely to look at more pages on your site if they think it 's mostly duplicate.
Preventing duplicate content issues using the canonical tag helps to prevent these issues.
Adding Canonical Tags to Your Pages
As mentioned above, the code for a canonical tag looks like this:
<link rel="canonical" href="https://www.example.com">
It goes in the <head> section of a page. A page 's head is all of the code that appears between the <head> and </head> tags of a page 's HTML code.
How you go about adding canonical tags to your pages will depend entirely on what type of site you have. If you 've got a WordPress site, you can use one of these SEO plugins to add canonicals:
Canonical Tag Best Practices
While the canonical tag is a relatively simple piece of code to add, it 's absolutely vital to follow best practices when using it. Since canonicals tell Google to essentially ignore the page it 's on and move on to canonical URL, you can easily screw up your sites SEO.
Self-referencing canonicals
Pages don 't have to use self-referencing canonical tags, but it doesn 't hurt to do so. A self-referencing canonical is when a page contains a canonical tag linking to itself. For example, when https://www.example.com/page1 contains this canonical tag:
<link rel="canonical" href="https://www.example.com/page1">
Again, you don 't have to do this, but doing so doesn't hurt. This might seem obvious at first glance, but it 's a common question in the SEO world.
Make sure your canonical URL is accessible
Again, this is probably obvious. Why would you say a page is the definitive version and then redirect Google to another URL? Or to a URL that returns a 404? Or a URL that is blocked by robots.txt? But it does happen.
Only canonicalize URLs that return a 200 status code and is allowed in your robots.txt file. Make sure your canonical URL is also listed in your site 's XML sitemap. Check canonical URLs for noindex tags.
Double check your canonical tags if you 're using a program that creates them dynamically. Some plugins and CMS platforms will write a unique self-referencing canonical for every URL published on your site, which completely defeats the purpose of URL canonicalization.
Use absolute URLs
When adding a URL to a canonical tag, always include the full URL. Meaning the URL must include these parts:
- The https://
- The www (if it 's are part of your preferred domain)
- Your domain name
- The .com part.
These are known as "absolute URLs." URLs that only include the part that comes after the ".com" are known as "relative URLs" and aren 't recognized when Google reads your canonical tag.
If you don 't use an absolute URL, Google will ignore the tag.
You CAN canonicalize across domains
If you own website A (websitea.com) and website B (websiteb.com), you can point a canonical from site A to site B. This makes sense for media companies that publish the same content across multiple properties, but only want one website to rank.
Don 't create "canonical chains"
We just made "canonical chains" up, but think of them as adding a tag to page A pointing to page B. And then adding a canonical tag of page B pointing to page C. This creates an ambiguous signal to Google since you 're telling it 2 different URLs are the definitive version of the page.
Ambiguous canonical situations like this will likely cause Google to ignore your canonicals.
Instead, decide on a canonical version (page C) and then point all versions at that. So put canonical tags linking to page C on both page A and page B.
Common Canonical Tag Errors
Nobody 's perfect. So every once in a while you might make a mistake with your canonical tags. Here are the most common errors people make with their URL canonicalization.
Canonicals and pagination
You can use canonical tags in conjunction with paginated content. That 's not an error. Errors often happen, however, when people accidentally add a canonical tag to every page pointing back to page 1. So, for example, https://www.example.com/content_page1 should have this canonical tag:
<link rel="canonical" href="https://www.example.com/content_page1">
While the next page in the chain, https://www.example.com/content_page2, should have this tag:
<link rel="canonical" href="https://www.example.com/content_page2">
Where people run into trouble is when they add this tag to https://www.example.com/content_page2:
<link rel="canonical" href="https://www.example.com/content_page1">
This will prevent Google from indexing the second page of content.
Canonicals and hreflang
Again, using canonical tags and hreflang tags is perfectly fine. But they 're easy to mess up by accidentally canonicalizing a page in one language to the same page in another. Which is the hreflang tag 's job.
If you 're using canonicals hreflangs together, double check the English page canonicals to the English URL and the Spanish page canonicals to the Spanish URL.
For example, if https://www.example.com is the canonical URL that 's also available in Spanish, the canonicals and hreflangs should look like this:
<link rel="canonical" href="https://www.example.com">
<link rel="alternate" hreflang="en" href="https://www.example.com">
<link rel="alternate" hreflang="es" href="https://www.example.com/es">
Simple enough but easy to create an error if you 're not paying close enough attention.
Using canonical tags on pages that aren 't similar
It 's common for websites to have multiple pages covering the same topic. At WooRank, we have multiple blog posts about keyword research, content marketing, advanced SEO and lots of other topics. An ecommerce site might have 2 different products that are very similar in description and specifications.
However, these pages are serving two very different purposes and neither of them, even though they are quite similar, should include canonical links to the other.
If you do use canonical tags a little too aggressively Google could decide to stop trusting them on your site altogether. Which means it won 't honor canonical tags on any page, leading to those potential duplicate content issues mentioned above.
What Next?
Once you understand canonical tags, it 's not such a complicated subject. If you follow best practices you can easily use canonical tags to keep your website optimized for Google 's crawlers.
If you 're already using canonical tags, consider auditing your website with WooRank 's Site Crawl or a different website crawling software to ensure you 're following canonical best practice.
If you 're all set with canonicals, consider moving on to these topics related to how Google crawls and understands your site: