05 Nov The Ultimate Guide To The Canonical Tag
The Article Is All About The Canonical Tag
If you are a potential client of a web team, have worked with HTML, are a web developer, web designer or manage a web team, you will eventually come across this curious Tag called the “Canonical Tag“. It is also referred to as the “Rel=Canonical” Tag.
According to MOZ, a canonical tag (aka “rel canonical“) is a way of telling search engines that a specific URL represents the master copy of a page. Using the canonical tag prevents problems caused by identical or “duplicate” content appearing on multiple URLs. Here is a link to MOZ if you want to see their explanation. Moz has a simple description of a canonical, but there is so much more involved. Search Engines use the canonical tag to combat duplicate content problems and give search engine ranking values for content to the page designated as the original “source” URL.
The word canonical comes from canon, which originally referred to secular or biblical rules and laws and a standard for judgement. Eventually canon was used to refer to a writer’s works that had been accepted as real and authentic.
But first let’s go over briefly what is a Tag and what is especially if you are new to HTML in general. I like to call all these tags Meta Tags.
A tag is a little bit of HTML code as you can see that always is surrounded by <meta something… /> or <link rel=”something />. Notice how the tags open and close. Meta tags are what I like to call hidden or invisible tags, because people never see them on the browser screen. However, Google, Bing and any other search engine left on the planet would actually see these meta tags, If you open up a web page you will see a lot of other types of tags.
Which leads us to the Canonical Tag. When you actually look it, it does not use the word Meta in the beginning of the tag itself:
<link rel=”canonical” href=”https://www.seoturbobooster.com/” />
It uses “Link” in the tag. That’s because this meta tag is a reference to something that is a HTTP Link, URL or Permalink, whatever floats your boat on the wording. The actual http link should be a URL that points to a web page. In particular, it is very likely that page is the page you are looking at, if you are looking at the source, but it may not be that page. But how is that possible?
So, why all this fuss with a canonical tagging. It’s all about stopping Duplication. What you don’t want Google to figure out, if it were true, that you have exact duplicate web pages. And there are cases where you will have a page with one URL/Link, that is totally duplicated with another URL/Link. A good example if you are using WordPress and use permalinks. Permalinks is the system to make the url more natural language which is a directive from Google as of October 2019.
So you create a page in WordPress like:
but it also has a nicer URL version:
So, it is important to not have two pages according to Google’s crawler. When Google processes the canonical meta tag, it will see they both have the same canonical “URL” and that will keep Google from crawling and processing both.
Another way of understanding this is looking all the ways your homepage may get found by search engine crawlers for seoturbobooster.com:
Meanwhile all these pages are really the same final page and content. To the search engines, each of these URLs is a unique “page.” You can observe there are five copies of the same homepage displayed in different ways.
It also important to not have a high volume of repeat pages, with all the same exact content. This will dilute your SEO and make it difficult to indexed by Google. Also, there is a possibility of having the wrong page indexed by Google, and then Google won’t index the right page.
Duplicate content is the legitimate reason for canonical tags. Here are a bunch of real reasons for duplicate content, mainly because automatic systems generate URLs. These include:
- Multiple Links/URLs
This is true on eCommerce sites where URLs are created through filter options for size, price, color, rating, etc.
- Session ID/Tracking URLs
Session IDs may be automatically generated by your system. The same applies to tracking URLs, breadcrumb links, printer friendly versions, and permalinks in certain CMS.
- NON-WWW, WWW, HTTP, HTTPS
Without canonical tags, search engines see https://www.seoturbobooster.com, https://seoturbobooster.com and https://www.seoturbobooster.com as distinct, independent pages, and will crawl and possibly index them as such hurting the overall site authority.
- URL, File, Folder Letter Case
While users, and most browsers, treat upper and lower case the same, with the two almost interchangeable, the same is not necessarily true for search engines. If your website mixes up case in filenames and folder structure, especially under a Linux server, you need to use the canonical tag.|
- CNAME and Mobile URLs
While using a special URL for the mobile version of your website or a cname, the exact same content needs to be canonicized to one source domain.
- Country Links/URL – when using multiple country specific URLs, the content largely remains the same, with only a few minor differences. This does not apply if the language is different, in which case you want the search engines to return separate results.
Here are some important points to remember
- Do You Need A Canonical Tag?
The answer is no, you don’t need a canonical tag, especially if there is no duplicate page on your website! However, those who use Yoast and in general it is good practice to use the canonical tag.
- Make Sure There Is A Canonical Tag
Sometimes either WordPress, Shopify or another CMS does not create the canonical tag. It is very easy from the desktop to “view source” and take a look to make sure it is there. Just search for the word “canonical” or use control-f and see if it is there. It would look something like <link rel=”canonical” href=”https://www.seoturbobooster.com/” />
- Check Canonical In Home Page
Because the home page is the most likely to be duplicated in many ways and is the most important make sure at least that page has the canonical tag in it.
- Never Chain Canonical Tags
If one page has 1 or 3 duplicates, always choose the 1 page to be the main page and the one listed in the canonical. Never have 1 reference 2, and then 2 reference 3 and then 3 reference 1. This silliness can happen sometime, especially if you are software developer.
- Don’t Canonicalize Near Duplicates
There are situations where pages are almost exactly the same. The fact is, if they are not exact duplicates, they should not be the same page. You can create confusion and complications if you use one canonical for many pages that are different. It’s not right.
- Ranking Power Through Cross-Domain Duplication
There have been situations, especially for content providers, where they want to let the search engines know this particular page is an exact duplicate of another page on another domain. This is used to consolidate the page power for Google search results to one domain. Not something that happens too often, but it can be done. You would have a canonical that is on another domain.
- 301 Redirect vs. Canonicals
We are about to do a big site redesign and the URL structures will change across a website. A 301 redirect tells Google to permanently move the page to the new URL, while a canonical tag inside of a page with another page url is more of a way to reduce content and not move a URL. If you move the URL, you are maintaining the old Google ranking, but a canonical would mean this page is not to be indexed then. 301 is preferred in a move.
Should a page have a self-referencing Canonical URL?
The answer to this is pretty much yes, there should be a canonical link in a page, even if it is the primary content page.
- Using Canonical Tags Across Paginated Pages?
The answer is you should not do it. If there are 5 pages that represent the one initial page. The first page should have a rel=canonical, but the next pages should have a rel=prev or a rel=next tag for the additional pages instead of the canonical.
- Leave the HTTP:// or HTTPS:// In The Canonical
Some sites want to have all relative links and that makes sense. In that case this page would be something like rel=”/this-web-page”. But it is better to include the full URL for canonical tags like rel=”https://seoturbobooster.com/this-web-page”.
- Canonical Link HTTP Header In Non-HTML Content
Google supports a canonical link HTTP header in non-HTML content pages. The header looks like this: Link:
Canonical link HTTP headers can be very useful when needing canonical techniques in files like PDFs. It’s good to know that the option exists.
- Important Use Of Canonical Tagging In Landing Page Variations
In our SEO Turbo Booster implementation we use canonical tagging to help keep unique landing pages indexed on their own by city and by keyword. This is important to have canonical tagging in these pages to get found properly.
Common problems in implementing Canonical Links:
These are problems that we have seen over the years
- More Than One Canonical Tag
Multiple SEO tools like Yoast could cause the canonical to appear more than once. This is not correct and only one will be recognized.
- Too Low In The Page HTML
The higher the better for the canonical, because it will save time on Google parsing the webpage. If the canonical and the page url do not match, Google will stop crawling.
- Multiple Links/URLs