Tag Archives: robots.txt

What is a robots.txt file?

Robots.txt is a small text file located in the root directory of your website. It’s job is to tell search engine robots what content on your website what not to visit.

Why would I want to do this?

The main reason why you want to do this from an SEO standpoint is to avoid duplicate content issues. For example if you have a printer friendly page on your website you would not want this indexed because it would be competing with the main page.

Read More…

How to avoid Duplicate Content

Duplicate content is when you haveĀ  identical or near identical pages within your website. This is a fairly big sin as far as Google and other search engines are concerned.

There are two reasons for this:

  1. Search engines think you stole or just reproduced content from somewhere else
  2. Your webpages are unecessarily competing with each other for rankings

Most of the time this is not intentional and many websites have duplicate content without even realising it. The main cause of this is sites making the same content available via different URLs.

Fantastic. But how do I identify and avoid it? Below are some of the main causes and solutions to this problem.

Read More…