Which is better — Meta Robot tags or robots.txt?

Started by siyajoshi, Jul 01, 2022, 08:23 AM

Previous topic - Next topic

siyajoshiTopic starter

Which is better — Meta Robot tags or robots.txt?


Robots. txt files are best for disallowing a whole section of a site, such as a category whereas a meta tag is more efficient at disallowing single files and pages.
  •  

ezhabchik

To crawl and index a website, robots.txt is a text file that instructs search engine bots (also known as crawlers or spiders). A robots.txt file should be placed in the top-level directory of your website so that robots may access its instructions immediately away, although this is not always the case

Meta Robot tags are the best option and strong option to index and noindex a particular page/URL.
  •  

davidkeller

The Robots meta tag is a tag with which we can control the indexing of our webblog by specifying blocking commands for each individual page.
By the way, there is no single spelling of the word "meta tag". Even Google writes them differently in their reference materials. Meta tag, Meta tag and Meta tag are all one word and are used on the web at the same time. At the same time,  in Google - Meta tag. Let's first understand what the Robots meta tag is in general. Whether you specify this meta tag or not, its value is always "all", which means to index. Those. There are three "states" of a given meta tag:

Its complete absence.
<meta name="robots" content="all" />
<meta name="robots" content="index, follow" />

All this means that the page will be indexed. Therefore, if you do not need to prohibit the page from being indexed, then the first option is used, i.e. we don't use anything at all. If you want to completely prohibit the page from being indexed, then the entry will be like this:
<meta name="robots" content="noindex, nofollow"/>
or a shorter version
<meta name="robots" content="none"/>
Why do you think the value has two parameters - index/noindex and follow/nofollow?
The index/noindex value only applies to the page text.
The follow/nofollow value only applies to links on the page.
Here, as well as in the definition itself, lies one significant advantage of the Robots meta tag over the file of the same name.

If you compare both definitions, you will see that they are practically the same. But they do have a slight difference.

Yes, both methods - creating a file or specifying a meta tag - are the same, perform exactly the same functions and have exactly the same significance. In other words, one cannot be said to be more important than the other. They are absolutely equal. But as I said, they have some differences.


generally, meta tags were invented not as a counterweight to the file, but to make life easier for those webmasters who do not have access to the root folders of their site, as, for instance, happens on Blogger. Those. search engines themselves recommend setting up the Robots.txt file when there is access to the site folders, but if there is no such access, then it is recommended to use the meta tag.

Advantages of the Robots.txt file over the meta tag
In my opinion, the advantage is that in the Robots.txt file we can specify entire directories of our site, prohibit all tags, categories and any other directories from being indexed at once. Moreover, this ban is set in a single line. If we want to prohibit the entire directory, but at the same time allow one or two pages for indexing, then we can also set up exceptions in the file. I wrote about all this in the article to which I gave a link above, so now I will briefly convey the essence.

How are things with the meta tag? The meta tag cannot be set once for the entire directory at once, it is set for each page separately. Those. it is convenient to use it when on your site you decide with each new publication whether to allow the search robot to index this page or not.

Personally, it is difficult for me to imagine such a site where this could be needed. But the fact remains. If you do not set up the Robots.txt file, but at the same time close many pages from indexing, then every time you need to be on the alert so as not to forget to close the page from indexing. Agree, it's inconvenient.

If you are free from such a routine, then it is always much more convenient and easier to set up the Robots.txt file once and for all and not think about it anymore.

Benefits of the Robots meta tag over a file or when to use a meta tag
I have already drawn your attention to the fact that the meta tag can be set on each individual page, as well as the different index / noindex and follow / nofollow commands that can be used in the meta tag, and under certain circumstances all this is a big advantage before the file.

More on the topic Proper writing of articles for the site

Situation 1. You publish non-unique content. It does not have to be copy-paste (stolen content), it can be some kind of official documents, legislative acts, articles of codes, i.e. any materials that will create a large amount of non-unique content on your site, while pages with non-unique content do not have a separate catalog, but are placed mixed with the main content. You can prevent such pages from being indexed, either completely, by specifying the meta tag

<meta name="robots" content="none"/>
and partially, prohibiting indexing only content, but allowing indexing links.
<meta name="robots" content="noindex, follow"/>

or simply

<meta name="robots" content="noindex"/>

The second case when it makes sense to use a meta tag is when publishing a large number of links on a page. For instance, you want to share interesting links with your users, but you don't want to compromise yourself to search engines by publishing a large amount of external links. In this case, you can prohibit the page from being indexed, while it will be available to your visitors. Just do not do this if you are exchanging links with someone, namely when you are not obliged to anyone.
Again, a complete ban on indexing will be like this:
<meta name="robots" content="none"/>
if you want the text content of the page to be indexed, but there is no link, then the entry should be like this
<meta name="robots" content="index, nofollow"/>
or an equivalent notation
<meta name="robots" content="nofollow"/>

We all know that archives, categories and tags create duplicate content. But it is not at all necessary to completely close these pages from indexing, because they contain links to our own pages, and these links can participate in internal linking, transferring their weight to pages with articles, the main page and others. in the Robots meta tag, we can tell the search engine not to index the text, because this creates duplication on the site, but still allow the links on those pages to be followed.
So, internal linking on the site will not be violated, but on the contrary, this creates an additional tool for us to increase the static weight of pages within the website.
So, you can use the value of the meta tag from situation 1 for internal linking on the site. How to correctly calculate the internal weight of pages and do linking, I wrote in the article How to check and do the right linking on the site, but if you still don't know what linking is, then I recommend that you first read the article - Linking Secrets.

If you study the reference materials of search engines, in particular Google, about this meta tag, you will find out that it can have other meanings besides index and follow (index and not index).

In addition to the voiced meta Robots, also understand the noarchive command

<meta name="robots" content="noarchive"/>

You can use this value if you do not want search engine users to see a Saved copy (Google) link in the search results, which leads to a saved copy of your page. In addition to all of the above, Google understands some more values, which I recommend that you familiarize yourself with.
And the last thing I want to draw your attention to in particular.

For any search engine, it doesn't matter at all how you specify the commands for indexing, in the robots.txt file or in the robots meta tag, but if you use conflicting commands in different cases, for instance, in the robots.txt file, the page is prohibited from indexing, and you manually put the meta tag with the value "all" or vice versa, then the search robot will take into account a more strict command and it will always be noindex, i.e. the robot will take into account the prohibition directive and will not index the web page. Therefore, be careful if you use both versions of robots on the website at the same time.
  •