What is robots.txt nad What is Robot's Meta Tag?

Started by stellarhomes, Sep 28, 2022, 04:24 AM

Previous topic - Next topic

stellarhomesTopic starter


What is robots.txt nad What is Robot's Meta Tag?
  •  

sbglobal

The meta tag <meta name="robots"> is written in the <head> part of web page. It performs the same functions as the file robots.txt – provides the ability to manage the indexing of content and links to pages through the corresponding values for the content attribute:
follow/nofollow – to take into account or not to take into account links;
index/noindex – index or not index the page content (often text);
all/none – abbreviation corresponding to the entries "follow, index"/"nofollow, noindex";
noarchive – prohibits archiving of the page. Using this value excludes the possibility of accessing the page from the search engine archive.

Unlike the file robots.txt , the meta tag of the same name has somewhat limited functionality.
It operates exclusively within the page. In a separate file, you can prohibit indexing of entire directories. For the rest of the parameters, these methods are equivalent. Search engines recommend using the robots meta tag when the webmaster does not have access to the root folder of the resource.
When identifying conflicting commands, priority is given to a stricter one. For instance, if there is an index in the file and none in the meta tag, the search robot will index the text.

Main use cases
All combinations are used in site configuration. They allow you to manage the weight more clearly, providing additional protection from the imposition of filters for non-unique content.

Applying "content = follow, index" or "content = all"
The indexing permission for links and text is set by default. Therefore, in most cases, it does not make sense to write this code in a meta tag. Exception – the page is located in a directory that is subject to a complete ban on indexing via robots.txt .
But this approach can complicate the work with the resource due to the non-obvious behavior of the code. It is better to neglect them.

The use of "content = nofollow, index"
Indexing of content without going and taking into account the link mass on the page is used quite often. This allows you to:

avoid weight leakage to other resources or pages;
manage PR more clearly;

create pages with a lot of links. For example, lists of useful links for users. Banning indexing will show the search robot that this is not spam and not over-optimization.

A meta tag with similar content should not be used when exchanging links with another resource. This is, at least, unethical, and may cause a break in cooperation.

Using "content = follow, noindex"
This version of the tag content is suitable for announcements, the 2nd and next pagination pages, previews, as well as other elements that duplicate the main content.
The reference weight will be transmitted. The search robot will follow the links to index their contents.
A similar version of the tag is suitable when non-unique content is placed on the page. For instance, regulatory and legislative acts, as well as other common documents, where it is not recommended to make changes.

Applying "content = nofollow, noindex" or "content = none"
It is important to use the ban on indexing texts and links to place confidential information that should not be accessible by search queries. But such measures do not always work. This is especially true for the Google search engine. It is important to remember that for him the meta tag is advisory in nature.
There is no such problem in Yandex search. To limit the indexing of the page by the search robot, you should use a password or other identification that requires manual input.
So, some webpages that have a ban on indexing may get into the search results.
  •  

Hanna-banana23

When the site is already online, it is checked by browser search robots to index and determine its place in the search. They access the file robots.txt . You can check its availability by adding at the end of the site address /robots.txt . If there is no file, you need to add it - otherwise robots may index the wrong information and incorrectly determine the site's place in the search. The file must be in the root folder of the site. And the meta tag should, in theory, help indexing. It is placed in the <head> block during page layout, it looks like this <meta name="robots" content=" " /> and has several options for values that can allow or prohibit the robot to index the page content.
  •  

Seattle

The robots meta tag is used to fine–tune indexing - you can close the content, but leave the links open (in the content parameter of the robots meta tag – 'noindex, follow") and vice versa.
In robots.txt there is no such possibility.  In situations where it is not possible to access the root directory of the website, edit robots.txt it is not possible. That's when the meta tag of the same name comes to the rescue. 
In robots.txt you can close an entire directory from indexing in order to prevent bots from accessing all the pages contained in it, whereas a meta tag will have to be used for each of them.
It turns out that in this case it is more convenient to make settings in the file. But if some pages inside the catalog still need to be left open, it is more convenient to use a meta tag. To manage the indexing of website pages, it is acceptable to use the robots meta tag and a file at the same time robots.txt . They can be responsible for giving instructions to search bots about different web pages or duplicate each other's commands. 

But if they contain conflicting directives about the same pages, search engine robots will not always make the right decision – by default, a stricter indication is selected. It turns out that the pages (or links to them) about which between robots.txt and there are disagreements with the robots meta tag, they will not be indexed.
The ability to manage website indexing is a very useful tool for SEO promotion.  The main thing is to learn how to correctly determine in which situation it is more effective to use one or another of the methods now known to you.
  •  

lipikatech

Robots. txt files are best for disallowing a whole section of a site, such as a category whereas a meta tag is more efficient at disallowing single files and pages. You could choose to use both a meta robots tag and a robots.
  •