Looking For Anything Specific?

how to setting custom robots txt for blogger

This time I will give tips and a little tutorial on how to set robots txt for blogger users, in this tip I will only explain the code that can be useful for setting up robots.txt as we want, for how to update the completed robots.txt we set you up, you can learn in this article How to update robots.txt in blogger, before we start let's learn a little about what robots txt is.

 

  • What is Robots Txt

I think robots txt is a line of code that acts as a rule for search engine crawlers to access our website by telling them what they can and cannot crawl and displaying it in the search engine results of the owner of the crawler.

 

Meanwhile, according to google as the owner of hosting from blogspot that is blogger, robots txt is

A file that tells search engine crawlers which pages or files can or cannot be requested from a site. This file is used primarily to prevent your site from being overloaded with requests; this file is not a mechanism to hide web pages from Google.

 

And this is how the default appearance of a robots.txt owned by our blog

User-agent: Mediapartners-Google

Disallow:

 

User-agent: *

Disallow: /search

Allow: /

 

Sitemap: https://mmtipsdantrik.blogspot.com/sitemap.xml

 

You can see for yourself the robots txt that your blog has by adding /robots.txt at the end of the website url like the example below 

Example: http://mmtipsdantrik.blogspot.com/robots.txt


In other words, robots txt is a file that is created to control the behavior of crawlers or search engine agents towards websites and content on our websites.


Alright, I'll just give an explanation about the code contained in the default robots.txt for more use than the existing codes, please read the article below

Prevent article images from appearing on Google search using robots.txt

In the basic robots.txt file above there are 4 important components that are User-agent,Allow,Disallow and Sitemap

First, the User-agent is an initial code to determine which type of crawler from search engines will crawl our website, the example above has 2 codes (box1) after the sign: the first code means that we set the crawler from google adsense

 

while the second one after the sign: there is an asterisk * means this code manages all crawlers from various other search engines not only google.

 

other than Mediapartners-Google and * there is another user-agent that can be distinguished how to set it that is Googlebot-Video, Googlebot-Image

More about the Googlebot-Image user-agent that can control our images to appear in search engines

Prevent article images from appearing on Google search using robots.txt


Second, disallow(box2) means that we can exclude a page or other subdirectories from our website after we specify a user-agent that is not allowed to crawl on that page.

 

Third allow(box3) after we block or disallow access of a User-Agent we can give an exception to the example of an image from all images that have been disallowed by writing it in detail.


 These three codes are the basis if we want to create a new rule that only applies to one type of content or other specific things.

 Example :

add more than one blocked page for all User-agent

User-agent: *

Disallow: /p/privacypolicy.html

Disallow: /p/aboutme.html

This means that the privacy policy and about me pages will not be crawled

 

add more than one type of user-agent

 User-agent: Googlebot-Video

User-agent: Googlebot-Image

Disallow: /

This means that images and videos on the website will not be crawled or appear on search engines

 

Allow one of the images in the post

User-agent: Googlebot-Image

allow: /UrlPostingan.html/imagesname.jpg/

This means that it allows imagesname.jpg from UrlPostingan.html articles to be crawled or appear on search engines


The fourth sitemap (box4) This code refers to the sitemap of our blog. By adding a sitemap link here we can optimize the crawl rate of the blog by all user-agents.

 

This means that whenever a search engine crawler scans a robots.txt file, there is a better chance that a crawler will crawl all of your blog posts without ignoring any of them.

More details about the sitemap

How to Create and Edit a Responsive Sitemap


Alright, here's the tutorial and tips this time, if you want to know how to update the robots.txt that you set, please click the link below

How to update robots.txt in blogger,

 

Thank you for visiting, if you have any questions, please comment below



Posting Komentar

0 Komentar