Later this week, on January 27, I published my first blog on this website. Before that, on January 21 I completely removed all the content and started all over again. I gave search engines a little bit of time to understand what happened here and where old URLs are now 301-redirected.

If you missed the first post where I covered a quick way to find and solve out-out-stock products, here is the link to take a look.

I wanted to know how fast search engines will discover and index this blog post. So I tracked them while doing some log file analysis.

It might be a detail that matters:

Bing and Yandex bots didn’t have access to crawl my website due to the “400 Bad Request” issue. This status code usually appears when blocking a specific page to a particular thing – country, bots, IP addresses, etc. I never wanted to block Yandexbot or Bingbot. But I solved this issue on 24th January by disabling ModSecurity on cPanel.

I thought it might be helpful to know some of these details before going to the story-based SEO research.

Which things did I examine?

Which search engine bot crawled this post first?

Well, Bingbot was fast, and it’s a clear winner of this question. I published the blog post on January 27, 2022, at 3:00 PM. According to the log file events I have, Bingbot crawled it 2 hours later on the same day. This log event coincides with the Bing Webmasters tool’s URL inspection details.

Bing Webmasters URL Inspection

5 hours later, Yandexbot crawled it 2 times repeatedly.

Yandex Webmaster Crawl Statistics

As far as I see, the Yandex Webmaster tool doesn’t give the exact time of its crawl statistics. But I had a chance to catch it on log file events. You might be interested to see what it looks like, so of course, I will share it with you.

Screaming Frog Log File Analyser Bingbot vs Yandexbot Events

How many times did Bingbot and Yandexbot crawl the post until indexing it?

Since I published the blog post, Bingbot and Yandexbot crawled just on the same date and never looked back.

In total, Bingbot crawled one time, and Yandexbot crawled two times.

Did all bots crawl the XML sitemap?

No. But in this case, it was clear that Googlebot takes care of the XML sitemap. Since January 21:

  • Googlebot crawled the main XML sitemap 13 times. Post-based sitemap was crawled 4 times.
  • On the day I published the blog post, Googlebot crawled the main XML sitemap 7 times. But the other sitemap, which included blog posts, was crawled once.
  • When it comes to Yandexbot, their bot crawled the main sitemap about 2 times. As well both times post-based sitemap.
  • Bingbot never crawled any of the XML sitemaps.

Are those log events by a verified crawler?

Yes. I set to analyze just verified bots on Screaming Frog Log File Analyser. Here are the IP addresses that I got until then.

Top 5 Yandexbot IP addresses (by the number of events):

  • 5.255.253.114
  • 213.180.203.5
  • 5.255.253.124
  • 5.255.253.110
  • 5.45.207.99

It’s easy to find them publicly on the web so that’s why I didn’t hesitate to share them with you.

Top 5 Bingbot IP addresses (by the number of events):

  • 40.77.167.99
  • 157.55.39.28
  • 207.46.13.230
  • 207.46.13.100
  • 40.77.167.39

You can check if the crawler IP address belongs to Bingbot or not by using Bing’s Toolbox or Bing Webmasters’ Verify Bingbot.

 

Findings

In this research, I discovered something weird – which might be just for me.

When I wanted to check if Googlebot even discovered my blog post on Google Search Console, it was discovered but is currently not indexed.

Google Search Console – URL Inspect, discovered but not indexed yet

Well, there wasn’t any date of discovering the blog post. So I checked the “Discovered – currently not indexed” coverage report, but there’s no blog post URL.

Google Search Console (GSC) – Discovered - currently not indexed issue.

At first look, I didn’t know the reason for it. But then I thought of two things:

  1. Googlebot crawled my XML sitemaps.
  2. Googlebot crawled my homepage and the category of the blog post.

I was right, and also Google by not showing the URL in the “Discovered – currently not indexed” issue reports list. When I checked Google search results by entering the page title, the homepage and author page were there.

Searching on Google by Page Title

 

Sometimes we need to look at complex things simply. So we can understand it faster and better.

Conclusions

  • Clearly, Googlebot was too busy while understanding my new website version.
  • Bing and Yandex indexed my blog post on the day I published it.
  • After three days, Google still didn’t index my blog post.

Update — February 6, 2022:

  • The blog post was finally indexed by Google on February 4, after 8 days.