SEO

Harness the Potential of Web Crawling to Improve Your SERP Ranking – Here’s How

Marvin Magusara

Marvin has over 7 years of experience working with some of the largest global brands and digital agencies. Despite his vast experience, Marvin is still obsessed with learning and testing SEO theories to stay ahead of the curve.

Spread the love

Search engines use crawlers to find new and updated content. Content can range from webpages to PDF files and images. What all content has in common, however, is that it’s discovered by links.

The most important crawler is Googlebot, Google’s crawler, because more than 90% of searches are performed via Google. It works by collecting several web pagesand then following the links on them to find new pages. Googlebot can find new and updated content and add it to an index by following this link path. When someone types in a query and relevant information is available in this index, this information is retrieved from it. 

Clicks and Rankings

When you do a search, Google will check its index for relevant content and rank it in order to solve your query. This is basically what ranking is. In theory, the highest-ranked result will be most relevant to your query. In practice, that’s often not the case.

Our SEO data-scientist set out to establish how Googlebot crawls affect page ranking, as well as the dependency between crawl frequency and URL performance. Here is what they discovered. 

The Study Dataset

The analysis is based on the real data of UK based site with average monthly organic search impressions 1.2M and 3.5K indexed pages.

For this analysis,the team combined data from filtered Googlebot crawl events (GET requests from google.com/bot) and Google Search Console’s daily URL performance for period of 15 days in September 2020. For each URL, they calculated the number of crawls per day. The frequency of crawls for each URL was calculated by dividing 15 (the total number of days) by the number of days on which there was at least one crawl.

Predictably, the more frequent the crawls, the higher the number of daily impressions and clicks was. The ranking of these pages was also higher. In this study, the number of URLs crawled every day was 81. They had 211 daily impressions and 3.59 clicks on average, and an average position of 15. At the other end, 1866 URLs were crawled once every two weeks. They had two daily impressions, 0.01 clicks, and a position of 27 on average.

Average Crawl Frequency (Once In)

Nr Of Urls

Avg Daily Impressions

Avg Daily Clicks

Avg Position

Everyday

81

211

3.59

15

Once in 2 days

540

25

0.42

19

Once in 3 days

162

11

0.08

22

Once in 4 days

343

4

0.02

27

Once in 5 days

677

3

0.02

28

Once in a week

1068

2

0.02

26

Once in 2 weeks

1866

2

0.01

27

frequency of crawls vs impression vs avg position

Introducing Pearson’s Coefficient of Linear Correlation

You’ll agree that with age, children tend to get taller. A statistician would say there was a very strong positive relationship between age and height. The Pearson coefficient of correlation would range from 0.70 to 1 depending on the study sample. Likewise, the higher the speed of a car, the shorter the traveling time; if you drive faster, you’ll arrive at your destination sooner. We then speak of a very strong negative relationship between speed and travel time. The coefficient would range from -0.70 to -1.

The team calculated the Pearson correlation coefficient between the number of days with crawls and the average daily impressions, a metric quantifying how many times a piece of content (like a site or an ad) is viewed or engaged with. The coefficient between the number of crawls, clicks, and position (ranking) was calculated too. 

Header

Nr Of Crawls

Avg impressions

0.69

Daily clicks

0.66

Position

-0.69

The coefficient between the number of crawls and average impressions was 0.69, which shows a strong positive relationship between them. It was 0.66 between the number of crawls and the clicks per day and -0.69 between the number of crawls and the position of the URL. This last negative value shows that high URL crawling frequency is associated with low average position number. In this case, this means more often crawled URLs are closer to the top SERP position.

The team compared two types of URLs: high-ranking URLs with low impressions (average position 1-5 and average daily number of impressions <=5) and low-ranking URLs with high impressions (position >20 and daily number of impressions >=50). They found low-ranking URLs with more impressions were crawled almost twice as often: 8.7 vs. 4.6. This led them to conclude impressions were a more important factor for crawl frequency.

How are Crawl Frequency and Page Rank Connected?

To determine how Googlebot affects page rank in the days after a crawl, the researchers combined data from three sources: a list of crawls and their frequency for a period of two weeks in September 2020 and daily position by keyword for the corresponding period. They determined the position of each URL and keyword on the day before the last crawl, on the day of the crawl, and each day for five days after the crawl.

They filtered out URLs which were crawled more frequently than once every three days, because these could have been affected by previous crawls, potentially resulting in data inaccuracy. They also filtered out keywords which had no position data on the day before the crawl.

The remaining keyword and URL data, spanning 153 rows, showed an above-average position change on the day of the crawl and two days after it. The biggest change was observed on the day right after the crawl.

Position Change

Avg Dtd Change

Day Of Last Crawl

1

Day After Last Crawl

2

Day After Last Crawl

3

Day After Last Crawl

4

Day After Last Crawl

5

Day After Last Crawl

Avg DtD change

1.1

1.6

3.5

2.2

1.3

1.2

1.6

Avg % DtD change

3.71%

5%

6%

4%

3%

1%

3%

Avg change to the day before crawl

Cell

1.6

3.0

3.5

3.2

2.4

2.9

Avg %change to the day before crawl

Cell

5%

8%

11%

11%

8%

10%

The share of URLs with position improvement after the crawl was 64%. The position was higher than before the crawl on the third day after it. The share of URLs,whose position deteriorated after the crawl,was 36%.

High-ranking URLs were less volatile. For first, second, and third-ranked URLs, the average day to day change was just 0.13 compared to 2.08 for URLs ranking 21st or lower. 

Avg Dtd Change In Position Depending On Avg Position

Avg Dtd Change

Avg position 1-3

0.13

Avg position 4-10

0.15

Avg position 11-20

0.64

Avg position >21

2.06

What do These Findings Mean?

If your site isn’t crawled and indexed, it won’t appear in ranking pages. You need to see how many of your pages are indexed if you have a site. This will help you find out whether Google is crawling and locating all of the pages that need to be crawled or if it is wasting crawl budget on URLs that you don't want to be crawled. To check if all of the important pages have been indexed, you can check Google Search Console.

You can help to prioritize URL crawl rate by increasing the quantity and quality of internal and external links to it. You can also ensure that you are not wasting crawl budget on low-priority pages by decreasing the internal links pointing to them, adding a nofollow tag to the links or by adding a block crawl instruction on robots.txt file to these URLs. These will point Googlebot in the right direction in terms of crawling your web content, which will increase your control over what ultimately appears in the index. 


Spread the love

Related Topics

Ultimate On-Page SEO Tool Showdown: POP vs. Surfer SEO vs. Cognitive SEO

6 Tips Multi-National/Location Businesses Can Effectively Use For Local SEO Efforts

Website Quality Audit Recovers site from Algorithm Penalty with 313% Organic Traffic Increase – SEO Case Study

Woo Google and Your Users with Your SEO Copywriting Skills

Leave a Comment

Your email address will not be published. Required fields are marked *

Let's Talk

We only work with businesses we can actually help through SEO. Contact us for assessment.

To The Top Logo

45 Braidley Road, Bournemouth, BH2 6JY, UK

4, Commerce and Industry Plaza, Mckinley Town Center, Taguig, 1634, Manila

© 2019 ToTheTop.