Content, Shares, and Links: Insights from Analyzing 1 Million Articles

Posted by Steve_Rayson

This summer BuzzSumo teamed up with Moz to analyze the shares and links of over 1m articles. We wanted to look at the correlation of shares and links, to understand the content that gets both shares and links, and to identify the formats that get relatively more shares or links.

What we found is that the majority of content published on the internet is simply ignored when it comes to shares and links. The data suggests most content is simply not worthy of sharing or linking, and also that people are very poor at amplifying content. It may sound harsh but it seems most people are wasting their time either producing poor content or failing to amplify it.

On a more positive note we also found some great examples of content that people love to both share and link to. It was not a surprise to find content gets far more shares than links. Shares are much easier to acquire. Everyone can share content easily and it is almost frictionless in some cases. Content has to work much harder to acquire links. Our research uncovered:

  • The sweet spot content that achieves both shares and links
  • The content that achieves higher than average referring domain links
  • The impact of content formats and content length on shares and links

Our summary findings are as follows:

  1. The majority of posts receive few shares and even fewer links. In a randomly selected sample of 100,000 posts over 50% had 2 or less Facebook interactions (shares, likes or comments) and over 75% had zero external links. This suggests there is a lot of very poor content out there and also that people are very poor at amplifying their content.
  2. When we looked at a bigger sample of 750,000 well shared posts we found over 50% of these posts still had zero external links. Thus suggests while many posts acquire shares, and in some cases large numbers of shares, they find it far harder to acquire links.
  3. Shares and links are not normally distributed around an average. There are high performing outlier posts that get a lot of shares and links but most content is grouped at the low end, with close to zero shares and links. For example, over 75% of articles from our random sample of 100,000 posts had zero external links and just 1 or less referring domain link.
  4. Across our total sample of 1m posts there was NO overall correlation of shares and links, implying people share and link for different reasons. The correlation of total shares and referring domain links across 750,000 articles was just 0.021.
  5. There are, however, specific content types that do have a strong positive correlation of shares and links. This includes research backed content and opinion forming journalism. We found these content formats achieve both higher shares and significantly more links.
  6. 85% of content published (excluding videos and quizzes) is less than 1,000 words long. However, long form content of over 1,000 words consistently receives more shares and links than shorter form content. Either people ignore the data or it is simply too hard for them to write quality long form content.
  7. Content formats matter. Formats such as entertainment videos and quizzes are far more likely to be shared than linked to. Some quizzes and videos get hundreds of thousands of shares but no links.
  8. List posts and videos achieve much higher shares on average than other content formats. However, in terms of achieving links, list posts and why posts achieve a higher number of referring domain links than other content formats on average. While we may love to hate them, list posts remain a powerful content format.

We have outlined the findings in more detail below. You can download the full 30 page research report from the BuzzSumo site:

Download the full 30-page research report

The majority of posts receive few shares and even fewer links

We pulled an initial sample of 757,000 posts from the BuzzSumo database. 100,000 of these posts were pulled at random and acted as a control group. As we wanted to investigate certain content formats, the other 657,000 were well shared videos, ‘how to’ posts, list posts, quizzes, infographics, why posts and videos. The overall sample therefore had a specific bias to well shared posts and specific content formats. However, despite this bias towards well shared articles, 50% of our 757,000 articles still had 11 or less Twitter shares and 50% of the posts had zero external links.

By comparison 50% of the 100,000 randomly selected posts had 2 or less Twitter shares, 2 or less Facebook interactions, 1 or less Google+ shares and zero LinkedIn shares. 75% of the posts had zero external links and 1 or less referring domain links.

75% of randomly selected articles had zero external links

Shares and links are not normally distributed

Shares and links are not distributed normally around an average. Some posts go viral and get a very high numbers of shares and links. This distorts the average, the vast majority of posts receive very few shares or links and sit at the bottom of a very skewed distribution curve as shown below.

This chart is cut off on the right at 1,000 shares, in fact the long thin tail would extend a very long way as a number of articles received over 1m shares and one received 5.7m shares.

This long tail distribution is the same for shares and links across all the domains we analyzed. The skewed nature of the distribution means that averages can be misleading due to the long tail of highly shared or linked content. In the example below we show the distribution of shares for a domain. In this example the average is the blue line but 50% of all posts lie to the left of the red line, the median.

There is NO correlation of shares and links

We used the Pearson correlation co-efficient, a measure of the linear correlation between two variables. The results can range from between 1 (a total positive correlation) to 0 (where there is no correlation) to −1 (a total negative correlation).

The overall correlations for our sample were:

Total shares and Referring Domain Links 0.021

Total shares and Sub-domain Links 0.020

Total shares and External Links 0.011

The results suggest that people share and link to content for different reasons.

We also looked at different social networks to see if there were more positive correlations for specific networks. We found no strong positive correlation of shares to referring domain links across the different networks as shown below.

  • Facebook total interactions 0.0221
  • Twitter 0.0281
  • Linkedin 0.0216
  • Pinterest 0.0065
  • Google plus 0.0058

Whilst there is no correlation by social network there is some evidence that very highly shared posts have a higher correlation of shares and links. This can be seen below.

Content sample Average total shares Median shares Average referring domain links Median referring domain links Correlation total shares – referring domains
Full sample
of posts

4,393 202 3.77 1 0.021
Posts with over
10,000 total shares

35,080 18,098 7.06 2 0.101

The increased correlation is relatively small, however, it does indicate that very popular sites, other things being equal, would have slightly higher correlations of shares and links.

Our finding that there is no overall correlation contradicts previous studies that have suggested there is a positive correlation of shares and links. We believe the previous findings may have been due to inadequate sampling as we will discuss below.

The content sweet spot: content with a positive correlation of shares and links

Our research found there are specific content types that have a high correlation of shares and links. This content attracts both shares and links, and as shares increase so do referring domain links. Thus whilst content is generally shared and linked to for different reasons, there appears to be an overlap where some content meets the criteria for both sharing and linking.

Screen Shot 2015-08-24 at 17.38.25.png

The content that falls into this overlap area, our sweet spot, includes content from popular domains such as major publishers. In our sample the content also included authoritative, research backed content, opinion forming journalism and major news sites.

In our sample of 757,000 well shared posts the following were examples of domains that had a high correlation of shares and links.

Site Number of articles in sample Referring domain links – total shares correlation
The Breast Cancer Site 17 0.90
New York Review of books 11 0.95
Pew Research 25 0.86
The Economist 129 0.73

We were very cautious about drawing conclusions from this data as the individual sample sizes were very small. We therefore undertook a second, separate sampling exercise for domains with high correlations. This analysis is outlined in the next section below.

Our belief is that previous studies may have sampled content disproportionately from popular sites within the area of overlap. This would explain a positive correlation of shares and links. However, the data shows that the domains in the area of overlap are actually outliers when it comes to shares and links.

Sweet-spot content: opinion-forming journalism and research-backed content

In order to explore further the nature of content on sites with high correlations we looked at a further 250,000 random articles from those domains.

For example, we looked at 49,952 articles from the New York Times and 46,128 from the Guardian. These larger samples had a lower correlation of links and shares, as we would expect due to the samples having a lower level of shares overall. The figures were as follows:


Number articles in sample

Average Total Shares

Average Referring Domain Links

Correlation of Total Shares to Domain Links









We then subsetted various content types to see if particular types of content had higher correlations. During this analysis we found that opinion content from these sites, such as editorials and columnists, had significantly higher average shares and links, and a higher correlation. For example:

Opinion content

Number articles in sample

Average Total Shares

Average Referring Domain Links

Correlation of Total Shares to Domain Links









The higher shares and links may be because opinion content tends to be focused on current trending areas of interest and because the authors take a particular slant or viewpoint that can be controversial and engaging.

We decided to look in more detail at opinion forming journalism. For example, we looked at over 20,000 articles from The Atlantic and New Republic. In both cases we saw a high correlation of shares and links combined with a high number of referring domain links as shown below.


Number articles in sample

Average Total Shares

Average Referring Domain Links

Correlation of Total Shares to Domain Links









This data appears to support the hypothesis that authoritative, opinion shaping journalism sits within the content sweet spot. It particularly attracts more referring domain links.

The other content type that had a high correlation of shares and links in our original sample was research backed content. We therefore sampled more data from sites that publish a lot of well researched and evidenced content. We found content on these sites had a significantly higher number of referring domain links. The content also had a higher correlation of links and shares as shown below.


Number articles in sample

Average Total Shares

Average Referring Domain Links

Correlation of Total Shares to Domain Links













Thus whilst overall there is no correlation of shares and links, there are specific types of content that do have a high correlation of shares and links. This content appears to sit close to the center of the overlap of shares and links, our content sweet spot.

The higher correlation appears to be caused by the content achieving a higher level of referring domain links. Shares are generally much easier to achieve than referring domain links. You have to work much harder to get such links and it appears that research backed content and authoritative opinion shaping journalism is better at achieving referring domain links.

Want shares and links? Create deep research or opinion-forming content

Our conclusion is that if you want to create content that achieves a high level of both shares and links then you should concentrate on opinion forming, authoritative content on current topics or well researched and evidenced content. This post falls very clearly into the latter category, so we will shall see if this proves to be the case here.

The impact of content format on shares and links

We specifically looked at the issue of content formats. Previous research had suggested that some content formats may have a higher correlation of shares and links. Below are the details of shares and links by content format from our sample of 757,317 posts.

Content Type

Number in sample

Average Total Shares

Average Referring Domain Links

Correlation of total shares & referring domain links

List post










Why post





How to post















What stands out is the high level of shares for list posts and videos.

By contrast the average level of shares for infographics is very low. Whilst the top infographics did well (there were 343 infographics with more than 10,000 shares) the majority of infographics in our sample performed poorly. Over 50% of infographics (53,000 in our sample) had zero external links and 25% had less than 10 shares in total across all networks. This may reflect a recent trend to turn everything into an infographic leading to many poor pieces of content.

What also stands out is the relatively low number of referring domain links for quizzes. People may love to share quizzes but they are less likely to link to them.

In terms of the correlation of total shares and referring domain links, Why posts had the highest correlation than all other content types at 0.125. List posts and videos also have a higher correlation than the overall sample correlation which was 0.021.

List posts appear to perform consistently well as a content format in terms of both shares and links.

Some content types are more likely to be shared than linked to

Surprising, unexpected and entertaining images, quizzes and videos have the potential to go viral with high shares. However, this form of content is far less likely to achieve links.

Entertaining content such as Vine videos and quizzes often had zero links despite very high levels of shares. Here are some examples.


Total Shares

External Links

Referring Domain Links

Vine video




Vine video




Disney Dog Quiz…




Brainfall Quiz…




Long form content consistently receives more shares and links than shorter-form content

We removed videos and quizzes from our initial sample to analyze the impact of content length. This gave us a sample of 489,128 text based articles which broke down by content length as follows:

Length (words) No in sample Percent
<1,000 418,167 85.5
1-2,000 58,642 12
2-3,000 8,172 1.7
3,000-10,000 3,909 0.8

Over 85% of articles had less than 1,000 words.

We looked at the impact of content length on total shares and domain links.

Length (words) Total Shares Average Referring Domain Links Average
<1,000 2,823 3.47
1-2,000 3,456 6.92
2-3,000 4,254 8.81
3-10,000 5,883 11.07

We can see that long form content consistently gets higher average shares and significantly higher average links. This supports our previous research findings, although there are exceptions, particularly with regard to shares. One such exception we identified is IFL Science, that publishes short form content shared by its 21m Facebook fans. The site curates images and videos to explain scientific research and findings. This article examines how they create their short form viral content. However, IFLS Science is very much an exception. On average long form content performs better, particularly when it comes to links.

When we looked at the impact of content length on the correlation of shares and links. we found that content of over 1,000 words had a higher correlation but the correlation did not increase further beyond 2,000 words.

Length (words) Correlation Shares/Links
<1,000 0.024
1-2,000 0.113
2-3,000 0.094
3,000+ 0.072

The impact of combined factors

We have not undertaken any detailed linear regression modelling or built any predictive models but it does appear that a combination of factors can increase shares, links and the correlation. For example, when we subsetted List posts to look at those over 1,000 words in length, the average number of referring domain links increased from 6.19 to 9.53. Similarly in our original sample there were 1,332 articles from the New York Times. The average number of referring domain links for the sample was 7.2. When we subsetted out just the posts over 1,000 words the average number of referring domain links increased to 15.82. When we subsetted out just the List posts the average number of referring domain links increased further to 18.5.

The combined impact of factors such as overall site popularity, content format, content type and content length is an area for further investigation. However, the initial findings do indicate that shares and/or links can be increased when some of these factors are combined.

You can download the full 30 page research report from the BuzzSumo site:

Download the full 30-page research report

Steve will be discussing the findings at a Mozinar on September 22, at 10.30am Pacific Time. You can register and save your place here…

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Source: moz

Clean Your Site’s Cruft Before It Causes Rankings Problems – Whiteboard Friday

Posted by randfish

We all have it. The cruft. The low-quality, or even duplicate-content pages on our sites that we just haven’t had time to find and clean up. It may seem harmless, but that cruft might just be harming your entire site’s ranking potential. In today’s Whiteboard Friday, Rand gives you a bit of momentum, showing you how you can go about finding and taking care of the cruft on your site.

Cleaning the Cruft from Your Site Before it Causes Pain and Problems with your Rankings Whiteboard

Click on the whiteboard image above to open a high resolution version in a new tab!

Video transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re chatting about cleaning out the cruft from your website. By cruft what I mean is low quality, thin quality, duplicate content types of pages that can cause issues even if they don’t seem to be causing a problem today.

What is cruft?

If you were to, for example, launch a large number of low quality pages, pages that Google thought were of poor quality, that users didn’t interact with, you could find yourself in a seriously bad situation, and that’s for a number of reasons. So Google, yes, certainly they’re going to look at content on a page by page basis, but they’re also considering things domain wide.

So they might look at a domain and see lots of these green pages, high quality, high performing pages with unique content, exactly what you want. But then they’re going to see like these pink and orange blobs of content in there, thin content pages with low engagement metrics that don’t seem to perform well, duplicate content pages that don’t have proper canonicalization on them yet. This is really what I’m calling cruft, kind of these two things, and many variations of them can fit inside those.

But one issue with cruft for sure it can cause Panda issues. So Google’s Panda algorithm is designed to look at a site and say, “You know what? You’re tipping over the balance of what a high quality site looks like to us. We see too many low quality pages on the site, and therefore we’re not just going to hurt the ranking ability of the low quality pages, we’re going to hurt the whole site.” Very problematic, really, really challenging and many folks who’ve encountered Panda issues over time have seen this.

There are also other probably non-directly Panda kinds of related things, like site-wide analysis of things like algorithmic looks at engagement and quality. So, for example ,there was a recent analysis of the Phantom II update that Google did, which hasn’t really been formalized very much and Google hasn’t said anything about it. But one of the things that they looked at in that Phantom update was the engagement of pages on the sites that got hurt versus the engagement of pages on the sites that benefited, and you saw a clear pattern. Engagement on sites that benefited tended to be higher. On those that were hurt, tended to be lower. So again, it could be not just Panda but other things that will hurt you here.

It can waste crawl bandwidth, which sucks. Especially if you have a large site or complex site, if the engine has to go crawl a bunch of pages that are cruft, that is potentially less crawl bandwidth and less frequent updates for crawling to your good pages.

It can also hurt from a user perspective. User happiness may be lowered, and that could mean a hit to your brand perception. It could also drive down better converting pages. It’s not always the case that Google is perfect about this. They could see some of these duplicate content, some of these thin content pages, poorly performing pages and still rank them ahead of the page you wish ranked there, the high quality one that has good conversion, good engagement, and that sucks just for your conversion funnel.

So all sorts of problems here, which is why we want to try and proactively clean out the cruft. This is part of the SEO auditing process. If you look at a site audit document, if you look at site auditing software, or step-by-step how-to’s, like the one from Annie that we use here at Moz, you will see this problem addressed.

How do I identify what’s cruft on my site(s)?

So let’s talk about some ways to proactively identify cruft and then some tips for what we should do afterwards.

Filter that cruft away!

One of those ways for sure that a lot of folks use is Google Analytics or Omniture or Webtrends, whatever your analytics system is. What you’re trying to design there is a cruft filter. So I got my little filter. I keep all my good pages inside, and I filter out the low quality ones.

What I can use is one of two things. First, a threshold for bounce or bounce rate or time on site, or pages per visit, any kind of engagement metric that I like I can use that as a potential filter. I could also do some sort of a percentage, meaning in scenario one I basically say, “Hey the threshold is anything with a bounce rate higher than 90%, I want my cruft filter to show me what’s going on there.” I’d create that filter inside GA or inside Omniture. I’d look at all the pages that match that criteria, and then I’d try and see what was wrong with them and fix those up.

The second one is basically I say, “Hey, here’s the average time on site, here’s the median time on site, here’s the average bounce rate, median bounce rate, average pages per visit, median, great. Now take me 50% below that or one standard deviation below that. Now show me all that stuff, filters that out.”

This process is going to capture thin and low quality pages, the ones I’ve been showing you in pink. It’s not going to catch the orange ones. Duplicate content pages are likely to perform very similarly to the thing that they are a duplicate of. So this process is helpful for one of those, not so helpful for other ones.

Sort that cruft!

For that process, you might want to use something like Screaming Frog or, which is a great tool, or Moz Analytics, comes from some company I’ve heard of.

Basically, in this case, you’ve got a cruft sorter that is essentially looking at filtration, items that you can identify in things like the URL string or in title elements that match or content that matches, those kinds of things, and so you might use a duplicate content filter. Most of these pieces of software already have a default setting. In some of them you can change that. I think and Screaming Frog both let you change the duplicate content filter. Moz Analytics not so much, same thing with Google Webmaster Tools, now Search Console, which I’ll talk about in a sec.

So I might say like, “Hey, identify anything that’s more than 80% duplicate content.” Or if I know that I have a site with a lot of pages that have only a few images and a little bit of text, but a lot of navigation and HTML on them, well, maybe I’d turn that up to 90% or even 95% depending.

I can also use some rules to identify known duplicate content violators. So for example, if I’ve identified that everything that has a question mark refer equals bounce or something or partner. Well, okay, now I just need to filter for that particular URL string, or I could look for titles. So if I know that, for example, one of my pages has been heavily duplicated throughout the site or a certain type, I can look for all the titles containing those and then filter out the dupes.

I can also do this for content length. Many folks will look at content length and say, “Hey, if there’s a page with fewer than 50 unique words on it in my blog, show that to me. I want to figure out why that is, and then I might want to do some work on those pages.”

Ask the SERP providers (cautiously)

Then the last one that we can do for this identification process is Google and Bing Webmaster Tools/Search Console. They have existing filters and features that aren’t very malleable. We can’t do a whole lot with them, but they will show you potential site crawl issues, broken pages, sometimes dupe content. They’re not going to catch everything though. Part of this process is to proactively find things before Google finds them and Bing finds them and start considering them a problem on our site. So we may want to do some of this work before we go, “Oh, let’s just shove an XML sitemap to Google and let them crawl everything, and then they’ll tell us what’s broken.” A little risky.

Additional tips, tricks, and robots

A couple additional tips, analytics stats, like the ones from GA or Omniture or Webtrends, they can totally mislead you, especially for pages with very few visits, where you just don’t have enough of a sample set to know how they’re performing or ones that the engines haven’t indexed yet. So if something hasn’t been indexed or it just isn’t getting search traffic, it might show you misleading metrics about how users are engaging with it that could bias you in ways that you don’t want to be biased. So be aware of that. You can control for it generally by looking at other stats or by using these other methods.

When you’re doing this, the first thing you should do is any time you identify cruft, remove it from your XML sitemaps. That’s just good hygiene, good practice. Oftentimes it is enough to at least have some of the preventative measures from getting hurt here.

However, there’s no one size fits all methodology after the don’t include it in your XML sitemap. If it’s a duplicate, you want to canonicalize it. I don’t want to delete all these pages maybe. Maybe I want to delete some of them, but I need to be considered about that. Maybe they’re printer friendly pages. Maybe they’re pages that have a specific format. It’s a PDF version instead of an HTML version. Whatever it is, you want to identify those and probably canonicalize.

Is it useful to no one? Like literally, absolutely no one. You don’t want engines visiting. You don’t want people visiting it. There’s no channel that you care about that page getting traffic to. Well you have two options — 301 it. If it’s already ranking for something or it’s on the topic of something, send it to the page that will perform well that you wish that traffic was going to, or you can completely 404 it. Of course, if you’re having serious trouble or you need to remove it entirely from engines ASAP, you can use the 410 permanently delete. Just be careful with that.

Is it useful to some visitors, but not search engines? Like you don’t want searchers to find it in the engines, but if somebody goes and is paging through a bunch of pages and that kind of thing, okay, great, I can use no index, follow for that in the meta robots tag of a page.

If there’s no reason bots should access it at all, like you don’t care about them following the links on it, this is a very rare use case, but there can be certain types of internal content that maybe you don’t want bots even trying to access, like a huge internal file system that particular kinds of your visitors might want to get access to but nobody else, you can use the robots.txt file to block crawlers from visiting it. Just be aware it can still get into the engines if it’s blocked in robots.txt. It just won’t show any description. They’ll say, “We are not showing a site description for this page because it’s blocked by robots.”

If the page is almost good, like it’s on the borderline between pink and green here, well just make it good. Fix it up. Make that page a winner, get it back in the engines, make sure it’s performing well, find all the pages like that have those problems, fix them up or consider recreating them and then 301’ing them over if you want to do that.

With this process, hopefully you can prevent yourself from getting hit by the potential penalties, or being algorithmically filtered, or just being identified as not that great a website. You want Google to consider your site as high quality as they possibly can. You want the same for your visitors, and this process can really help you do that.

Looking forward to the comments, and we’ll see you again next week for another edition of Whiteboard Friday. Take care.

Video transcription by

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Source: moz

New Study: Data Reveals 67% of Consumers are Influenced by Online Reviews

Posted by Dhinckley

Google processed over 1 trillion search queries in 2014. As Google Search continues to further integrate into our normal daily activities, those search results become increasingly important, especially when individuals are searching for information about a company or product.

To better understand just how much of an impact Google has on an individual’s purchasing decisions, we set up a research study with a group of 1,000 consumers through Google Consumer Surveys. The study investigates how individuals interact with Google and other major sites during the buying process.

Do searchers go beyond page 1 of Google?

We first sought to understand how deeply people went into the Google search results. We wanted to know if people tended to stop at page 1 of the search results, or if they dug deeper into page 2 and beyond. A better understanding of how many pages of search results are viewed provides insight into how many result pages we should monitor related to a brand or product.

When asked, 36% of respondents claimed to look through the first two pages or more of search results. But, looking at actual search data, it is clear that individuals view less than 2% of searches below the top five results on the first page. From this, it is clear that actual consumer behavior differs from self-reported search activity.

Do Searchers Go Beyond Page 1 of Google?

Takeaway: People are willing to view as many as two pages of search results but rarely do so during normal search activities.

Are purchasing decisions affected by online reviews?

Google has integrated reviews into the Google+ Local initiative and often displays these reviews near the top of search results for businesses. Other review sites, such as Yelp and TripAdvisor, will also often rank near the top for search queries for a company or product. Because of the prevalence of review sites appearing in the search results for brands and products, we wanted a better understanding of how these reviews impacted consumers’ decision-making.

We asked participants, “When making a major purchase such as an appliance, a smart phone, or even a car, how important are online reviews in your decision-making?”

The results revealed that online reviews impact 67.7% of respondents’ purchasing decisions. More than half of the respondents (54.7%) admitted that online reviews are fairly, very, or absolutely an important part of their decision-making process.

Purchasing Decisions and Online Reviews

Takeaway: Companies need to take reviews seriously. Restaurant review stories receive all the press, but most companies will eventually have pages from review sites ranking for their names. Building a strong base of positive reviews now will help protect against any negative reviews down the road.

When do negative reviews cost your business customers?

Our research also uncovered that businesses risk losing as many as 22% of customers when just one negative article is found by users considering buying their product. If three negative articles pop up in a search query, the potential for lost customers increases to 59.2%. Have four or more negative articles about your company or product appearing in Google search results? You’re likely to lose 70% of potential customers.

Negative Articles and Sales

Takeaway: It is critical to keep page 1 of your Google search results clean of any negative content or reviews. Having just one negative review could cost you nearly a quarter of all potential customers who began researching your brand (which means they were likely deep in the conversion funnel).

What sites do people visit before buying a product or service?

Google Search is just one of the sites that consumers can visit to research a brand or product. We thought it would be interesting to identify other popular consumer research sites.

Interestingly, most people didn’t seem to remember visiting any of the popular review sites. Instead, the brand site that got the most attention was Google+ Local reviews. Another noteworthy finding was that Amazon came in second, with half the selections that Google received. Finally, the stats show that more people look to Wikipedia for information about a company than to Yelp or TripAdvisor.

Prepurchase Research Sources

Takeaway: Brands should invest time and effort into building a strong community on the Google+, which could lead to receiving more positive reviews on the social platform.

Online reviews impact the bottom line

The results of the study show that online reviews have a significant influence on the decision-making process of consumers. The data supports the fact that Internet users are generally willing to look at the first and second page of Google search results when searching for details about a product or company.

We can also conclude that online review sites like Google+ Local are heavily visited by potential customers looking for information, and the more negative content they find there, the less likely they will be to purchase your products or visit your business.

All this information paints a clear picture that what is included in Google search results for a company or product name will inevitably have an impact on the profitability of that company or product.

Internal marketing teams and public relation firms (PR) must consider the results that Google displays when they search for their company name or merchandise. Negative reviews, negative press, and other damaging feedback can have a lasting impact on a company’s ability to sell their products or services.

How to protect your company in Google search results

A PR or marketing team must be proactive to effectively protect a company’s online reputation. The following tactics can help prevent a company from suffering from deleterious online reviews:

  • First, identify if negative articles already exist on the first two pages of search results for a Google query of a company name or product (e.g., “Walmart”). This simple task should be conducted regularly. Google often shifts search results around, so a negative article—which typically attracts a higher click-through rate—is unfortunately likely to climb the rankings as individuals engage with the piece.
  • Next, monitor and analyze the current sentiment of reviews on popular review sites like Google+ and Amazon. Other sites, like Yelp or Trip Advisor, should also be checked often, as they can quickly climb Google search results. Do not attempt to artificially alter the results, but instead look for best practices on how to improve Yelp reviews or other review sites and implement them. The goal is to naturally improve the general buzz around your business.
  • If negative articles exist, there are solutions for improvement. A company’s marketing and public relations team may benefit by highlighting and/or generating positive press and reviews about the product or service through SEO and ORM efforts. By gaining control of the search results for your company or product, you will be in control of the main message that individuals see when looking for more information about your business. That’s done by working to ensure prospects and customers enjoy a satisfying experience when interacting with your brand, whether online or offline.

Being proactive with a brand’s reputation, as viewed in the Google search results and on review sites, does have an impact on the bottom line.

As we see in the data, people are less likely to make a purchase as the amount of negative reviews increase in Google search results.

By actively ensuring that honest, positive reviews appear, you can win over potential customers.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Source: moz

A Beginner’s Guide to Google Search Console

Posted by Angela_Petteys

If the name “Google Webmaster Tools” rings a bell for you, then you might already have an idea of what Google Search Console is. Since Google Webmaster Tools (GWT) has become a valuable resource for so many different types of people besides webmasters—marketing professionals, SEOs, designers, business owners, and app developers, to name a few—Google decided to change its name in May of 2015 to be more inclusive of its diverse group of users.

If you aren’t familiar with GWT or Google Search Console, let’s head back to square one. Google Search Console is a free service that lets you learn a great deal of information about your website and the people who visit it. You can use it to find out things like how many people are visiting your site and how they are finding it, whether more people are visiting your site on a mobile device or desktop computer, and which pages on your site are the most popular. It can also help you find and fix website errors, submit a sitemap, and create and check a robots.txt file.

Ready to start taking advantage of all that Google Search Console has to offer? Let’s do this.

Adding and verifying a site in Google Search Console

If you’re new to Google Search Console, you’ll need to add and verify your site(s) before you can do anything else. Adding and verifying your site in Search Console proves to Google that you’re either a site’s owner, webmaster, or other authorized user. After all, Search Console provides you with all sorts of incredibly detailed information and insights about a site’s performance. Google doesn’t want to hand that kind of information over to anybody who asks for it.

Adding a site to Search Console is a very simple process. First, log into your Search Console account. Once you’re logged in, you’ll see a box next to a red button which says “Add Property.”

Add a Site to Search Console.png

Enter the URL of the site you’re trying to add in the box and click “Add Property.” Congratulations, your site is now added to your Search Console account!

Next, you will be asked to verify your site. There are a few different ways you can go about this. Which method will work best for you depends on whether or not you have experience working with HTML, if you have access to upload files to the site, the size of your site, and whether or not you have other Google programs connected to your site. If this sounds overwhelming, don’t worry—we’ll help you figure it out.

Adding an HTML tag

This verification method is best for users and site owners who have experience working with HTML code.

Manage Property.png

From the Search Console dashboard, select “Manage Property,” then “Verify this property.” If the “HTML Tag” option does not appear under “Recommended method,” then you should click on the “Alternate methods” tab and select “HTML tag.” This will provide you with the HTML code you’ll need for verification.

Verify HTML Tag Edit.png

Copy the code and use your HTML editor to open the code for your site’s homepage. Paste the code provided within in the <Head> section of the HTML code. If your site already has a meta tag or other code in the <Head> section, it doesn’t matter where the verification code is placed in relation to the other code; it simply needs to be in the <Head> section. If your site doesn’t have a <Head> section, you can create one for the sake of verifying the site.

Once the verification code has been added, save and publish the updated code, and open your site’s homepage. From there, view the site’s source code. The verification code should be visible in the <Head> section.

Once you’re sure the code is added to your site’s homepage, go back to Search Console and click “Verify.” Google will then check your site’s code for the verification code. If the code is found, you will see a screen letting you know the site has been verified. If not, you will be provided with information about the errors it encountered.

When your site has been verified by Search Console, do not remove the verification code from your site. If the code is removed, it will cause your site to become unverified.

Uploading an HTML file

To use this method, you must be able to upload files to a site’s root directory.

From the Search Console dashboard, select “Manage site,” then “Verify this site.” If “HTML file upload” is not listed under “Recommended method,” it should be listed under the “Alternate method” tab. HTML File Method.png

When you select this method, you will be asked to download an HTML file. Download it, then upload it to the specified location. Do not make any changes to the content of the file or the filename; the file needs to be kept exactly the same. If it is changed, Search Console will not be able to verify the site.

After the HTML file has been uploaded, go back to Search Console and click “Verify.” If everything has been uploaded correctly, you will see a page letting you know the site has been verified.

Once you have verified your site using this method, do not delete the HTML file from your site. This will cause your site to become unverified.

Verifying via domain name provider

The domain name provider is the company you purchased a domain from or where your website is hosted. When you verify using your domain name provider, it not only proves you’re the owner of the main domain, but that you also own all of the subdomains and subdirectories associated with it. This is an excellent option if you have a large website.

From the Search Console dashboard, select “Manage site,” then “Verify this site.” If you don’t see the “Domain name provider” option listed under “Recommended method,” look under the “Alternate method” tab.

Domain Name Provider Method.png

When you select “Domain name provider,” you will be asked to choose your domain name provider from a list of commonly used providers, such as If your provider is not on this list, choose “Other” and you will be given instructions on how to create a DNS TXT record for your provider. If a DNS TXT record doesn’t work for your provider, you will have the option of creating a CNAME record instead.

Adding Google Analytics code

If you already use Google Analytics (GA) to monitor your site’s traffic, this could be the easiest option for you. But first, you’ll need to be able to check the site’s HTML code to make sure the GA tracking code is placed within the <Head> section of your homepage’s code, not in the <Body> section. If the GA code is not already in the <Head> section, you’ll need to move it there for this method to work.

From the Search Console dashboard, select “Manage site,” then “Verify this site.” If you don’t see the “Google Analytics tracking code” option under the “Recommended method,” look under the “Alternate method” tab. When you select “Google Analytics tracking method,” you’ll be provided with a series of instructions to follow.

Google Analytics Code Method 2.png

Once your site has been verified, do not remove the GA code from your site, or it will cause your site to become unverified.

Using Google Tag Manager

If you already use Google Tag Manager (GTM) for your site, this might be the easiest way to verify your site. If you’re going to try this method, you need to have “View, Edit, and Manage” permissions enabled for your account in GTM. Before trying this method, look at your site’s HTML code to make sure the GTM code is placed immediately after your site’s <Body> tag.

From the Search Console dashboard, select “Manage site,” then “Verify this site.” If you don’t see the “Google Tag Manager” option listed under “Recommended method,” it should appear under “Alternate method.”

Google Tag Manager Method.png

Select “Google Tag Manager” and click “Verify.” If the Google Tag Manager code is found, you should see a screen letting you know your site has been verified.

Once your site is verified, do not remove the GTM code from your site, or your site will become unverified.

How to link Google Analytics with Google Search Console

Google Analytics and Google Search Console might seem like they offer the same information, but there are some key differences between these two Google products. GA is more about who is visiting your site—how many visitors you’re getting, how they’re getting to your site, how much time they’re spending on your site, and where your visitors are coming from (geographically-speaking). Google Search Console, in contrast, is geared more toward more internal information—who is linking to you, if there is malware or other problems on your site, and which keyword queries your site is appearing for in search results . Analytics and Search Console also do not treat some information in the exact same ways, so even if you think you’re looking at the same report, you might not be getting the exact same information in both places.

To get the most out of the information provided by Search Console and GA, you can link accounts for each one together. Having these two tools linked will integrate the data from both sources to provide you with additional reports that you will only be able to access once you’ve done that. So, let’s get started:

Has your site been added and verified in Search Console? If not, you’ll need to do that before you can continue.

From the Search Console dashboard, click on the site you’re trying to connect. In the upper righthand corner, you’ll see a gear icon. Click on it, then choose “Google Analytics Property.”

Google Analytics Property.jpg

This will bring you to a list of Google Analytics accounts associated with your Google account. All you have to do is choose the desired GA account and hit “Save.” Easy, right? That’s all it takes to start getting the most out of Search Console and Analytics.

Adding a sitemap

Sitemaps are files that give search engines and web crawlers important information about how your site is organized and the type of content available there. Sitemaps can include metadata, with details about your site such as information about images and video content, and how often your site is updated.

By submitting your sitemap to Google Search Console, you’re making Google’s job easier by ensuring they have the information they need to do their job more efficiently. Submitting a sitemap isn’t mandatory, though, and your site won’t be penalized if you don’t submit a sitemap. But there’s certainly no harm in submitting one, especially if your site is very new and not many other sites are linking to it, if you have a very large website, or your if site has many pages that aren’t thoroughly linked together.

Before you can submit a sitemap to Search Console, your site needs to be added and verified in Search Console. If you haven’t already done so, go ahead and do that now.

From your Search Console dashboard, select the site you want to submit a sitemap for. On the left, you’ll see an option called “Crawl.” Under “Crawl,” there will be an option marked “Sitemaps.”

Crawl Sitemap.png

Click on “Sitemaps.” There will be a button marked “Add/Test Sitemap” in the upper righthand corner.

Add Test Sitemap 4.png

This will bring up a box with a space to add text to it.

Add Test Sitemap Submit.png

Type “system/feeds/sitemap” in that box and hit “Submit sitemap.” Congratulations, you have now submitted a sitemap!

Checking a robots.txt file

Having a website doesn’t necessarily mean you want to have all of its pages or directories indexed by search engines. If there are certain things on your site you’d like to keep out of search engines, you can accomplish this by using a robots.txt file. A robots.txt file placed in the root of your site tells search engine robots (i.e., web crawlers) what you do and do not want indexed by using commands known as the robots Exclusion Standard.

It’s important to note that robots.txt files aren’t necessarily guaranteed to be 100% effective in keeping things away from web crawlers. The commands in robots.txt files are instructions, and although the crawlers used by credible search engines like Google will accept them, it’s entirely possible that a less reputable crawler will not. It’s also entirely possible for different web crawlers to interpret commands differently. Robots.txt files also will not stop other websites from linking to your content, even if you don’t want it indexed.

If you want to check your robots.txt file to see exactly what it is and isn’t allowing, log into Search Console and select the site whose robots.txt file you want to check. Haven’t already added or verified your site in Search Console? Do that first.

Search Console Crawl Robots 2.png

On the lefthand side of the screen, you’ll see the option “Crawl.” Click on it and choose “robots.txt Tester.” The Robots.txt Tester Tool will let you look at your robots.txt file, make changes to it, and it alert you about any errors it finds. You can also choose from a selection of Google’s user-agents (names for robots/crawlers) and enter a URL you wish to allow/disallow, and run a test to see if the URL is recognized by that crawler.

Robots txt Tester Tool.png

If you make any changes to your robots.txt file using Google’s robots.txt tester, the changes will not be automatically reflected in the robots.txt file hosted on your site. Luckily, it’s pretty easy to update it yourself. Once your robots.txt file is how you want it, hit the “Submit” button underneath the editing box in the lower righthand corner. This will give you the option to download your updated robots.txt file. Simply upload that to your site in the same directory where your old one was ( Obviously, the domain name will change, but your robots.txt file should always be named “robots.txt” and the file needs to be saved in the root of your domain, not

Back on the robots.txt testing tool, hit “Verify live version” to make sure the correct file is on your site. Everything correct? Good! Click “Submit live version” to let Google know you’ve updated your robots.txt file and they should crawl it. If not, re-upload the new robots.txt file to your site and try again.

Fetch as Google and submit to index

If you’ve made significant changes to a website, the fastest way to get the updates indexed by Google is to submit it manually. This will allow any changes done to things such as on-page content or title tags to appear in search results as soon as possible.

The first step is to sign into Google Search Console. Next, select the page you need to submit. If the website does not use the ‘www.’ prefix, then make sure you click on the entry without it (or vice versa.)

On the lefthand side of the screen, you should see a “Crawl” option. Click on it, then choose “Fetch as Google.”

Fetch as Google Edit.png

Clicking on “Fetch as Google” should bring you to a screen that looks something like this:

Fetch as Google 2.png

If you need to fetch the entire website (such as after a major site-wide update, or if the homepage has had a lot of remodeling done) then leave the center box blank. Otherwise, use it to enter the full address of the page you need indexed, such as Once you enter the page you need indexed, click the “Fetch and Render” button. Fetching might take a few minutes, depending on the number/size of pages being fetched.

After the indexing has finished, there will be a “Submit to Index” button that appears in the results listing at the bottom (near the “Complete” status). You will be given the option to either “Crawl Only This URL,” which is the option you want if you’re only fetching/submitting one specific page, or “Crawl This URL and its Direct Links,” if you need to index the entire site.

Click this, wait for the indexing to complete, and you’re done! Google now has sent its search bots to catalog the new content on your page, and the changes should appear in Google within the next few days.

Site errors in Google Search Console

Nobody wants to have something wrong on their website, but sometimes you might not realize there’s a problem unless someone tells you. Instead of waiting for someone to tell you about a problem, Google Search Console can immediately notify you of any errors it finds on on your site.

If you want to check a site for internal errors, select the site you’d like to check. On the lefthand side of the screen, click on “Crawl,” then select “Crawl Errors.”

Site Errors Tool.png

You will then be taken directly to the Crawl Errors page, which displays any site or URL errors found by Google’s bots while indexing the page. You will see something like this:

Errors Page.png

Any URL errors found will be displayed at the bottom. Click on any of the errors for a description of the error encountered and further details.

Error Details.png

Record any encountered errors, including screenshots if appropriate. If you aren’t responsible for handling site errors, notify the person who is so they can correct the problem(s).

We hope this guide has been helpful in acquainting you with Google Search Console. Now that everything is set up and verified, you can start taking in all the information that Google Search Console has for you.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Source: moz

Traffic and Engagement Metrics and Their Correlation to Google Rankings

Posted by Royh

When Moz undertook this year’s Ranking Correlation Study (Ranking Factors), there was a desire to include data points never before studied. Fortunately, SimilarWeb had exactly what was needed. For the first time, Moz was able to measure ranking correlations with both traffic and engagement metrics.

Using Moz’s ranking data on over 200,000 domains, combined with multiple SimilarWeb data points—including traffic, page views, bounce rate, time on site, and rank—the Search Ranking Factors study was able to measure how these metrics corresponded to higher rankings.

These metrics differ from the traditional SEO parameters Moz has measured in the past in that they are primarily user-based metrics. This means that they vary based on how users interact with the individual websites, as opposed to static features such as title tag length. We’ll find these user-based metrics important as we learn how search engines may use them to rank webpages, as illustrated in this excellent post by Dan Petrovic.

Every marketer and SEO professional wants to know if there is a correlation between web search ranking results and the website’s actual traffic. Here, we’ll examine the relationship between website rankings and traffic engagement to see which metrics have the biggest correlation to rankings.

You can view the results below:

Traffic correlated to higher rankings

For the study, we examined both direct and organic search visits over a three-month period. SimilarWeb’s traffic results show that there is a generally a high correlation between website visits and Google’s search rankings.

Put simply, the more traffic a site received, the higher it tended to rank. Practically speaking, this means that you would expect to see sites like Amazon and Wikipedia higher up in the results, while smaller sites tended to rank slightly worse.

This doesn’t mean that Google uses traffic and user engagement metrics as an actual ranking factor in its search algorithm, but it does show that a relationship exists. Hypothetically, we can think of many reasons why this might be the case:

  • A “brand” bias, meaning that Google may wish to treat trusted, popular, and established brands more favorably.
  • Possible user-based ranking signals (described by Dan here) where uses are more inclined to choose recognizable brands in search results, which in theory could push their rankings higher.
  • Which came first—the chicken or the egg? Alternatively, it could simply be the case that high-ranking websites become popular simply because they are ranking highly.

Regardless of the exact cause, it seems logical that the more you improve your website’s visibility, trust, and recognition, the better you may perform in search results.

Engagement: Time on site, bounce rate, and page views

While not as large as the traffic correlations, we also found a positive correlation between a website’s user engagement and its rank in Google search results. For the study, we examined three different engagement metrics from SimilarWeb.

  • Time on site: 0.12 is not considered a strong correlation by any means within this study, but it does suggest there may be a slight relationship between how long a visitor spends on a particular site and its ranking in Google.
  • Page views: Similar to time on site, the study found a small correlation of 0.10 between the number of pages a visitor views and higher rankings.
  • Bounce rate: At first glance, with a correlation of -0.08, the correlation between bounce rate and rankings may seem out-of-whack, but this is not the case. Keep in mind that lower bounce rate is often a good indication of user engagement. Therefore, we find as bounce rates rise (something we often try to avoid), rankings tend to drop, and vice-versa.

This means that sites with lower bounce rates, longer time-on-site metrics, and more page views—some of the data points that SimilarWeb measures—tend to rank higher in Google search results.

While these individual correlations aren’t large, collectively they do lend credence to the idea that user engagement metrics can matter to rankings.

To be clear, this doesn’t mean to imply that Google or other search engines use metrics like bounce rate or click-through rate directly in their algorithm. Instead, a better way to think of this is that Google uses a number of user inputs to measure relevance, user satisfaction, and quality of results.

This is exactly the same argument the SEO community is currently debating over click-through rate and its possible use by Google as a ranking signal. For an excellent, well-balanced view of the debate, we highly recommend reading AJ Kohn’s thoughts and analysis.

It could be that Google is using Panda-like engagement signals. If a site’s correlated bounce rate is negative, that means that the website should have a lower bounce rate because the site is healthy. Similarly, if the time that users spend on-site and the page views are higher, the website should also tend to produce higher Google SERPs.

Global Rank correlations

SimilarWeb’s Global Rank is calculated by data aggregation, and is based on a combination of website traffic from six different sources and user engagement levels. We include engagement metrics to make sure that we’re portraying an accurate picture of the market.

If the website has a lower Global Rank on SimilarWeb, then the website will generally have more visitors and good user engagement.

As Global Rank is a combination of traffic and engagement metrics, it’s no surprise that it was one of the highest correlated features of the study. Again, even though the correlation is negative at -0.24, a low Global Rank is actually a good thing. A website with a Global Rank of 1 would be the highest-rated site on the web. This means that the lower the Global Rank, the better the relationship with higher rankings.

As a side note, SimilarWeb’s Website Ranking provides insights for estimating any website’s value and benchmarking your site against it. You can use its tables to find out who’s leading per industry category and/or country.


The Moz Search Engine Ranking Factors study examined the relationship between web search results and links, social media signals, visitor traffic and usage signals, and on-page factors. The study compiled datasets and conducted search result queries in English with Google’s search engine, focusing exclusively on US search results.

The dataset included a list of 16,521 queries taken from 22 top-level Google Adwords categories. Keywords were taken from head, middle, and tail queries. The searches ranged from infrequent (less than 1,000 queries per month), to frequent (more than 20,000 per month), to enormously frequent with keywords being searched more than one million times per month!

The top 50 US search results for each query were pulled from the datasets in a manner that did not account for location or personalization in a location- and personalization-agnostic manner.

SimilarWeb checked the traffic and engagement stats of more than 200,000 websites, and we have analytics on more than 90% of them. After we pulled the traffic data, we checked for a correlation using keywords from the Google AdWords tool to see what effect metrics like search traffic, time on site, page views, and bounce rates—especially with organic searches—have upon Google’s rankings.


We found a positive correlation between websites that showed highly engaging user traffic metrics on SimilarWeb’s digital measurement platform, and higher placement on Google search engine results pages. SimilarWeb also found that a brand’s popularity correlates to higher placement results in Google searches.

With all the recent talk of user engagement metrics and rankings, we’d love to hear your take. Have you observed any relationship, improvement, or drop in rankings based on engagement? Share your thoughts in the comments below.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Source: moz