Google’s John Mueller addressed the sphere of page pruning again, but this time explains that Google can handiest assume the final region quality in step with the pages they’ve of that region that is of their index. So for these who block Google from crawling and indexing sections of your region via noindex, nofollow, Google would possibly perhaps well also now not bear in mind of the pages which would be now not of their index for the final region quality.
John said this at the 20:05 tag into a video hangout from earlier this month:
About 15% of our crawlable pages admire a noindex nofollow tag to preserve far from duplicate content and other low quality pages from indexing. Can also this affect the final region quality or does Google handiest bear in mind of the index pages when evaluating the everyday of the positioning?
Drag. We handiest survey at the index pages in the case of notion the everyday of a web region.
But John goes on to conceal that on this case, it doubtlessly makes extra sense to now not block the pages but to utilize the rel=canonical attribute to point these signals to a single page, as an different of factual killing off the page totally from Google. John added:
In well-liked though, so one thing presumably factual taking a runt step help here you mentioned it is seemingly you’ll well also very successfully be the utilization of this noindex as successfully for duplicate content. In well-liked, I’d counsel the utilization of a rel canonical for duplicate content as antagonistic to a noindex. Without a index it is seemingly you’ll well also very successfully be telling us this page should always now not be listed in any appreciate. With a canonical it is seemingly you’ll well also very successfully be telling us this page is normally the identical as this other page, yeah. And that helps us attributable to then we are in a position to buy the total signals that we now admire for both of these pages and mix them into one. Whereas for these who factual admire a noindex or for these who block it with robots.txt then the total signals which would be linked with that page, that’s blocked or it has a noindex on it, are truly misplaced. So if anyone were to link that page and it is seemingly you’ll well even admire got it establish to noIndex, look after successfully they’re linking to nowhere. Whereas for these who had a rel canonical we would stare that link going to their page to note the rel canonical to the page you eradicate to admire listed and use that one for indexing.
Glenn Gabe has a terrific tweet and GIF summing it all up:
By the utilization of @johnmu: Google handiest looks to be to be like at listed pages when evaluating quality for a region. You would possibly perhaps well perchance presumably nuke or increase low quality content, but for duplicate content, use rel canonical as an different of noindex. Then Google can fling all signals to the canonical page: https://t.co/OZ2cWebV3Jpic.twitter.com/idwpda74Bp