Sunday, December 22, 2024
spot_img

Top 5 This Week

Related Posts

Technical SEO testing: How Googlebot handles iframes

Earlier this year at SMX Advanced, I presented results from our Peak Ace test lab. These tests shed some light on several technical implementation points and how Googlebot would deal with them. 

One of my favorite tests examined Google’s indexing of iFramed URLs and their content. In my SMX Advanced presentation, I touched on various scenarios that may lead Google to index the content inside an iFrame, while “assigning” that content to its parent URL.

iFrame content will be attributed to its parent URL post-render.

The parent URL can, in some instances, rank for content that only exists in the iFramed URL and not in the parent URL.

Post-render, the parent page can now be found for content within the iFrame.Post-render, the parent page can now be found for content within the iFrame.

Naturally, this excited people – and all sorts of follow-up questions arose. Here are a few of them with my answers.

In the iFrame test, was the iFramed content coming from the same domain or a different one? 

My example showed two URLs that live on the same domain: domain.com/test.html would iFrame domain.com/tobeframedA.html, so that test.html could rank for content that only exists in tobeframedA.html

The same also works for externaldomain.com/tobeframedB.html – which can still cause test.html to rank for content only present in tobeframedB.html, as well as for iFrames residing on subdomains. We tested every combination we could think of and concluded that it made no difference where the iFrame content was hosted.

If you want to prevent someone from loading (and ranking for) your content in an iFrame, it would be a good idea to look into the X-Frame-Options Header. This indicates whether a browser should be allowed to render a page in an iFrame. 

If we were to use iFrames with a no-indexed content page, would the parent page still rank for the listed content with the intent to improve page speed?

As soon as the iFramed URL contains a meta robots noindex directive, the parent URL won’t be able to rank for the content from the iFramed URL.

iFramed URL containing meta robots noindex directive.iFramed URL containing meta robots noindex directive.

The same is true if you iFrame a URL that would be served with an X-Robots noindex header directive or is actively blocked using robots.txt.

As far as page speed is concerned, iFrames support the loading="lazy" attribute, which would defer offscreen iFrames from being loaded until a user scrolls near them. This is an elegant solution for speeding up loading times for URLs that depend on iFramed content.

Does Google give full value to semi-hidden content (content that typically comes after ‘Read more’)?

There doesn’t seem to be too much love for using “Read more” functionality within the ranks of Google. John Mueller went on record a couple of times here and here, questioning the use of the functionality in its entirety. Mueller added, “I don’t think you’d see a noticeable, direct change in SEO, […]”. 

When we tested it, the purpose of the test was to understand what difference the technical implementation could potentially make – and if, in general, content behind a “Read more” would be indexed (if correctly set up). 

The short answer: whether or not it was visible, the content would be indexed, found and returned.

However, content that was invisible during loading didn’t get highlighted in the snippet. The technical implementation didn’t make a difference (as long as the content was part of the HTML DOM at load), leaving you free to use display:none, opacity:0, visibility:hidden, etc.

That said, in my opinion, it is impossible – due to various factors outside of our control – to create a test setup that (including results) could provide an accurate answer regarding the “full value” part of the question. 

Did you mention that duplication in certain areas of the content can be fixed by CSS implementation since it is not indexed?

I did present some behavior that I find fairly interesting regarding CSS selectors. What technically happens is that selectors such as ::before create a pseudo element that is the first child of the selected element. In practice, this is often used to add cosmetic content to an HTML element. 

This could also be useful from an SEO point of view because Googlebot seems to treat this just as it would treat Chrome on desktop/smartphone. The rendered DOM remains unchanged (which is to be expected since it’s a pseudo class). As a result, content from within said selectors won’t be indexed.

Content CSS Selectors Not IndexedContent CSS Selectors Not Indexed

So, ultimately you could use this to prevent certain content from being indexed without keeping it from being displayed on the website. Maybe you have to display certain content that gets classified as “boilerplate” (e.g., shipping info, or legal info) or you want to create a certain content footprint. This opens up a great many possibilities to explore further.

Watch: Technical SEO testing in 2022: Separating fact from fiction

Below is the complete video of my SMX Advanced presentation.


Contributing authors are invited to create content for Search Engine Land and are chosen for their expertise and contribution to the search community. Our contributors work under the oversight of the editorial staff and contributions are checked for quality and relevance to our readers. The opinions they express are their own.


Popular Articles