Obey the Testing Goat!

How to get Selenium to wait for page load after a click

Oft-heard is the folorn cry...

Every so often you get bitten by a weird behaviour in one of your Selenium tests. You tell it to click a link, and then you ask it something about the new page, and it returns you something from the old page:

old_value = browser.find_element_by_id('my-id').text
browser.find_element_by_link_text('my link').click()
new_value = browser.find_element_by_id('my-id').text
assert new_value != old_value ## fails unexpectedly

(There's another example, in chapter 20 of my book)

You scratch your head, and eventually conclude Selenium must be fetching the element from the old page. "Why would it do that?!", you exclaim in a programmer-rage, "In real life, when you click on a link, you see the browser starts to load a new page, and you wait for it to load, right? That's obviously what you'd want Selenium to do too, and it should be totally trivial to implement!"

"Selenium should just wait until the page has completed loading after you click!"

browser.find_element_by_link_text('my link').click() # should just block until the next page has loaded

... with a sane timeout perhaps. There's even a document.readyState API for checking on whether a page has loaded! Grrr...

The thing is that, from the Selenium point of view, it's not that simple (and I'm grateful for David from Mozilla (@AutomatedTester) for patiently explaining this to me, more than once.)

You see, Selenium has no way of telling whether you've asked it to click on a "real" hyperlink that goes to a new URL, or whether the link goes to the same page, or whether the click is going to be intercepted by some sort of JavaScript to do some rich UI stuff on the same page.

More than that, since Selenium webdriver has become more advanced, clicks are much more like "real" clicks, which has the benefit of making our tests more realistic, but it also means it's hard for Selenium to be able to track the impact that a click has on the browsers' internals -- it might try to poll the browser for its page-loaded status immediately after clicking, but that's open to a race condition where the browser was multitasking, hasn't quite got round to dealing with the click yet, and it gives you the .readyState of the old page.

So, instead, Selenium does its best. The implicitly_wait argument will at least put a little retry loop in if you try and fetch an element that doesn't exist on the old page:

browser.implicitly_wait(3)
old_value = browser.find_element_by_id('thing-on-old-page').text
browser.find_element_by_link_text('my link').click()
new_value = browser.find_element_by_id('thing-on-new-page').text # will block for 3 seconds until thing-on-new-page appears
assert new_value != old_value

But the problem comes when #thing-on-new-page also exists on the old page. So what to do?

The "recommended" solution is an explicit wait:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions

old_value = browser.find_element_by_id('thing-on-old-page').text
browser.find_element_by_link_text('my link').click()
WebDriverWait(browser, 3).until(
    expected_conditions.text_to_be_present_in_element(
        (By.ID, 'thing-on-new-page'),
        'expected new text'
    )
)

Several problems with that though:

HIDEOUSLY UGLY *
It's not generic -- even if I do write a nice wrapper, it's tedious to have to call it every time I click on a thing, specifying a different other thing to wait for each time
And it won't work for the case when I want to check that some text stays the same between page loads.

Really, I just want a reliable way of waiting until the page has finished loading after I click on a thing. I totally understand that David and pals aren't going to provide that for me by default because they can't tell what's a Javascript click and what's a click that goes to a new page, but I know. But how to do it?

Some things that won't work

The naive attempt would be something like this:

def wait_for(condition_function):
    start_time = time.time()
    while time.time() < start_time + 3:
        if condition_function():
            return True
        else:
            time.sleep(0.1)
    raise Exception(
        'Timeout waiting for {}'.format(condition_function.__name__)
    )


def click_through_to_new_page(link_text):
    browser.find_element_by_link_text('my link').click()

    def page_has_loaded():
        page_state = browser.execute_script(
            'return document.readyState;'
        ) 
        return page_state == 'complete'

    wait_for(page_has_loaded)

The wait_for helper function is good, but unfortunately click_through_to_new_page is open to the race condition where we manage to execute the script in the old page, before the browser has started processing the click, and page_has_loaded just returns true straight away.

Our current working solution

Full credit to @ThomasMarks for coming up with this: if you keep some references to elements from the old page lying around, then they will become stale once the DOM refreshes, and stale elements cause selenium to raise a StaleElementReferenceException if you try and interact with them. So just poll one until you get an error. Bulletproof!

def click_through_to_new_page(link_text):
    link = browser.find_element_by_link_text('my link')
    link.click()

    def link_has_gone_stale():
        try:
            # poll the link with an arbitrary call
            link.find_elements_by_id('doesnt-matter') 
            return False
        except StaleElementReferenceException:
            return True

    wait_for(link_has_gone_stale)

Or, here's a genericised, sanitized version of the same thing, based on comparing Selenium's internal "IDs" for an object, and made into a nice Pythonic context manager:

[update 2014-09-06 -- see the comments, it's possible that comparing ids is not as efffective as waiting for stale reference exceptions. Will investigate, but bewarned that YMMV for now.]

class wait_for_page_load(object):

    def __init__(self, browser):
        self.browser = browser

    def __enter__(self):
        self.old_page = self.browser.find_element_by_tag_name('html')

    def page_has_loaded(self):
        new_page = self.browser.find_element_by_tag_name('html')
        return new_page.id != self.old_page.id

    def __exit__(self, *_):
        wait_for(self.page_has_loaded)

And now we can do:

with wait_for_page_load(browser):
    browser.find_element_by_link_text('my link').click()

And I think that might just be bulletproof!

And for bonus points...

(credit to Tommy Beadle for this solution)

It turns out selenium has a built-in condition called staleness_of, as well as its own wait-for implementation. Use them, alongside the @contextmanager decorator and the magical-but-slightly-scary yield keyword, and you get:

from contextlib import contextmanager
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.expected_conditions import \
    staleness_of

class MySeleniumTest(SomeFunctionalTestClass):
    # assumes self.browser is a selenium webdriver

    @contextmanager
    def wait_for_page_load(self, timeout=30):
        old_page = self.browser.find_element_by_tag_name('html')
        yield
        WebDriverWait(self.browser, timeout).until(
            staleness_of(old_page)
        )

    def test_stuff(self):
        # example use
        with self.wait_for_page_load(timeout=10):
            self.browser.find_element_by_link_text('a link')
            # nice!

Note that this solution only works for "non-javascript" clicks, ie clicks that will cause the browser to load a brand new page, and thus load a brand new HTML body element.

Let me know what you think!

Comments

comments powered by Disqus

Obey the Testing Goat!

TDD for the Web, with Python, Selenium, Django, JavaScript and pals...

How to get Selenium to wait for page load after a click

Oft-heard is the folorn cry...

Some things that won't work

Our current working solution

And for bonus points...

Comments

Read the book

Reviews & Testimonials

Resources

Old TDD / Django Tutorial

Save the Testing Goat Campaign