Automated test optimization at 500px
The primary automated test suite at 500px contains over 11000 automated examples written in Rspec with Capybara. These tests run on development branches, as well as each time a commit is merged into master. They take an average of 20 minutes to run across six threads on Semaphore. We have dedicated test engineers (and co-ops!) responsible for helping maintain this test infrastructure as part of the 500px quality process. Reducing the run time of the test suite can have a significant impact on developer productivity. Let’s deep dive into two fixes we implemented to keep our test suite running smoothly.
Capybara wait time
Override the default max wait time for Capybara.
Capybara’s “have_selector” method is used to detect whether a given select matches an element on the page. It takes an optional “wait time” parameter. When “wait time” is not specified, it uses the configured global maximum wait time, which for us was 90 seconds. Often, tests want to verify that an element doesn’t exist on the page, which looks like:
expect(page).not_to have_selector(…).
In these cases, it’s important to consider wait times. If you don’t specify a wait time, the negated matcher will wait the default wait time before checking, so that any code responsible for removing the element has time to complete. But if the code being tested isn’t time-sensitive, this can cause large, unnecessary delays in the completion of the test. To avoid this problem, specify the wait time:
expect(page).not_to have_selector(“#element”, wait: 3).
This makes Capybara wait 3 seconds before checking the page for the existence of that element, rather than the default 90.
Instances of this issue can add up quickly. An audit for this issue found 6 instances of this pattern in our test suite, which was adding 9 minutes to the total run time.
Load balancing
Don’t waste time waiting
As mentioned before, we use Semaphore for our CI which runs our tests across 6 jobs. In order to optimize total run time, we need to put some thought into how we split up the tests. Previously, we balanced tests across threads based on the number of lines in the test file, but this wasn’t very effective, because the number of lines in a test file doesn’t necessarily reflect the test’s run time. Now we’re using a gem called Knapsack. This gem uses a JSON manifest that keeps track of every test’s run time and uses that information to balance which jobs get which tests, allowing them to all finish around the same time. This change reduced our test suite’s average run time from 35+ minutes to 20 minutes.

However, this solution had a problem: over time, the JSON manifest that Knapsack depends on could get out of date, because we’re constantly adding and removing new tests. Ideally, the JSON manifest would be automatically updated.
To automate this, we needed to automatically update the JSON manifest based on the result of each test run. Since these tests run in parallel across 6 threads, we could have had each thread just write their results to the manifest. However, this proved to be impractical as it caused a lot of threadlock. Threads would finish around the same time and wait to write to the JSON manifest one by one. Instead, we decided to have all 6 threads upload their results after each run, and read those files and combine them into the complete manifest at the start of each run.
In the end

Our automation now load balances itself, and updates the run times of each spec in preparation for the next run. The tests are less flaky, and timeout failures are less costly. Adding specs is effortless, and test run times are minimized. Faster tests = happy devs!