Stability through Randomness

We only recently enabled test randomization and as a result found that some of our tests were failing. Through fixing the tests, we learned lessons that could help others have a less painful migration themselves.

If you’re writing tests for your Flutter application, it’s safe to assume that your goal is to build a robust, reliable piece of software that you can be confident in. However, if your tests aren’t run in random order, you may have a false sense of confidence that the assertions you’re making in them are actually accurate. By default, running flutter test will run your tests in the order they’re written within your test file.

In other words, the following test file will always exit successfully, despite the fact that there are obvious issues with how it’s set up.

example test that will always exit

In this case, our second test is relying on the side effects of the first test. Since the first test will always run before the second test, we’re not privy to this dependency. However, in more complex testing scenarios, this dependency won’t be as obvious.

In order to avoid test inter-dependency issues, we can instead run our tests in a random order (per file) by passing the  --test-randomize-ordering-seed flag to flutter test. The flag takes a seed that can be one of two things, either a 32 bit unsigned integer or the word “random”. 

To ensure true randomness, always pass “random” as the seed. The benefit of having the option to pass an integer as a seed becomes apparent once you come across a test that fails when run in an order other than that which it was defined. The test runner will print the seed it chose at the beginning of test execution, and you can reliably use that seed to reproduce the failure and be confident in your fix once the test begins passing.

passing integer as seed number

If you have been using the randomization flag since the inception of your codebase, you’re in a fantastic position and can be confident in your tests! If you haven’t, there’s no better time to start than now. Of course, introducing the flag may cause some tests to begin failing. Whether you choose to skip those tests while you work on fixing them so the rest of your team can keep chugging away, or address the issues immediately, the following tips should help you quickly identify where the issues are coming from and how to resolve them.

Tip 1: Assume every test within a test file will run first

The first snippet above highlights the anti-pattern of assuming a consistent test execution order. We can rewrite this test so that each test would pass if it were run first. Hopefully it’s easy to look past the trivial nature of using an int and imagine how this might apply to a more complex test case.

rewritten tests to ensure passing on first run

Tip 2: Keep all initialization & configuration code inside of setUp() methods

While it may be tempting to set up certain test objects directly in your main function, this can cause sneaky issues to crop up, especially when mocking or using mutable objects. 

Don’t

don't do this

Did you know that even when run sequentially, this will print A,B,D,C,E? This is because code in the body of the main function and the bodies of groups only runs once and it does so immediately. This can introduce sneaky testing bugs that may not surface until the tests themselves run in random order.

Do

do this

This will correctly print A,B,C,A,D,E (A prints twice because setUp is run before each test)

Tip 3: Scope test objects as closely as possible to the tests that need them

In the same way that we prefer to keep shared state as low in the Widget tree as possible, keep your test objects close to the tests that utilize them. Not only does this increase test readability (each set up method will set up only the dependencies needed for the tests below it and within the same scope in the testing tree), but this reduces the scope for potential problems.

Don’t

don't do this

Do

do this

By keeping test dependencies tightly scoped to where they’re used, we avoid the possibility that a test will be added or changed in such a way that impacts the tests previously consuming the dependency. Instead, when a new test is introduced that requires that dependency, the decision can be made to share it in such a way that its state gets reset prior to each test or to not share it at all and have each test create and set up the dependency itself. Keep in mind, descriptive group names go a long way in adding clarity to what dependencies that bucket relies upon. For example, a group named “when a user is logged in” tells me that the group of tests relies upon a user in the authenticated state. If I add another group named “when a user is logged out”, I would expect both groups to have setUp() methods that correctly create or set up the user model to have the correct authentication state.

Following the above tips should put you well on your way to fixing existing problems in your test suite or otherwise preventing them all together! So if you haven’t already, make sure to enable test randomization in your Flutter codebase today!