Meet our writer
Staff Engineer, Betterment
Greg graduated from Cornell University with a Master's degree in computer science. As a software engineer, he has designed and built back-end systems and programming interfaces for corporate clients. He worked with Group Commerce, a white-label e-commerce platform for publishers, before joining Betterment in 2013.
Articles by Greg Sadowski
Guidelines for Testing Rails Applications
Discusses the different responsibilities of model, request, and system specs, and other ...Guidelines for Testing Rails Applications Discusses the different responsibilities of model, request, and system specs, and other high level guidelines for writing specs using RSpec & Capybara. Testing our Rails applications allows us to build features more quickly and confidently by proving that code does what we think it should, catching regression bugs, and serving as documentation for our code. We write our tests, called “specs” (short for specification) with RSpec and Capybara. Though there are many types of specs, in our workflow we focus on only three: model specs, request specs, and system specs. This blog post discusses the different responsibilities of these types of specs, and other related high level guidelines for specs. Model Specs Model specs test business logic. This includes validations, instance and class method inputs and outputs, Active Record callbacks, and other model behaviors. They are very specific, testing a small portion of the system (the model under test), and cover a wide range of corner cases in that area. They should generally give you confidence that a particular model will do exactly what you intended it to do across a range of possible circumstances. Make sure that the bulk of the logic you’re testing in a model spec is in the method you’re exercising (unless the underlying methods are private). This leads to less test setup and fewer tests per model to establish confidence that the code is behaving as expected. Model specs have a live database connection, but we like to think of our model specs as unit tests. We lean towards testing with a bit of mocking and minimal touches to the database. We need to be economical about what we insert into the database (and how often) to avoid slowing down the test suite too much over time. Don’t persist a model unless you have to. For a basic example, you generally won’t need to save a record to the database to test a validation. Also, model factories shouldn’t by default save associated models that aren’t required for that model’s persistence. At the same time, requiring a lot of mocks is generally a sign that the method under test either is doing too many different things, or the model is too highly coupled to other models in the codebase. Heavy mocking can make tests harder to read, harder to maintain, and provide less assurance that code is working as expected. We try to avoid testing declarations directly in model specs - we’ll talk more about that in a future blog post on testing model behavior, not testing declarations. Below is a model spec skeleton with some common test cases: System Specs System specs are like integration tests. They test the beginning to end workflow of a particular feature, verifying that the different components of an application interact with each other as intended. There is no need to test corner cases or very specific business logic in system specs (those assertions belong in model specs). We find that there is a lot of value in structuring a system spec as an intuitively sensible user story - with realistic user motivations and behavior, sometimes including the user making mistakes, correcting them, and ultimately being successful. There is a focus on asserting that the end user sees what we expect them to see. System specs are more performance intensive than the other spec types, so in most cases we lean towards fewer system specs that do more things, going against the convention that tests should be very granular with one assertion per test. One system spec that asserts the happy path will be sufficient for most features. Besides the performance benefits, reading a single system spec from beginning to end ends up being good high-level documentation of how the software is used. In the end, we want to verify the plumbing of user input and business logic output through as few large specs per feature that we can get away with. If there is significant conditional behavior in the view layer and you are looking to make your system spec leaner, you may want to extract that conditional behavior to a presenter resource model and test that separately in a model spec so that you don’t need to worry about testing it in a system spec. We use SitePrism to abstract away bespoke page interactions and CSS selectors. It helps to make specs more readable and easier to fix if they break because of a UI or CSS change. We’ll dive more into system spec best practices in a future blog post. Below is an example system spec. Note that the error path and two common success paths are exercised in the same spec. Request Specs Request specs test the traditional responsibilities of the controller. These include authentication, view rendering, selecting an http response code, redirecting, and setting cookies. It’s also ok to assert that the database was changed in some way in a request spec, but like system specs, there is no need for detailed assertions around object state or business logic. When controllers are thin and models are tested heavily, there should be no need to duplicate business logic test cases from a model spec in a request spec. Request specs are not mandatory if the controller code paths are exercised in a system spec and they are not doing something different from the average controller in your app. For example, a controller that has different authorization restrictions because the actions it is performing are more dangerous might require additional testing. The main exception to these guidelines is when your controller is an API controller serving data to another app. In that case, your request spec becomes like your system spec, and you should assert that the response body is correct for important use cases. API boundary tests are even allowed to be duplicative with underlying model specs if the behavior is explicitly important and apparent to the consuming application. Request specs for APIs are owned by the consuming app’s team to ensure that the invariants that they expect to hold are not broken. Below is an example request spec. We like to extract standard assertions such as ones relating to authentication into shared examples. More on shared examples in the section below. Why don’t we use Controller Specs? Controller specs are notably absent from our guide. We used to use controller specs instead of request specs. This was mainly because they were faster to run than request specs. However, in modern versions of Rails, that has changed. Under the covers, request specs are just a thin wrapper around Rails integration tests. In Rails 5+, integration tests have been made to run very fast. Rails is so confident in the improvements they’ve made to integration tests that they’ve removed controller tests from Rails core in Rails 5.1. Additionally, request specs are much more realistic than controller specs since they actually exercise the full request / response lifecycle – routing, middleware, etc – whereas controller specs circumvent much of that process. Given the changes in Rails and the limitations of controller specs, we’ve changed our stance. We no longer write controller specs. All of the things that we were testing in controller specs can instead be tested by some combination of system specs, model specs, and request specs. Why don’t we use Feature Specs? Feature specs are also absent from our guide. System specs were added to Rails 5.1 core and it is the core team’s preferred way to test client-side interactions. In addition, the RSpec team recommends using system specs instead of feature specs. In system specs, each test is wrapped in a database transaction because it’s run within a Rails process, which means we don’t need to use the DatabaseCleaner gem anymore. This makes the tests run faster, and removes the need for having any special tables that don’t get cleaned out. Optimal Testing Because we use these three different categories of specs, it’s important to keep in mind what each type of spec is for to avoid over-testing. Don’t write the same test three times - for example, it is unnecessary to have a model spec, request spec, and a system spec that are all running assertions on the business logic responsibilities of the model. Over-testing takes more development time, can add additional work when refactoring or adding new features, slows down the overall test suite, and sets the wrong example for others when referencing existing tests. Think critically about what each type of spec is intended to be doing while writing specs. If you’re significantly exercising behavior not in the layer you’re writing a test for, you might be putting the test in the wrong place. Testing requires striking a fine balance - we don’t want to under-test either. Too little testing doesn’t give any confidence in system behavior and does not protect against regressions. Every situation is different and if you are unsure what the appropriate test coverage is for a particular feature, start a discussion with your team! Other Testing Recommendations Consider shared examples for last-mile regression coverage and repeated patterns. Examples include request authorization and common validation/error handling: Each spec’s description begins with an action verb, not a helping verb like “should,” “will” or something similar.
Dealing With the Uncertainty of Legacy Code
To complete our portfolio optimization, we had to tackle a lot of legacy code. And then ...Dealing With the Uncertainty of Legacy Code To complete our portfolio optimization, we had to tackle a lot of legacy code. And then we applied our learnings going forward. Last fall, Betterment optimized its portfolio, moving from the original platform to an upgraded trading platform that included more asset classes and the ability to weight exposure of each asset class differently for every level of risk. For Betterment engineers, it meant restructuring the underlying portfolio data model for increased flexibility. For our customers, it should result in better expected, risk-adjusted returns for investments. However, as our data model changed, pieces of the trading system also had to change to account for the new structure. While most of this transition was smooth, there were a few cases where legacy code slowed our progress. To be sure, we don't take changing our system lightly. While we want to iterate rapidly, we strive to never compromise the security of our customers nor the correctness of our code. For this reason, we have a robust testing infrastructure and only peer-reviewed, thoroughly-tested code gets pushed through to production. What is legacy code? While there are plenty of metaphors and ways to define legacy code, it has this common feature: It’s always tricky to work with it. The biggest problem is that sometimes you're not always sure the original purpose of older code. Either the code is poorly designed, the code has no tests around it to specify its behavior, or both. Uncertainty like this makes it hard to build new and awesome features into a product. Engineers' productivity and happiness decrease as even the smallest tasks can be frustrating and time-consuming. Thus, it’s important for engineers to do two things well: (a) be able to remove existing legacy code and (b) not to write code that is likely to become legacy code in the future. Legacy code is a form of technical debt—the sooner it gets fixed, the less time it will take to fix in the future. How to remove legacy code During our portfolio optimization, we had to come up with a framework for dealing with pieces of old code. Here’s what we considered: We made sure we knew its purpose. If the code is not on any active or planned future development paths and has been working for years, it probably isn't worth it. Legacy code can take a long time to properly test and remove. We made a good effort to understand it. We talked to other developers who might be more familiar with it. During the portfolio update project, we routinely brought a few engineers together to diagram trading system flow on a whiteboard. We wrote tests around the methods in question. It's important to have tests in place before changing code to be as confident as possible that the behavior of the code is not changing during refactoring. Hopefully, it is possible to write unit tests for at least a part of the method's behavior. Write unit tests for a piece of the method, then refactor that piece. Test, repeat, test. Once the tests are passing, write more tests for the next piece, and repeat the test, refactor, test, refactor process. Fortunately, we were able to get rid of most of the legacy code encountered during the portfolio optimization project using this method. Then there are outliers Yet sometimes even the best practices still didn’t apply to a piece of legacy code. In fact, sometimes it was hard to even know where to start to make changes. In my experience, the best approach was to jump in and rewrite a small piece of code that was not tested, and then add tests for the rewritten portion appropriately. Write characterization tests We also experimented with characterization tests. First proposed by Michael Feathers (who wrote the bible on working with legacy code) these tests simply take a set of verified inputs/outputs from the existing production legacy code and then assert that the output of the new code is the same as the legacy code under the same inputs. Several times we ran into corner cases around old users, test users, and other anomalous data that caused false positive failures in our characterization tests. These in turn led to lengthy investigations that consumed a lot of valuable development time. For this reason, if you do write characterization tests, we recommend not going too far with them. Handle a few basic cases and be done with them. Get better unit or integration tests in place as soon as possible. Build extra time into project estimates Legacy code can also be tricky when it comes to project estimates. It is notoriously hard to estimate the complexity of a task when it needs to be built into or on top of a legacy system. In our experience, it has always taken longer than expected. The portfolio optimization project took longer than initially estimated. Also, if database changes are part of the project (e.g. dropping a database column that no longer makes sense in the current code structure), it's safe to assume that there will be data issues that will consume a significant portion of developer time, especially with older data. Apply the learnings to future The less legacy code we have, the less we have to deal with the aforementioned processes. The best way to avoid legacy code is to make a best effort at not writing in the first place. The best way to avoid legacy code is to make a best effort at not writing it in the first place. For example, we follow a set of pragmatic design principles drawn from SOLID (also created by Michael Feathers) to help ensure code quality. All code is peer reviewed and does not go to production if there is not adequate test coverage or if the code is not up to design standards. Our unit tests are not only to test behavior and drive good design, but should also be readable to the extent that they help document the code itself. When writing code, we try to keep in mind that we probably won't come back later and clean up the code, and that we never know who the next person to touch this code will be. Betterment has also established a "debt day" where once every month or two, all developers take one day to pay down technical debt, including legacy code. The Results It's important to take a pragmatic approach to refactoring legacy code. Taking the time to understand the code and write tests before refactoring will save you headaches in the future. Companies should strive for a fair balance between adding new features and refactoring legacy code, and should establish a culture where thoughtful code design is a priority. By incorporating many of these practices, it is steadily becoming more and more fun to develop on the Betterment platform. And the Betterment engineering team is avoiding the dreaded productivity and happiness suck that happens when working on systems with too much legacy code. Interested in engineering at Betterment? Betterment is an engineering-driven company that has developed the most-trusted online financial advisor based on the principles of optimization and efficiency. Learn more about engineering jobs and our culture. Determination of most trusted online financial advisor reflects Betterment LLC's distinction of having the most customers in the industry, made in reliance on customer counts, self-reported pursuant to SEC rules, across all online-only registered investment advisors.