Free for 90 days: Sign up now and get 90 days managed free after your first deposit. See offer details

Coming soon: our new one-on-one advice packages. Learn more

Now available: our new one-on-one advice packages. Learn more

Get your entire Smart Saver balance managed free for 3 months. Enroll today

Get your entire Smart Saver balance managed free for 6 months. Enroll today

Introducing Smart Saver: You could earn 1.83% with our low-risk investing account for your extra cash.* Learn more

Engineering at Betterment

Keeping Our Code Base Simple, Optimally

Betterment engineers turned regulatory compliance rules into an optimization problem to keep the code base simple. Here's how they did it.

Articles by Mike

By Mike Matsui
  |  Published: September 4, 2013

Interested in working at Betterment as an engineer? Learn more here.

At Betterment, staying compliant with regulators, such as the Securities and Exchange Commission, is a part of everyday life.  We’ve talked before about how making sure everything is running perfectly — especially given all the cases we need to handle — makes us cringe at the cyclomatic complexity of some of our methods.

It’s a constant battle to keep things maintainable, readable, testable, and efficient. We recently put some code into production that uses an optimizer to cut down on the  amount of code we’re maintaining ourselves, and it turned out to be pretty darn cool. It makes communicating with our regulators easier, and is doing so in a pretty impressive fashion.

We were tasked with coming up with an algorithm that, at first pass, made me nervous about all the different cases it would need to handle in order to do things intelligently. Late one night, we started bouncing ideas off each other on how to pull it off. We needed to make decisions at a granular level, test how they affected the big picture, and then adjust accordingly.

To use a Seinfield analogy, the decisions we would make for Jerry had an effect on what the best decisions were for Elaine. But, if Elaine was set up a certain way, we wanted to go back to Jerry and adjust the decisions we made for him.  Then George. Then Newman. Then Kramer. Soon we had thought about so many if-statements that they no longer seemed like if-statements, and all the abstractions I was formulating were already leaking.

Then a light came on. We could not only make good decisions for Elaine, Jerry, and Newman, we could make those decisions optimally.

A little bit of disclaimer here before we start digging in a little more: I can barely scratch the surface of how solvers work. I just happen to know that it was a tool available to us, and it happened to model the problem we needed to solve very well. This is meant as an introduction to using one specific solver as a way to model and solve a problem.

An example

Let’s say at the last minute, the Soup Nazi is out to make the biggest batch of soup he possibly can. For his recipe he needs a ratio of:

  • 40% chicken
  • 12% carrots
  • 8% thyme
  • 15% onions
  • 15% noodles
  • 5% garlic
  • 5% parsley

All of the stores around him only keep limited amounts in stock. He calls around to all the stores just to see what the have in stock and puts together each store’s inventory:

Ingredients in stock (lbs)

Elaine’s George’s Jerry’s Newman’s
Chicken 5 6 2 3
Carrots 1 8 5 2
Thyme 3 19 16 6
Onions 6 12 10 4
Noodles 5 0 3 9
Garlic 2 1 1 0
Parsley 3 6 2 1

Also, the quality of the bags at all of the stores vary, limiting the total number pounds of food the Soup Nazi can carry back. (We’re also assuming he only wants to make at most one visit to each store.)

Pound of food limits

Elaine’s 12
George’s 8
Jerry’s 15
Newman’s 17

With the optimizer, the function that we are trying to minimize or maximize is called the objective function. In this example, we are trying to maximize the number of pounds of ingredients he can buy because that will result in the most soup. If we say that,


a1 = pounds of chicken purchased from Elaine’s
a2 = pounds of carrots purchased from Elaine’s
a3 = pounds of thyme purchased from Elaine’s

a7 = pounds of parsley purchased from Elaine’s
b1 = pounds of chicken purchased from George’s

c1 = pounds of chicken purchased from Jerry’s

d1 = pounds of chicken purchased from Newman’s

We’re looking to maximize,

a1 + a2 + a3 … + b1 + … + d7 = total pounds

We then have to throw in all of the constraints to our problem. First to make sure the Soup Nazi gets the ratio of ingredients he needs:

.40 * total pounds = a1 + b1 + c1 + d1 
.12 * total pounds = a2 + b2 + c2 + d2
.08 * total pounds = a3 + b3 + c3 + d3
.15 * total pounds = a4 + b4 + c4 + d4
.15 * total pounds = a5 + b5 + c5 + d5
.05 * total pounds = a6 + b6 + c6 + d6
.05 * total pounds = a7 + b7 + c7 + d7

Then to make sure that the Soup Nazi doesn’t buy more pounds of food from one store than he can carry back:

a1 + a2 + … + a7 <= 12
b1 + b2 + … + b7 <= 8
c1 + c2 + … + c7 <= 15
d1 + d2 + … + d7 <= 17

We then have to put bounds on all of our variables to say that we can’t take more pounds of any ingredient than any store has in stock.

0 <= a1 <= 5
0 <= a2 <= 1
0 <= a3 <= 3
0 <= a4 <= 6

0 <= d7 <= 1

That expresses all of the constraints and bounds to our problem and the optimizer works to maximize or minimize the objective function subject to those bounds and constraints. The optimization package we’re using in this example, python’s scipy.optimize, provides a very expressive interface for specifying all of those bounds and constraints.

Translating the problem into code

If you want to jump right in, the full code sample can be found here. However, there are still a few more things to note:

  • Get numpy and scipy installed.

  • The variables we’re solving for are put into a single list. That means, x = [a1, a2, … , a7, b1, b2 … d7]. With python, it’s helpful to know that we can pull the pounds of food for a particular ingredient out of x,  i.e, [a1, b1, c1, d1] with

    x[ingredient_index :: num_of_ingredients]

    Likewise, we can pull out the ingredients for a given store with

    x[store_index * num_of_ingredients : store_index * num_of_ingredients + num_of_ingredients]

    e.g, [b1, b2, b3, b4, b5, b6, b7]

  • For this example, we’re using the scipy.optimize.minimize function using the ‘NLSQP’ method.

Arguments provided to the minimize function

Objective function

With the package we’re using, there is no option to maximize. This might seem like a show stopper, but we get around it by negating our objective function, minimizing, and then negating the results. Therefore our objective function becomes,

 −a1 − a2 − a3 − a4 − … − d6 − d7

And expressing that with numpy is pretty painless:

numpy.sum(x) * −1.0

Bounds

Bounds make sure that we don’t take more than any one ingredient than the store has in stock. The minimize function takes this in as a list of tuples where the indices line up with x. We can’t take negative ingredients from the store, so the lower bound it always 0. Therefore, 

[(0, 5), (0, 1) … (0, 1)]

In the code example, for readability, I threw all of the inputs into the program into some globals dictionaries. Therefore, we can calculate our bounds with,


def calc_bounds():
    bounds = []

    for s in stores:
        for i in ingredients:
            bounds.append((0, store_inventory[s][i]))

    return bounds

Guess

Providing a good initial guess can go a long way in getting you to a desirable solution. It can also dramatically reduce the amount of time it takes to solve a problem. If you’re not seeing numbers you expect, or it is taking a long time to come up with a solution, the initial guess is often the first place to start. For this problem, we made our initial guess to be  what each store had in stock, and we supplied it to the minimize method as a list.

Constraints

One thing to note is that for the packages we’re using, constraints only deal with ‘ineq’ and ‘eq’ where ‘ineq’ means greater than. The right hand side of the equation is assumed to be zero. Also, we are providing the constraints as tuple of dictionaries.

(a1 + b1 + c1 + d1) − (.40 * total pounds) > 0
...
(a7 + b7 + c7 + d7) − (.05 * total pounds) > 0

Note here that I changed the constraints from equal-to to greater-than because comparing floats to be exactly equal is a hard problem when you’re multiplying and adding numbers. Therefore, to make sure we limit chicken to 40% of the overall ingredients, one element of the constraints tuple will be,

{'type' : 'ineq',
 'fun' : lambda x : sum(extract_ingredient_specific_pounds(x, chicken)) − (calc_total_pounds_of_food(x) * .4) }

Making sure the soup nazi is able to carry everything back from the store:

12 − a1 − a2 − … − a7 >= 0

17 − d1 − d2 − … − d7 >=  17

Leads to,

{'type' : 'ineq',
  'fun' : lambda x : max_per_store[store] − np.sum(extract_store_specific_pounds(x, store))}

Hopefully this gives you enough information to make sense of the code example.

The Results?

Pretty awesome. The Soup Nazi should only buy a total of 40 lbs worth ingredients because Elaine, George, Jerry, and Newman just don’t have enough chicken.

9.830 lbs of food from Elaine's. Able to carry 12.0 pounds.
  chicken: 5.000 lbs (5.0 in stock)
  carrots: 0.000 lbs (1.0 in stock)
  thyme: 0.000 lbs (3.0 in stock)
  onions: 0.699 lbs (6.0 in stock)
  noodles: 1.000 lbs (5.0 in stock)
  garlic: 1.565 lbs (2.0 in stock)
  parsley: 1.565 lbs (3.0 in stock)
7.582 lbs of food from George's. Able to carry 8.0 pounds.
  chicken: 6.000 lbs (6.0 in stock)
  carrots: 0.667 lbs (8.0 in stock)
  thyme: 0.183 lbs (19.0 in stock)
  onions: 0.733 lbs (12.0 in stock)
  noodles: 0.000 lbs (0.0 in stock)
  garlic: 0.000 lbs (1.0 in stock)
  parsley: 0.000 lbs (6.0 in stock)
13.956 lbs of food from Jerry's. Able to carry 15.0 pounds.
  chicken: 2.000 lbs (2.0 in stock)
  carrots: 3.501 lbs (5.0 in stock)
  thyme: 3.017 lbs (16.0 in stock)
  onions: 4.568 lbs (10.0 in stock)
  noodles: 0.000 lbs (3.0 in stock)
  garlic: 0.435 lbs (1.0 in stock)
  parsley: 0.435 lbs (2.0 in stock)
8.632 lbs of food from Newman's. Able to carry 17.0 pounds.
  chicken: 3.000 lbs (3.0 in stock)
  carrots: 0.632 lbs (2.0 in stock)
  thyme: 0.000 lbs (6.0 in stock)
  onions: 0.000 lbs (4.0 in stock)
  noodles: 5.000 lbs (9.0 in stock)
  garlic: 0.000 lbs (0.0 in stock)
  parsley: 0.000 lbs (1.0 in stock)

16.000 lbs of chicken. 16.0 available across all stores. 40.00%
4.800 lbs of carrots. 16.0 available across all stores. 12.00%
3.200 lbs of thyme. 44.0 available across all stores. 8.00%
6.000 lbs of onions. 32.0 available across all stores. 15.00%
6.000 lbs of noodles. 17.0 available across all stores. 15.00%
2.000 lbs of garlic. 4.0 available across all stores. 5.00%
2.000 lbs of parsley. 12.0 available across all stores. 5.00%

Bringing it all together

Hopefully this gives you a taste of the types of problems optimizers can be used for. At Betterment, instead of picking pounds of ingredients from a given store, we are using it to piece together a mix of securities, in order to keep us compliant with certain regulatory specifications.

While there was a lot of work involved in making our actual implementation production-ready (and a lot more work can be done to improve it), being able to express rules coming out of a regulatory document as a series of bounds and constraints via anonymous functions was a win for the readability of our code base. I’m also hoping that it will make tacking on additional rules painless in comparison to weaving them into a one off algorithm.

Recommended Content

View All Resources
Get All the Returns You Deserve

Get All the Returns You Deserve

We set out to make investing more efficient—and we have. With our platform, your investor returns can get a boost of 2.66%.

The Difference Between Vanguard and Betterment

The Difference Between Vanguard and Betterment

I’ve been asked a lot in the last few months in the transition from my old role at Vanguard to my new role as Chief Growth Officer at Betterment – what’s the difference between the two companies?

How We Engineered Betterment’s Tax-Coordinated Portfolio™

How We Engineered Betterment’s Tax-Coordinated Portfolio™

For our latest tax-efficiency feature, Tax-Coordinated Portfolio, Betterment’s solver-based portfolio management system enabled us to manage and test our most complex algorithms.

How would you like to get started?

Your first step toward a smarter investing future starts here.

Create a Betterment account

Go ahead and join the smart, modern way to invest.

See what we can do for you

Tell us a bit about yourself, and we'll show you the benefits of investing with us.

Get a free investing checkup

Help us get a sense of your investing approach and see how you could improve.

Transfer a 401(k) or an IRA

Move an existing retirement account into a Betterment IRA.

Download the mobile app

Enjoy the Betterment experience anywhere on the go.

Search our site

For more information and disclosures about the Betterment Resource Center, click here. | See our contributors.