In Part 1 we looked at how Intuit’s culture of design thinking has evolved to embrace rapid online experimentation. In Part 2 we will look into Design of Experiments. Any experiment is a continuous process of design, execute and analysis. Let’s take a closer look at each. Fig 1: Experimentation lifecycle flywheel Design There are some […]
In Part 1 we looked at how Intuit’s culture of design thinking has evolved to embrace rapid online experimentation. In Part 2 we will look into Design of Experiments.
Any experiment is a continuous process of design, execute and analysis. Let’s take a closer look at each.
There are some basic concepts in experimentation
E.g. the following two are treatments on the QuickBooks home page.
With the basic concepts defined , we conduct experiments at several distinct stages in the product lifecycle with the goal to continuously learn and iterate.
The experimentation funnel starts with simulations in our big data systems. This generally helps us validate a few hypotheses quickly and more importantly discard the ones that don’t make sense. Once a minimum viable product is developed we launch an Alpha. Alphas are playgrounds launched internally to our employees to get feedback with a goal to gather qualitative data. Post the alpha phase we launch Betas that are opt-ins (in our QuickBooks labs) again with the goal to gather qualitative data. Finally A/B or Multivariate Tests are online controlled experiments that we conduct to get quantitative data and tests for statistical significance. These are full blown experiments.
Our experimentation tool has a simple self service interface to create an experiment by specifying the name , duration — start date and end date — and region and setup treatments by specifying the allocations (or what % of traffic would be directed to it).
The treatment screen captures the name , optionally the factors and the allocation range — allocation range is used to determine if the user experiences the treatment. Technically if userId modulous 100 falls in this range than the user experiences this treatment.
A simple experiment on Sign-Up allocates 5% each on control and treatment. A user with userid modulus 100 = 25 maps to the control group and will be shown the default Sign-Up experience. Another experiment on Home Page allocates 10% each to control and treatment. Another user with userid mod 100 = 45 maps to the Home Page treatment. In the diagram above Sign-Up and Home Page are setup as two mutually exclusive experiments — i.e. at any given time a user is in one of the two experiments.
While the exclusive segments is a good way to separate traffic and avoid collisions it does not scale to the growing need of experimentation within Intuit . Since at any given time we need to run thousands of experiments we need a way to create overlapping segments. As such we need to think of experiments that can run orthogonal to each other with uniform random collisions.
In the figure above you have 5 groups of experiments that show the power of exclusive and orthogonal spectrums.
The groups A,B,C and D use an exclusive spectrum – the experiments across them don’t statistically collide with each other as that can pollute the results. Inside each group there may be overlapping orthogonal experiments that collide uniformly across control and treatments.
The Hashing Constant (hashId) (in Fig. 5) serves as an input to an MD5 based algorithm used to uniquely define the orthogonal plane. As we want to run thousands of concurrent experiments, different hashIDs imply that the randomizations between active experiments are orthogonal — Google , Microsoft Bing and LinkedIn use similar approaches.
We also allow to further segment users to target different sub-populations.
Built in Attributes — The platform has access to more than 300+ built-in customer attributes for experimenters. They range from static attributes such as user subscription status to dynamic attributes such as a member’s last login date. These attributes are either computed daily as part of our data pipelines or stored real time in our profile infrastructure.
Contextual Attributes — These attributes are only available at runtime, such as the browser type , user country or mobile device . For example, to target only requests coming from iPhone9, one just needs to inform the platform that an attribute called “deviceType” is to be evaluated at runtime, and target only those with the value equal to “iPhone9”.
In part 2 hopefully you got a good sense of how we design and setup experiments and how the platform is scaled to support thousands of concurrent experiments. In Part 3 we will look into the execution engine that serves experiments.
Experimentation @Intuit — Part 2 — Design of Experiments was originally published in QuickBooks Engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.