It’s all One Big Decision: AB Testing, Predictive Analytics, and Behavioral Targeting
In all the excitement around BigData and Analytics right now, even otherwise-savvy users of business intelligence can get a bit confused between the concepts of A/B Testing, Predictive Analytics, and Personalization. These terms are all related, and when you boil each of them down you are left with the same underlying concept at their root: Decision Rules.
What is a Decision Rule
‘A decision rule is a function which maps an observation to an appropriate action’. – Wikipedia
Decision rules are instructions on how to act in a specific situation. In common language, we often phrase these rules in the form ‘If This Then That’. In these phrases, the ‘If This’ is the observation or context and ‘Then That’ is the action that should be taken.
So for example, a digital application might have a rule that visitors from mobile phones should get a site version appropriate for a mobile device. So the rule might look like:
If ‘Mobile Device’ then ‘Show Mobile Version’.
This is a fixed decision rule with a single condition and a single action, every time a Mobile Device visits, they see the Mobile Version.
We can make this gradually more complex in two ways: by adding conditions, or adding actions.
A/B Testing is really a colloquial term for a fairly basic form of comparison experiment. Consider this new rule:
If ‘*’ then (‘Show A’ XOR ‘Show B’), where ‘*’ means match everyone, don’t filter out any users.
Now, every user will see a version of the site: either A or B, but not both. Having more than one possible outcome for the same condition means this rule is Non-Deterministic, and can be imagined as adding a coin toss to the rule. Since the users are assigned to A or B based on a random coin-toss, we can attribute any behavioral differences between the users exposed to A and the users exposed to B back to our rule’s selection.
For standard online A/B tests, we 1) collect the data on each option; 2) compare the resulting user behavior; and 3) based on a basic statistical analysis, determine if there is a significant difference in behavior. Assuming there is a meaningful effect, and there is enough evidence to act on it, we use the higher-performing option moving forward.
We can extend this idea to targeting by changing the condition of our rule, and inserting specific attributes of the user. In a simple A/B targeting case, we can have one A/B test decision rule for each targeting variable. For example, maybe we want to see if visitors from Social Media sites (referred from Twitter, Facebook, etc.) will respond differently than visitors who arrive from search engines (Google, Bing, etc.). We will then have two rules:
- If (referrer=Social) Then (‘A’ xor ‘B’)
- If (referrer=Search) Then (‘A’ xor ‘B’).
We would then keep track of which type of visitors received each option so that we could then determine the option that works best for each user type. This type of testing is usually called a segmented AB test, because we have segmented our users before placing them into each test. Essentially, each segment is treated as a separate test.
The effect of running the AB tests is to convert our non-deterministic rules into a set of determinist ones. So in our case above, after the tests we will be left with 1 of the following set of fixed rules:
- If (*) Then ‘A’; if A performs best for everyone
- If (*) Then ‘B’; if B performs best for everyone
- If(Social) Then ‘A’ and If(Search) Then ‘B’; if A is best for Social and B is best for Search
- If(Social) Then ‘B’ and If(Search) Then ‘A’; if B is best for Social and A is best for Search
So we see that really, when we are trying to optimize, we are actually trying to pick the most valuable bundle of decision rules from a long (possibly very, very long) list.
Segmentation to Personalization
So far so good. But what if we had lots of targeting data we wanted to use to really hone a personalized experience for our users? Now we have many individual bits of information, such that there could be millions or even billions of possible unique segment combinations.
That is going to make it difficult to run your standard segmented AB style test here. Too many possible micro-segments to just enumerate them all, and even if you did, you wouldn’t have enough data to learn since most of the combinations would have few, if any, users in them (with just 30 binary user features – i.e.: mobile/not mobile - we wind up with a billion unique micro-segments.)
So if A/B tests aren’t going to work how can we learn what option is best for all of the combinations of targeting features we might encounter?
One way around having to write out all of the rules explicitly and run all those separate tests is to use mathematical functions to represent, or model, the relationship between the targeting features and how valuable each of the possible options are. How we go about selecting and calculating these relationships is the domain of Predictive Analytics.
As we add more targeting features, the number of conceptual rules will grow. A model, however, can take the place of all of these targeting rules. The model, then, is a compact way to represent all of the If/Then rules that we will need. So we might pass in the targeting features based on referrer, device type, and whether or not the customer is new. The general rule would look like:
If (Social | Mobile | New Customer) Then (‘A’ xor ‘B’)
And we would collect data just as we did in the previous situations. However, this time, rather than perform separate A/B tests, we would use the results for ‘A’ and ‘B’ to learn a model.
And this simple model might look like:
Value(A)= Base value of A + value of Social on A + value of Mobile on A + value of new customer on A
Value(B)= Base value of B + value of Social on B + value of Mobile on B + value of new customer on B
We still need to collect data on how well both A and B are performing, but rather than separating the results based on each unique segment, using some simplifying assumptions, we blend, or share the learning over the features.
Once the model has been learned, to get the best results in our application, we need to just select the option that has the highest value, for whatever input we pass to the model. So if ‘B’ is the better choice when both Social and Mobile are both present our rule would look like:
If Value(B| Social and Mobile) > Value(A| Social and Mobile) Then ‘B’
Putting it all together with Optimization
As we mentioned earlier Optimization is the process of selecting the most valuable rules from a list of possible rules. The optimal solution is the one that both discovers the best rules, but does so in the least amount of time, so that we may apply our rules as soon as possible.
The graphic above attempts to put it all together. The optimization process runs as follows:
- Observe data about the user.
- Ask the model which option it thinks has the highest value. Early on, the model may not have much information to answer this, so it may just suggest randomly selecting from our options so that it can learn a bit more.
- Based on the models suggestions, select from the possible options.
- After the selected option has been used for the user’s experience, observe user behavior.
- Send back the results (sales, downloads, etc.) to the model so that it can update its internal value of selecting that option when we observe the targeting data.
Over time, as the model builds up its learning, we can stop randomly selecting options, and instead select the option that our predictive model indicates has the highest value for the observed targeting data.
So for targeted optimization, we start off with all of the possible targeting rules that can be represented by our model – a huge number. As we learn the impact of each of the options, we begin to tune the model. This tuning is actually weeding out the poorly performing rules. Hopefully, over time, our model will converge (finish learning), and what we will be left with is the set of the most valuable targeting rules that we can use to improve our users’ experiences.