Uncategorized

Statistical significance in marketing – not only data experts should know this

Gut feeling or data analysis? If you want to trust your own data, there is no getting around the topic of statistical significance. Our guest author explains why this is so.

Let’s not kid ourselves: many decisions in companies, and especially in marketing, are still made on the basis of gut instinct. Even if sentences like “Data are the new gold” have been preached for many years, series like “Mad Men” glorify courageous decisions that also bring great economic success. On the other hand, today – especially in marketing – the reality looks quite different at first glance: There is a clear performance orientation, especially in digital marketing. Much is quantifiable, what is done is what has been proven to work.

And yet: in the end, it is not uncommon for a gut decision to be made as to what should be attempted at all. And even if A / B tests are used to evaluate different variants of a landing page or different CTAs, there are still a number of pitfalls in interpreting the results.

This text is therefore a plea that everyone responsible for marketing should deal with the principle of statistical significance – and how it should be used when experimenting with marketing measures.



What is statistical significance and why is it important?

“If a statistical result is described as significant, this expresses that the probability of error that an assumed hypothesis also applies to the population is not above a fixed level,” at least explains Statista. To put it a little more simply: If a relationship that has been observed, for example, through a measurement, is statistically significant, then it does not just occur randomly in the measured data (sample), but can be generalized.

This also makes it clear what use statistical significance can have in decision-making: It can show relationships and thus provide information on how turning certain adjusting screws can have an effect. So basically it’s the opposite of making decisions based on gut instinct.

Almost finished!

Please click on the link in the confirmation email to complete your registration.

Would you like more information about the newsletter? Find out more now



How do you calculate statistical significance?

The chi-square test, which was first described in 1900 by the British mathematician Karl Pearson, plays an important role in the calculation of statistical significance. Without going into too deep detail at this point: As the name suggests, the squaring of the data plays an important role in tracking down possible variables – i.e. the adjustment screws just mentioned.

Caution, now comes a formula – the one with which the chi-square method is often used in our context:

statistically significant = probability (p)

In practice this means: The result of a test or an experiment is statistically significant if the probability (p for Probability) for the occurrence of this result is lower than the threshold value (a, also known as the alpha value). That means simply: Statistically significant means that the probability that the event occurred purely by chance should be extremely low. Instead, it was triggered by the set screw being examined.



Why is statistical significance fundamental in digital marketing?

It has probably already become clearer why checking (test) results for their statistical significance is so important, especially in marketing. A practical example will certainly help:

The marketing team wonders if a particular messaging (messaging A) works better than another (messaging B) on a landing page. An obvious test would be to turn the adjusting screw “Messaging” and see how this affects the conversion rate. If the conversion actually changes as a result of this test, the marketing team wants to know whether this change was just a coincidence or whether turning the adjusting screw was decisive – i.e. statistically significant.



How is statistical significance used in A / B testing?

How can the example just described be applied to practice? In many cases, an A / B test is useful for this. Because: If the marketing team just switches to messaging B and then looks at how the conversion rate changes, you are also comparing two different periods of time – before and after the messaging change. It is therefore better to split the traffic that comes to the landing page into group A, which sees the original messaging A, and group B, which sees the messaging B to be tested. This basic principle of the A / B test can be applied to many areas and questions in modern marketing. For example, one could examine the effects of turning the adjusting screws on the following areas:

  • Email: clicks, open rates, engagement
  • Reply to notifications
  • Conversion rates of push notifications
  • Customer reactions and surfing behavior
  • Reactions to product launches
  • Call-to-Action Interactions (CTA) on the website



The 6 steps to applying statistical significance to A / B testing

How does it work in practice if you want to find information on the important levers with A / B tests? Basically, you should always follow the following six steps:

Put simply, the null hypothesis means that turning the adjusting screw to be examined will not affect the desired result. In our example: Switching to Messaging B will not affect the landing page’s conversion rate. In this respect, the null hypothesis describes the benchmark.



2. Formulate an alternative hypothesis

The logical counterpart to the null hypothesis: the alternative hypothesis, which describes the desired effect. In the example above, this would be the assumption that Messaging B will significantly increase the conversion rate.



3. Set the test threshold

Next, the threshold a mentioned in the formula comes into play. The lower you set this threshold, the “stricter” the test, the clearer the relationship between the adjustment screw and the desired result must be in order to be considered statistically significant. The rule of thumb here is: the more extensive the adjustment screw (if, for example, the effects of a completely rebuilt landing page are to be examined), the higher the threshold value should be chosen. If, on the other hand, it is about a minor change (like another confirmation button), then the threshold should be lower.

Then it’s time for the actual A / B test. At this point you split the traffic and look at both variants over a certain period of time. In the example described above, one would only have to compare at the end of the test period whether the group with messaging A or B achieved the better conversion rate. If B shows the better results, the alternative hypothesis is initially confirmed.



5. Use the chi-square method

And now it’s getting really serious, because now the chi-square test comes into play. If you want to find out in detail how this works, you can take a look at how Scott Klemmer from the University of California in San Diego explains it.

In any case, the chi-square test will clarify whether the results are statistically significant – that is, whether the probability (p) is actually less than the threshold value (a).



6. Translate results into (meaningful) measures

Let us assume that in our example, Messaging B actually delivered a statistically significant and at the same time better conversion rate than Messaging A. In this case, everything speaks in favor of displaying all of the traffic that comes to the landing page in Messaging B.
If the result is not statistically significant, it does not immediately mean that Messaging B is out of the running. In this case (especially if the result is tight) you should first carry out a further, more extensive A / B test in which a larger sample is used, i.e. more traffic is examined.



Conclusion: What you should also pay attention to

Finally, it should not go unmentioned that when it comes to A / B tests, some typical mistakes are made over and over again, which are better avoided.

  • Use A / B test without need. Sometimes it is forgotten that A / B testing, even if performed optimally, takes time. Therefore, changes or marketing measures that are inexpensive or can be easily reversed can perhaps be dispensed with. However, especially in the case of irreversible changes, you should definitely check for statistical significance.
  • Too little variation or comparisons. Sad but true: Most of the time, reality is much more complicated than our example described above. This means that other levers could also interfere. This should be investigated through further tests.
  • Build in distortion. It can easily happen that when setting up an A / B test, the later results are unintentionally distorted – for example because it will produce certain results in certain regions of the world or for specific socio-demographic groups. Subsequently, however, you wrongly generalize the results and thus make completely wrong assumptions about your own target group.

If you keep these points in mind and follow the instructions outlined above, A / B tests checked for statistical significance are an extremely helpful tool. It can provide insights that can help companies in virtually any industry. Because: Your gut feeling may be right now and then, but statistical significance is never wrong if used correctly.

You might be interested in that too

Leave a Reply

Your email address will not be published. Required fields are marked *