Testing Tuesday: A weekly series on how to get the most of direct-response fundraising tests
It seemed like snake oil. The Sales Dude was selling us new email technique.
But one of the lessons you quickly learn in fundraising is not to trust your instinct too much. You'd be amazed how often gut instinct turns out to be dramatically wrong. So when you see a new idea, even if your instincts are screaming give me a break! -- you learn to listen for facts before you draw conclusions.
That's even more true in the fast-changing world of digital fundraising.
Sales Dude knew that, so he quickly trotted out test results for his new email technique.
In a recent test, our new technique beat the control by 50%, he crowed.
Around the table, several jaws dropped. You don't get an improvement that big very often.
Fortunately, there was a Math Nerd in the room who asked to see the numbers. Sales Dude smiled and passed a thick sheaf of printed-out spreadsheets.
Math Nerd leafed through the papers for a minute. Then zeroed in on what they were looking for: The control had brought in four responses. The test of the new technique brough in six.
Exactly 50%.
But completely useless.
It didn't prove that the new technique was snake oil, but it did prove that Sales Dude was a snake oil salesman.
Because while 6 really is 50% bigger than 4, it's also a difference so small it is statistically insignificant.
It's a number you can't trust. There's more noise than signal. If you ran the same test again, the chance of getting the same results would be purely random.
Because the results were so small, you haven't learned a thing.
But worse, it looks like you've learned something, which is the worst outcome of all.
Statistical significance is expressed by a percentage, and it's the likelihood that if you did a test again you'd get the same result.
People who understand testing require a significance of 90%, 95%, sometimes even more. Meaning there is a 90% (or more) chance that you'll get the same results again. At that level, you can pretty confidently say you've learned something from your test.
The percentage is derived from the quantity you test and the difference in response between the test panels.
If you test a very small number, you almost guarantee you won't reach significance. If you get a very small response rate, same thing.
If your panels give results that are very close -- even if you started with huge quantity -- sorry, not significant. Some people call it a "tie," though it's not exactly that. It's just untrustworthy information.
The math you do to calculate significance is complicated. Fortunately, there's an easy way: You can use a free online calculator.
My favorite significance calculator is the Mal Warwick DonorDigital Statistical Calculator, but google "statistical significance calculator," and you'll find many great options.
Every time you run a test, use one of these calculators to check the significance.
Always project significance up front to see if it even makes sense to test. Then run the calculation again with actual results to see if you have readable results.