Testing Tuesday: A weekly series on how to get the most of direct-response fundraising tests
Everyone tells you that you should be testing your fundraising messages.
But should you?
Possibly not. Most fundraisers shouldn't be testing.
Not because testing isn't worthwhile, but because the large majority of organizations don't have enough donors to give trustworthy test results.
For most nonprofits, the chance that test results will be untrustworthy are high. Almost certain. It would be like going to the doctor to test your blood for some condition, and finding out the chance of a false positive or false negative are more likely than an accurate result
You don't want that for your health or for your fundraising!
Here's a quick way to estimate if you have enough quantity to do a valid test:
- Determine your testing quantity, that is, the total number of pieces you are going to mail or email.
- Split that number in half, if it's a two-panel test -- thirds if it's a three-panel, etc.
- Project the likely response rate of the project. Make your best educated guess, erring on the side of pessimism.
- Calculate what that response rate is in numbers for each panel.
If that final number is under 100, don't test.
Because you don't have the quantity to give you valid results.
Example:
Say you want to test whether a blue envelope or a green envelope will work best in a mailing to your donors. It's your March appeal, generally an average performer.
- Your mailing list is 5,000.
- Each test panel will be 2,500.
- You project 6% response.
- That's 150 responses per panel.
Go ahead and test! You have a chance of getting valid, statistically significant results that you can believe.
But let's say you want to do the same test for a donor acquisition mailing to a list of 5,000.
Because it's donor acquisition, you project 1% response. That's 25 responses per panel.
Don't test!
Projecting 100+ responses per panel doesn't guarantee you'll get statistically significant results. But if you're below 100, you pretty much guarantee that you won't get it.
The whole issue here is statistical significance, which is how you separate "signal" from "noise" in testing.