Sampling for NPS surveys

Published in

Envoy Engineering

4 min readNov 21, 2017

There’s been burgeoning adoption of Net Promoter Score (NPS) as a measure of customer satisfaction and a predictor of growth. NPS excels as an industry benchmark because it is, in majority cases, determined by asking a single survey question:

How likely is it that you would recommend our company/brand/product to a friend or colleague?

Here’s how it works: respondents to NPS surveys answer on scale of 0 to 10. Score a 9 or 10 , you’re a Promoter. 7 or 8, you’re a Passive. And anything lower, that makes you a Detractor.

Promoters get assigned a +1, Passives a 0 and Detractors a -1. This figure is averaged out and multiplied by 100 to give an NPS, which can fall anywhere between -100 (worst) and 100 (best).

Many of the world’s top companies regularly conduct NPS surveys and benchmark themselves against their industry average, which runs a wide gamut from 31 for telecommunications to 74 for healthcare. Tesla, for example, is doing great at 97, while Pacific Gas and Electric Corp not so much at -6.

The sampling problem

At Envoy, we run NPS surveys daily, both in-app and through email. In particular, we care about 30-day NPS — calculated by taking the mean NPS over the last 30 days.

Because we don’t want to spam our customers with surveys all the time, we throttle survey-sending to once every 90 days. This immediately presents a sampling problem: How can we determine the number of customers to survey each day, if we want this number x to draw a smooth curve over time?

For starters, we obviously can’t survey 100% of eligible customers at the start of every quarter, because then we’d have no respondants in the second and third months and our 30-day NPS would be irrelevant during those periods.

Sampling 100% of customers results in months with no data

Intuitively, we ought to sample 1/90 of our customer base each day to send surveys out to. This is easier said than done. An approach we employed was to pull lists of eligible customers each day, pick 1.11% of them at random, and send them surveys.

This worked great on the first day.

On the second day, what needs to happen is: people surveyed on the first day have to be taken out of the pool. New eligible customers from the day before have to be added to the pool. When you include such complexities as only sending surveys on weekdays during work hours, and taking into account global timezones, this gets messy real fast.

On day 91, we have to do all that, plus add the people that were surveyed on day 1 back into the pool before sampling. This also means that 90 days is the least amount of time between surveys, when what we really want is for 90 days to be the exact amount of time between surveys.

# Sample data frame
set.seed = 66
df_sample <- df[sample(nrow(df), nrow(df)/90), ]

Looking back, this was a terrible idea.

Smarter sampling

After some iterating, it turns out that the better solution is simply to assign each customer a random number between 1 and 90. This way, day 1’s cohort would comprise all the 1’s, day 2’s all the 2’s, and day 91’s all the 1’s again. We can write it this way:

# Cohort eligible customers
set.seed = 66
df <- mutate(df, cohort = sample(1:90, nrow(df), replace = TRUE))

Then, pick an arbitrary start date and determine today’s cohort:

# Determine current cohort
freq <- 90
start <- as.Date("1970-01-01")
cohort <- as.integer(Sys.Date() - start) %% freq + 1

Finally, put it all together by identifying the correct cohort to survey:

# Sample data frame
df_sample <- filter(df, cohort == cohort)

Remember how we were talking about drawing a smooth curve of customers surveyed over time? If you model out what we’ve put together, you’ll find a curve with decreasingly small steps around day 91 and every 90 days thereafter. The slope of the curve also closely mimics that of your customer growth, which is exactly what we’re looking for.

Why sample manually?

Several NPS survey providers (Wootric and Delighted, to name a couple we’ve used) offer solutions to this very problem. They do a great job throttling surveys — all you really need to do is upload a CSV file of your customers, and they’ll take care of the rest.

So why would you roll your own sampling solution?

The simple answer is control. At Envoy, we always want to provide the best possibly customer experience, so our criteria for selecting customers to survey includes:

Have been on the platform for at least a month;
Have used the product at least once over the last 7 days;
Can only receive surveys between 8:00 AM and 5:00 PM on weekdays;
Belong to a particular set of users defined by permission level;
Work at companies that are active paying customers.

What we’ve discovered is that putting in more thought into your sample results in a more accurate and authentic NPS. This process, which we run on Heroku off an R buildpack, helps us to continue building great products around customer satisfaction and feedback.

Sampling for NPS surveys

The sampling problem

Smarter sampling

Why sample manually?

Written by Kai Chan