Better results by breaking the rules: Duo MaxDiff

Consumer choice
Innovation

MaxDiff is a solid technique. I’ve used it, taught it, built variations of it—and I still recommend it all the time. It forces people to make real choices, it gets past the mushy middle of rating scales, and it gives you a clean, ranked list of priorities. What’s not to like?

Well… maybe the way it’s usually done.

Most MaxDiff exercises ask people to pick both a “most” and a “least” from a set of items. It’s just the standard. Over time, that standard has become something people—researchers, clients, everyone—accept without really thinking. We do it that way because that’s how it’s always been done.

But I started wondering whether that second pick, the “least,” was actually pulling its weight. Was it really giving us useful information? Or were we just going through the motions because the template said so?

We tested that curiosity. A lot. We took typical MaxDiffs (15 items, five per screen, basic stuff) and analyzed the results two ways: one using just the “most” selections and one using the typical full “most + least” approach. What we found was kind of ridiculous: the aggregate correlation between the two sets of utilities was 0.99. In other words, there was basically no difference between asking just “most” vs. asking “most” and “least.”

Even when we looked at the individual level, where noise usually creeps in, we saw an average correlation of 0.89. When we focused on the top half of the items in importance—the part that informs most real-world decisions—it was even tighter. Sure, there was a little drop in precision at the less important end, but let’s be honest—no one’s building strategy around their 13th or 14th ranked feature. Losing a bit of clarity down there isn’t a big deal.

So, we stopped asking about “least” and started using that space for something better.

Double the fun

Instead of asking “least,” we decided to ask another “most”: a second dimension. Same screen. Same number of items. Same cognitive load on the respondent. Twice the value.

The first time we did this was for a client in oral care. They were testing a dozen messages and had two goals: figure out which ones would motivate people to act (i.e. go to the dentist), and which ones would actually differentiate them from the other brands crowding the space.

They thought they’d need two separate studies, or at least a long, bloated survey. We gave them a more practical solution—one nine-screen Duo MaxDiff.

We asked:

“Which of these is most motivating?”

“Which is most differentiating?”

That’s it. No fancy build-up. As the results came in, we plotted the items on a perceptual map, putting motivation on one axis and differentiation on the other. The story basically told itself. Three messages stood out, a few were strong in one area but not the other, and some were just…there. The client made decisions faster than I’d ever seen them decide before. They were sold, asking for Duo MaxDiffs in plenty of studies since, just like other clients who have seen what it can do.

Why it works (and where it doesn’t)

We’ve used this across countless spaces—QSR, CPG, tech, retail—and it holds up. Respondents like it better, the data is cleaner, and the outputs are easier for clients to actually do something with. We’ve also learned what kinds of dimensions pair well (spoiler: not every combo plays nice), how to write instructions that don’t cause confusion, and—this one’s important—why we stop at two.

Because yeah, we tried three once. Just to see. It didn’t work. People either picked the same item three times, randomly tapped something, or just gave up. The data was garbage, and the client hated it, so we pulled the plug. Lesson learned: two’s your limit. After that, you’re just punishing people.

What I like most is that this didn’t come from theory. It came from asking real questions, pushing against the defaults, and building something better. That’s the kind of work we try to do all the time, not just with MaxDiff.

The wrap-up (not the polished kind)

MaxDiff still works, but that “least” pick you’re clinging to? It might be the least helpful thing in the whole survey.

You don’t need to blow up your research plan. You just need to stop assuming the standard version is the best version. If you’ve got a message test or prioritization problem, and you want something smarter than the copy-paste template, give us a call. We’ve already broken the method—might as well let us build the better version for you.

Connect with Rob Kaiser, Chief Methodologist at robkphd@psbinsights.com to learn more.

Double the fun

Why it works (and where it doesn’t)

The wrap-up (not the polished kind)

Time to tackle that thorny problem