The Perils Of Survivorship Bias

An aspiring entrepreneur could be forgiven for thinking that dropping out of college to start a company is the key to success. After all, it worked beautifully for Steve Jobs, Bill Gates and Mark Zuckerberg. These business moguls’ well-known stories give the impression that to become a triumph in business, all you need is a big idea in college and the will to quit school to pursue it. The problem is that college dropouts do not usually become billionaires—there are many more budding entrepreneurs who dropped out of college to start companies and failed than those who succeeded. When you focus on the people who left school and made it big and ignore the far larger set of dropouts who never got anywhere, you are succumbing to what is known as “survivorship bias.” Sendhil Mullainathan, a professor of computation and behavioral science at the University of Chicago Booth School of Business, has thought a lot about how to avoid such logical errors. Recently, Katy Milkman, a professor at the Wharton School at the University of Pennsylvania, got to chat with Mullainathan about survivorship bias and the poor decisions it can produce in an interview for the podcast Choiceology. [An edited excerpt of the interview follows.] Can you explain what survivorship bias is? Imagine you got a letter in the mail that says, “Hey Katy, I have a new stock-picking trick. And since I know you don’t trust me yet, I want to tell you look at Whatever Incorporated tomorrow, and it’s going to go up.” You say, “Well, I don’t know.” But you look at it, and it went up. But anyone can get lucky. So the next week, you get another letter saying, “Tomorrow, I want you to look at Johnson Incorporated, and it’s going to go down.” Now you’re intrigued. You look at Johnson Incorporated, and it does go down. Now you’re waiting for the third letter. It does come, and it’s exactly right. Now the person says, “If you’re interested in having me as your advisor, you should call me.” Do you see where all this is headed? This is actually a thing that was run in the 1940s or maybe the 1930s. They sent a bunch of random guesses to 10,000 people. Half the time, they were right. So then, to that half of the people, they sent another bunch of random guesses, which, half the time, were right. When you start with 10,000, after four guesses, you’ve divided the pool by 16, which is still a pretty big population who now thinks you’re amazing. And what the population is suffering from here is survivorship bias. They’re the set of people who happened to survive, and so now they have this entirely false belief. Survivorship bias is an error that arises because we look at the data we have but ignore the selection process that led us to have those data. That principle applies in so many places, especially to people like you and me. Say more about that principle. How does it apply to us, specifically? Anybody who’s had a set of positive, lucky events that led them to be successful in life—they don’t think of themselves as people who happened to get Whatever Incorporated right and then happened to get Johnson Incorporated right. They think of themselves as talented people. I think of myself as very, very, very lucky, actually. Yeah, but it’s easy to think that we think of ourselves as lucky. I suspect we still think of ourselves as more talented than our equivalent self who didn’t catch the same breaks. I’m sure that’s true. Yet we give advice as if we know exactly how to succeed. Look at billionaires. No one says, “That’s a person who won a lottery ticket.” People say, “I would love advice from that person.” So I think survivorship bias really colors how we look at the world, because it leads us to look at these highly selected events and then make inferences and say, “Oh, that manager and that person must be good.” Are there other ways we can be biased by seeing only a select subsample of the data? My colleagues and I, we’ve been spending a lot of time looking at medical decision-making. Say you walk into an emergency room, and you might or might not be having a heart attack. If I test you, I learn whether I’m making a good decision or not. But if I say, “It’s unlikely, so I’ll just send her home,” it’s almost the opposite of survivorship bias. I never get to learn if I made a good decision. And this is supercommon, not just in medicine but in every profession. Similarly, there was a work done that showed that people who had car accidents were also more likely to have cancer. It was kind of a puzzle until you think, “Wait, who do we measure cancer in?” We don’t measure cancer in everybody. We measure cancer in people who have been tested. And who do we test? We test people who are in hospitals. So someone goes to the hospital for a car accident, and then I do an MRI and find a tumor. And now that leads to car accidents appearing to elevate the level of tumors. So anything that gets you into hospitals raises your “cancer rate,” but that’s not your real cancer rate. That’s one of my favorite examples, because it really illustrates how even with something like cancer, we’re not actually measuring it without selection bias, because we only measure it in a subset of the population. How can people avoid falling prey to these kinds of biases? Look at your life and where you get feedback and ask, “Is that feedback selected, or am I getting unvarnished feedback?” Whatever the claim—it could be “I’m good at blank” or “Wow, we have a high hit rate” or any sort of assessment—then you think about where the data comes from. Maybe it’s your past successes. And this is the key: Think about what the process that generated the data is. What are all the other things that could have happened that might have led me to not measure it? In other words, if I say, “I’m great at interviewing,” you say, “Okay. Well, what data are you basing that on?” “Well, my hires are great.” You can counter with, “Have you considered the people who you have not hired?” It’s a very simple thing, where you just need to ask the question: What’s the data that’s not present?

Sendhil Mullainathan, a professor of computation and behavioral science at the University of Chicago Booth School of Business, has thought a lot about how to avoid such logical errors. Recently, Katy Milkman, a professor at the Wharton School at the University of Pennsylvania, got to chat with Mullainathan about survivorship bias and the poor decisions it can produce in an interview for the podcast Choiceology.

[An edited excerpt of the interview follows.]

Can you explain what survivorship bias is?

Imagine you got a letter in the mail that says, “Hey Katy, I have a new stock-picking trick. And since I know you don’t trust me yet, I want to tell you look at Whatever Incorporated tomorrow, and it’s going to go up.”

You say, “Well, I don’t know.” But you look at it, and it went up. But anyone can get lucky. So the next week, you get another letter saying, “Tomorrow, I want you to look at Johnson Incorporated, and it’s going to go down.” Now you’re intrigued. You look at Johnson Incorporated, and it does go down. Now you’re waiting for the third letter. It does come, and it’s exactly right.

Now the person says, “If you’re interested in having me as your advisor, you should call me.” Do you see where all this is headed?

This is actually a thing that was run in the 1940s or maybe the 1930s. They sent a bunch of random guesses to 10,000 people. Half the time, they were right. So then, to that half of the people, they sent another bunch of random guesses, which, half the time, were right.

When you start with 10,000, after four guesses, you’ve divided the pool by 16, which is still a pretty big population who now thinks you’re amazing. And what the population is suffering from here is survivorship bias. They’re the set of people who happened to survive, and so now they have this entirely false belief.

Survivorship bias is an error that arises because we look at the data we have but ignore the selection process that led us to have those data. That principle applies in so many places, especially to people like you and me.

Say more about that principle. How does it apply to us, specifically?

Anybody who’s had a set of positive, lucky events that led them to be successful in life—they don’t think of themselves as people who happened to get Whatever Incorporated right and then happened to get Johnson Incorporated right. They think of themselves as talented people.

I think of myself as very, very, very lucky, actually.

Yeah, but it’s easy to think that we think of ourselves as lucky. I suspect we still think of ourselves as more talented than our equivalent self who didn’t catch the same breaks.

I’m sure that’s true. Yet we give advice as if we know exactly how to succeed.

Look at billionaires. No one says, “That’s a person who won a lottery ticket.” People say, “I would love advice from that person.”

So I think survivorship bias really colors how we look at the world, because it leads us to look at these highly selected events and then make inferences and say, “Oh, that manager and that person must be good.”

Are there other ways we can be biased by seeing only a select subsample of the data?

My colleagues and I, we’ve been spending a lot of time looking at medical decision-making. Say you walk into an emergency room, and you might or might not be having a heart attack. If I test you, I learn whether I’m making a good decision or not. But if I say, “It’s unlikely, so I’ll just send her home,” it’s almost the opposite of survivorship bias. I never get to learn if I made a good decision. And this is supercommon, not just in medicine but in every profession.

Similarly, there was a work done that showed that people who had car accidents were also more likely to have cancer. It was kind of a puzzle until you think, “Wait, who do we measure cancer in?” We don’t measure cancer in everybody. We measure cancer in people who have been tested. And who do we test? We test people who are in hospitals. So someone goes to the hospital for a car accident, and then I do an MRI and find a tumor. And now that leads to car accidents appearing to elevate the level of tumors. So anything that gets you into hospitals raises your “cancer rate,” but that’s not your real cancer rate.

That’s one of my favorite examples, because it really illustrates how even with something like cancer, we’re not actually measuring it without selection bias, because we only measure it in a subset of the population.

How can people avoid falling prey to these kinds of biases?

Look at your life and where you get feedback and ask, “Is that feedback selected, or am I getting unvarnished feedback?”

Whatever the claim—it could be “I’m good at blank” or “Wow, we have a high hit rate” or any sort of assessment—then you think about where the data comes from. Maybe it’s your past successes. And this is the key: Think about what the process that generated the data is. What are all the other things that could have happened that might have led me to not measure it? In other words, if I say, “I’m great at interviewing,” you say, “Okay. Well, what data are you basing that on?” “Well, my hires are great.” You can counter with, “Have you considered the people who you have not hired?”

It’s a very simple thing, where you just need to ask the question: What’s the data that’s not present?