Survivorship Bias

meret118:

white-aster:

dieselpunkisdad:

I have posted about survivorship bias and how it affects your career choices: how a Hollywood actor giving the classic “follow your dreams and never give up” line is bad advice and is pure survivorship bias at work.

When I read up on the wikipedia page, I encountered an interesting story:

During WWII the US  Air Force wanted to minimize bomber losses to enemy fire. The Center for Naval Analyses ran a research on where bombers tend to get hit with the explicit aim of enforcing the parts of the airframe that is most likely to receive incoming fire. This is what they came up with:

image

So, they said: the red dots are where bombers are most likely to be hit, so put some more armor on those parts to make the bombers more resilient. That looked like a logical conclusion, until Abraham Wald – a mathematician – started asking questions: 

– how did you obtain that data?
– well, we looked at every bomber returning from a raid, marked the damages on the airframe on a sheet and collected the sheets from all allied air bases over months. What you see is the result of hundreds of those sheets.
– and your conclusion?
– well, the red dots are where the bombers were hit. So let’s enforce those parts because they are most exposed to enemy fire. 
– no. the red dots are where a bomber can take a hit and return. The bombers that took a hit to the ailerons, the engines or the cockpit never made it home. That’s why they are absent in your data. The blank spots are exactly where you have to enforce the airframe, so those bombers can return.

This is survivorship bias. You only see a subset of the outcomes. The ones that made it far enough to be visible. Look out for absence of data. Sometimes they tell a story of their own.

BTW: You can see the result of this research today. This is the exact reason the A-10 has the pilot sitting in a titanium armor bathtub and has it’s engines placed high and shielded.

If you want to think scientifically, ALWAYS ask what data was included in a conclusion. And ALWAYS ask what data was EXCLUDED when making a conclusion.

If they have excluded information because “it doesn’t exist” or “it was too hard to get” or “it was good data but was provided by people we don’t like”, then that is a BIG RED FLAG that the analysis was flawed.

Another example of this is originally doctor’s thought smoking protected people from developing dementia until someone pointed out it was because smokers didn’t usually live long enough to get the most common forms.

About C.A. Jacobs

Just another crazy person, masquerading as a writer.
This entry was posted in Uncategorized and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.