Beyond the White Coat

The type II error and black holes


An international group of scientists have announced they have an image of a black hole. This feat of scientific achievement and teamwork is another giant step in humankind’s understanding of the universe. It isn’t easy to find something that isn’t there. Black holes exist and this one is about 6.5 billion times more massive than Earth’s sun. That is a lot of “there.”

Dr. Kevin T. Powell, a pediatric hospitalist and clinical ethics consultant in St. Louis.

Dr. Kevin T. Powell

In medical research, most articles are about discovering something new. Lately, it is also common to publish studies that claim that something doesn’t exist. No difference is found between treatment A and treatment B. Two decades ago those negative studies rarely were published, but there was merit in the idea that more of them should be published. However, that merit presupposed that the negative studies worthy of publication would be well designed, robust, and, most importantly, contain a power calculation showing that the methodology would have detected the phenomenon if the phenomenon were large enough to be clinically important. Alas, the literature has been flooded with negative studies finding no effect because the studies were hopelessly underpowered and never had a realistic chance of detecting anything. This fake news pollutes our medical knowledge.

To clarify, let me provide a simple example. With my myopia, at 100 yards and without my glasses, I can’t detect the difference between Lebron James and Megan Rapinoe, although I know Megan is better at corner kicks.

Now let me give a second, more complex example that obfuscates the same detection issue. Are there moons circling Jupiter? I go out each night, find Jupiter, take a picture with my trusty cell phone, and examine the picture for any evidence of an object(s) circling the planet. I do this many times. How many? Well, if I only do it three times, people will doubt my science, but doing it 1,000 times would take too long. In my experience, most negative studies seem to involve about 30-50 patients. So one picture a week for a year will produce 52 observations. That is a lot of cold nights under the stars. I will use my scientific knowledge and ability to read sky charts to locate Jupiter. (There is an app for that.) I will use my experience to distinguish Jupiter from Venus and Mars. There will be cloudy days, so maybe only 30 clear pictures will be obtained. I will have a second observer examine the photos. We will calculate a kappa statistic for inter-rater agreement. There will be pictures and tables of numbers. When I’m done, I will publish an article saying that Jupiter doesn’t have moons because I didn’t find any. Trust me, I’m a doctor.

Science doesn’t work that way. Science doesn’t care how smart I am, how dedicated I am, how expensive my cell phone is, or how much work I put into the project, science wants empiric proof. My failure to find moons does not refute their existence. A claim that something does NOT exist cannot be correctly made by simply showing that the P value is greater than .05. A statistically insignificant P value also might also mean that my experiment, despite all my time, effort, commitment, and data collection, is simply inadequate to detect the phenomenon. My cell phone has enough pixels to see Jupiter but not its moons. The phone isn’t powerful enough. My claim is a type II error.

Proving zero difference with statistics is impossible. One needs to specify the threshold size of a clinically important effect and then show that your methods and results were powerful enough to have detected something that small. Only then may you correctly publish a conclusion that there is nothing there, a donut hole in the black void of space.

I invite you to do your own survey. As you read journal articles, identify the next 10 times you read a conclusion that claims no effect was found. Scour that article carefully for any indication of the size of effect that those methods and results would have been able to detect. Look for a power calculation. Grade the article with a simple pass/fail on that point. Did the authors provide that information in a way you can understand, or do you just have to trust them? Take President Reagan’s advice, “Trust, but verify.” Most of the 10 articles will lack the calculation and many negative claims are type II errors.

Dr. Powell is a pediatric hospitalist and clinical ethics consultant living in St. Louis. Email him at

Next Article: