Something I ran across a few years ago: http://www.buckeyefirearms.org/alternate-look-handgun-stopping-power
This study has appeared in several places including Ellifritz's own blog,
www.activeresponsetraining.net. It generated considerable controversy. It also met with considerable criticism some of which I think was excessive and unfair.
The biggest problem is that it doesn't quantify the statistical uncertainty in the results. It's not enough to say that a given caliber produces a one-shot stop X% of the time. If Ellifritz repeated the study with a different set of shootings, he wouldn't get exactly X% the second time. After many repetitions, he would be able to say that it was X% ± Y%. With enough data, Y would be small enough that you could see meaningful differences. Ellifritz says that he doesn't have enough data to get reliable results for Y. I believe him.
The second problem is that the study doesn't separate out several important independent variables. The ones I can think of are bullet design (brand and model, not just weight and expanding or not), where the target was hit (central nervous system, upper torso, peripheral areas) and whether drugs rendered the target insensitive to pain and shock. Even if Ellifritz limited bullet variation to expanding and non-expanding, he would need 2 x 3 x 2 = 12 times as much data. It took him several years to accumulate 1,800 data points. I wouldn't want to be the one to say, "Come back when you have 21,600 points."
It may be possible to get a hint at the internal consistency of the study by ranking calibers and looking at violations of those rankings. If all other factors are the same, a faster bullet should be at least as effective as a slower one and a bigger one should be at least as effective as a smaller one. With those assumptions, .44 mag >= .357 mag, .357 mag >= .38 special, 9 mm >= .380 ACP >= .32 >= .25 ACP. (>= means that the caliber on the left is at least as effective as the caliber on the right.) The parameter of most interest to me is failure to incapacitate since that's the situation in which you most desperately need the bullet to do its job. Violations in the rankings are .44 mag < .357 mag by 4%, .32 < .25 ACP by 5%. Everything else makes sense. Uncertainty in a result is inversely proportional to the square root of the number of samples. Since .44 mag and .32 were used least often, by more than a factor of two, this is a plausible explanation for these two exceptions.
Most stops are psychological. A combination of fear, shock and pain persuades the bad guy to cease his attack. If that's not enough, stopping him requires physical incapacitation through damage to the central nervous system or through blood loss resulting in loss of consciousness. For me, the one thing in Ellifritz's study that stood out, but may still not be statistically significant, is that failure to incapacitate was noticeably lower with .357 (mag and Sig) than with any of the other handgun calibers. The most popular self defense calibers, 9 mm, .40 S&W and .45 ACP, were identical. .380 ACP and .38 special aren't far behind. I wonder if .38 special isn't more like .380 than 9 mm.