Vertebroplasty, cognitive dissonance, and evidence-based medicine: What do we do when the ‘evidence’ says we are wrong?

In the study by Buchbinder et al, 2 the real treatment had no benefit in any primary or secondary end point. This study did not allow crossovers.

In the study by Kallmes et al, 1 more patients who received the real treatment reported clinically meaningful improvement in pain (a secondary end point), but the difference was not quite statistically significant (64% vs 48%, P = .06). In this trial, patients were allowed to cross over to the other study group after 1 month, and significantly more patients crossed over from the sham surgery group to the active treatment group than the other way around (43% vs 12%, P < .001).

My first instinct was to pick through the papers for flaws that would invalidate the results— and there were some problems. Both studies were initially planned to include more patients and therefore to have greater statistical power, but they were reassessed because of slow enrollment. In the study by Kallmes et al, 1 the difference in clinically meaningful improvement might have reached statistical significance if the trial had been larger. The study by Buchbinder et al 2 was a multicenter trial, but one center accounted for 53 (69%) of the 78 patients. Could this have biased the results?

The surgeon in me also seized for a while on the idea that since all of the interventions in both studies were done by interventional radiologists, the problem may have been in patient selection and that radiologists are not as astute as we are. However, even a surgeon’s ego cannot support this interpretation.

As I looked in more detail at the response I had written to these trials, I realized these criticisms were hardly fatal flaws, and the fact that two separate well-designed studies reached the same conclusion enhances their validity.

One concern that does bear some scrutiny is that the trials were too small to identify subgroups that may benefit from the procedure. In my experience, vertebral augmentation seems to have better results with certain types of fractures. Patients with a mobile pseudarthrotic cleft pattern of fracture seem to do much better than those with the more common nonmobile fracture.


Many commentaries on these two trials have discussed a famous study of a different procedure for a different condition. In this study, Moseley et al 26 evaluated the use of arthroscopy to treat osteoarthritis of the knee and found that sham arthroscopy was as effective as real arthroscopy and that both were better than conventional treatment.

I was not long out of my orthopedic residency when this trial was published and was very aware of the debate that preceded it, as I once had to prepare a talk about it for resident rounds. I remember that there was a lively debate in the orthopedic community over the efficacy of the procedure before the results of this trial were released.

In contrast, the vertebral augmentation controversy had become a debate about the relative efficacy and the economics of specific techniques, not about the effectiveness of the entire concept. The mainstream had accepted the validity of the procedure, which was not the case in the knee arthroscopy trial.

