Analysis of outcomes
Pain. Twelve ESs from 7 studies16,18-20,22,23,25 were computed for pain, ranging from -.501 to +.794. Because of borderline heterogeneity of the results for SAMe versus placebo (Q = 5.41; P= .067), a more conservative random effects model was used to compute the mean ES of .223 (P= .352; 95% CI, -.247 to .693). Homogeneity was present for SAMe versus NSAIDs (Q = 9.31, P= .317) and on the basis of a fixed effects model, the weighted mean ES was .122 (P= .057; 95% CI, -.029 to .273). Among the studies of SAMe versus NSAIDs, effect size was not related to study quality (P= .32), length of intervention (P= .31), or dosage of SAMe (P= .97). Finally, there was no evidence of publication bias according to the funnel P lot (Figure W1)* or the rank order correlation (P= .297) for studies of SAMe versus NSAIDs.
Functional limitation. Six studies17-20,24,26 contributed 10 effect sizes for functional limitation. The length of the intervention phase was 28 days to 42 days for all 6 studies. Only one study19 compared SAMe with placebo (ES = .309; P= .002; 95% CI, .098 - .519). Among the studies comparing SAMe with NSAIDs, there was homogeneity of results (Q = 2.53; P= .96) with a weighted mean ES of .025 (95% CI, -.127 to .176), indicating no difference between SAMe and NSAIDs with respect to functional limitation. There was no relationship of ES to study quality (P = .30), length of treatment (P= .71), or dosage of SAMe (P= .48). Both the funnel plot (Figure W2)** and the rank correlation of standardized ES and variance (P= .097) suggested no evidence of publication bias with respect to the functional limitation outcome for SAMe versus NSAIDs.
Adverse effects. Two studies16,19 reported adverse effects when comparing SAMe with placebo. Results were homogenous (Q = 2.035; P= .362), with a pooled OR of 1.37 (95% CI, .81 - 2.32). Among the studies comparing SAMe with NSAIDs results also were homogeneous (Q = 4.41; P =.622), with a pooled OR of .424 (95% CI, .294 - .611). Again, the effect size was not related to quality of study (P= .409), length of treatment (P= .367), or dosage of SAMe (P= .341). That is, those treated with SAMe were 58% less likely to experience side effects than those treated with NSAIDs. Further, this was independent of study quality, dosage of SAMe, or the length of the intervention.
As an additional indication of tolerability we compared the overall dropout rates due to side effects. The dropout rate was highest (6.9%) among those treated with NSAIDs, followed by those receiving placebo (5.0%). The dropout rate for SAMe users was lowest at 2.6%. The only significant difference was between those treated with SAMe and with NSAIDs (P= .001).
Results of this meta-analysis indicate that SAMe has a comparable effect to that of NSAIDs in reducing pain and functional limitation. In addition, there was significantly less likelihood of patients reporting adverse effects with the use of SAMe. When SAMe is compared with placebo, however, there is no differential effect on pain according to 2 studies, although there is minimal improved functional limitation according to one study. This improvement corresponds to a 15% decrease in functional limitation in the SAMe group as compared with placebo. The likelihood of adverse effects was similar in the 2 groups. Given the combined sample sizes in this meta-analysis, there was a more than 90% power to detect a moderate difference between groups at a .05 level of significance.
Several reporting issues were noted during the extraction of study data. Some researchers did not adequately describe study dropouts and how they were handled. Sample characteristics may have been reported for the initial sample, but there was no mention of the characteristics of the final sample, so that bias in subject loss could not be assessed in any studies that did not use intention-to-treat analysis. Some authors reported intervention results on the basis of the location of the OA, but only reported characteristics (age, sex, duration of disease) for the full sample. This precluded examining the relationship of intervention effect size to demographic characteristics. Finally, because not all authors provided complete descriptive statistics, we based the computation of the ES for one study on post-test scores only, rather than on the change from baseline, a strategy that could underestimate the ES. This potential underestimation occurred in a study with one of the larger sample sizes that, in turn, would carry more weight in the analysis.