ADVERTISEMENT

Simplifying the language of evidence to improve patient care

The Journal of Family Practice. 2004 February;53(2):111-120
Author and Disclosure Information

Strength of Recommendation Taxonomy (SORT): A patient-centered approach to grading evidence in the medical literature

Research evidence. This evidence is presented in publications of original research, involving collection of original data or the systematic review of other original research publications. It does not include editorials, opinion pieces, or review articles (other than systematic reviews or meta-analyses).

Review article. A nonsystematic overview of a topic is a review article. In most cases, it is not based on an exhaustive, structured review of the literature and does not evaluate the quality of included studies systematically.

Systematic reviews and meta-analyses. A systematic review is a critical assessment of existing evidence that addresses a focused clinical question, includes a comprehensive literature search, appraises the quality of studies, and reports results in a systematic manner. If the studies report comparable quantitative data and have a low degree of variation in their findings, a meta-analysis can be performed to derive a summary estimate of effect.

Most strength-of-evidence scales lack key elements

In March 2002, the Agency for Healthcare Research and Quality (AHRQ) published a report that summarized the state-of-the-art in methods of rating the strength of evidence.5 The report identified a large number of systems for rating the quality of individual studies: 20 for systematic reviews, 49 for randomized controlled trials, 19 for observational studies, and 18 for diagnostic test studies. It also identified 40 scales that graded the strength of a body of evidence consisting of 1 or more studies.

The authors of the AHRQ report proposed that any system for grading the strength of evidence should consider 3 key elements: quality, quantity, and consistency. Quality is the extent to which the identified studies minimize the opportunity for bias and is synonymous with the concept of validity. Quantity is the number of studies and subjects included in those studies. Consistency is the extent to which findings are similar between different studies on the same topic. Only 7 of the 40 systems identified and addressed all 3 elements.6-11

Strength of Recommendation Taxonomy (SORT) contains the key elements

The authors of this article represent the major family medicine journals in the United States and a large family practice academic consortium. Our process began with a series of electronic mail exchanges, was developed during a meeting of the editors, and continued through another series of electronic mail exchanges.

We decided our taxonomy for rating the strength of a recommendation should address the 3 key elements identified in the AHRQ report: quality, quantity, and consistency of evidence. We also were committed to creating a grading scale that could be applied by authors with varying degrees of expertise in evidence-based medicine and clinical epidemiology, and interpreted by physicians with little or no formal training in these areas. We believed that the taxonomy should address the issue of patientoriented evidence versus disease-oriented evidence explicitly and be consistent with the information mastery framework proposed by Slawson and Shaughnessy.2

After considering these criteria and reviewing the existing taxonomies for grading the strength of a recommendation, we decided that a new taxonomy was needed to reflect the needs of our specialty. Existing grading scales were focused on a particular kind of study (ie, prevention or treatment), were too complex, or did not take into account the type of outcome.

Our proposed taxonomy is called the Strength of Recommendations Taxonomy (SORT), and it is shown in Table 1. The taxonomy includes ratings of A, B, or C for the strength of recommendation for a body of evidence. The taxonomy also explains whether a body of evidence represents good-quality or limited-quality evidence, and whether evidence is consistent or inconsistent. The quality of individual studies is rated 1, 2, or 3; numbers are used to distinguish ratings of individual studies from the letters A, B, and C used to evaluate the strength of a recommendation based on a body of evidence. Figure 1 provides information about how to determine the strength of recommendation for management recommendations, and Figure 2 explains how to determine the level of evidence for an individual study. These 2 algorithms should be helpful to authors preparing papers for submission to family medicine journals. The algorithms are to be considered general guidelines, and special circumstances may dictate assignment of a different strength of recommendation (eg, a single, large, well-designed study in a diverse population may warrant an A-level recommendation).

Recommendations based only on improvements in surrogate or disease-oriented outcomes are always categorized as level C, because improvements in disease-oriented outcomes are not always associated with improve-ments in patient-oriented outcomes, as exemplified by several well-known findings from the medical literature. For example, doxazosin lowers blood pressure in African American patients—a seemingly beneficial outcome—but it also increases mortality.12 Similarly, encainide and flecainide reduce the incidence of arrhythmias after acute myocardial infarction, but they also increase mortality.13 Finasteride improves urinary flow rates, but it does not significantly improve urinary tract symptoms in patients with benign prostatic hypertrophy,14 while arthroscopic surgery for osteoarthritis of the knee improves the appearance of cartilage but does not reduce pain or improve joint function.15 Additional examples of clinical situations where disease-oriented evidence disagrees with patient—oriented evidence are shown in Table 2.12-24 Examples of how to apply the taxonomy are given in Table 3.