OSU logo Oklahoma State University
Center for Health Sciences
Medical Physiology - Evidence-Based Medicine
Printer Friendly


What is EBM?

Patient Care Model
Life-long Learning
Why is EBM Important?
Available Evidence?
EBM Issues

The Well-built Question

The EBM Process
Anatomy of a Question

Finding Evidence

Selecting a Resource
Searching the Resource
Reviewing Search Results
Returning to the Patient

Evaluating Evidence

Evaluating the Validity
Validity Questions


Knowledge Test

Multiple Sclerosis
Case #2
Case #3
Case #4




Evaluating the Validity of a Diagnostic Study

1. Was there an independent, blind comparison with a reference ("gold") standard?

Patients in the study should have undergone both the diagnostic test in question and the reference or "gold" standard.

The "gold" standard refers to the commonly accepted "proof" that they do or do not have the target disorder; the "gold" standard might be an autopsy or biopsy. The "gold" standard provides objective criteria (e.g., laboratory test not requiring interpretation) OR a current clinical standard (e.g., a venogram for deep venous thrombosis) for diagnosis. Sometimes there may not be a widely accepted "gold" standard; the author will then need to clearly justify their selection of the reference test.

Clinicians evaluating the tests should be blinded. The results of one test should not be known to those who are conducting or evaluating the other test.

2. Did the patient sample include an appropriate spectrum of patients to whom the diagnostic test will be applied in clinical practice?

For the information to be truly useful, the test should be applied to a broad spectrum of patients: those with mild and severe cases, as well as early and late cases, and with patients both treated and untreated for the target disease. The test should also be applied to patients with disorders that are commonly confused with the target disease.

3. Did the results of the test being evaluated influence the decision to perform the reference standard?

Researchers should conduct both tests regardless of the results of the test in question. Researchers should not be tempted to forego the "gold" standard test, if the outcome of the test in question is negative.

4. Were the methods for performing the test described in sufficient detail to permit replication?

The methodology for conducting the test should be presented in enough detail so that it can be conducted again within the appropriate setting. This may include dosage levels, patient preparations, timing, etc.

Key issues for Diagnostic Studies:

  • blinding 

  • identified gold standard test 

  • patient sample

  • each patient gets both tests


Are the results valid?

Likelihood ratios indicates the likelihood that a given test result would be expected in a patient with the target disorder compared to the likelihood that the same result would be expected in a patient without that disorder.




Test +



Test -



  • LR + = a / (a + c) divided by b / (b +d)

  • LR - = c / (a + c) divided by d / (b + d)

  • Sensitivity measures the proportion of patients with the disease who also test positive for the disease. [ a / (a + c)]

  • Specificity measures the proportion of patients without the disease who also test negative for the disease. [d / (b + d)]

  • A good test is both highly sensitive and highly specific.

Pretest Probabilities are estimated from published studies of prevalence, data from your practice setting, and your clinical intuition.

How much do LRs change disease likelihood?

LRs greater than 10 or less than 0.1

cause large changes

LRs 5 - 10 or 0.1 - 0.2

cause moderate changes

LRs 2 - 5 or 0.2 - 0.5

cause small changes

LRs less than 2 or greater than 0.5

cause tiny changes

LRs = 1.0

cause no change at all

5. Aside from the experimental intervention, were the groups treated equally?

Both groups must be treated the same except for administration of the experimental treatment. If "cointerventions" (interventions other than the study treatment which are applied differently to both groups) exist, they must be described in the methods section of the study.

Article: It appears that both groups were treated the same. There are no reported differences in co-interventions, follow-up or outcome measures.

6. Are the results of this study valid?

Article: This study methodology appears to be sound and the results are valid.

7. What are the results of the study?

Main results: Analysis was by intention to treat. Mortality did not differ between groups. 1181 deaths occurred in the digoxin group compared with 1194 deaths in the placebo group. (34.8% vs. 35.1%, P=0.8) The digoxin group however, had lower rates of hospitalizations overall compared with the placebo group (64.3% vs. 67.1%, (P=0.006).

Conclusions: Digoxin did not affect mortality but reduced hospitalizations in patients with heart failure and normal sinus rhythm.











nonsignificant p=0.08

Total hospitalization






hospitalization for CHF






hospitalization for CV causes






Reprinted with permission from the American College of Physicians. (ACP Journal Club 1997 Sep/Oct;127(2):34)

Key issues for studies of Therapy:

  • randomization 

  • follow-up (80% or better) 

  • blinding (the more blinding the better) 

  • baseline similarities (established at the start of the trial)


Key terminology for estimating the size of the treatment effect



Risk of outcome

+ -

Treated (Y)



Y = a/(a + b)

Control (X)



X = c/(c + d)

  • Relative Risk (RR) is the risk of the outcome in the treated group (Y) compared to the risk in the control group.
    RR = Y / X

  • Relative Risk Reduction (RRR) is the percent reduction in risk in the treated group (Y) compared to the control group (X).
    RRR = 1 - Y / X x 100%

  • Absolute Risk Reduction (ARR) is the difference in risk between the control group (X) and the treatment group (Y).
    ARR = X - Y

  • Number Needed to Treat (NNT) is the number of patients that must be treated over a given period of time to prevent one adverse outcome.
    NNT = 1 / (X - Y)

Source: Jaeschke R ; Guyatt G ; Sackett DL. Users' guides to the medical literature. III. How to use an article about a diagnostic test. A. Are the results of the study valid? Evidence-Based Medicine Working Group. JAMA 1994 Feb 2;271(5):389-91.

Note: For criteria for other types of studies, see the following
supplements:  Therapy | Prognosis | Etiology/ Harm

We now go to the next section, Knowledge Test: Case #1