Efficacy-effectiveness ratings: feasibility, reliability, and association with treatment effects.

Abstract text
Background: Clinicians and policymakers prioritize understanding intervention effects under highly controlled (efficacy) and “real world” (effectiveness) conditions.

Objectives: To determine the inter-rater reliability of a validated tool for evaluating trial efficacy-effectiveness (EE) and evaluate associations with treatment effects.

Methods: As part of a systematic review evaluating noninvasive positive pressure ventilation (NPPV) for adults with acute respiratory failure, investigator pairs independently rated EE of 69 randomized trials. We adapted a previously validated, 7-item instrument (Table) addressing setting, eligibility criteria stringency, clinically important health outcomes, intervention flexibility and followup duration, assessment of adverse effects, adequate sample size, and intent-to-treat analysis approach. Each item was scored 0 or 1 with total scores ranging from 0 to 7. Studies were categorized as efficacy (0-2), mixed (3-5) or effectiveness (6-7). We measured reliability with simple agreement and Kappa statistics. We used subgroup analyses, using consensus ratings, to determine if treatment effects varied by EE rating.

Results: Three experienced methodologists trained with 3 trial sets to develop operational definitions for each EE domain. Of the 69 studies, 17 were classified as efficacy, 50 mixed, and 2 effectiveness. Simple agreement was 79% and unweighted kappa 0.60. Pooled odds ratios (ORs) for NPPV effects on mortality varied by EE category: efficacy=0.56 (95% CI, 0.31-1.02), mixed=0.52 (95% CI, 0.41-0.66), and effectiveness=0.99 (95% CI, 0.66-1.49; p=0.02 for between group differences). Analysis of risk for intubation by EE category yielded similar results: ORs efficacy=0.29 (95% CI, 0.19-0.46), mixed=0.29 (95% CI, 0.21-0.41) and effectiveness=0.58 (95% CI, 0.16- 2.13; p=0.61 for between group differences).

Conclusions: EE are feasible and ratings can be made reliably but require calibration practice. In one test set with mostly mixed EE studies, some treatment effects varied by EE ratings. We are using this approach with an additional dataset and will present updated results.
Williams Jr JW1, Coeytaux RR2, McCrory DC1, Sanders GD2, Gierisch JM1
1 Duke Evidence-based Practice Center and Durham VAMC, U.S.A
2 Duke Evidence-based Practice Center, U.S.A
Presenting author and contact person
Presenting author: 
John Williams
Contact person Affiliation Country
John Williams (Contact this person) Duke University Medical Center USA
Date and Location
Oral session C15O4
Wednesday 3 October 2012 - 12:00 - 12:20