4 Evidence for Relative Clinical Effect

< previous | next >

This chapter outlines what sources of evidence are preferred when calculating relative clinical effect (treatment efficacy and adverse effects) for inclusion in an economic model. This chapter does not cover sources of evidence for estimating baseline risk of disease; health-related quality of life; or resource use.

All appropriate evidence relating to the pharmaceutical(s) and population under assessment should be identified, described and quality-assessed. The level of clinical evidence may vary depending on the level of analysis and time available to systematically review the evidence – for less detailed analyses, more opportunistic data may need to be used and less comprehensive critical appraisal undertaken.

For further details on how relevant clinical inputs are systematically identified and synthesised, please refer to the Guidelines for Funding Applications to PHARMAC, available at: Make a funding application

4.1   Data Sources

Key Recommendations: All appropriate levels of evidence should be identified; however, well-conducted randomised controlled trials (RCTs) and meta-analyses are the preferred data sources when estimating relative treatment effects. In the absence of valid RCTs, evidence from the highest available level of study design should be considered with reference to the limitations of the study design.

4.1.1   Key Data Sources

Key clinical data sources to be used when estimating relative treatment effects include published randomised controlled trials (RCTs), meta-analyses, and observational studies. Other possible sources include unpublished trial data, expert opinion, post-surveillance studies, and case reports (8, 13-15).

Details of the advantages and disadvantages of these data sources, including their recommended use, are outlined in Table 4.

Table 4: Data Sources

Data Source Recommended Use Advantages Disadvantages

Randomised controlled trials (RCTs)

All else being equal, published trials are preferred to unpublished trials, as the latter have not been formally peer reviewed. If the use of unpublished trials or abstracts/posters is necessary, these should be subject to the same quality assessment as published studies; hence, if there is insufficient information to assess quality, such data should be used with caution.

If published trials are available, data from unpublished trials should only be included as supplementary information, which could include clinical study reports (CSRs) from the pivotal trials.

External influences minimised through randomisation, patient selection, and double-blinding. This ensures that the effect is attributable to the intervention alone.

Selected patients, investigators and comparator treatments may result in poor external validity.

Often short time spans.

May be subject to publication bias.

Meta-analysis [3]

Meta-analysis may be useful when there is more than one key study or when results conflict between studies.

With more detailed analyses it may be necessary to undertake a meta-analysis if there are no published meta-analyses available.

A single study may be insufficiently powered to detect treatment effects.

Useful when results conflict between studies; when inappropriate comparators are used; or when a study consists of only one treatment arm.

Publication and inclusion biases (ie choice of studies included).

May be difficult to assess validity.

Incompatible studies may be included.

Observational studies [4]

Used to compare with the results of a clinical trial.

Observational studies are most useful when estimating baseline risk and modelling non-compliance.

More than one independent source should be examined in order to gain confidence in the validity of the conclusions.

High real-world relevance.

Allow observation of a new treatment on compliance and treatment-switching patterns.

Lack of control over confounding factors.

Underlying biases (selection bias, measurement bias, etc).

Lack of control groups.

Expert opinion

It is not recommended that expert opinion be used as the primary source for assessment of effectiveness. PHARMAC mainly uses expert opinion to review an economic model, in particular any clinical assumptions/extrapolations.

Clarification of unreliable, conflicting or insufficient clinical information in the literature.

Subject to selection bias.

Case reports

Generally not recommended that these be included in CUAs.

High real-world relevance.

High risk of bias.

Small numbers of patients.

Post-surveillance studies

Post-surveillance studies may provide useful information on the incidence and descriptions of adverse drug reactions.

High real-world relevance.

Lack of control groups.

Underlying biases.

4.2   Obtaining Data

4.2.1   Data Sources

Potentially useful information sources on clinical efficacy and event rates include:

Database searches should be supplemented by scanning references in articles and hand searching key journals.

Information on drug safety and international regulatory authorities can be found at:

It may also be useful to check the reviews of clinical evidence undertaken by international health technology assessment organisations. These include (but are not limited to):

4.2.2   Search Strategy

All evidence should be obtained systematically. Details of the search strategy used to retrieve clinical studies should be described, including the:

  • medium used to conduct the search and who conducted it
  • databases searched
  • when the search was undertaken
  • search strategy, keywords or MeSH headings used.

Published errata, corrections, retractions, editorials, commentaries, and journal correspondence relating to individual trials should be included in the search strategy.

The pre-defined inclusion and exclusion criteria used for selecting relevant studies should be clearly specified. The report should clearly state the reasons for excluding any studies.

4.3   Presentation of Evidence

For key trials, the following details should be included in the report:

(i)      Objective of trial.

(ii)      Study design including eligibility criteria, sample size, interventions (including dose and treatment duration), methods for randomisation and blinding, duration of follow-up, and outcomes measures and methods.

(iii)     Results including number of withdrawals and dropouts; and results for prospectively defined primary outcomes, secondary outcomes and adverse effects for the Intent To Treat (ITT) population.

Further details on analysing clinical trial data are included in section 5.1 (Transformation of Clinical Evidence).

4.4   Assessing Data Quality

Key Recommendation: Trials should be critically appraised using the Graphic Appraisal Tool for Epidemiology (GATE) framework (or other similar frameworks), with consideration given to the internal and external validity of the trials. Grades of evidence should be assigned, and assessment undertaken on the applicability of the trials to the New Zealand health sector. PHARMAC recommends that when high-quality studies are available, these should be the preferred data source when estimating relative treatment effects.

4.4.1   Critical Appraisal of Trials

PHARMAC recommends that clinical trials be critically appraised using the Graphic Appraisal Tool for Epidemiology (GATE) framework (16) or other similar frameworks.

The GATE framework involves the following five steps:

  1. Asking focused questions based on PECOT (Population, Exposure, Comparison, Outcome, Time) and RAMMbo (fair Recruitment, fair Allocation, fair Maintenance, fair Measurement of Outcomes).
  2. Searching the literature for best available evidence.
  3. Appraising the study by ‘hanging’ on the GATE frame.
  4. Assessing study quality.
  5. Applying the evidence in practice.

Details on the GATE framework, including critical appraisal spreadsheets, are available at: https://www.fmhs.auckland.ac.nz/en/soph/about/our-departments/epidemiology-and-biostatistics/research/epiq/evidence-based-practice-and-cats.html(external link).

The following table outlines key factors to consider when critically appraising a clinical trial.

Table 5: Key Factors to Consider in Critical Appraisal of Trials

Factors for Appraisal Questions to Consider

Internal validity – How reliable are the trial results?

Availability of data

Were all available trial data used?

Were there quality controls (eg was the trial published in a peer-reviewed journal)?

Number of patients

Was the sample size large enough to rule out effects due to chance (ie false negatives and false positives)?

Or was the effect large enough to be statistically significant even in a small sample size?

Method of randomisation, including adequate concealment

Was there likely to be any selection bias or confounding?

Was there adequate reporting of appropriate randomisation and how this was kept concealed?

Were patients, clinicians and assessors blinded?

Length and completeness of follow-up

Were patients followed for an adequate time period?

How often were patients assessed?

Was analysis by intention-to-treat (including drop-outs and deaths)?

Selection of endpoints

Were the endpoint/outcome measures relevant?

External validity – How relevant are the trial results?

Patient population

Was the patient population in the trial similar to those considered for funding?


Was the comparator consistent with current clinical practice in New Zealand?

Dose, formulation and administration regimen

Were these consistent with recommended treatment regimes in New Zealand?

The quality of studies tends to vary between therapeutic groups. For example, for cardiovascular drugs, a large number of RCTs are often undertaken involving large numbers of patients. However, for mental health drugs, in some cases it is more difficult to conduct good-quality RCTs due to poorer compliance rates and difficulties with recruitment. PHARMAC therefore recommends that the quality of the clinical evidence should be assessed relative to the ability to conduct good-quality RCTs within the therapeutic group. This recommendation will reduce biases against pharmaceuticals where it may be difficult to conduct high-quality RCTs.

It is also recommended that poor-quality data be explicitly highlighted, especially for therapeutic groups where high-quality, double-blinded trials are able to (and should) be conducted.

4.4.2   Grading the Evidence

Assigning levels of evidence to studies is useful for determining the weighting that should be placed on the results of an analysis when making a decision. Although the final scores are only guides, if a study rates poorly it is likely that the study is subject to significant biases; hence caution should be taken when interpreting the results.

There are many different methods of assigning levels of evidence, and there has been considerable debate about which method is best.

A commonly used checklist is that developed by the Scottish Intercollegiate Guidelines Network (SIGN), outlined below:

Table 6: SIGN Checklist

Level of Evidence Type of Evidence
1++ High-quality meta-analyses, systematic reviews of RCTs, or RCTs with a very low risk of bias.
1+ Well-conducted meta-analyses, systematic reviews, or RCTs with a low risk of bias.
1- Meta-analyses, systematic reviews, or RCTs with a high risk of bias.
2++ High-quality systematic reviews of case-control or cohort studies. High-quality case-control or cohort studies with a very low risk of confounding, bias or chance and a high probability that the relationship is causal.
2+ Well conducted case-control or cohort studies with a low risk of confounding or bias and a moderate probability that the relationship is causal.
2- Case-control or cohort studies with a high risk of confounding or bias and a significant risk that the relationship is not causal.
3 Non-analytic uncontrolled observational studies (cross-sectional studies, prospective longitudinal follow-up studies, retrospective follow-up case series, case reports).
4 Expert opinion and/or modelling in the absence of empirical data.

PHARMAC recommends that in cases where well-conducted RCTs, systematic reviews and meta-analyses are available (ie grade of evidence 1+ or 1++), these should be the preferred data source when estimating relative treatment effects. In such cases, studies with a grade of evidence below 1+ should be rejected. These studies should, however, be included in evidence tables of the report for discussion.

In cases where the clinical evidence on relative treatment effect is limited to RCTs with a high risk of bias (ie grade of evidence of 1-), good-quality observational studies (cohort studies and case-control studies) should also be considered. PHARMAC acknowledges that in some cases it may be necessary to use lower levels of evidence if this is all that is available. For example, trials on vaccines and medical devices may be of insufficient duration to evaluate long-term efficacy, and may only report intermediate endpoints. As lower-quality evidence increases the level of uncertainty in the analysis, conservative assumptions should be applied and extensive sensitivity analysis undertaken. See Chapter 5 for further details.

The SIGN checklist relates to the internal validity of the study and is used for assessing quality of evidence and risk of study bias. However, in assessing the effectiveness of the pharmaceutical, real-word relevance and clinical practice are also important. The patient population and treatment regimen used in the trial should be consistent with how the treatment will be used in New Zealand clinical practice.

4.4.3   Application of evidence to the New Zealand context

The following questions should be considered when assessing the applicability of the studies to the New Zealand health sector:

  1. Are there any known biological factors that may alter the effect of the pharmaceutical?
  2. What effects does the time of taking the pharmaceutical have?
  3. What effects do variations in the nature and severity of the disease have?
  4. Does the effectiveness of the pharmaceutical depend on the way it is administered and/or by whom (eg by a nurse rather than by the patients)?
  5. Is the giving or taking of the pharmaceutical part of a complex procedure with many components?
  6. Is any infrastructure required/available, such as monitoring with regular blood tests?
  7. Are there any other factors that may affect transferability of study results to the New Zealand setting?

Download as a PDF [PDF, 67 KB]

< previous | next >