OUP user menu

★ Editor's Choice ★

Clinical trials and tribulations—lessons from pulmonary fibrosis

A.L. Olson, J.J. Swigris, K.K. Brown
DOI: http://dx.doi.org/10.1093/qjmed/hcs066 1043-1047 First published online: 29 May 2012


Idiopathic pulmonary fibrosis (IPF) is a dreadful disease that lacks adequate therapy. A number of treatment trials have been performed and have utilized a variety of primary efficacy endpoints. Endpoints that provide the most useful efficacy information are clinical endpoints that are directly related to how a patient feels, functions or survives. Unfortunately, there are no properly established patient-reported outcome measures or measures of functional status in IPF, making survival the most robust primary efficacy endpoint. Clinically meaningful events such as hospitalization can also provide important efficacy information. The use of non-validated surrogate endpoints as primary outcome measures often leads to uncertainty when interpreting trial results.







Idiopathic pulmonary fibrosis (IPF) is a dreadful disease characterized by progressive impairment in quality of life, increasingly limited physical function and an early death from respiratory failure. With this to look forward to, it is no surprise that patients and their physicians are desperate for treatment options that might change the outcome. However, convincing evidence of clinically meaningful benefit requires a well-designed treatment trial, a process that requires considerable time and enormous resources. As with any deadly disease, the combination of desperation, the costly and time-consuming process necessary to confirm clinical benefit and limited resources have lead to significant challenges in the development, performance and interpretation of results of IPF treatment trials.

Design of a treatment trial

A well-designed treatment trial provides a therapy the opportunity to show definitive evidence of clinical efficacy with an acceptable safety profile. In order to create this design, a full knowledge of the natural history of the disease and the biologic mechanisms responsible for its clinical effects, as well as the proposed intervention’s intended and unintended mechanisms of action is particularly useful. Unfortunately, we do not have this level of understanding in IPF or for any of its proposed therapies, but while none of the previously performed trials in IPF has been perfect, each has given us additional insight into both the clinical and biologic features of the disease that were otherwise unobtainable in cohort or natural history studies.1

While all of the efficacy and safety endpoints utilized in a treatment trial provide important information, whether a trial is considered to be positive or negative is determined by the effect of the intervention on the prospectively defined primary endpoint. Moreover, because of this, the choice of the primary endpoint in a trail designed to provide definitive evidence of efficacy is critical. One is most confident of the benefits of a therapy when a positive trial’s primary efficacy endpoint is a ‘clinical endpoint’. Clinical endpoints are ‘characteristics or variables that reflect how a patient feels, functions, or survives’2—that is, the endpoints measure something unquestionably relevant to a patient.

Potential endpoints

There has been no consensus on the appropriate primary endpoint for IPF treatment trials, and a variety of primary endpoints have been utilized over the last decade of research (Table 1). The following sections review the utility of a number of these.

View this table:
Table 1:

The primary endpoints used in IPF treatment trials over the past decade

StudyDrugPrimary endpoint
GIPF-00121IFN-γMedian time to death or disease progression (decrease of at least 10% in the predicted FVC or an increase of at least 5 mm Hg in P(A-a)O2 at rest)
INSPIRE4IFN-γOverall survival time
IFIGENIA22N-acetylcysteineAbsolute changes in vital capacity and DLCO between baseline and Month 12
BUILD-18BosentanChange in 6-minute-walk distance from baseline up to Month 12
BUILD-323BosentanTime to IPF worsening (a confirmed decrease from baseline in FVC ≥10% and DLCO ≥ 15%, or acute exacerbation of IPF)
Shionogi12PirfenidoneThe difference in the change in the lowest oxygen saturation by pulse oximetry (SpO2) during a 6-min exercise test
Shionogi24PirfenidoneChange in vital capacity from baseline to Week 52
CAPACITY 1/225PirfenidoneChange in percent predicted FVC at Week 72
Gleevec26GleevecTime to disease progression (10% decline in percent predicted FVC from baseline) or death
STEP9SildenafilProportion of patients with an increase in the 6-min walk distance of ≥20%
BIBF-112010Tyrosine kinase inhibitorAnnual rate of decline in FVC

Clinical endpoints


As IPF is a life-shortening disease, survival is a most robust primary endpoint and it is difficult to imagine a situation where a treatment could be considered conclusively efficacious without demonstrating a survival benefit.3 A single trial with overall survival time as the primary outcome measure has been performed in IPF4; and confirming the robustness of this study design, this trial ended years of argument about the potential benefit of interferon-γ (IFN-γ) when its results provided conclusive evidence of lack of efficacy.


An increase in dyspnea, a progressive decline in quality of life and a loss of functional capacity occurs in almost all IPF patients. Additional clinically meaningful events including respiratory failure, hospitalization and/or lung transplantation will also occur in some patients.

Patient-reported outcomes

Patient-reported outcomes (PROs) are ‘any report of the status of a patient’s health condition that comes directly from the patient, without interpretation of the patient’s response by a clinician or anyone else’.5 PROs can focus on specific symptoms or on more general concepts such as quality of life. Two IPF-specific PROs (designed to assess health-related quality of life) have been developed, ATAQ-IPF (A Tool to Assess Quality of Life in IPF) and SGRQi (St George’s Respiratory Questionnaire IPF-specific Version).6,7 However, data to support their validity in longitudinal settings are not yet available. A number of more generic instruments have been used as secondary endpoints in trials of IPF. Although the lack of disease specificity creates limitations, positive signals in symptoms and quality of life have been seen in a number of trials8–10 suggesting that more specific instruments may be useful if properly established.

Functional status

Currently, there is no consensus on a single tool that fully captures the limitations IPF places on a patient’s ability to perform daily activities. Maximal exercise capacity as measured by formal cardiopulmonary exercise testing is reduced, and while this has been used as an endpoint in scleroderma-related lung disease,11 it has not routinely been utilized in trials of therapy for IPF. An abbreviated measure of sub-maximal exercise capacity, the distance walked during a 6-min walk test, has been used as a primary endpoint in a number of trials. However, the test is difficult to interpret due to differences in performance technique and arguments over how to handle the oxygen desaturation that routinely occurs. The severity of this oxygen desaturation has been used as an endpoint,12 but this variable appears to be poorly reproducible.13

Meaningful clinical events

A number of clinically meaningful events can occur over the natural course of an IPF patient’s life. Some of these can be captured and potentially used as clinical endpoints. In any one year, 5–10% of patients may develop an episode of acute respiratory deterioration or acute exacerbation (AEX-IPF). Although these events are relatively rare, they are clinically important and often associated with an early death.14 Both respiratory and all-cause hospitalizations are also clinically meaningful events and are correlated with mortality and can be captured during a trial. Lung transplantation is more problematic; when or even whether it occurs heavily depends on factors completely separate from the patient or investigational treatment. Given this and its relative rarity during the course of a treatment trial, the utility of lung transplantation as an endpoint is uncertain.

Surrogate endpoints

As obtaining an adequate number of clinical endpoints in a definitive treatment trial often requires large numbers of subjects to be followed over a long period of time, indirect measures of true clinical endpoints or surrogate endpoints, are often used as substitutes in order to shorten the duration, decrease the number of study subjects and to minimize the resources required. A ‘surrogate endpoint’ is a biomarker that is intended to substitute for a clinical endpoint, while a ‘biomarker’ is an objectively measured indicator of a normal biological process, pathologic process or pharmacologic response to a therapeutic agent.2 A common approach to the use of surrogate endpoints is to identify a biomarker that correlates with a clinical endpoint of interest and to design the study around the treatment’s expected effect on the biomarker, anticipating that this will consistently predict a similar effect on the relevant clinical endpoint.15 Unfortunately, there are multiple ways to be misled about both the efficacy and the safety of an intervention when relying on the response of a surrogate.16,17 For example, recognition that ventricular premature complexes (VPCs) that occur after acute myocardial infarction (AMI) are correlated with an increased death rate led to the Cardiac Arrhythmia Suppression Trial (CAST).18 This trial tested the hypothesis that suppression of post-AMI VPCs would reduce arrhythmic death. Although the intervention was associated with suppression of VPCs, it was also associated with a >3-fold increased risk of death. Validation of the surrogate as an appropriate substitute or replacement endpoint requires ‘substantial evidence’ that the effect of an intervention on a clinical endpoint is reliably predicted by its effect on the surrogate.2 To date, there are no validated surrogate endpoints in IPF.

Pulmonary physiologic measures, particularly forced vital capacity (FVC), have frequently been used as surrogate endpoints (Table 1). However, although measurable declines in FVC are correlated with mortality,19,20 such correlations do not validate FVC as an acceptable surrogate for mortality. More specifically, the considerable data necessary to establish that therapy-induced changes in FVC reliably predict similar consistent changes in mortality (or other clinical endpoints) are missing. Other physiologic measures such as the diffusing capacity for carbon monoxide (DLCO) suffer from the same or greater limitations. Until physiologic measures are validated as surrogate endpoints, uncertainty will surround their use and interpretation.

Composite endpoints

Composite endpoints are the combination of two or more individual endpoints. For example, progression-free survival (the combination of death or decline in FVC) has been used as a primary endpoint in IPF trials. Composite endpoints can be useful in estimating efficacy across more than one clinically important outcome, especially when clinical endpoints are used. However, when a composite endpoint includes an unvalidated surrogate, uncertainty around its clinical relevance increases for a variety of reasons; for example, a loss of data regarding the true clinical endpoint or potential discordant effects of the intervention on the surrogate and clinical endpoints can occur.


A treatment trial designed to definitively determine whether a therapy is beneficial in IPF demands an enormous contribution on the part of each of its participants, and the choice of the primary efficacy endpoint is a critical step in its development. The endpoints that provide the most useful efficacy information are clinical endpoints; i.e. endpoints that are directly related to how a patient feels, functions or survives. Since there are as yet, no properly established PROs or measures of functional status in IPF, survival is clearly the most robust primary efficacy endpoint. Clinically meaningful events such as hospitalization can also provide important efficacy information. Unfortunately, in the absence of validation, dependence upon surrogate endpoints will lead to uncertainty about trial results.


Dr. Swigris is supported in part by a NIH grant (K23 HL092227).

Conflict of interest: None declared.


View Abstract