## Life expectancy of patients with newly-diagnosed HIV infection in the era of highly active antiretroviral therapy

## Abstract

**Background:** Limited data are available on the life expectancy of patients with newly-diagnosed HIV infection in the era of highly active antiretroviral therapy (HAART).

**Aim:** To provide such an estimate using a semi-parametric projection.

**Design:** Statistical analysis.

**Methods:** Follow-up data for patients newly diagnosed with HIV infection in Taiwan (HIV/AIDS Cohort) from 1 May 1997 to 30 April 2003 (*n* = 3351, only 1% are injecting drug users) were analysed using the Kaplan-Meier method. The survival function for an age- and gender-matched reference population was generated by the Monte Carlo method from the life-table of the general population. A constant excess hazard model was used to project long-term survival of HIV-infected patients, with linear extrapolation of a logit-transformed curve of survival ratio between HIV-infected patients and the reference population.

**Results:** The 5-year survival rate was 58% in patients who had already developed AIDS at diagnosis (AIDS group), and 89% in those who had not (non-AIDS group). Extrapolation yielded an expected mean survival time of 10.6 years after diagnosis for the AIDS group, and 21.5 years after diagnosis for the non-AIDS group.

**Discussion:** Our results support the expansion of HIV screening programs to minimize delay in diagnosis. With continuing advances in HAART, this estimate of survival in initially asymptomatic patients may be conservative. Their long life expectancy raises questions about what kind of preventive heath services should be offered. These should be addressed through further analysis of overall benefit and cost-effectiveness.

## Introduction

The introduction of highly active antiretroviral therapy (HAART) has dramatically improved the short-term survival of patients with human immunodeficiency virus (HIV) infection.^{1–3} However, there has been a lack of empirical data on long-term survival. A valid estimation of life expectancy after diagnosis would be of great value, not only for public health policy-making but also for formulation of clinical guidelines.^{4}

Most literature reports regarding the estimated survival time of HIV-infected patients in the HAART era are based on Markov models with Monte Carlo simulation.^{5–12} However, survival time estimated by Markov modelling was highly dependent on assumptions about the efficacy of antiretroviral therapy.^{7} For practical purposes, a more robust approach is needed.

Parametric survival modeling^{13,},^{14} is not a suitable alternative in this scenario, because of a high rate of right-censoring in HIV cohorts. As the follow-up time increases, the age factor and various comorbid diseases unrelated to HIV could influence the expected survival time.^{12,},^{15} It is therefore very difficult, if not impossible, to take into account mathematically all these factors, and produce a specific functional formula.^{16} To consider both HIV-related excess hazard and non-HIV-related background hazard, we have developed a semi-parametric method to incorporate the life expectancy information of background general population into the estimation process.^{17–21} If the HIV-related excess hazard remains constant in the chronic stage of a stable disease, HIV-infected patients’ long-term survival can be projected from the available follow-up data using the lifetime survival function of an age- and gender-matched reference population as reference. The survival function of this reference population can be generated by Monte Carlo methods from the life table of the general population.^{17}

In this study, we first evaluated the validity of this method using follow-up data from the Taiwan HIV/AIDS Cohort, which included all identified HIV-positive patients in Taiwan. We then applied this method to estimate the life expectancy of patients with newly-diagnosed HIV infection in the HAART era. The estimates were compared with the results obtained from standard parametric survival modelling.

## Methods

### Taiwan HIV/AIDS Cohort

The Taiwan HIV/AIDS Cohort includes all identified HIV-positive citizens in Taiwan. Both HIV infection and acquired immunodeficiency syndrome (AIDS) are reportable diseases in Taiwan.^{22} All identified HIV cases must be confirmed by Western blot and reported to the Center for Disease Control (Taipei, Taiwan). AIDS is defined according to the Centers for Disease Control and Prevention (Atlanta, USA) 1993 revised criteria,^{23} but with three minor modifications: (i) Disseminated or extrapulmonary infection due to *Penicillium marneffei*,^{24} an opportunistic pathogen endemic in southeast Asia, is considered an AIDS-defining condition. (ii) Because of a high incidence (50–75 cases per 100 000 population per year) of pulmonary tuberculosis in the non-HIV general population,^{25} pulmonary tuberculosis is an AIDS-defining condition only if the patient's CD4 count is <200/μl. (iii) Patients who have a CD4 count <200/μl are considered to have AIDS only after they develop at least one AIDS-defining opportunistic infection. Center for Disease Control (Taipei, Taiwan) maintains a periodically-updated data profile, including the date of diagnosis, age, gender, date of development of AIDS, and date of death for each Western-blot-confirmed case.

### HAART

HAART was introduced in Taiwan in 1997, and was freely offered to all identified HIV-positive citizens through the National Health Insurance system.^{26} The timing of initiating HAART, and the regimens, both followed the evolving guidelines recommended in the US.^{27–29} Initially, early treatment was encouraged, except for those with blood HIV-RNA levels <5000 copies/ml and a CD4 cell count ⩾500/μl. In 2002, the practice of initiating HAART in asymptomatic patients was gradually changed to the new criteria of a CD4 count <350/μl or a peripheral blood HIV-RNA level >55 000 copies/ml.^{29,},^{30} In 1997, only unboosted protease inhibitor (PI)-based regimens were available. Non-nucleoside reverse transcriptase inhibitor-based and boosted PI-based combinations became the first-line regimens in 2000 and 2001, respectively.

### Survival in patients with newly diagnosed HIV infection in the HAART era

We included all cases diagnosed in the period from 1 May 1997 to 30 April 2003. Being newly diagnosed in HAART era, these patients were all treatment-naive when HAART was started. The survival status of each patient was further verified by cross-checking with the national death certification database maintained by the Ministry of the Interior, Taiwan.^{31} We used the Kaplan-Meier method to estimate survival function, stratified by whether they had already developed AIDS-defining conditions at diagnosis of HIV infection, using the follow-up data from 1 May 1997 to 30 April 2003.

### Survival in the reference population

Life tables for the general population were obtained from vital statistics published by Department of Statistics, Ministry of the Interior, Executive Yuan, Taiwan. Life expectancy at birth in Taiwan has gradually increased from 75.04 years in 1999 to 75.87 years in 2002. Because individual survival time of subjects in a hypothetical cohort cannot be directly derived from the life table of the general population, we used a Monte Carlo method to generate the simulated remaining survival time of age- and gender-matched hypothetical subjects for each patient in the Taiwan HIV/AIDS Cohort. For example, the remaining survival time of a hypothetical subject corresponding to a male patient of age *x* may be generated as follows.

From the life table of the general population, we first find , which is the proportion of male persons alive at the beginning of age interval (*x* + *k, x* + *k* + 1) but dead during the interval for *k* ⩾ 0. The conditional survival function of the male general population who have survived to age *x* is given by , for *t* > 0, and *S*(0|*x*) = 1. Secondly, a uniform random number between 0 and 1 is generated. The time *t _{x}* such that

*S*(

*t*|

_{x}*x*) =

*u*equals the uniform random number is a survival time for the hypothetical subject. The total collection of hypothetical subjects was used as the reference population. The ratio between numbers of hypothetical subjects and each real patient was set to make the final size of the reference population up to 100 000. The survival curve of the reference population is then obtained by applying the Kaplan-Meier method to the simulated survival times.

### Logit survival ratio extrapolation

The survival ratio between the survival functions of two populations is defined by the formula: *W*(*t*) = *S*(*t* | patient population)/*S*(*t* | reference population). Because the patient population has a worse survival than the reference population, the value of *W*(*t*) initially equals 1 at time point *t* = 0, then gradually decreases due to disease-associated excess mortality. Because the value of *W*(*t*) is limited to the range from 0 to 1, linear regression for temporal trend is not applicable. We therefore used the logit transformation of *W*(*t*), or log[*W*(*t*)/(1 − *W*(*t*))].^{17} A higher logit *W*(*t*) value corresponds to a higher *W*(*t*) value, but the range of values was transformed from 0 ∼ 1 to that of −∞ ∼ +∞. Furthermore, if the HIV-associated excess hazard remains constant over time, the curve of the logit of *W*(*t)* will converge to a straight line, as shown below.

Let *H*(*t* | patient) = *H*(*t* | reference) + HIV-associated excess hazard C_{1}, where C_{1} is a positive constant. Using the definition of hazard function *H*(*t*) = −[d*S*(*t*)/d*t*]/*S*(*t*), this equation can be rewritten as d*S*(*t* | patient)/*S*(*t* | patient) = d*S*(*t* | reference)/*S*(*t* | reference) − C_{1} × d*t*. We integrate both sides of this equation to obtain ln [*S*(*t* | patient)] = ln [*S*(*t* |reference)] + C_{0} − C_{1} × *t*. Therefore, we obtain the following equation:
Because C_{1} > 0, the residual item ln [1 − exp (C_{0} − C_{1} × *t*)] will converge to 0 when *t* → ∞. As a result, when *t* → ∞, logit *W*(*t*) will approximate to C_{0} − C_{1} × *t*, a straight line with slope −C_{1}.

The extrapolation process consisted of three phases. First, a plot of logit *W*(*t*) over time was created. The time point *T _{s}* after which the logit

*W*(

*t*) curve became a nearly straight line was then identified. Second, we fitted a simple linear regression for logit

*W*(

*t*) from

*T*to the end of follow-up

_{s}*T*, that is: where the noise term

_{f}*N*is independently and normally distributed with mean 0 and variance σ

_{t}^{2}. Finally, given the least squares estimates of the two parameters, and , we project the long-term survival curve of patient population beyond the follow-up limits as: for

*t*>

*T*. The standard error of survival estimates was obtained through a bootstrap method by implementing the extrapolation process with data simulated by repeatedly sampling with replacement from the real data set 1000 times.

_{f}To facilitate the computation, we developed a software program MC-QAS, which was built in the R and S-PLUS 2000 (MathSoft) environment and can be freely downloaded from [http://www.stat.sinica.edu.tw/jshwang] (released in December 2004).

### Parametric survival modelling

For comparison, a standard parametric survival regression and extrapolation was also applied to the same follow-up data. Models based on the versatile three-parameter extended generalized gamma distribution were chosen, of which the popular two-parameter Weibull distribution is a special case. S-PLUS 2000 (MathSoft) and SAS Proc Lifereg (SAS Institute) (distribution = gamma or Weibull) were used for computation.

## Results

### Characteristics of patients

A total of 3351 HIV-positive patients, of whom 718 (21%) had already developed AIDS-defining conditions at diagnosis (the AIDS group) and 2633 had not (the non-AIDS group), were diagnosed between 1 May 1997 and 30 April 2003. The great majority of the 3351 were men (93%). The most common age at diagnosis was 20–29 years (37%), followed by 30–39 years (35%), and 40–49 years (13%). Sexual contacts (98%) were the predominant risk factor, followed by intravenous drug use (IVDU) (1%). The mean age at diagnosis in the AIDS group was significantly higher than that in the non-AIDS group (40.6 vs. 33.1 years, *p* < 0.001).

### Observed 6-year survival curves

The Kaplan-Meier survival curves for the AIDS group (*n* = 718) and the non-AIDS group (*n* = 2633) are shown in Figures 1a and 1b, respectively. The longest follow-up period was 6 years. The 5-year survival rate in the non-AIDS group was 89%. For the AIDS group, the survival rate dropped rapidly to 66% at the end of the first year, and then gradually decreased to 58% at the end of the fifth year.

### Temporal trend of logit *W*(*t*)

The plots of logit *W*(*t*) in the AIDS group and the non-AIDS group are shown in Figures 2a and 2b, respectively. In both groups, the logit *W*(*t*) curve underwent an initial rapid decline, then levelled off after the first year, and eventually converged to a straight line with a negative slope of −0.008/month and −0.010/month, respectively. If stratified according to gender or age at diagnosis of HIV infection, the logit *W*(*t*) curves still followed the same temporal trend, and eventually converged to a straight line with a negative slope of −0.007/month (women), −0.009/month (men), −0.007/month (age <40 years) and −0.010/month (age >40 years). This indicates that the observed survival data of HIV-positive patients met the assumption of a constant HIV-associated excess hazard.

### Validity of extrapolation

The first 3-year follow-up data (1 May 1997–30 April 2000) from the 1264 patients diagnosed in that period were analysed to extrapolate the survival curve to 3 years beyond 30 April 2000. The 1999 life-table of the general population was used as the reference. Predicted 6-year survival curves were then compared with those actually observed from 1 May 1997 to 30 April 2003. The validity of the parametric survival models was examined using the same data.

The logit survival ratio extrapolation method predicted that the AIDS group (*n* = 326) would have a mean survival time (±SE) of 43.3 ± 2.2 months at the end of the 6-year follow-up. There was no significant difference (95%CI −4.9 to 6.9 months) from the actual value (42.3 ± 2.0 months). The predicted survival curve matched quite well with the actual curve (Figure 3a). In comparison, the Weibull survival model and the extended generalized gamma model predicted mean survival times of 38.3 months and 41.6 months, respectively. The Weibull survival model failed to capture the feature of an abrupt decrease in hazard after initial months in AIDS group, and did not achieve a good match between the predicted and the actual survival curves, while the more versatile extended generalized gamma model yielded a better match (Figure 3a).

Similarly, the logit survival ratio extrapolation method predicted that the non-AIDS group (*n* = 938) would have a mean survival time (±SE) of 65.6 ± 1.9 months at the end of the 6-year follow-up. There was no significant difference (95%CI −4.7 to 3.1 months) from the actual value (66.4 ± 0.6 months). Both the Weibull and the extended generalized gamma survival models predicted a mean survival time of 67.7 months. The logit survival ratio extrapolation method, as well as the Weibull and the extended generalized gamma models, achieved a reasonable match between the predicted and the actual 6-year survival curves in the non-AIDS group (Figure 3b).

### Long-term survival after diagnosis

The 6-year follow-up data (follow-up from 1 May 1997 to 30 April 2003) of the 3351 patients diagnosed in that period were used to extrapolate the survival curve to the 50th year after diagnosis for estimation of the life expectancy. The 2002 life-table of the general population was used as the reference.

The 50-year survival curves predicted via the logit survival ratio extrapolation method for patients initially with or without AIDS are shown in Figures 4a and 4b, respectively. For the AIDS group (*n* = 718), the predicted survival probability was 0.58 at the end of the fifth year, 0.43 at the end of the 10th year, 0.31 at the end of the 15th year, and 0.07 at the end of the 30th year (Figure 4a). The estimated mean±SE lifetime survival was 10.6 ± 3.2 years after diagnosis. For the non-AIDS group (*n* = 2633), the predicted survival probability was 0.89 at the end of the fifth year, 0.80 at the end of the 10th year, 0.69 at the end of the 15th year, and then 0.25 at the end of the 30th year (Figure 4b). The estimated mean ± SE lifetime survival was 21.5 ± 5.7 years after diagnosis.

In comparison, the standard parametric survival model based on extended generalized gamma distribution yielded erroneous predictions of long-term survival curves (Figures 4a and 4b). HIV-positive patients are expected to have a worse long-term survival. However, the extended generalized gamma model predicted a better survival in the HIV-positive patients at 30–40 years after diagnosis than in the age- and gender-matched reference population. This indicates that standard parametric models may yield grossly deviated results in the prediction of long-term survival for HIV patients.

## Discussion

In our analysis, the semi-parametric logit survival ratio extrapolation method performed better than parametric survival models. Our results are however limited by uncertainty regarding the stability of excess hazard in the extrapolation period. For example, the efficacy of HAART may gradually decrease over time due to accumulation of resistance mutations. Although this may be balanced by the introduction of a more potent salvage therapy, the impaired immune functions and the adverse effects of some HAART regimens on metabolic profiles may also take their toll in the late course of the illness, and may interact synergistically rather than additively with underlying diabetes mellitus or coronary artery disease.^{32} Since the HIV-related excess hazard is unlikely to be exactly constant throughout the extrapolation period, a certain degree of prediction error is unavoidable.^{17} Despite this, our semi-parametric method avoids the gross deviations in long-term projections seen with the extended generalized gamma survival models, with the advantage of an input of information from the life table of the background general population.

Our results should be interpreted with another important limitation in mind. With the introduction of more potent new-generation first-line HAART regimens and advances in medical care for opportunistic infections and malignancies,^{29,}^{33,},^{34} survival in HIV-positive patients is likely to see improvements in the future beyond those in our 1997–2003 data. This trend is supported by recent epidemiological evidence.^{15} Despite the concern of widespread transmission of drug-resistant virus among the population,^{35} a cohort study in the US showed a continuing decrease from 1997 to 2003 in the annual death rate of non-IVDU HIV-positive patients.^{15} As a result, our prediction based on the 1997–2003 follow-up data is likely to be a conservative estimate.

Despite these limitations, our method yielded similar estimates to those reported in recent studies using calibrated Markov models.^{8,}^{10–12} Before 2001, when HAART was still in its infancy, researchers conducting Markov modelling used a conservative assumption of an initial round of HAART effective for a maximum of only 2 years followed by a single round of less effective salvage therapy.^{7} This led to the conclusion that patients with newly-diagnosed HIV infection had a mean expected survival of only 2.84–9.13 years after diagnosis.^{5,},^{7} In recent years, observational cohort data with increasing follow-up length became available and allowed calibration of the probability parameters. Using these updated models, Braithwaite *et al*. estimated a median survival time of 20.4 years for newly diagnosed HIV-infected patients, and 12.2 years for those with a CD4 count of 200/μl and a viral load of 1 000 000 copies/ml.^{12} Similarly, King *et al*. estimated a median survival time ranging from 15.4 to 26.6 years for HIV patients with an initial CD4 count >200/μl, and 8.5 years for those with a CD4 count ⩽200/μl.^{8} Paltiel *et al*. also estimated that newly-diagnosed HIV patients would have a mean survival of 19.0–19.6 years.^{10} These results, based on approaches entirely different from ours, support the robustness of our estimates.

Our results show that, even under a National Health Insurance system which provides HIV-positive patients free access to HAART and medical care, there was still a significant portion (21%) of patients who did not receive HIV testing until they developed AIDS-defining conditions. This delay was associated with a high risk of mortality within 1 year, and a much worse life expectancy. HIV screening programs, which have been shown to be cost-effective in two simulation studies,^{10,},^{11} should be expanded to minimize such a delay in diagnosis and unnecessary premature mortality. On the other hands, newly diagnosed asymptomatic patients have an expected mean survival time of at least 21.5 years, which will probably continue to improve in the coming years. One probabilistic model predicted that 36–72% of them would die from causes not directly attributable to HIV.^{12} For clinicians providing medical care for HIV-positive persons, a compelling question is: what kind of preventive heath services should be offered to them? Should they be exactly the same as individuals without HIV? Should they be less? Should they depend upon prognostic markers? Formulation of clinical guidelines needs to consider the overall benefit and the cost-effectiveness, with remaining life expectancy as an important determinant.^{4,},^{12}

Several important features of our estimates should be recognized. First, derived from the national cohort data, our estimates account for effects such as unsatisfactory drug adherence among less motivated patients and even interruption of treatment for various reasons. Highly motivated individuals with stricter drug adherence, such as those seen in clinical trials, may have a much better long-term survival than our estimates. But patients with unfavourable risk profiles such as those with HBV or HCV coinfections^{36} may have a life expectancy significantly worse than our estimates. Second, only 1% of patients in our cohort were IVDUs. Therefore, our results may not be generalizable to this population, which may have a worse long-term survival than other non-IVDU HIV-infected patients.^{36} Third, our estimates were based on the scenario that HIV-positive patients have free access to HAART. Because access to HAART is an important determinant of long-term survival for people living with HIV/AIDS,^{37–39} our estimates do not account for those without such access, for economic or other reasons. Fourth, in addition to the above clinical factors, there are other important determinants of life expectancy in newly diagnosed HIV patients. These include socioeconomic factors such as average income, as well as the performance of public health systems and the quality of medical care.^{40} Our estimates are therefore not directly applicable to developing countries that have less favourable socioeconomic conditions and a much shorter life expectancy at birth. To calculate the corresponding estimate for a specific country, local data should be used regarding life-tables of the general population and the survival of patients with HIV/AIDS.

In conclusion, by incorporating information from the life-table of the general population, estimation using the logit survival ratio extrapolation method is a robust approach to calculating the life expectancy of HIV-positive patients. Patients who have already developed AIDS at presentation have a high risk of mortality within 1 year, and a much worse life expectancy. HIV screening programs should be expanded to minimize delay in diagnosis. Patients with newly diagnosed asymptomatic HIV infection in the HAART era are expected to survive for a mean of 21.5 years after diagnosis according to current projections. Because of continuing advances in HAART, our current estimate is likely to be conservative. This long life expectancy raises questions about what kind of preventive heath services should be offered, that need to be addressed through further analysis of overall benefit and cost-effectiveness.

## Acknowledgments

Jing-Shiang Hwang and Jung-Der Wang contributed equally to this work. This study was supported by grants DOH91-DC-1056, DOH 92-DC-1032 and DOH95-DC-1104 from the Department of Health, Executive Yuan, Taiwan, and integrated grants NHRI-EX92-9204PP and NHRI-EX93-9204PP from the National Health Research Institutes, Taiwan. The preliminary version of this paper was presented as a poster (abstract 8432) at the XV International AIDS Conference, Bangkok, July 11–16, 2004.

- © 2007 The Author(s)

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (<http://creativecommons.org/licenses/by-nc/2.0/uk/>) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.