Q J Med 2001; 94: 695-698
© 2001 Association of Physicians
Improving the measurement of quality of life in older people: the York SF-12
From the Department of Health Studies, University of York, York, UK
Received 3 August 2001 and in revised form 9 October 2001
| Summary |
|---|
|
|
|---|
To assess whether changing the layout of the SF-12 affected item response rates, we tested two SF-12 formats in a quasi-randomized trial of women aged
70 years in two general practices in North Yorkshire. The modified version of the SF-12 (York SF-12) converted the stem and leaf format of some questions to individual items. We assessed the effect of the two types of questionnaires on item response rates. The difference in overall response rates to the two questionnaires (York SF-12 26.8%; SF-12 29.5%) was not statistically significant (95%CI -1.88% to 7.22%). However, the modified SF-12 had a statistically significantly lower item non-response rate of 8.5%, compared with the 26.6% of the SF-12 (95%CI 11.1%25.1%). Chronbach's alpha reliability scores for the York SF-12 were also slightly better than for the older version. The York version of the SF-12 is an improvement on the original questionnaire. We recommend that the York SF-12 be used in preference to the SF-12 when surveying an older population. | Introduction |
|---|
|
|
|---|
The SF-36 is a widely used quality-of-life instrument. However, its length could affect response rates, particularly in older people. For instance, we previously found an 8% reduction in response rates in a population survey when a questionnaire was extended from 4 to 7 pages of A4,1 while Dorman and colleagues using a population from a randomized trial noted a 5% reduction in response rates when the SF-36 was compared against the much shorter EuroQol questionnaire.2 Because of possible problems with response rates and data entry burden a shortened version of the SF-36 (the SF-123) is popular. Recently, both of these instruments have been amended. For instance, some questions that had a binary response have been replaced with a Likert response scale. In addition, the layout for some questions has been changed from a vertical format to a horizontal layout.
We are using the latest version of the SF-12 in a number of ongoing randomized trials. However, the SF-12 (and the SF-36) has a confusing stem and leaf layout of some of its items. Some questions are preceded with a general phrase such as How much during the last month:, which is then followed by up to three specific questions (Figure 1
). We noticed in a pilot study that this layout is confusing to older respondents. In the context of a pilot study for a large randomized control trial of patients with chronic leg ulcers, eleven elderly patients were asked to complete four different quality-of-life questionnaires, presented in the following order, Euroqol, SF-12, pain, and ulcer-specific questionnaires.
|
Even though patients were asked to respond the questionnaires on their own, they often asked for assistance with questions in the SF-12. They had difficulties working out the way they should respond to the stem and leaf questions. When it was explained to patients that the researcher could not help, since this could bias their responses, patients tended to either miss items (questions) from the questionnaire, tick the same item twice, or tick the descriptions of the response categories rather than the appropriate box (Iglesias, unpublished data). This results in loss of data and also leads to data entry problems. When a respondent ticks two categories, not only is it impossible to ascertain which is the true response, but when data-entry is through a scanner an error message is produced, requiring the questionnaire to be manually checked, which increases the cost and complexity of data entry. Similarly, if the respondent ticks the description of the response category rather than the box, this again leads to problems with automated data entry. Because of these problems, we decided to amend the layout of the SF-12 without altering the existing questionnaire's length (2-sides of an A4 page) and undertake a randomized comparison of our version on its response rates and validity.
| Methods |
|---|
|
|
|---|
We amended the SF-12 questionnaire by converting all the stem and leaf questions into individual items. This version was piloted among a small convenience sample of patients at a leg ulcer clinic. After some slight revisions we decided to test the York SF-12 in a randomized trial.
We undertook an opportunistic study in the context of recruiting women aged 70 years and over for a randomized trial of hip protectors for fracture prevention. We mailed out 1500 questionnaires, 750 of each version, York SF-12, and SF-12. This was to detect at least an 8% difference in response rate at 80% power (2p
0.05). The EuroQol without the analytic scale was also included in both questionnaires. Questionnaires were placed in an opaque sealed envelope with a prepaid reply envelope. To produce equivalent groups, we used alternation, which if strictly adhered to, results in comparable groups.4 The sealed envelopes were then sent to participating general practices. Staff at the general practices put address labels on the envelopes and posted them to the participants. No reminders were or could be sent. Completed questionnaires were returned to the Department of Health Studies.
Statistics
Comparisons in return and item completion rates, and general health status of respondents between the two types of questionnaires were performed using
2 tests. We also compared item-response rates between the EuroQol and the two versions of the SF-12 within questionnaires using the McNemar test for paired data. To assess the validity of the new version, we explored any possible psychometric differences between the York SF-12 and the SF-12 questionnaires by principle components analysis, using a varimax orthogonal rotation with an exclusion criteria of 0.5. The adequacy of the data for factor analysis was identified using Kaiser Meyer Olkin (KMO) test and Bartlett's test of sphericity.5 Chronbach's alpha6 was used to explore questionnaires' internal reliability.
| Results |
|---|
|
|
|---|
Out of the 1500 questionnaires mailed out, 422 questionnaires were returned, 221 (29.5%) and 201 (26.8%) for the old and new versions, respectively. No statistically significant difference (p=0.779) was found in a comparison of the general health status between the respondents to either version of the SF-12. Differences in overall response, missing item, and single item response rates per questionnaire are described on Table 1
|
As Table 1
We also compared the item response rates between the two versions of the SF-12 with the EuroQol. As expected, there were no statistically significant differences in item non-response rates for the EuroQol when combined with either version of the SF-12, 3.1% and 2% for SF-12 and York SF-12 respectively, (95%CI -1.85% to 4.14%). However, the difference (6.5%) in the modified SF-12 item non-response compared with the EuroQol, although statistically different, (95%CI 2% to 8%, p<0.004), was substantially lower than the 23.8% difference between the old SF-12 and the EuroQol (95%CI 17.8% to 29.8%, p<0.001).
The KMO and Bartlett's tests were both satisfactory for both versions of the questionnaire (0.93049 and 2060.2787, p<0.0001 for the York SF-12; 0.90573 and 1455.5486 p<0.0001 for the old SF-12). Subjecting both questionnaires to factor analysis yielded the expected two-factor solution. The internal reliability of the York SF-12 was 0.94 and 0.91 for the physical health and mental health domains, respectively, measured using Chronbach's alpha. This was slightly better than the standard SF-12, which gave reliability estimates of 0.90 and 0.88 for the same two factors.
| Discussion |
|---|
|
|
|---|
The SF-12 is an increasingly popular quality-of-life questionnaire. However, we found that its layout caused confusion among the older population in our studies. It would appear that this may be due to the stem and leaf format of some of the SF-12 items. Converting these items into individual items led to a statistically significant decrease in the number of missing items. The new layout of the York SF-12 appears to have an unambiguous factor structure with good reliability. Importantly, it improves item response by clarifying the format, resulting in fewer missing values, while showing a similar reliability.
We accept that our overall response rates for both studies were relatively low. This is probably explained by the fact that the main aim of our survey was to recruit high-risk women to a randomized trial of hip protectors. However, there was only a small difference in overall response rates between the two questionnaires, which was not statistically significant. Further, if our study group was relatively highly motivated to complete the questionnaire, this would suggest that the York SF-12 could perform even better, compared with the standard SF-12, in a less-motivated population. It is interesting to note that in a trial comparing the standard version of the SF-36 with the EuroQol, 28% of the returned SF-36 responses had missing data, which is similar to the 27% we found using the standard SF-12.2 It may be helpful to modify the SF-36 in a similar manner.
In conclusion we have modified the SF-12 and undertaken a trial to test its item response rate and its reliability. We are currently exploring the equivalence of the Physical Component Score and Mental Component Score between the two versions of the SF-12.7 However on the basis of the present analysis, we would recommend using the York modified SF-12, particularly among older people.
| Acknowledgments |
|---|
This research was undertaken as part of a project funded by Procter & Gamble Pharmaceuticals and Aventis Pharma.
| Notes |
|---|
Address correspondence to Ms C.P. Iglesias, Department of Health Studies, University of York, York YO105DQ. e-mail: cpiu1{at}york.ac.uk
| References |
|---|
|
|
|---|
1. Iglesias C, Torgerson D. Does length of questionnaire matter? A randomized trial of response rates to a mailed questionnaire. J Hlth Serv Res Policy2000; 5:21921.
2.
Dorman PJ, Slattery J, Farrell B, Dennis MS, Sandercock PAG. A randomised comparison of the EuroQol and Short Form-36 after stroke. Br Med J1997; 315:46163.
3. Jenkinson C, Layte R, Jenkinson D, Lawrence K, Petersen S, Paice C, Stradling J. A shorter form health survey: can the SF-12 replicate results from the SF-36 in longitudinal studies? J Pub Hlth Med1997; 19:17986.
4. Chalmers I. Assembling comparison groups to assess the effects of health care. J Roy Soc Med1997; 907:37986.
5. Norussi MJ. SPSS-X advanced Statistics Guide. New York, McGraw Hill, 1985.
6.
Bland JM, Altman DG. Statistics notes: Chronbach's alpha. Br Med J1997; 314:572.
7. Ware JE, Kosinski M, Keller SD. How to score the SF-12 Physical & Mental Health Summary Scales, 3rd edn. Lincoln RI, QualityMetric Inc, 1998.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
C. P Iglesias, Y. F Birks, D. J Torgerson, P.-J Roberts, C. Roberts, B. Sibbald, and D. J Torgerson Increasing response rates to postal questionnaires BMJ, August 24, 2002; 325(7361): 444 - 444. [Full Text] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

