OUP user menu

Validation and comparative evaluation of the osteoporosis self-assessment tool (OST) in a Caucasian population from Belgium

F. Richy, M. Gourlay, P.D. Ross, S.S. Sen, L. Radican, F. De Ceulaer, W. Ben Sedrine, O. Ethgen, O. Bruyere, J.-Y. Reginster
DOI: http://dx.doi.org/10.1093/qjmed/hch002 39-46 First published online: 31 December 2003


Background: Risk indices have been developed to identify women at risk of low bone mineral density (BMD) who should undergo BMD testing.

Aim: To compare the performance of four risk indices in White ambulatory women in Belgium.

Design: Epidemiological cross-sectional study.

Methods: Records were analysed for 4035 postmenopausal White women without Paget's disease or advanced osteoarthritis, seen at an out-patient osteoporosis centre between January 1996 and September 1999. Osteoporosis risk index scores were compared to bone density T-scores. The ability of each risk index to identify women with low BMD (T-score < −2.0) or osteoporosis (T < −2.5) was evaluated.

Results: Using an Osteoporosis Self-Assessment Tool (OST) score <2 to recommend DXA referral, sensitivity ranged from 85% at the lumbar spine to 97% at the total hip to detect BMD T-scores of ≤ −2.5, and specificity ranged from 34% at the total hip to 37% at the femoral neck and lumbar spine. The negative predictive value was high at all skeletal sites (89–99%), demonstrating the usefulness of the OST to identify patients who have normal BMD and should not receive DXA testing. All risk indices performed similarly, although the OST had somewhat better sensitivity and somewhat lower specificity than the other indices at the cut-offs evaluated. Among the 11–12% of women who were classified as highest risk using OST or the Osteoporosis Index of Risk (OSIRIS), 81–85% had low bone mass and 68–74% had osteoporosis.

Discussion: The performance of these risk indices among women in Belgium was similar to that reported earlier for other samples in Asian countries, the US, and the Netherlands. The OST and other risk indices are effective and efficient tools to help target high-risk women for DXA testing.


Osteoporosis is a major public health concern worldwide. The medical, social and psychological consequences can severely impact the health-related quality of life of patients with fractures, and are life-threatening for some patients.1–,3 Osteoporosis is ‘a systemic skeletal disorder characterized by low bone mass and micro-architectural deterioration of bone tissue, with a consequent increase in bone fragility and susceptibility to fracture.‘4 This definition clearly introduces the association between low bone mass and increased fracture risk, allowing for an operational definition of osteoporosis based on bone mineral density (BMD).5

Osteoporosis is common among postmenopausal women, but is often asymptomatic. It is widely accepted that osteoporosis can be diagnosed using BMD measurements, generally made at the hip or spine using dual X-ray absorptiometry (DXA). Some researchers have recommended that BMD measurements be targeted to subjects with risk factors for osteoporosis, because of limited availability of BMD technology in some communities, and cost considerations.5,,6

Comprehensive epidemiological studies have identified clinical risk factors for osteoporosis, and these risk factors have been used to develop risk assessment indices.7–,15 The purpose of the risk assessment indices is not to diagnose osteoporosis or low BMD, but to identify women who are more likely to have low BMD. These patients can then be referred for BMD measurements. Such indices, while not identifying all cases of osteoporosis, increase the efficiency of BMD measurement by focusing on subjects who are at increased risk.

The Osteoporosis Self-assessment Tool (OST)14 is based simply on age and weight. It was developed and validated in several studies in Asian and White populations,14 and was compared to other risk indices in large samples of postmenopausal women.16 The authors reported that OST predicted low BMD as well as other indices did, and they considered OST to be the easiest to use in clinical practice. Other risk tools are also based on age and weight, in combination with up to four additional risk factors; these include the Osteoporosis Risk Assessment Instrument (ORAI),15 the Simple Calculated Osteoporosis Risk Estimation (SCORE),10 and the Osteoporosis Index of Risk (OSIRIS).17 These four risk assessment tools have been proposed for increasing awareness of osteoporosis and for encouraging more efficient use of BMD measurements in patients who are likely to have low bone mass, especially in asymptomatic postmenopausal women.

Our goals were to assess the validity of OST in a population of 4035 White women from Belgium, and to compare the performance of the four tools—OST, ORAI, SCORE and OSIRIS—in identifying women at risk of low BMD and who could benefit from definitive osteoporosis evaluation using DXA.


We analysed a database that was previously used to assess the discriminatory performance of SCORE.18 It included medical data on patients either consulting spontaneously or referred for a BMD measurement between January 1996 and September 1999 to an out-patient osteoporosis center located at the University of Liège in Belgium. Referral was based on diagnostic judgment of the referring physician. Informed consent was obtained from all eligible study participants. Patients with Paget's disease and advanced osteoarthritis were excluded. The research protocol was reviewed and approved by the institutional review board of the University of Liège.

The following outcomes were recorded for 4035 postmenopausal women: BMD, age, weight, history of rheumatoid arthritis, non-traumatic fracture history after age 45 years, and history of oestrogen use. BMD measurements, using DXA technology (Hologic QDR2000), were obtained from the hip (both total and femoral neck) and lumbar spine (L2–L4). BMD values, expressed in g/cm2, were converted into T scores, expressed in standard deviations (SDs), using QDR reference values specifically established for the population of Liège.19,,20 We used the WHO classification range to categorize subjects as normal (T > −1), osteopenic (−2.5 < T ≤ −1), or osteoporotic (T ≤ −2.5). A subcategory was defined as ‘low BMD’, for all subjects with T < −2.0, to allow comparison to published results for some risk indices that were based on this cutoff.10,,11 Further, this cutoff is widely used in many communities to detect pre-osteoporotic patients.24

The BMD cut-off values for osteopenia and osteoporosis at the lumbar spine, total hip, and femoral neck sites in our sample were 1.065 and 0.840; 0.790 and 0.640; and 0.750 and 0.600 g/cm2, respectively. Weight was recorded in kg in our medical records, so this factor was converted into pounds by using a multiplier of 2.205 for use in calculating risk indices such as SCORE.

The OST, ORAI, SCORE, and OSIRIS indices were then derived according to the algorithms suggested by their developers (Table 1). The following dichotomous cutoffs for DXA referral were used: < 2 for OST, > 7 for SCORE, > 8 for ORAI and < 1 for OSIRIS. Also, three risk categories were used for each index, according to their developer's recommendations and the validation of some indices in American and European populations.17 Prevalence of osteoporosis in each of these three categories were determined using the WHO criteria. Prevalence is easier for many clinicians to understand and use than statistical measures of performance such as sensitivity and specificity.

View this table:
Table 1

Calculation of the evaluated indices

Race other than Black+ 5
Rheumatoid arthritis+ 4
Non-traumatic fracture after age 45 years+ 4 per fracture, up to a maximum of 12
Age+ 3 for each decade
Oestrogen therapy+ 1 if never
Weight− 1 for each 10 lb (4.5 kg)
Age > 75 years+ 15
Age 65–74 years+ 9
Age 55–64 years+ 5
Body weight < 60 kg+ 9
Body weight 60–70 kg+ 3
Oestrogen therapy+ 2 if not currently using oestrogen
Body weight (kg)+ 0.2 × body weight
Age (years)− 0.2 × age
History of low impact fracture(s)− 2
Oestrogen therapy+ 2
Body weight (kg)
Age (years)0.2 × (body weight − age)

Basic demographic data were tabulated to allow comparison of our study population to that from the other recent studies by Koh et al.14 and Geusens et al.16 involving OST. Receiver operating characteristic (ROC) analyses were performed to evaluate the discriminatory performances of OST, ORAI, SCORE, and OSIRIS, and the area under the curve (AUC) was computed for each. To assess the internal validity of the indices, sensitivity was defined as the proportion of the population with low BMD correctly classified by the risk index (true positive fraction) and specificity was defined as the proportion with normal BMD correctly identified by the risk index (true negative fraction). ROC curves provided a graphical representation of the overall accuracy of a test by plotting sensitivity against (1–specificity) for all thresholds, while the AUC quantified the accuracy of the test.

We also calculated the positive predictive value (PPV) and negative predictive value (NPV) to evaluate the external validity of each tool. The PPV and NPV represent the proportion of women who tested positive or negative (as classified by the four tools) and who truly had, or did not have, BMD below the T-score threshold being tested, respectively.

We evaluated OST, ORAI, SCORE, and OSIRIS at the BMD T-score thresholds of −2.5 and −2.0, to assess the performance of those indices in predicting osteoporosis and low bone mass, respectively. The ability of the tools to detect different thresholds of low BMD was also evaluated for various anatomical sites (total hip, femoral neck, L2–L4) of densitometry measurement.

Statistical analysis used Statistica 6.0 software (Statsoft, France).


The mean age of the women in our sample was 61.5 (± 8.8) years, ranging from 45 to 96 years. Table 2 shows their basic demographic data as compared to cohorts studied by Koh et al.14 and Geusens et al.16 The patient population in this study was slightly younger (61.5 vs. 62.3) than in the study by Koh et al., while rheumatoid arthritis was less prevalent in our study sample (< 2%). Only 2.6% reported a non-traumatic fracture after age 45 at the wrist, rib, or hip; this was much lower than in the study of Koh et al.14 The prevalence of osteoporosis at all sites increased progressively with age (Figure 1). Of the women in our study, 32% were osteoporotic (T ≤ −2.5) at one or more skeletal site, 47% had a low BMD (T ≤ −2.0), and 73% were classified as osteopenic according to the WHO operational definition. In comparison to the other two cohorts, the prevalence of low BMD in our population was higher at the femoral neck.

Figure 1

Prevalence of osteoporosis (T score < −2.5) by age and measurement site.

View this table:
Table 2

Characteristics of the participants

CharacteristicThis studyDevelopment cohort14Validation cohort16
Ethnicity (%)Black 0Black 0Black 4.9
White 100White 0White 81.8
Asian 0Asian 88Asian/Other 13.3
Other 0Other 12
Mean (SD) age (years)61.5 (8.8)62.3 (6.2)61.3 (9.6)
Mean (SD) weight (kg)65 (11.9)57.1 (8.2)70.5 (15.7)
BMD T Score < −2.5 (%)FN 18.8FN 14FN 13.7
Hip 9.4(Netherlands sample)
L2–L4 24.2
Oestrogen use (%)
Prior fractures (%)
After 45 years2.612
Rheumatoid arthritis (%)1.295
  • FN, femoral neck; L2–L4, lumbar spine L2 to L4.

Table 3 shows the performance of the four risk indices in identifying patients at various BMD measurement sites and thresholds (T score values of −2 and −2.5). Increasing prevalence of osteoporosis (T ≤ −2.5) with ascending risk category (low, medium, high) was apparent for all four risk tools. For example, the prevalence of osteoporosis based on femoral neck BMD was approximately 6%, 22%, and 60% at the low (42% of women), medium (47% of women), and high (11% of women) OST risk levels. Compared to measuring BMD on all women to identify all cases of osteoporosis (or measuring 90% of women at random to identify 90% of osteoporosis cases), by using OST or OSIRIS, BMD testing can be targeted to the 11% of women at highest risk. Of the women in our study classified as high risk, 85% had low bone mass and 74% had osteoporosis.

View this table:
Table 3

Prevalence of low BMD and osteoporosis by BMD measurement site and risk category

Risk categoryTotalT score ≤ −2.5T score ≤ −2
Hip (9.5%)FN (18.8%)L2–L4 (24.3%)Any (32.6%)Hip (16.8%)FN (29.1%)L2–L4 (37.9%)Any (47.3%)
> 1 (low risk)42%1.4%6%12.6%16.1%4.1%11.9%24.3%28.7%
− 3 to 1 (moderate risk)47%10.6%22.2%28.6%31%20%35.1%43.5%55.2%
< −3 (high risk)11%43.7%59.9%51.6%74.2%59.5%73.1%66%85%
< 7 (low risk)25%1.6%5.1%11.2%13.9%3.6%9.4%22.2%23.6%
7–15 (moderate risk)70%10.1%21.2%27.2%36.6%18.9%33.5%41.7%52.8%
> 15 (high risk)5%42.9%57.1%51.3%73.8%57.6%70.2%66.5%83.3%
< 9 (low risk)40%2.4%8.3%14.5%19.1%6.1%15.3%26.5%31.7%
9–17 (moderate risk)47%10.4%20.1%27.6%36.5%18.8%32.4%42.3%52.8%
> 17 (high risk)13%27.4%46.1%42.3%60%42.3%59.2%57%75.4%
> 1 (low risk)46%1.9%7%14.2%17.7%5.1%13.5%25.9%30.9%
− 3 to 1 (moderate risk)42%10.2%22.1%28.9%38.7%19.9%35.1%44.5%55.3%
< −3 (high risk)12%35.6%52.5%48.3%67.6%51.3%67.2%61.3%81.2%
  • FN, Femoral neck; L2–L4, lumbar spine L2 to L4.

At the considered thresholds, OST, SCORE, ORAI, and OSIRIS identified respectively, 88%, 86%, 90%, and 80% of the patients with normal BMD who subsequently should not have been recommended for densitometry, since according to their score they were ‘low risk’.

Using the dichotomous cut-off value of < 2, the sensitivity of OST in identifying individuals at increased risk of osteoporosis ranged from 85% for lumbar spine to 97% at the total hip region, and was higher across BMD sites than that of the other indices. The corresponding specificity of OST ranged from 34% at the total hip to 40% at any given site and was lower across sites than the other indices (Table 4). At the OST cut-off point of 2, and using a BMD T-score threshold of −2.5 for any site, 45% of the subjects were misclassified (most of these were false positives); the proportion of misclassified patients for single BMD sites was 60% at the total hip, 52% at the femoral neck, and 51% at the lumbar spine site.

View this table:
Table 4

Performance of the risk indices by BMD measurement site and T-score cut-off (%)

Total hipFemoral neckL2–L4 spineAny site*All sites**
T ≤ − 2.5
OST (< 2 vs. ≥ 2)973413999237259585373189864041869733899
SCORE (≥ 7 vs. < 7)943714988840259481393087864041869733899
ORAI (≥ 8 vs. < 8)904314988245269276453185764841809042998
OSIRIS (< 1 vs. ≥ 1)8463199775663492636537846469508085611298
T ≤ −2
OST (< 2 vs. ≥ 2)9536239789403890813945778244577396351798
SCORE (≥ 7 vs. < 7)9039239586433888774245757846567191371697
ORAI (≥ 8 vs. < 8)8545249378473884724745737351576885431795
OSIRIS (< 1 vs. ≥ 1)7865319368694785566751725873656680632395
  • Sens, sensitivity; Spec, specificity; PPV, positive predictive value; NPV, negative predictive value. *Any site, osteoporosis at one or more sites (hip or femoral neck or L2–4 spine). **All sites, osteoporosis at all sites (hip and femoral neck and L2–4 spine).

The AUC was consistently high (0.7) for the two hip sites, and somewhat lower for the spine (Table 5), indicating good test performance. For each combination of BMD measurement and T-score cutoff, the AUC results were similar for all four risk tools.

View this table:
Table 5

Areas under the ROC curves for the four assessment tools by BMD (%) measurement site and T-score cut-off

ToolTotal hipFemoral neckL2–4 spineAny site
T ≤ −2.5
T ≤ −2


Although most physicians and patients are aware of osteoporosis, it is being diagnosed and appropriately treated in only a small proportion of patients; this is true even for patients who have already had fractures. The availability of new pharmacological treatments for osteoporosis has increased pressure on public health policy makers to support BMD measurements for patients considered at high risk.21 Effective treatments reduce fracture risk by 33–50%.22 Several guidelines have been developed based upon expert opinion, cost-effectiveness criteria, systematic reviews, and/or predictive models. The US Preventive Service Task Force,23 the National Osteoporosis Foundation,24 and the North American Menopause Society25 proposed that all women aged 65 and older should have BMD measurements. On the other hand, the National Institute of Health,26 the WHO Task Force for Osteoporosis,27 the Canadian Multicentre Osteoporosis Study Group15, and the International Osteoporosis Foundation10 recommended selecting patients for BMD measurements based on particular risk factors.

In our study involving White women aged 45 years and more, the OST successfully identified most women with osteoporosis and low BMD who should undergo DXA testing. The OST, based only on age and weight, performed as well as the more complex risk assessment indices (SCORE, ORAI, and OSIRIS) in identifying women at low risk of osteoporosis who would not need DXA testing. Avoiding unnecessary testing among low risk patients can substantially reduce cost for the community and the patient (DXA is not reimbursed in some countries). For example, in this sample of Belgian women, 42% of the women were classified as low risk using OST, and thus would not need to be referred to DXA testing. Of these, only 6% actually had osteoporosis based upon femoral neck BMD. At the same time, using OST to select the 58% of women at risk for BMD measurements, 90.4% (17/18.8) of all women with osteoporosis at the femoral neck level would be identified, and only 9.6% would be missed. In contrast, if a random 58% sample of women had BMD measurements without using OST, 58% of all low BMD and OP cases would be identified and 42% would be missed. On the basis of DXA measurements, of the 4035 subjects who were scanned, 758 (18.8%) had low BMD at the femoral neck level. Using OST, only 2340 scans (58% of 4035) would have been performed to detect 685 subjects (30%) with low BMD at that level, or 1426 subjects (61%) with low BMD at any site.

Although some women who do not have low BMD were classified as increased risk (false positives) and would be referred for testing, some of these women would have undergone testing anyway if OST were not used. Among these women, treatment for low BMD would only be initiated upon confirmation by DXA—a safe, non-invasive diagnostic procedure. Thus, a risk assessment tool such as OST that is free and has no associated harm does not need to have both high sensitivity and high specificity. There is no risk of harm to the patient from unnecessary treatment or invasive diagnostic testing in case of a false-positive result from OST.

We found the performance of OST in this sample to be similar to that reported among Asian,14 American and European women of various ethnic backgrounds,16 despite differences in the reference databases used for T score calculations.28 We used the same risk tool categories as Geusens et al.16 to compare the performance of the various tools in a different population, and found very similar results for OST at the T-score ≤ −2.0 and ≤ −2.5 cut-offs. We were unable to assess the SOFSURF index, developed by Black et al.,13 based on age, weight, history of fracture, and smoking status, since smoking status was not recorded in our database.

The SCORE, ORAI, OSIRIS, and OST are validated risk indices that can help physicians and public health representatives to focus DXA testing on individuals at increased risk of osteoporosis. All four risk tools performed similarly, and identified a significant proportion of all women at low risk who would not benefit from BMD measurements. These indices have been studied and validated in several different large populations, and different risk index cut-offs have been determined for White versus Asian populations. Each of these four risk tools was developed using comprehensive surveys of numerous potential risk factors together with statistical analyses to identify the most important predictors of low BMD. Good performance was achieved using only a few risk factors, and additional risk factors did not further improve performance. Given the consistency of results across studies, it seems likely that existing risk tools represent the best achievable performance using self-reported risk factors, and efforts should now be directed at using these tools in clinical practice.

As with most studies, our study has limitations. For example, the subjects in our sample were either referred or came in spontaneously for osteoporosis evaluations, and may differ in some ways from the general population such as socioeconomic and education levels, or the prevalence of some conditions associated with osteoporosis (i.e. the prevalence of smoking and alcohol consumption,29 vitamin D or calcium deficiency30,,31 or long-term corticosteroid use32). A local Belgian BMD reference range was used for calculating T-scores, because it was felt to be representative of the Belgium population. Compared to the US reference data from NHANES III,34 differences for the BMD cutoffs using the Belgian reference data were about 5% for the T-score = −2.5 cutoff. This may explain the somewhat higher prevalence of osteoporosis in this sample of Belgian women, compared to the report of Geusens et al.16

The identification of low bone mass in postmenopausal women should be of higher priority if we are to curtail the growing tide of fractures that are already a substantial socioeconomic problem in developed countries. Although most physicians and patients are aware of osteoporosis, it is only diagnosed and appropriately treated in a small proportion of patients, even those with prior fractures.34,,35 Measuring BMD is the best method of identifying patients with osteoporosis to consider for treatment, but measuring BMD in all postmenopausal women is not feasible in most countries. The OST tool performed as well in this White population as it did in earlier studies to help target BMD measurements to women at risk. Its high negative predictive value allows for the safe exclusion of healthy women, in order to allocate BMD test resources to those most likely to benefit. ORAI, SCORE, and OSIRIS provided similar performance in this setting. The main advantage in using the OST index is that it is the simplest and quickest to calculate, and thus to use as a systematic first-line prescreening tool in post-menopausal women.


View Abstract