Detectable Changes in Physical Performance Measures in Elderly African Americans

Kathleen Kline Mangione, Rebecca L. Craik, Alyson A. McCormick, Heather L. Blevins, Meaghan B. White, Eileen M. Sullivan-Marx, James D. Tomlinson


Background African American older adults have higher rates of self-reported disability and lower physical performance scores compared with white older adults. Measures of physical performance are used to predict future morbidity and to determine the effect of exercise. Characteristics of performance measures are not known for African American older adults.

Objective The purpose of this study was to estimate the standard error of measurement (SEM) and minimal detectable change (MDC) for the Short Physical Performance Battery (SPPB), Timed “Up & Go” Test (TUG) time, free gait speed, fast gait speed, and Six-Minute Walk Test (6MWT) distance in frail African American adults.

Design This observational measurement study used a test-retest design.

Methods Individuals were tested 2 times over a 1-week period. Demographic data collected included height, weight, number of medications, assistive device use, and Mini-Mental Status Examination (MMSE) scores. Participants then completed the 5 physical performance tests.

Results Fifty-two participants (mean age=78 years) completed the study. The average MMSE score was 25 points, and the average body mass index was 29.4 kg/m2. On average, participants took 7 medications, and the majority used assistive devices. Intraclass correlation coefficients (ICC [2,1]) were greater than .90, except for the SPPB score (ICC=.81). The SEMs were 1.2 points for the SPPB, 1.7 seconds for the TUG, 0.08 m/s for free gait speed, 0.09 m/s for fast gait speed, and 28 m for 6MWT distance. The MDC values were 2.9 points for the SPPB, 4 seconds for the TUG, 0.19 m/s for free gait speed, 0.21 m/s for fast gait speed, and 65 m for 6MWT distance.

Limitations The entire sample was from an urban area.

Conclusions The SEMs were similar to previously reported values and can be used when working with African American and white older adults. Estimates of MDC were calculated to assist in clinical interpretation.

African American older adults have higher rates of self-reported disability and lower physical performance scores compared with white older adults. Racial differences in physical performance increase with age and are greater in women than in men.1 The prevalence of heart disease, diabetes, obesity, and arthritis also is greater in African Americans than in the white population.2,3 Although the management of each of these diseases includes recommendations for exercise,46 exercise studies targeting older African Americans are few in number and tend to focus on weight management7,8 and increasing physical activity.9,10 Elderly African Americans have not been studied in exercise trials.

Measures of physical function are used, for example, to determine the effects of an intervention or to assign individuals to groups to predict future events such as hospitalization or mortality. Physical performance tests such as the Short Physical Performance Battery (SPPB) and the Six-Minute Walk Test (6MWT) have been shown to be sensitive to change in older adult populations1113 and predictors of morbidity.1417

Perera et al13 have reported the psychometric properties, including the standard error of measurement (SEM) and meaningful change, for the SPPB, gait speed, and 6MWT for several samples of predominantly white older adults. It is not known, however, whether psychometric properties are similar for older African Americans. Wolinsky and colleagues18 assessed reliability of the SPPB in a subsample of the African American Health Project, but the participants were all younger than 65 years of age. Older African Americans were included in the Women's Health And Aging Study, but SPPB results were not differentiated by race.11,19 Therefore, the purpose of this study was to estimate the SEM (absolute reliability) and minimal detectable change (MDC) for the Short Physical Performance Battery (SPPB), free gait speed, fast gait speed, 6MWT distance, and Timed “Up & Go” Test (TUG) time in an understudied sample of elderly African Americans.


Study Design

This was an observational measurement study that used a test-retest design to estimate the SEM for the physical performance tests. Written consent was obtained from all participants.

Study Setting

Individuals were tested 2 times over a 1-week period from January 2009 to May 2009. Participants were recruited from 2 sites: the West Oak Lane Senior Center and the University of Pennsylvania School of Nursing Living Independently for Elders Program. The sites are located in urban Philadelphia and provide social and recreational activities, health screenings and information, and hot meals. The University of Pennsylvania site is a Program of All-Inclusive Care for the Elderly (PACE) participant and additionally offers primary care services, as well as rehabilitative and social services if needed.


This was a sample of convenience. Participants were recruited at the sites by word of mouth, by invitation of the activity director, and by an advanced practice nurse. Sixty-two individuals volunteered. To be included, participants had to be 65 years of age or older, be ambulatory with or without an assistive device but not requiring human assistance, and identify themselves as African American. The exclusion criterion was a Mini-Mental Status Examination (MMSE) score less than 15.

Demographic Measures

At the first session, each participant reported his or her medical history, current medications, and assistive device use, which were verified with medication and medical records when available (73%). Study investigators (A.A.M., H.L.B., and M.B.W.) administered the MMSE and measured height and weight to calculate body mass index (BMI).

Performance Measures

Participants completed 5 performance measures in the following order during each testing session: SPPB, TUG, free gait speed, fast gait speed, and 6MWT. The SPPB is a composite of 3 timed tests: chair rise for 5 repetitions without the use of arms; standing balance in positions of side-by-side stance, semi-tandem stance, and full tandem stance; and walking speed over a 2.44-m (8-ft) course. Each test is scored on a scale of 0 to 4, with a highest possible score of 12, indicating better function. The SPPB has been found to be predictive of disability and mortality.14,15

The TUG was performed by having the participants stand from a chair with armrests, walk 3.28 m (10 ft) away from the chair, then return and sit back down while being timed. Participants completed this measure twice, and the average of the 2 trials was used for data analysis. The TUG is predictive of falls and decline in activities of daily living in older people and has been found to have adequate reliability for clinical use.20,21

Free gait speed and fast gait speed were measured by the Gait Mat II,* which measures spatial and temporal components of gait. Each participant was asked to walk at his or her “normal” speed across the mat for 2 trials and then as “fast as possible” for 2 trials. The 2 trials of free speed were averaged each session, as were the 2 trials of fast speed. The average speeds were used in data analysis. Gait speed can be used to predict subsequent disability and has adequate reliability for clinical use.22,23

The last performance measure was the 6MWT. Each participant was instructed to cover as much distance as possible in 6 minutes. The paths for the walk were each greater than 33 m (100 ft) long. Participants were given standardized encouragement after each minute and told how much time they had left to complete the test. The 6MWT has been found to be a measure of mobility in older adults, as well as a measure that helps to describe the impact of comorbidities on exercise capacity and endurance.24,25

Individually determined rest periods were provided for all participants. Testing time varied between 15 and 30 minutes. Blood pressure and heart rate were assessed for safety prior to each data collection session.


The examiners were not blinded to the purpose of the study or to the data collected. However, the examiners were trained on how to properly collect the measurements and were instructed not to familiarize themselves with the previous session data.

Statistical Methods

We estimated sample size based on a desired reliability coefficient of .90, as demonstrated by the Women's Health and Aging Study sample,11 and an acceptable coefficient of .80. With a one-sided 95% confidence interval and 2 testing sessions and with α=.05, a minimum sample size of 21 was required.26,27 We oversampled to account for potential loss of participants during follow-up testing and to decrease the width of the resulting confidence intervals. Data were analyzed with Excel and SPSS version 15.

Descriptive statistics were used to characterize the sample. We compared demographic and performance characteristics using unpaired t tests for participants who did and did not complete the testing. Means and standard deviations of the performance measures from time 1 and time 2 for all the tests were calculated. Relative reliability was determined using intraclass correlation coefficient (ICC [2,1]) for the SPPB and 6MWT distance and ICC (2,k) for averaged TUG time, averaged free gait speed, and averaged fast gait speed.28 The SEM was used to determine absolute reliability and was calculated with the following formula: Embedded Image, where sd is the pooled standard deviation of the 2 testing trials. The 95% confidence intervals for SEM were estimated using the method reported by Stratford and Goldsmith.29 The MDC was computed from the SEM to indicate the amount of change needed to be confident that a true change occurred: MDC90=SEM × 1.65 × Embedded Image. The MDC90 is associated with a 90% confidence level, meaning that typical variability in patients with stable performance could result in changes up to the MDC90. Any changes greater than the MDC90 value would be considered “real” change.

Role of the Funding Source

This study was funded by the Hartford Center of Geriatric Nursing Excellence–Jones Fund and by the Ellington Beavers Fund for Intellectual Inquiry. The funding sources were not involved in recruitment, data collection or analysis, or manuscript writing.


Sixty-two individuals met the inclusion criteria and gave their consent to participate in the study. Ten individuals (16%) failed to return for a second trial due to not being present at the center (n=6) or refusal to be retested (n=4). The 10 participants who did not complete the study were not statistically different (P>.05) from the other 52 participants in age, BMI, MMSE score, number of medications, or any of the performance measures.

There was an average (SD) of 2.3 (0.5) days between testing sessions. Descriptive statistics for the sample are shown in Table 1. The total sample consisted of 52 participants (45 female, 7 male). The average age of the participants was 78 years. They had an average score of 25 points on the MMSE and an average BMI of 29.4 kg/m2, and they took an average of 7 medications. We recorded their medical conditions and reported the conditions that were present in at least 10 of the 52 participants. These conditions were hypertension, osteoarthritis, diabetes, renal disease, gastroesophageal reflux disease, visual impairment or blindness, peripheral vascular disease, stroke, dementia, anemia, congestive heart failure, and chronic obstructive pulmonary disease. From this list, the number of comorbid conditions an individual had ranged from 0 to 9, with the average number of conditions being 4.5. Sixty percent of the participants had an additional 1 to 2 medical conditions not reported here. Fifty-six percent of the participants used either a single-point cane or a rolling walker for assistance with ambulation.

Table 1.

Description of Sample of African American Older Adults (n=52)

Mean values of the whole group for day 1 and day 2 of testing are presented in Table 2, along with the ICC, SEM, and MDC90 value for each test. Data from 48 participants were used for the analysis of free gait speed and fast gait speed due to technical malfunction of the Gait Mat II. Intraclass correlation coefficients were .90 or higher for all tests except for the SPPB score (ICC=.81). The SEM was 1.2 points for the SPPB; 1.7 seconds for the TUG; 0.08 m/s and 0.09 m/s for free and fast gait speed, respectively; and 28 m for 6MWT distance. The MDC90 values were 2.9 points for the SPPB, 4.0 seconds for the TUG, 0.19 m/s for free gait speed, 0.21 m/s for fast gait speed, and 65 m for 6MWT distance.

Table 2.

Physical Performance Measure Resultsa


We have provided estimates of variability for commonly used physical performance measures for a group of older African Americans. Physical performance measures have not been reported on a sample similar to ours. Taaffe and colleagues30 reported gait speed on a subset of individuals who were well-functioning from the Health Aging and Body Composition Study. The sample had a mean age of 73 years, and mean gait speed values for the African Americans were 1.0 m/s for the women and 1.1 m/s for the men.30 Everson-Rose and colleagues31 reported SPPB scores from participants in the Chicago Health and Aging Project. On average, the sample was 74 years of age and scored 9.86 on the SPPB.31 Like the individuals reported in these studies, many of our participants were obese, based on an average BMI of 29.4 kg/m2. Our group also showed mild cognitive impairment, with the average MMSE score being 25 points. More than half of the sample used assistive devices, had an average free gait speed of 0.7 m/s, had multiple chronic conditions, and had an average SPPB score of 7. These values are associated with physical disability in walking 0.8 km (0.5 mile) and climbing stairs and needing help in activities of daily living.14,15

The results of this study are consistent with the values reported by other authors. Perera and colleagues13 reported SEMs of 1.42 points for the SPPB, 0.10 m/s for the 3.28-km walking speed, and 22 m for 6MWT distance. Ries and colleagues32 studied a sample of institutionalized older adults with dementia and reported SEMs of 2.48 seconds for the TUG, 20 m for 6MWT distance, and 0.06 m/s for gait speed. In people with advanced hip or knee osteoarthritis, the SEM for 6MWT distance was 26 m.33 Despite the different samples of older adults, the SEMs were very similar and suggest that African American and white older adults show similar variability in physical performance.

Clinicians should be able to use the reported SEM values in clinical practice because the SEM represents how much a patient's performance measurements vary if the test is repeated without any underlying change in the patient; that is, it represents measurement error. For example, a clinician administers the SPPB during a health screening event at a senior center, and an older adult scores an 8 on the test. If the individual is retested later that day, the SPPB score may be a 7, 8, or 9 (typical variability in performance, based on the rounded-off estimate of SEM=1). The score of 8 has been associated with an odds ratio of 7.6 for developing mobility disability within 3 years.34 This score would lead the clinician to advise the older adult to alter or begin an exercise program.

The translation of measurement error (SEM) into numbers that can be used to evaluate the effectiveness of an exercise program, for example, is an area with little consensus. Some authors have advocated that the SEM be considered as the MDC or minimally important change after an exercise intervention.35,36 However, additional calculations and patient input have been used to define terms such as “minimal detectable change,”37 “minimally important change,”38 “small meaningful change,”13 “smallest detectable difference,”39 “minimal clinically important difference,”40 and “substantial meaningful change.”13 These terms connote that the change is real change and, in some cases, suggest that the change is meaningful for the person.

Applying the MDC data to a patient example can highlight the complexity of interpreting clinical change. For example, did an older adult patient whose 6MWT distance increased by 42 m after completing a 1-month outpatient exercise program make a real change in endurance? According to the MDC90 of 65 m that we reported, the clinician could not be 90% confident that a real change occurred in 6MWT distance. Kennedy et al33 reported a similar value of 61 m as the MDC90 for 6MWT distance. Other clinical findings will likely support the impression that endurance has or has not improved. Researchers, however, more than clinicians, rely on a single primary outcome measure to determine the effectiveness of a program. For researchers, between-group statistical significance may be found even when within-group intervention changes are smaller than detectable change (65 m). Perera et al13 reported that distances of 50 m or more were “substantial meaningful change” and were calculated to help predict sample size for future exercise trials. The many ways to interpret clinical change warrants further study and discussion.

Of the tests that we have studied, we are least confident that the SPPB scores are suitable for detecting individual level change. The reliability coefficient was .81, which was the lowest value for the physical performance measures we tested. Additionally, a change of 3 points on the SPPB would entail an enormous increase in one area (gait speed, chair rise ability, or balance ability) or large increases in several areas. This magnitude of change might be evident in older patients acutely recovering from orthopedic surgery, but less likely in older adults who begin exercise or rehabilitation programs because of gradual declines in function. The SPPB was devised as a performance measure to be used in epidemiologic studies and appears better suited for group data.

Gait speed, the TUG, and the 6MWT can be used for individual-level decisions in a wide sample of older adults. The slight variability in interpretations of detectable change in the literature may have to do with sampling and methods used. In the study by Ries and colleagues,32 consistent and frequent verbal cues were provided throughout the testing procedures to ensure that the patients with dementia were following the task. These commands may have decreased some of the test-retest variability and thus explain their smaller MDC90 values.

There are several limitations to this study. The sample was one of convenience, and although it represented a wide range in function and cognition, all of the participants were from an urban area and may not be representative of all older African Americans. We did not have information on social history or on educational level, which may have affected performance. The walking portion of the SPPB was conducted only once as was reported in the original article,14 but additional trials and longer lengths were recommended later. It is possible that repeated walking trials may have increased the stability of the SPPB scores; however, we used the same technique in each session. We also had an additional measure of gait speed and believe that our gait speed values accurately reflect performance. The testing sessions were not conducted in private, so the influence of other older adults or staff in the room may have affected performance on any particular day.


The SEMs for the SPPB, TUG, free and fast gait speed, and 6MWT distance in elderly African American older adults were similar to reported values in samples of whites and can be used to estimate typical variability in performance. Estimates of MDC at the 90% confidence level were calculated to assist in clinical interpretation of change after exercise interventions.


  • Dr Mangione, Dr Craik, Dr McCormick, Dr Blevins, and Dr Sullivan-Marx provided concept/idea/research report and writing. Dr Mangione, Dr Craik, Dr McCormick, Dr Blevins, and Ms White provided data collection. Dr Mangione, Dr Craik, Dr McCormick, Dr Blevins, Ms White, and Mr Tomlinson provided analysis. Dr Mangione and Dr Sullivan-Marx provided project management and fund procurement. Dr Mangione, Ms White, and Dr Sullivan-Marx provided participants. Dr Sullivan-Marx provided facilities/equipment and institutional liaisons. Mr Tomlinson provided consultation (including review of manuscript before submission).

  • The study was approved by the institutional review boards of Arcadia University and the University of Pennsylvania.

  • A subset of the sample data was presented orally at a local geriatrics meeting in Philadelphia. The data were presented orally at the Annual Meeting of the Gerontological Society of America; November 18–22, 2009; Atlanta, Georgia, and as a poster at the Combined Sections Meeting of the American Physical Therapy Association; February 17–20, 2010; San Diego, California.

  • This study was funded by the Hartford Center of Geriatric Nursing Excellence–Jones Fund and by the Ellington Beavers Fund for Intellectual Inquiry.

  • * EQ Inc, PO Box 16, Chalfont, PA 18914-0016.

  • Microsoft Corp, One Microsoft Way, Redmond, WA 98052-6399.

  • SPSS Inc, 233 S Wacker Dr, Chicago, IL 60606.

  • Received November 5, 2009.
  • Accepted February 25, 2010.


View Abstract