|
|
||||||||
Research Reports |
PW Stratford, PT, MSc, is Professor, School of Rehabilitation Science, and Associate Member, Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, and a Scientific Affiliate in the Department of Surgery, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
DM Kennedy, BScPT, MSc, is the Manager of Program Development for Hip and Knee Replacement, Holland Orthopaedic & Arthritic Centre of Sunnybrook Health Sciences Centre, and Part-time Assistant Clinical Professor, School of Rehabilitation Science, McMaster University
LJ Woodhouse, PT, PhD, is Assistant Professor, School of Rehabilitation Science, McMaster University, and a Scientific Affiliate in the Department of Surgery, Sunnybrook Health Sciences Centre
Address all correspondence to Mr Stratford at: stratfor{at}mcmaster.ca
Submitted January 4, 2006;
Accepted July 5, 2006
| Abstract |
|---|
Key Words: Factorial validity Osteoarthritis Outcome assessment
| Introduction |
|---|
|
|
|---|
Although many self-report measures profess to assess physical function, few provide an operational definition of its intended meaning. A noted exception is the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) physical function subscale, which provides the following statement: "By this we mean your ability to move around and to look after yourself."6 We suspect that this statement captures the intended meaning of lower-extremity physical functional status left undeclared by many researchers, and it is representative of our view of lower-extremity functional status.
Contrary to the belief that self-report measures and performance measures of physical function provide comparable information is a body of work refuting this idea.2,3,7 Parent and Moffet,2 in a study of patients after total knee arthroplasty, noted improvement in self-reported physical function as measured by the WOMAC and the Medical Outcomes Study 36-Item Health Survey Questionnaire (SF-36) physical function subscales but a significant reduction in the 6-minute walking distance when assessed at 2 months after arthroplasty. Maly et al,7 in an investigation of patients with OA of the knee, reported higher correlations between pain and WOMAC and SF-36 physical function scores than between pain and 3 performance measures (Six-Minute Walk Test, Timed "Up & Go" Test, and a stair test). Using a stepwise linear regression analysis that included pain and thigh muscle strength as independent variables and WOMAC and SF-36 physical function subscales as dependent variables, these investigators also found that pain was more predictive of self-reported function than muscle strength.7 Stratford and Kennedy,3 in a study of patients after hip or knee arthroplasty, reported higher standardized regression coefficients between pain and WOMAC physical function scores and change scores than between pain and the time or distance associated with several performance tasks. Also reported in this article was the finding that self-reported Lower-Extremity Functional Scale (LEFS) scores were most strongly associated with pain preoperatively, exertion when assessed within 2 weeks of arthroplasty, and the time or distance associated with performance measures when evaluated approximately 2 months after arthroplasty.3
A further insight is provided in a study that examined the relationship between performance-rated components of pain, exertion, and function (time or distance) and LEFS scores.8 Patients with end-stage OA of the hip or knee and awaiting arthroplasty performed 3 performance tasks—40-m self-paced walk, stair test, and Timed "Up & Go" Test—and completed the LEFS. Immediately following each performance task, patients reported the amount of pain and exertion that they experienced.8 An exploratory factor analysis identified 3 factors, with pain responses loading on 1 factor, exertion loading on the second factor, and time loading on the third factor. The LEFS loaded on all 3 factors (pain=.44; exertion=.41; and time=.35).8 Recently, Terwee et al9 examined the relationship between the WOMAC and SF-36 pain and function subscales with the performance-based DynaPort Knee Test* for patients with OA of the knee before and after arthroplasty. Applying an exploratory factor analysis, these investigators found that the self-report measures of pain and function loaded on 1 factor and that the performance measure loaded on a second factor.9 The SF-36 function score loaded on both factors, with the higher loading on the factor composed of the self-report measures (.78 and .69).9 Collectively, these findings support the premise that self-report measures of physical function assess more than a patient's ability to move around.2,3,7–9 It appears that, in addition to providing patients' perceptions of their ability to move around, self-report measures of physical function also are influenced by what patients experience when moving around (eg, pain and exertion).
A further understanding of the relationship between self-report assessments of pain and physical function is offered by a number of studies that examined the factorial validity of the WOMAC. Factorial validity exists to the extent that items cluster in accordance with the specified domains to which they have been assigned by the measure's developer. The WOMAC was conceived to assess 3 domains: pain, stiffness, and physical function.10 Accordingly, factorial validity would exist if the 5 pain items loaded on 1 factor, the 2 stiffness items loaded on a second factor, and the 17 physical function items loaded on a third factor. However, there is consistent evidence demonstrating that the WOMAC pain and physical function items group more by activity than by the hypothesized domains of pain and physical function.11–14
There is no doubt that pain and physical function are related health concepts. Yet to the extent that during assessments, clinicians routinely inquire about pain and physical function separately, outcome measures have separate scales to assess pain and function, and due to the fact that authoritative groups such as OMERACT III have identified pain and physical function as 2 core outcome measures rather than 1, investigators are challenged to develop assessment methods that maximize valid information concerning the attributes of interest. It was with these challenges in mind that we undertook the present study.
Our intent was to determine whether performance test assessments of pain and physical function provided responses consistent with these 2 domains. Specifically, our goal was to evaluate the factorial validity of performance assessments of pain and physical function. Our specific hypotheses were as follows: (1) responses to the performance assessments could be explained by 2 factors, 1 consisting of pain items and the other consisting of time (distance) items; (2) each pain or performance item would be related only to the health concept that it was perceived to be assessing (each item would have a nonzero loading on the factor that it was conceived to measure and a zero loading on the other factor); (3) the factors pain and physical function would be correlated; and (4) the measurement error terms associated with the items would be uncorrelated.
| Method |
|---|
|
|
|---|
|
Measures
Participants completed 4 performance measures in the following order: self-paced walk, Timed "Up & Go" Test, stair test, and Six-Minute Walk Test. Several minutes were provided between the self-paced walk, Timed "Up & Go" Test, and stair test. A 10-minute rest interval was provided between the stair test and the Six-Minute Walk Test. With the exception of the Six-Minute Walk Test, the outcome was the time to complete the task. Time was measured on a stopwatch to the nearest one-hundredth of a second, and distance was measured to the nearest meter.
Self-paced walk
Participants walked 2 lengths of a 20-m indoor course in response to the instructions, "Walk as quickly as you can without overexerting yourself."15 The turnaround time was excluded. An intraclass correlation coefficient (ICC) for test-retest reliability of .91 and a standard error of measurement of 1.73 seconds have been reported for this measure for patients similar to the participants in the present study.15
Timed "Up & Go" Test
Participants were instructed to rise from a standard arm chair, walk at a safe and comfortable pace to a line 3 m away, cross the line, turn, and return to a sitting position in the chair.16 An ICC for test-retest reliability of .75 and a standard error of measurement of 1.07 seconds have been reported for this measure for patients with OA and those undergoing arthroplasty of the hip or knee.15
Stair test
Participants ascended and descended 9 stairs (step height, 20 cm; step depth, 27 cm) in their usual manner at a safe and comfortable pace.15 A handrail was available. An ICC for test-retest reliability of .90 and a standard error of measurement of 2.35 seconds have been reported for this measure for patients similar to the participants in the present study.15
Six-Minute Walk Test
Participants were instructed to cover as much distance as possible during the 6-minute time frame. Standardized encouragement—"You are doing well, keep up the good work"—was provided at 60-second intervals. The test was conducted on a premeasured, 46-m, unobstructed, uncarpeted, rectangular circuit. The outcome was the distance walked in 6 minutes.15,17 An ICC for test-retest reliability of .94 and a standard error of measurement of 26.29 m have been reported for this measure for patients similar to the participants in the present study.15
Activity-specific pain rating
Participants marked the pain that they experienced on an 11-point (0–10) numeric rating scale immediately following each performance test.15 We are not aware of test-retest reliability values for patients similar to the participants in the present study; however, a reliability estimate (ICC) of .86 and a standard error of measurement of 1.04 have been reported for people with a spectrum of lower-extremity problems.18
Data Analysis
We applied confirmatory factor analysis with a maximum-likelihood estimation method (AMOS 4.0
) to assess the factorial validity of the performance tests.19–22 Unlike exploratory factor analysis, which provides all possible factor loadings, confirmatory factor analysis provides factor loadings for the specified model only. We conceptualized a measurement model with 2 factors, which we labeled pain and physical function (Fig. 1).22 We applied the following indexes to assess model fit: comparative fit index (CFI), relative fit (RF), Tucker-Lewis Index (TLI), root-mean-square error of approximation (RMSEA), and the model fit chi-square test and associated P value.22 Although no single standard exists for defining acceptable model fit, the following values are generally accepted: CFI, RF, and TLI values exceeding .95 indicate good fit; RMSEA values of less than .05 indicate good fit; and RMSEA values of less than .08 indicate reasonable fit.22,23 A significant chi-square value (eg, P<.05) indicates that the data do not fit the model. Prior to conducting the analyses, we assessed the data and found several of the underlying distributions to be nonnormal. Accordingly, we applied the bootstrap feature of AMOS 4.0 for 1,000 samples with replacement to estimate the parameter values and model fit indexes.22
|
| Results |
|---|
|
|
|---|
28=7.7, P=.473; cross-validation sample:
28=9.6, P=.269; simultaneous test for a difference between model structures:
216=17.6, P=.351). Similar results also were obtained for the second cross-validation analysis (knee sample:
28=12.1, P=.148; hip sample:
28=13.3, P=.108; simultaneous test for a difference between model structures:
216=25.3, P=.064). Given that the cross-validation analyses supported the model for various independent subgroups of participants, we present the results for the entire sample of 177 participants. Descriptive statistics for the performance measures are shown in Table 2, Figure 1 shows the standardized factor loadings for the initial measurement model (model 1), and Table 3 shows the fit statistics. The observed or measured variables in Figure 1 are shown in rectangles, and the latent variables are shown in circles. The larger circles labeled "pain" and "function" designate the factors, and the smaller circles with numbered "e" values signify the measurement error terms associated with each observed variable. The numbers between the factors and observed variables connected by single-headed arrows represent the standardized factor loadings. The negative value associated with the function component of the 6-minute walk test occurs because higher functional levels are associated with greater distances, whereas shorter times reflect higher functional levels for the other 3 performance tests. The curved double-headed arrow showing a value of .48 represents the correlation between the factors pain and function. Although the CFI, RF, and TLI exceeded .90 (Tab. 3), the root-mean-square coefficient indicated a less-than-desirable fit. The modification index for this model (not shown) suggested that the model could be improved by adding a correlation between the stair pain and time error terms, and we elected to address this association with 2 revised models. To ascertain the magnitude of the correlated error terms, the first revised model (model 2a) specified a correlation between the stair pain and time error terms (Fig. 2a: curved double headed arrow showing a correlation of .41). The second revised model (model 2b) removed the stair pain and time terms (Fig. 2b). The fit statistics for both models are shown in Table 3. Both modified models improved the fit over that of the initial model. However, of the 2 modified models, only the 1 that removed the stair terms achieved a good fit for all indexes and was consistent with all of our initial hypotheses.
|
|
|
| Discussion |
|---|
|
|
|---|
Rather than adhering to the notion that self-report measures represent the preferred method of assessing physical function, we examined whether performance-specific evaluations of pain and physical function provide a viable method for obtaining a more distinct assessment of these 2 related health concepts than has been reported for self-report measures, such as the WOMAC. Our initial model yielded a correlation of .48 between pain and function, providing support for hypotheses 1 and 3, which conceptualized 2 correlated health concepts. Moreover, our second hypothesis was sustained in that significant correlations were obtained for the specified health concepts, and no evidence of loading on the nonspecified health concept was evident. However, our fourth hypothesis was not supported in that the error terms for stair pain and function were correlated. This finding led to the exploration of 2 revised models: 1 allowed a correlation between stair pain and time error terms, and the other removed the stair terms from the model. The intent of the model that allowed a correlation between the stair pain and time terms was to examine the extent to which these components were correlated beyond the correlation between the factors pain and function. The second revised model excluded the stair test from the analysis, and this model provided results consistent with our 4 hypotheses. The correlation between the factors pain and function was .43 for the final model. This correlation is lower than that typically reported between the pain and function subscales of the WOMAC (.74–.84)7,9,25 and SF-36 (.57)9 for patients reasonably similar to the participants in the present study.
Inclusion of the stair test makes the distinction between the health concepts of pain and function less discernable. This finding is reflected in the lower correlation between pain and function noted in model 2b (r=.43) than in models 1 and 2a (r=.48). Accordingly, when the clinical goal is to obtain as distinct an assessment as possible between the health concepts of pain and function, our results suggest that the stair test not be included in a composite score. However, we are not suggesting that the stair test be excluded from a patient's assessment. It is clear that 1 of the physical therapist's responsibilities for patients similar to the participants in the present study is to ascertain their ability to safely ascend and descend stairs and to intervene when appropriate. We simply stress that if the results from the stair test are combined with the results from the other performance measures, then the impressions of pain and function will be less distinct.
Assessments of pain and function are important both to identify patients' problems at a point in time and to assess change over time. Information from these assessments is applied by clinicians to guide decisions concerning individual patients, by researchers to ascertain the relative effectiveness of competing interventions in clinical trials, and by health care policy makers to set benchmarks regarding the maximum number of patient visits and corresponding payment plans. Previous work demonstrated that self-reports of physical function after arthroplasty are strongly influenced by pain and change in pain.3 The consequences are that patients report their physical function to be higher than is demonstrated by performance tests and that health care professionals who rely on self-reports alone overestimate patients' functional status levels.2,3 The results of the confirmatory factor analysis of the present study indicate that performance-rated pain and function represent 2 factors that have not emerged in previous factor analyses of self-report measures. Accordingly, complementing existing self-report assessments of physical function with performance-rated pain and function tests may provide clinicians with a more valid assessment of these health concepts.
There are several potential limitations of the present study. First, the study sample was patients awaiting hip or knee arthroplasty. Presumably, these patients have more severe OA than the typical patient seen in general physical therapist practice. However, in considering this point, it should be remembered that the study participants were able to complete all of the performance tests. A second limitation relates to the sample size for the cross-validation portion of the present study. Although there is no standard method for estimating sample size, it is generally agreed that the sample size should be at least 10 subjects per observed variable or a minimum of 100 subjects.21,26 Although our overall sample size of 177 participants exceeded the recommended minimum sample size, the number of participants in each of the cross-validation samples was slightly smaller than the recommended sample size.
| Conclusion |
|---|
|
|
|---|
| Footnotes |
|---|
The Research and Ethics Committee of the Holland Orthopaedic & Arthritic Centre of Sunnybrook Health Sciences Centre approved this study.
* McRoberts BV, The Hague, the Netherlands. ![]()
SmallWaters Corp, 1507 E 53rd St, Suite 452, Chicago, IL 60615. ![]()
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
F. Cecchi, R. Molino-Lova, A. Di Iorio, A. A. Conti, A. Mannoni, F. Lauretani, E. Benvenuti, S. Bandinelli, C. Macchi, and L. Ferrucci Measures of Physical Performance Capture the Excess Disability Associated With Hip Pain or Knee Pain in Older Persons J Gerontol A Biol Sci Med Sci, December 1, 2009; 64A(12): 1316 - 1324. [Abstract] [Full Text] [PDF] |
||||
![]() |
E M Roos, A B Bremander, M Englund, and L S Lohmander Change in self-reported outcomes and objective physical function over 7 years in middle-aged subjects with or at high risk of knee osteoarthritis Ann Rheum Dis, April 1, 2008; 67(4): 505 - 510. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. M Kennedy, P. W Stratford, D. L Riddle, S. E Hanna, and J. D Gollish Assessing Recovery and Establishing Prognosis Following Total Knee Arthroplasty Physical Therapy, January 1, 2008; 88(1): 22 - 32. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |