|
|
||||||||
Letters and Responses |
A recent issue of Physical Therapy contained a 2-part series investigating the Roland-Morris Back Pain Disability Scale (RMQ).1,2 In these articles, the authors provided further evidence regarding the ability of the RMQ to detect meaningful change over time in patients with low back pain. I would like to comment on the authors' use of the term "sensitivity to change" to describe this measure.
Sensitivity to change is defined in the first article as "the capacity of a measure to detect change in patients' functional status over time"1(p1188) and in the second article as dealing with "the clinical meaningfulness of changes in scores."2(p1198) The term typically applied to this measurement property is "responsiveness," not "sensitivity to change." "Responsiveness" is, indeed, the term used by Kirshner and Guyatt,3 whom the authors of the first article1 reference when describing the measure properly. Although the terms "responsiveness" and "sensitivity to change" are interrelated, they convey a somewhat different meaning, and the proper term to describe the measure investigated in the studies by Riddle et al2 and Stratford et al1 is "responsiveness," not "sensitivity to change."
The capacity of a scale to detect meaningful change over time, classically described as responsiveness, involves 2 issues: first, the measure must detect meaningful change when it has occurred, and second, it must remain stable when no change has occurred. The analogy is often made to a diagnostic test, in which the "disease" that the investigator is attempting to detect is meaningful change and the "test" is the change score of the scale.4 To continue with this analogy, the ability of a scale to detect change when it has occurred describes the scale's sensitivity to change, whereas the stability of a scale in patients who have not changed represents its specificity to change.
In order to investigate the sensitivity of a diagnostic test, the investigator looks only at the patients diagnosed with the disease based on a gold standard and calculates the percentage of patients properly diagnosed by the test. Likewise, specificity is calculated by looking only at patients without the disease and calculating the percentage with a negative test result. Sensitivity and specificity may then be combined into a statistic such as a positive likelihood ratio (sensitivity/[1specificity]) that combines both patients with and without the disease and describes the overall diagnostic ability of the test.5
Carrying this analogy to the investigation of self-report measures such as the RMQ, sensitivity to change can be evaluated by looking at only those patients identified as having undergone meaningful change based on a gold standard (such as global rating or some other external criteria) and calculating a statistic (such as an effect size or standardized response mean) describing the ability of the RMQ to identify change in this group of patients. Similarly, specificity to change can be investigated only by calculating such a statistic using those patients deemed to be stable based on the gold standard. Statistics calculated using both groups of patients (those who have changed and those who are stable) cannot properly be said to assess sensitivity or specificity to change. Calculations that involve both patients who have and have not changed describe a measure's overall responsiveness.
Beurskens et al6 provide an example of these distinctions. The authors calculated effect sizes for the RMQ and the Oswestry Questionnaire (OSW) separately for patients judged by a gold standard to have changed and those judged to be stable. Effect sizes should be relatively high in patients who have changed, attesting to a measure's sensitivity to change, and should be near zero in stable changes if a measure possesses specificity to change. Because patients who underwent meaningful change based on the gold standard were evaluated as a separate group, these authors could correctly state that they evaluated the sensitivity to change of the 2 scales. These authors also calculated the area under a receiver operating characteristic (ROC) curve that combines patients who have and have not changed to compute a composite statistic describing the overall responsiveness of each scale. The ROC curve was also used to determine the minimum clinically important difference (MCID), or the smallest difference in scores that can be confidently judged to represent meaningful change and not measurement error.7 Because the area under the ROC curve and MCID calculations involved both changed and stable patients, these statistics were said by the authors to describe the responsiveness of the RMQ and the OSW.6
Stratford et al1 and Riddle et al2 used ROC curve analyzes and calculated MCID values for the RMQ in their studies. These measures combine both changed and stable patients and, therefore, should be termed measures of responsiveness. The phrase "sensitivity to change" should be reserved for statistics that strictly describe the magnitude of change occurring in patients who have changed based on a gold standard. This type of analysis was not performed in these 2 studies.
Investigation into the measurement properties of outcome scales such as the RMQ is a relatively new and rapidly expanding field of inquiry. As interest in this area continues to grow, the maintenance of precise terminology will be important in facilitating future debate and discussion. The term "sensitivity" has a precise definition in the epidemiological vernacular, which should be maintained in studies of this nature.
Assistant Professor
Department of Physical Therapy
University of Pittsburgh
6035 Forbes Tower
Pittsburgh, PA 15260
(jfritz{at}pitt.edu)
References
We believe that Fritz makes 2 major points: (1) "responsiveness involves 2 issues: first, the measure must detect meaningful change when it has occurred, and second, it must remain stable when no change has occurred," and (2) the notion that the term "sensitivity," when used in a diagnostic test concept, is really what the term "sensitivity to change" is referring to when used in the literature. Before proceeding, we would like to preface our response by stating that, based on current usage, we believe the terms "responsiveness" and "sensitivity to change" have essentially equivalent meanings. However, we will offer an argument concerning our use of the term "sensitivity to change."
As Fritz has done, authors applying the term "responsiveness" frequently cite articles by Kirshner and Guyatt.2,3 The term "responsiveness," as defined by Kirshner and Guyatt,2 does not include the concepts of valid change (ie, distinguishing those patients who change by different amounts) or stability. Instead, it is the term "evaluative measure" that conveys the concept of accurately detecting meaningful change over time. Kirshner and Guyatt defined evaluative index as follows: "An evaluative index is used to measure the magnitude of longitudinal change in an individual or group on the dimension of interest."2 They stated that it is composed of 3 properties: reliability, validity, and responsiveness.2 They defined reliability as "stable intrasubject variation: insignificant variation between replicate measures," validity as "longitudinal construct validity: relationship between changes in index and external measures over time," and responsiveness as the "power of the test to detect a clinically important difference."2 Therefore, according to Kirshner and Guyatt's taxonomy, it is possible for a measure to be responsive but not valid.2,3 Other authors4,5 have suggested that responsiveness is an aspect of validity.
As with the term "responsiveness," there also is variation in the implied meaning of the term "sensitivity to change." To illustrate the conundrum facing readers, we apply an example from the writings of Deyo and colleagues.68 In 1986, Deyo used the term "sensitivity to change" and applied a correlational analysis (patients were ranked as "better," "unchanged," or "worse").6 Also in 1986, Deyo and Centor stated, "The issue is not merely sensitivity to change, but ability to discriminate between those who improve and those who do not."7 We believe these 2 applications of the term "sensitivity to change" are contradictory and illustrate why so much confusion exists in the usage of the terms "responsiveness" and "sensitivity to change." Finally, in 1991, Deyo et al8 stated, "Responsiveness is the ability of an instrument to detect small but important clinical changes....Sensitivity to change has sometimes been used to denote this property, but we avoid the term sensitivity because of its other clinical and epidemiologic meanings." We believe the works of Deyo and colleagues illustrate 2 points. First, Deyo appeared to apply different meanings to the term at about the same point in time (ie, the two 1986 publications), and second, "responsiveness" and "sensitivity to change" have been used to denote the same property.
The second point made by Fritz was that the term "sensitivity to change" should be avoided because it has a meaning that is unique to diagnostic test methodology. The term "sensitivity to clinical change,"9 however, predates the application of diagnostic test methodology as one analytic method for assessing change.7 Moreover, a review of the literature shows that the word "sensitivity" is not restricted to articles by researchers who applied diagnostic test methodology to "diagnose change"; the word "sensitivity" is also found in articles by researchers who applied group contrast analysis911 and correlational analysis.6 To our knowledge, only Beurskens et al12 have coupled the phrases "sensitivity to change" and "specificity to change." Deyo et al8 acknowledged that they chose the term "responsiveness" to avoid confusion with the multiple meanings associated with the word "sensitivity." In summary, we do not believe the words "sensitivity"used in a diagnostic test contextand the phrase "sensitivity to change" are equivalent, and we believe the literature discussed in our response supports this notion.
Conventions of grammar and definition are often dependent on contemporary usage. Applying this principle, we suggest that, in most cases, contemporary usage supports the notion that the terms "responsiveness" and "sensitivity to change" have equivalent meanings. Support for this statement is found in a frequently cited textbook on clinical measurement (eg, "This effect has been variously labeled as sensitivity to change and responsiveness in the literature.")11(p168) and in journal articles.8,1315 However, when the term "responsiveness" is used in conjunction with the Kirshner and Guyatt citation,2 and, in the absence of an additional clarifying statement, this term takes on a restrictive meaning that does not include the concept of validity. We believe that readers need to look beyond the jargon when forming an opinion about a measure's ability to detect valid and meaningful change. A critical review of the study design and analysis is essential.
References
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |