PTJ
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


PHYS THER
Vol. 82, No. 12, December 2002, pp. 1265-1268

This Article
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Rapid Responses are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Olson, K. A
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Olson, K. A
Related Collections
Right arrow Injuries and Conditions: Hip
Right arrow Tests and Measurements
Right arrowRelated Article
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Letters and Responses

"Skilled Mentors" and Reliability


To the Editor:

I am writing in response to the article by Riddle et al titled "Evaluation of the Presence of Sacroiliac Joint Region Dysfunction Using a Combination of Tests: A Multicenter Intertester Reliability Study" in the August 2002 issue of the Journal. The research design and statistical analysis used in this multi-center reliability study on examination of the sacroiliac region was impressive. The authors attempted to replicate an earlier study by Cibulka and colleagues,1 but they greatly increased the number of clinical sites and therapists involved, in an attempt to make the results more generalizable to therapists who might use the examination scheme advocated by Cibulka and colleagues. However, the results of the study by Riddle et al demonstrated poor reliability, whereas Cibulka et al had reported an excellent level of reliability.

In comparing the 2 studies, it seems that one likely explanation for the disparity in results is the training process used in the studies. Cibulka et al1 developed the examination scheme and likely mentored their coworkers to perform the procedures exactly as they advocated in an attempt to maximize the consistency and reliability. In contrast, Riddle et al sent a written description of the examination procedures with illustrations to the participating physical therapists and asked them to practice the techniques on each other until they felt ready to participate in the study.

Manual examination procedures are motor skills that must be practiced under the guidance of a mentor in order to be learned properly. The motor learning literature, I believe, suggests that practice of a motor skill is necessary for the skill to be learned. Petty2 contended that physical therapists in an advanced manual therapy residency program provided inaccurate and unreliable feedback to classmates when practicing manual therapy techniques. Watson and Radman3 showed that concurrent qualitative feedback provided by an instructor enhanced the retention and future performance of a manual therapy technique. Other researchers4,5 have demonstrated that the use of concurrent quantitative feedback during practice sessions will assist in more accurate application of forces in manual examination and manual therapy techniques.

For Riddle et al to assume that the physical therapists participating in their study would learn the manual examination procedure effectively from a written instruction sheet is analogous to a physician assuming that a patient with low back pain will learn proper performance of an exercise program from a written instruction sheet. The unfortunate result often seen in clinical practice, in my opinion, is that after the patient with low back pain is given the physician's written exercise program, the patient does not improve, the exercise program is determined to be ineffective, and procedures that are more invasive are often prescribed. Most physical therapists, I believe, would argue that if the patient had received proper training in performance of the exercise program with practice and feedback, the exercise program would prove to be effective. Similarly, Riddle et al have concluded that the Cibulka examination scheme for sacroiliac region dysfunctions lacks reliability and should not be included in the examination of patients with lumbosacral complaints. I would argue that with proper training of the therapists in these manual examination procedures—which can best be done through practice sessions under the watchful eye of a mentor to provide feedback—the consistency of performance and reliability of data obtained with these examination procedures would likely improve.

I would encourage future researchers who want to engage in studies that test the measurement reliability of manual examination procedures to use motor learning principles when training the physical therapists who participate in their studies. The physical therapists' training should include practice sessions with concurrent qualitative and quantitative feedback from a skilled mentor before the reliability of data obtained with an examination procedure are tested.

Kenneth A Olson, PT, MSc, OCS, FAAOMPT, Instructor

Physical Therapy Program
College of Health and Human Sciences
School of Allied Health Professions
Northern Illinois University
DeKalb, IL 60178
(kolson{at}niu.edu)

References

  1. Cibulka MT, Delitto A, Koldehoff RM. Changes in innominate tilt after manipulation of the sacroiliac joint tests in patients with low back pain: an experimental study. Phys Ther.1988; 68:1359–1363108.[Abstract/Free Full Text]
  2. Petty NJ, Bach TM, Cheek L. Accuracy of feedback during training of passive accessory intervertebral movements. Journal of Manual and Manipulative Therapy.2001; 9:99–108.
  3. Watson TA, Radwan H. Comparison of three teaching methods for learning spinal manipulation skill: a pilot study. Journal of Manual and Manipulative Therapy.2001; 9:48–52.
  4. Lee M, Moseley A, Refshauge K. Effect of feedback on learning a vertebral joint mobilization skill. Phys Ther.1990; 70:97–104.[Abstract/Free Full Text]
  5. Keating J, Matyas TA, Bach TM. The effect of training on physical therapists' ability to apply specified forces of palpation. Phys Ther.1993; 73:45–53.[Medline]

 

To the Editor:

We would like to comment on the article by Riddle, Freburger, and the North American Orthopaedic Rehabilitation Research Network titled "Evaluation of the Presence of Sacroiliac Joint Region Dysfunction Using a Combination of Tests: A Multicenter Intertester Reliability Study."

We read with interest their report of using a combination of sacroiliac joint tests to determine the presence of sacroiliac joint dysfunction. After reading the "Method" section, we came away with the conclusion that if little time is spent in training therapists on how to properly perform sacroiliac joint tests, the chance of physical therapists agreeing on the results of the tests probably will be low. We do not think that physical therapists' proficiency can be tested merely by asking them how many continuing education courses they have attended or whether they felt comfortable conducting the tests after reading a written description of the tests. Furthermore, we find the statement, "It is likely that most therapists in our study had seen or had used the tests examined in the study," hardly evidence of competency.

It also was unclear to us after reading the "Method" section who specifically did what. There were 34 examiners working in 11 clinics, seeing a total of only 65 patients. The authors did not fully describe who did what or how many tests each pair of therapists performed. Thus, we wonder whether some pairs of therapists did more than others and whether some pair of therapists did better than others.

We think that the methods usually used by therapists to improve disagreement among observers—including repeating the examination, discussing areas of disagreement between therapists, and using feedback loops, among others14 —are often omitted during the relatively artificial process of determining observer variability. Although we agree that having good observer consistency is important, we think what is even more important is the clinical usefulness of the index or test and whether or not the patient is better off. We believe that studies that examine only the observer variability of an index or test run the risk of being clinically irrelevant. Thus, we think that observer variability studies can be more useful if they also are devoted to demonstrating their clinical worth. An example of this is a study in which Erhard et al5 demonstrated the success of using this same cluster of tests in the management of patients with low back pain who were suspected of having sacroiliac joint pain.

Finally, we agree with the authors' conclusion that the role of physical therapist training in the use of this procedure is unclear.

Michael T Cibulka, PT, PT/MHS, OCS

Jefferson County Rehabilitation and Sports Clinic
1330 YMCA Dr, Ste 1200
Festus, MO 63028
(Jcrehab{at}jcn.net)

Rhonda M Koldehoff, PT

Metro East Rehabilitation and Sports Clinic
817 South Belt W
Belleville, IL 62220

References

  1. Sackett DL, Haynes RB, Tugwell P. Clinical Epidemiology: A Basic Science for Clinical Medicine. Boston, Mass: Little, Brown & Co;1985 .
  2. Feinstein AR. Clinimetrics. New Haven, Conn: Yale University Press;1987 .
  3. Sackett DL, Richardson WS, Rosenberg W, Haynes RB. Evidence-Based Medicine: How to Practice and Teach EBM. Edinburgh, Scotland: Churchill Livingstone;1998 .
  4. Hayes KW. Making tests with low reliability work for you. Orthopaedic Physical Therapy Practice.1999; 11(1):28–29.
  5. Erhard RE, Delitto A, Cibulka MT. Relative effectiveness of an extension program and a combined program of manipulation and flexion and extension exercises in patients with acute low back syndrome. Phys Ther.1994; 74:1093–1100.[Abstract/Free Full Text]

 

Author Response:


Mr Olson makes 2 major points in his letter. First, he contends that one likely explanation for the differences in the magnitude of reliability reported in our article, as compared with the article by Cibulka and colleagues,1 is differences in the type and extent of physical therapist training. Second, Mr Olson argues that researchers planning to conduct reliability studies should use "skilled mentors" to train the therapists participating in the studies.

As we indicated in our article, one potential explanation for the differences between the reliability estimates obtained in our study and those obtained in the study by Cibulka et al1 was that the therapists in the 2 studies appeared to receive different training in the procedures that were examined. In the study by Cibulka et al, Cibulka and a second therapist colleague in the clinic collected data. The extent and type of training that these 2 therapists completed prior to the study were not described. In our study, 34 therapists from 11 clinics participated. We gave the therapists a handout that explained each measurement and that included photographs of both the beginning and ending positions for each procedure. Therapists were instructed to practice the measurements and begin data collection after they believed they were ready to conduct the tests on patients.

We believe that the effect training might have on the reliability of measurements of sacroiliac joint alignment or movement is not known. However, when it comes to the assessment of reliability, we contend that the literature shows some consistent patterns. We believe that the evidence clearly indicates that for some measurements, additional training and practice prior to a study may improve reliability beyond that expected of a therapist with little or no additional training in the taking of a measurement. Diamond and colleagues,2 for example, found that goniometric measurements of foot alignment can be reliable (intraclass correlation coefficients [ICC (2,1)] ranging from .58 to .89, with most at .75 or higher) when therapists undergo a lengthy training period prior to taking the measurements. Contrast this to the poor reliability (ICCs [1,1] ranging from 0 to .58, with most at .30 or lower) reported for therapists reporting limited experience with the measures.3

For other measurements, it appears that additional training, even by experts, has little impact on reliability.4,5 For example, Binkley and colleagues4 found that the reliability was poor (Kappa=.3) for measurement of accessory motion of the lumbar spine. In the study by Binkley and colleagues, the examiners had what appeared to be extensive training in the procedures, because the therapists earned advanced credentials and averaged approximately 10 years of clinical experience. Maher and Adams6 found similarly low reliability estimates for stiffness judgments of lumbar spine accessory motions by physical therapists with graduate credentials in manipulative therapy and extensive clinical experience.

For other measures, reliability is high even without specialized training. The literature that has examined the reliability of passive goniometric measurements of the shoulder, elbow, and knee, for example, clearly demonstrates that reliability is high for some of these measures even when additional training is not provided and the techniques for taking the measurements are not specified.7,8 Manual assessments of the inflammatory status of the knee joint (Kappa=.76) and palpation for increased temperature of the knee (Kappa=.66) also have been shown to have reasonably high reliability for therapists who did not have training by an expert prior to the study.9

Olson contends, "Manual examination procedures are motor skills that must be practiced under the guidance of a mentor in order to be learned properly." It is not clear what Olson means by "skilled mentor," but this implies to us that Olson believes that training from an expert is required to properly conduct manual examination procedures. Goniometric measurements of passive range of motion and manual assessments of the presence of joint inflammation are common examination procedures done by physical therapists, and we believe that the evidence supports the contention that training by an expert is not required to obtain reliable data for these types of measurements.

We believe the logical conclusion from the large body of literature on reliability is that some measurements have what we would consider to be acceptable reliability in the hands of therapists with minimal or no specialized training or experience, some measurements are reliable only after specialized training, and some are unreliable no matter the amount of training. We suspect that the tests advocated by Cibulka and colleagues1 fall into the latter category, because of the evidence we provided in our article and because of the evidence we reviewed in an earlier article,10 which indicated that there are numerous studies of sacroiliac joint tests of position or movement that showed poor reliability of measurements even among experienced examiners.

Olson also suggests that studies of the reliability of measurements should be designed so that the participating therapists are trained by an expert prior to the study. We believe that this suggestion is ill-conceived. First, we believe there is evidence to indicate that, at least for some measurements, specialized training is not required to achieve adequate reliability. Second, by providing training and mentorship that is unique to the individual providing the training, the researcher is introducing an artificial control in the study that immediately places restrictions on generalizability and, therefore, usefulness of the results for most therapists.

We believe the following general approach should be taken when conducting reliability studies of clinical measurements. If evidence indicates that the reliability of a measurement is unacceptable for clinical practice, we believe the most appropriate approach, in most cases, is to design a training program that is well defined and easy to replicate so that others can undergo similar training. If reliability is shown to be adequate with this training program, then therapists could undergo similar training in order to achieve similar levels of reliability. If reliability for a measure is shown to be inadequate for clinical decisions even with additional training, we believe therapists should look for an alternative measure.

Cibulka and Koldehoff also discuss the issue of therapist training but make other points. Cibulka and Koldehoff contend that therapists use a variety of methods to reduce the extent of measurement error and that these methods should be included in reliability studies. They state that therapists repeat examinations, discuss areas of disagreement, and use feedback loops during clinical practice. We agree that therapists often discuss measurements, speculate on sources of measurement error, and practice examination procedures. Therapists in our study were given an opportunity to practice and discuss the techniques as well. But when it comes to the design of reliability studies, we believe Cibulka and Koldehoff miss the point. When therapists conduct examinations, almost all measurements are routinely taken in clinical practice by an individual therapist on an individual patient, without the input of others. It is the error associated with this one-to-one process that most reliability studies are designed to assess. We believe the error we found is what exists when a therapist obtains a measurement following a typical amount of training.1

Cibulka and Koldehoff also argue that "observer variability studies can be more useful if they are also devoted to demonstrating their clinical worth." We found in our study that when therapists interpret the 4 tests in a composite analysis as described by Cibulka et al,1 therapists agree on which patients have a positive composite score about 50% of the time. We argued in our article that the agreement on positive tests that was essentially equal to chance agreement provided reasonably strong evidence of a lack of clinical utility.

Daniel L Riddle, PT, PhD, Associate Professor


Department of Physical Therapy
Medical College of Virginia Campus
Virginia Commonwealth University
1200 E Broad St
Richmond, VA 23298-0224
(driddle{at}hsc.vcu.edu)

Janet K Freburger, PT, PhD

NRSA Postdoctoral Reserch Fellow
Cecil G Sheps Center for Health Services Research
Assistant Professor
Division of Physical Therapy
University of North Carolina at Chapel Hill
Chapel Hill, NC

North American Orthopaedic Rehabilitation Research Network

References

  1. Cibulka MT, Delitto A, Koldehoff RM. Changes in innominate tilt after manipulation of the sacroiliac joint in patients with low back pain: an experimental study. Phys Ther.1988; 68:1359–1363.[Abstract/Free Full Text]
  2. Diamond JE, Mueller MJ, Delitto A, Sinacore DR. Reliability of a diabetic foot evaluation. Phys Ther.1989; 69:797–802.[Abstract/Free Full Text]
  3. Elveru RA, Rothstein JM, Lamb RL. Goniometric reliability in a clinical setting: subtalar and ankle joint measurements. Phys Ther.1988; 68:672–677.[Abstract/Free Full Text]
  4. Binkley J, Stratford PW, Gill C. Interrater reliability of lumbar accessory motion mobility testing. Phys Ther.1995; 75:786–795.[Abstract/Free Full Text]
  5. Potter NA, Rothstein JM. Intertester reliability for selected clinical tests of the sacroiliac joint. Phys Ther.1985; 65:1671–1675.[Abstract/Free Full Text]
  6. Maher C, Adams R. Reliability of pain and stiffness assessments in clinical manual lumbar spine examination. Phys Ther.1994; 74:801–811.[Abstract/Free Full Text]
  7. Riddle DL, Rothstein JM, Lamb RL. Goniometric reliability in a clinical setting: shoulder measurements. Phys Ther.1987; 67:668–673.[Abstract/Free Full Text]
  8. Rothstein JM, Miller PJ, Roettger RF. Goniometric reliability in a clinical setting: elbow and knee measurements. Phys Ther.1983; 63:1611–1657.[Abstract/Free Full Text]
  9. Fritz JM, Delitto A, Erhard RE. An examination of the selective tissue tension scheme, with evidence for the concept of a capsular pattern of the knee. Phys Ther.1998; 78:1046–1056.[Abstract/Free Full Text]
  10. Freburger JK, Riddle DL. Using published evidence to guide the examination of the sacroiliac joint region. Phys Ther.2001; 81:1135–1143.[Free Full Text]

Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?

Related Article

Evaluation of the Presence of Sacroiliac Joint Region Dysfunction Using a Combination of Tests: A Multicenter Intertester Reliability Study
Daniel L Riddle, Janet K Freburger, and North American Orthopaedic Rehabilitation Research Network
Physical Therapy 2002 82: 772-781. [Abstract] [Full Text] [PDF]




This Article
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Rapid Responses are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Olson, K. A
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Olson, K. A
Related Collections
Right arrow Injuries and Conditions: Hip
Right arrow Tests and Measurements
Right arrowRelated Article
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2002 by the American Physical Therapy Association.