Advertisement

Four Clinical Tests of Sacroiliac Joint Dysfunction: The Association of Test Results With Innominate Torsion Among Patients With and Without Low Back Pain

Pamela K Levangie

Abstract

Background and Purpose. The purpose of this study was to assess the association between innominate torsion (asymmetric anteroposterior positioning of the pelvic innominates) and the Gillet, standing forward flexion, sitting forward flexion, and supine-to-sit tests. Subjects. A sample of 21- to 50-year-old patients with low back pain (n=150) and a comparison group of patients with upper-extremity impairments (n=138) were recruited from outpatient physical therapy facilities. Methods. The association of single and combined test results with innominate torsion (calculated from pelvic landmark data) and with presence or absence of low back pain were estimated via odds ratios, sensitivities, specificities, and predictive values. Results. Individual test sensitivities were low (8%-44%), as were negative predictive values (28%-38%), for identifying the presence of innominate torsion. Combining tests and controlling for sex, age group, leg-length difference, or iliac crest level did not improve performance characteristics. The associations of test results with low back pain were weak, with the exception of the Gillet test (odds ratio=4.57). Conclusion and Discussion. The data do not support the value of these tests in identifying innominate torsion, although the use of these tests for identifying other phenomena (eg, sacroiliac joint hypomobility) cannot be ruled out. Further exploration of the association of Gillet test results with low back pain is warranted.

Sacroiliac joint dysfunction is one of a variety of labels that have evolved since the turn of the century to describe a fairly broad and poorly defined group of signs and symptoms that are usually thought to arise from the pelvic ring and surrounding structures. Although recent studies13 have provided evidence that the sacroiliac joint may be a source of low back pain (LBP) by demonstrating symptom reduction after intra-articular injection of local anesthetic, the source of pain or the tissues involved remain unsubstantiated. One hypothesis is that pain arises from tissues in the pelvis or the low back area that are being stressed by asymmetry within the pelvic ring. Anterior or posterior displacement (torsion) of one of the innominates may cause a positional change within one or both sacroiliac joints. This change may potentially stress the structures attached to the innominates or within the sacroiliac joints. Another theory of sacroiliac joint pain is that sacroiliac joint hypomobility, with or without concomitant innominate torsional asymmetry, may cause LBP. This theory appears to assume that a hypomobile sacroiliac joint may stress surrounding or intervening tissues if one or both sacroiliac joints fail in their presumed function of dissipating force from the head and trunk above or from the ground below.

The 2 hypotheses as to what causes sacroiliac pain appear to be the bases for the classification of LBP as being due to iliosacral dysfunction,4 sacroiliac joint dysfunction,58 lumbosacral dysfunction,9 sacroiliac joint malalignment,10 sacroiliac hypermobility or hypomobility,11 or sacroiliac regional pain.12 Each of these classification or diagnostic schemes is based on the assumption that sacroiliac joint dysfunction can be identified by use of tests to assess either innominate torsional asymmetry or sacroiliac joint hypomobility. The common tests include determination of posterior superior iliac spine (PSIS) level in a standing or sitting position, the Gillet test (also known as the march or stork test), the standing flexion test, the sitting flexion test (or Piedallu's sign), and the supine-to-sit test. These tests are also widely promoted as part of a LBP examination in orthopedic, osteopathic, physical therapy, and chiropractic educational texts.1321 Yet, there is neither consensus on nor evidence to support the underlying hypotheses on which these tests are based.

Although mechanisms to assess sacroiliac joint motion do not currently exist, investigators using these tests and those promoting test use in texts often suggest using one or more of what they call “dynamic tests” (ie, standing flexion test, sitting flexion test, Gillet test, and supine-to-sit test) to detect hypomobility or motion asymmetry of the sacroiliac joints.9,1719,21 Some authors4,14,15,22 have argued that the sitting flexion test detects hypomobility of the sacrum on the ilium, whereas the standing flexion test detects hypomobility of the ilium on the sacrum. Other authors10,15 have argued that one or more of these tests can be used to detect the side of anterior or posterior innominate torsion. Bemis and Daniel4 found the supine-to-sit test result to be related to a diagnosis of iliosacral dysfunction (innominate torsion). They diagnosed iliosacral dysfunction using a composite finding of positive standing PSIS asymmetry, a positive standing flexion test, and a negative sitting flexion test. Delitto et al8(p478) and Cibulka and colleagues5,23(p920),24 used a combination of 4 tests (3 of which must have positive findings) to determine whether a person has sacroiliac joint dysfunction. Three of these tests were determination of PSIS asymmetry in a sitting position, the standing flexion test, and the supine-to-sit test. Cibulka and Koldehoff24 proposed that a positive standing flexion test indicated hypomobility, whereas a positive supine-to-sit test indicated both abnormal movement and malalignment (innominate torsion). Sitting PSIS asymmetry was used to detect malalignment.24 Cibulka, in a published case study, reported that the composite of 4 tests was used to “determine whether innominate bone rotation was present.”23 Delitto and colleagues8 used the tests as part of their LBP classification system, but they did not discuss the conceptual basis for the tests. They stated that the tests are “purportedly directed toward dysfunction of the sacroiliac joints”8 and that they “prefer to state that a positive composite is indicative of need for a specific manipulation technique.”8

The Standards for Tests and Measurements in Physical Therapy25 Practice specify that research reports or scholarly articles should address the theoretical basis for tests that are used and should include a discussion of the evidence relating to the construct validity and content validity of the tests. The Standards further note that tests proposed to classify people into diagnostic groups should include essential elements to allow for interpretation, including sensitivity, specificity, and predictive value. It is clear that judgments of PSIS asymmetry, the Gillet test, the standing flexion test, and the supine-to-sit test have not met these standards and that the specified information is generally unavailable. One obvious reason for this dearth of information is the lack of a gold standard against which altered static or dynamic 3-dimensional relationships within and around the sacroiliac joints and pelvic ring can be assessed. Given the bicompartmental anatomy and complex spatial relationships of the sacroiliac joint,26 it is not surprising that traditional imaging procedures to date have been unable to provide a noninvasive gold standard against which innominate torsion or sacroiliac motion can be assessed. No studies could be found that proposed a noninvasive external standard of sacroiliac hypomobility that did not rely on clinical judgments of positive-negative findings using unvalidated test outcomes. In contrast, an acceptable standard for assessing innominate torsion may be available.

Pitkin and Pheasant27 first proposed a mechanism for measuring unilateral innominate inclination by assessing pelvic landmarks. Their method or slightly modified forms of the method were subsequently utilized and accepted by other researchers5,2832 as appropriate for assessing either unilateral innominate inclination or side-to-side innominate differences (innominate torsion). Although there is no external standard against which to validate this technique, assessing the inclination of anterior superior iliac spine (ASIS) and PSIS landmarks unilaterally or bilaterally would appear conceptually to be valid for assessing innominate inclination. The technique also is a more reliable way of assessing innominate inclination than typically found through palpation and clinical judgment alone. Using this measurement technique, Walker and colleagues33 found good intratester reliability of .84; other researchers,31,34 however, found stronger intertester intraclass correlation coefficients (ICCs) of .94 to .96. If the reliability and validity of data obtained with this technique are accepted, we would appear to have a standard against which the Gillet, standing flexion, sitting flexion, and supine-to-sit tests can be assessed as measures of static or positional innominate torsion. Although such an assessment ignores the issue of sacroiliac joint hypomobility, there does not appear to be any viable method for addressing this problem. Given the equivocal basis for these tests, we gain information even if we are only able to rule in or rule out one aspect of their performance. As noted by Rothstein, “All evidence has limitations, but whatever those limitations may be, data are far better than debates that are more about theology than they are about health care.”35(p1044) The intent of this study, therefore, was to explore whether the construct of innominate torsional asymmetry was related to the outcome of 4 common clinical tests of sacroiliac dysfunction.

In my study, I used a cross-sectional approach with a sample of adult patients seeking physical therapy services: (1) to assess the magnitude of the association between innominate torsion and the results of 4 clinical tests of sacroiliac joint dysfunction, (2) to estimate the performance characteristics (sensitivity, specificity, positive predictive value, and negative predictive value) of these tests in identifying patients with innominate torsion, and (3) to assess the magnitude of association between the results of the clinical tests and nonspecific LBP of less than 1 year's duration.

Method

Choice of Clinical Tests

I identified 4 commonly used clinical tests of sacroiliac joint dysfunction as the focus of this study: (1) the Gillet test, (2) the standing flexion test, (3) the sitting flexion test, and (4) the supine-to-sit test. This study was part of a larger study I conducted to investigate the association between estimated innominate torsional asymmetry and LBP.36 In the larger study, as well as in this study, PSIS levels (or asymmetry) in standing and sitting positions, unlike the other 4 tests, were measured rather than assessed using clinical judgment. There also does not appear to be any argument that PSIS asymmetry estimates static (positional) changes in the innominates rather than joint mobility. I also did not include clinical tests designed to provoke symptoms, because provocation tests are used to determine whether the sacroiliac region is a source of pain rather than to identify innominate torsion (or hypomobility).

Subjects

I recruited a sample of adult patients seeking physical therapy services for this study. I chose a clinic-based sample of patients with and without LBP because population-based subjects may differ in unknown ways from those who actually seek medical attention, expend health care dollars, and are managed by health care practitioners. I recruited all subjects from the same facilities so that they would be as alike as possible on uncontrolled variables such as geographical distribution, socioeconomic group, health care access, and willingness to seek medical care. I set a lower age limit of 21 years to target subjects who had reached skeletal maturity. I set an upper age limit of 50 years in an attempt to reduce the prevalence of sacroiliac degenerative changes that are thought to reduce sacroiliac mobility (and, perhaps, torsion) in subjects after age 50 years.3739

The subjects with LBP were patients who had been referred for physical therapy for LBP of less than 1 year's duration. I excluded patients with LBP of greater than 1 year's duration because some experts contend that the pain and disability experienced by people with LBP becomes dissociated over time from the original physical basis of the problem.40 A comparison group of subjects consisted of patients referred for physical therapy for an upper-extremity problem whose diagnosis did not appear to me to indicate that the problem was neck- or back-related (eg, thoracic outlet syndrome). I excluded people with upper-extremity problems who had been treated for LBP within the previous year or had experienced activity limitation due to LBP for more than a few days in the previous year. This was done to avoid including patients with low back dysfunction among the comparison group. Exclusionary criteria for both groups are presented in Table 1.

Table 1.

Exclusionary Criteria for Subjects With and Without Low Back Pain (LBP)

A sample size of 150 subjects with LBP and 150 subjects without LBP was targeted. The number of subjects was estimated for the larger study to obtain a power of at least 80%.36 All subjects were recruited through outpatient physical therapy facilities in 7 hospitals and 32 private practices serving a range of inner city and suburban communities in the metropolitan Boston area. Appointments for data collection were made at the participating facility most convenient to the subject. Subjects received $25 for participation. Recruitment and enrollment was continued until the target sample of 150 subjects with LBP and 150 subjects without LBP was reached.

Data Collection

During the data collection session, I or a research assistant obtained informed consent and had each subject complete a self-administered questionnaire from which descriptive data were obtained. I conducted a physical examination in a fixed order for all subjects. The examination consisted of measurements of the height of the pelvic landmarks, iliac crest level, and leg lengths and the 4 clinical tests of sacroiliac joint dysfunction. I served as the only examiner to avoid interrater reliability issues. At the beginning of data collection, I was usually unaware of whether subjects had LBP. To reduce the possibility that patients' behavior would indicate whether they had LBP, I treated all subjects as if they had LBP. All subjects were unknown to me, and I performed only the measurements described in this article. Consequently, my judgments of positive or negative test results were not influenced by patient history or other evaluative findings.

Measured Innominate Torsion

I began the determination of innominate torsion by first measuring the heights of anterior (ASIS) and posterior (PSIS) pelvic landmarks on each side. To obtain height measurements, I hooked my thumb beneath each bony prominence while the horizontal arm of a pedestal-mounted post was brought to a line marking the midpoint of my thumbnail (Fig. 1 and Fig. 2). These height values were then used to calculate the difference between PSIS and ASIS heights on the right innominate (right PSIS − right ASIS) and the difference between PSIS and ASIS landmarks on the left innominate (left PSIS − left ASIS). To determine innominate asymmetry, I calculated the absolute difference between the right PSIS/ ASIS difference and the left PSIS/ASIS difference. That is, I estimated innominate torsional asymmetry using the following formula: Estimated Innominate Torsion = Absolute [(Right PSIS − Right ASIS) − (Left PSIS − Left ASIS)]. The method of landmark measurement and the innominate torsion calculation I chose were based on principles and techniques used and accepted by other researchers2931,33,34 as reliable and as representative of innominate inclination. To obtain estimates of reliability and standard error of the measurement for locating anatomical landmarks and calculated innominate torsion, I measured each landmark height twice. The height markings on the post were out of my line of sight and were not observed until the position of the horizontal arm had been set. The arm was dropped down after recording the first measurement and repositioned for the second measurement, again without being able to see the height markings until after the arm was fixed in position. In addition to obtaining repeated landmark measurements, I calculated innominate torsion first using the first set of landmark heights and then again using the second set of landmark heights. For the remaining analyses, the average of the 2 measurements was used for calculations. I chose to consider the standard error of the measurement to be the cutoff value for identifying the presence or absence of innominate torsion. That is, any calculated torsion greater than the standard error would be positive torsion (asymmetric innominates) and any torsion at or below the standard error would be negative torsion (symmetric innominates). The standard error of the innominate torsion measurement was estimated as the square root of the mean square error from the repeated-measures analysis of variance.41

Figure 1.

A pedestal-mounted post with an adjustable horizontal arm was used to measure heights of palpated landmarks (in millimeters).

Figure 2.

As the examiner actively hooked her thumb beneath the landmark, the horizontal arm was brought to the height of a mark bisecting the examiner's thumbnail longitudinally. Once the arm position was secured, the height reading was obtained off the graduated post.

In addition to measuring PSIS and ASIS heights and calculating innominate torsion, I also measured leg lengths and determined iliac crest levels while subjects were standing and iliac crest levels while subjects were sitting. These variables allowed me to explore whether asymmetry of leg lengths or iliac crests might affect the results of the tests. I measured leg lengths with the subjects positioned supine, using a cloth tape measure to measure the distance from the ASIS to the lateral malleolus.42 Leg lengths were each measured twice to permit estimates of reliability. The markings on the tape measure were not observed for either the first of second measurement until my hand positions on the tape measure were set and the tape measure was removed from the subject. Iliac crest level was assessed using a crest level tester (Med-level Model M2000*).43,44 The crest level tester was placed around the subject's waist from behind and firmly seated on top of the iliac crests by feel alone (Fig. 3). Once the crest level tester was firmly seated on the iliac crests, I visually assessed the position of the bubble in the level. If more than half of the bubble was within the central marks, I considered the crests to be symmetric (negative); if more than halfway outside the markings to the left or right, the crests were considered asymmetric (positive). I repeated the procedure once or twice again from the beginning before recording the determination reached twice. I used the same procedure for assessing posterior iliac crest level while the subjects were sitting. Repeated assessments were not used for reliability estimates because I could not be blinded to the initial measurement.

Figure 3.

Posterior iliac crest symmetry was assessed using a crest level tester. The arms of the tester were securely seated on top of the iliac crests, and then the bubble level was observed to make a determination of iliac crest asymmetry.

Clinical Test Performance

I performed the 4 clinical tests and, based on the measurements, judged each test to be positive or negative. Repeated test assessments were not recorded because I could not be blinded to the first outcome result when obtaining a second measurement during the same (and only) data collection session. I repeated a test only when I believed that the first result was equivocal and, as I believe is done in the clinical setting, made a single judgment of the result. I performed the right and left Gillet tests using a commonly advocated technique and decision criteria.15,45 If the measurement was positive on either the right or the left side, the Gillet test was considered positive. The standing flexion test and the sitting flexion test were conducted using a common set of techniques and decision criteria,5,15 as was the supine-to-sit test.4,5,8

Data Analysis

I computed estimates of the association between clinical test results and measured innominate torsion and between clinical test results and LBP using an odds ratio (ÔR) calculated from a 2 × 2 contingency table. An odds ratio of 1.0 indicates no association (similar odds of LBP or innominate torsion among those subjects with and without a positive test). An odds ratio of greater than 1.0 connotes a direct association of test results with innominate torsion or LBP, and an odds ratio of less than 1.0 connotes an inverse association. All odds ratios were calculated with 95% confidence intervals (CIs). The 95% CI is an indication of the precision of the estimated odds ratio. A wide interval indicates a relatively imprecise estimate. The 95% CI can also be used to estimate the statistical probability of the estimated odds ratio. When the null value for the odds ratio (1.0) lies in the 95% CI, the corresponding probability value for the odds ratio will be greater than .05. The more centrally the null value lies in the interval, the larger the corresponding probability value.

I assessed the effects of several variables as possible effect modifiers or confounders of the association between test results and either measured innominate torsion or LBP. Effect modification exists when subgroups or strata differ on the association under study. For example, the association between test results and innominate torsion may be different for younger subjects than it is for older subjects. That is, age group may “modify” the effect of the association. To determine effect modification, the test result/ innominate torsion association is computed separately for younger subjects and for older subjects. If the 2 estimated odds ratios differ substantively, then age group is modifying the relation.46 Confounding exists when an uncontrolled variable (eg, age group) is independently related to each of the 2 primary variables under study and, through those relations, distorts the association between the primary variables.

Using a hypothetical example, age group would produce confounding of the test result/innominate torsion association if it were associated with test results (eg, older people are less flexible and have more positive test results) and independently associated with innominate torsion (eg, older people have more torsion).46 The association between age group and test results and the association between age group and torsion would result in an apparent, but inaccurate, association between test results and innominate torsion. That is, age group would confound the test result/innominate torsion association. Standardized odds ratios (SÔRs) for the association between test results and LBP and for the association between test results and innominate torsion were computed as summary measures across strata of a possible confounding variable.46 Continuing the hypothetical example, odds ratios for the association between innominate torsion and test results would be determined for younger subjects and separately for older subjects. The stratification first removes the effect of age group. A standardized odds ratio is calculated from the 2 age group-specific odds ratios and gives a summary (or weighted) estimate of the association between innominate torsion and test results across the 2 age groups, that is, controlling for age group. If the standardized odds ratio differs substantively from the original (crude) odds ratio for the association of innominate torsion and test results (before younger and older subjects were separated), then age group should be considered as a possible confounder. That is, age group should be considered as potentially biasing the estimate of the association between innominate torsion and test results. To compute the standardized odds ratios, I chose the distribution of the potential confounding variable among those subjects with negative innominate torsion as the standard.

Sex and age group were studied for possible effect modification or confounding of the primary associations being studied. Differences between the pelvises of male and female subjects or changes in the sacroiliac joints with age may affect test results or innominate torsion. I also examined leg-length differences and either standing or sitting iliac crest level as possible effect modifiers or confounding variables because the standing or sitting asymmetry of these variables may affect the results of standing or sitting tests and may affect innominate torsion. I computed sensitivity, specificity, a positive predictive value, and a negative predictive value for each measure to determine whether the tests were useful in identifying those subjects with and without innominate torsion. I did not examine the validity characteristics of the tests in identifying those subjects with LBP (as opposed to innominate torsion) because it is not the intent of any of these tests to differentiate between subjects with and without back pain. I conducted computations and statistical analyses using Statistix Analytic Software for Windows,47 and Microsoft Excel 7.0.,48

Results

Sample

I enrolled a total sample of 150 subjects with LBP and 150 subjects without LBP (comparison group) over a 27-month period. Data from 4 subjects in the comparison group were discarded because of data recording omissions or incomplete entries. I also dropped from the data set 8 subjects from the comparison group who reported experiencing LBP on the day of testing. Thus, for the study, there were 150 subjects with LBP and 138 subjects without LBP. I was not able to obtain innominate torsion measurements for 14 subjects, predominantly due to their obesity. I therefore used a total of 141 subjects with LBP and 133 subjects without LBP for the analyses involving innominate torsion measurements. Descriptive data for the subjects are presented in Table 2.

Table 2.

Descriptive Statistics for Subject Demographic Characteristics

Reliability and Standard Errors of the Measurements

I did not measure reliability for the clinical tests because the second set of measurements could not be obtained blindly. Standard errors of the measurement and ICCs (3,1) for repeated measurements of landmarks, innominate torsion, PSIS asymmetry and ASIS asymmetry in a standing position, and leg-length differences are reported in Table 3. The form of ICC that I used assumes a single rater (fixed-effects model) and use of single values rather than mean values.49 The ICCs for innominate and landmark asymmetry data were .61 to .75. Although some people might consider these values as indicating moderate reliability,50 the estimated standard error of measurement for each of these measurements is small. The relatively low ICCs, in my opinion, appear to be attributable not to large error, but rather to the small variability in these values that resulted from calculation of side-to-side difference scores. Unlike the standard error of the measurement, ICCs are sensitive to variability among subjects. Values for ICCs decrease as intersubject variability decreases. The side-to-side difference scores in this study were based on measured landmark heights and leg lengths. When ICCs for these more variable values were calculated, the ICCs exceeded .99. Yet, the standard errors of the measurement were approximately the same for the landmark height and leg-length measurements, as they were for the side-to-side difference calculations (2.7–3.2 mm). Similarities in standard errors of the measurement and differences in ICC values appear to indicate bias in ICC values for side-to-side difference scores resulting from reduced data variability.

Table 3.

Standard Error of the Measurements and Intraclass Correlation Coefficients (ICC [3,1]) for Repeated Landmark, Innominate Torsion, and Asymmetry Data

Rounding up the standard error of the measurement for calculated innominate torsion, 6 mm was used as the cutoff point to differentiate between subjects with and without innominate torsion. Approximately 35% of the subjects were considered negative for calculated innominate torsion (torsion ≤6 mm). Rounding up the standard error of the measurements for standing and sitting PSIS asymmetry, standing ASIS asymmetry, and leg-length discrepancy, 4 mm was used as the cut-point for each of these. For standing PSIS asymmetry, sitting PSIS asymmetry, standing ASIS asymmetry, and leg-length difference, negative results (≤4 mm) were obtained for 38%, 39%, 35%, and 40% of the subjects, respectively.

Associations of Test Results With Innominate Torsion

I determined the frequencies of positive and negative test results for each test among those subjects with positive and negative innominate torsion, as well as the odds ratios and 95% CIs (Tab. 4). Missing values resulted from the subjects' refusal or inability to be tested or from inability to palpate landmarks due to obesity. Odds ratios and 95% CIs for the association between test results and innominate torsion for the Gillet, standing flexion, sitting flexion, and supine-to-sit tests were 1.07 (95% CI=0.42, 2.74), 0.81 (95% CI=0.43, 1.54), 1.01 (95% CI=0.41, 2.47), and 1.37 (95% CI=0.80, 2.33), respectively. Only the odds ratio of the supine-to-sit test was much above 1.0, and the CI for this slightly higher odds ratio was wide (indicating imprecision of the estimate).

Table 4.

Association of Test Results With Static Innominate Torsion: Frequencies, Odds Ratio (ÔRs) and 95% Confidence Intervals (CIs), Sensitivity, Specificity, Positive Predictive Values (PVs), and Negative Predictive Values

The sensitivity, specificity, positive predictive value, and negative predictive value for the ability of each test to identify subjects with positive innominate torsion and subjects with negative innominate torsion are also presented in Table 4. As might be expected from the estimated odds ratios, test sensitivity and negative predictive values were uniformly low (8%-44%).

I examined the data for possible effect modification and confounding of the innominate torsion/test result association by sex, age group, supine leg-length difference, and both standing and sitting iliac crest asymmetry. For age group, subjects were considered 21 to 34 years of age or 35 to 50 years of age (mean age was 35 years). There was no evidence of important effect modification or confounding for any of the examined variables. That is, none of the examined variables appeared to be either inflating or masking a relationship between innominate torsion and test results.

Because detection of innominate torsion is only really an issue for people with LBP, I examined the ability to use the tests to differentiate between subjects with LBP with positive innominate torsion and those with negative innominate torsion (dropping the comparison group). The association between test results and innominate torsion and the sensitivity, specificity, positive predictive values, and negative predictive values for this subgroup of subjects with LBP were not substantively different from those in the full data set.

Alternative Analyses

I explored the association of innominate torsion with positive findings on 2 or more tests. According to some authors,5153 the results of 2 or more dichotomous tests used in parallel will be more sensitive than the results of a single test, assuming that the measures reflect complementary phenomena rather than redundant data. Of those subjects with positive innominate torsion, 13.4% (n=21) had positive results on 2 or more tests, whereas 11.5% (n=10) of those subjects without torsion had positive results on 2 or more tests. Two subjects, one with positive innominate torsion and one with negative innominate torsion, had positive results on 3 of the 4 tests. No one had positive results on all 5 tests. The odds ratio for the association between innominate torsion and 2 or more positive tests was 1.40 (95% CI=0.72, 2.71). The cumulative test performance was no stronger than the performance of the supine-to-sit test alone.

I also examined the criterion test results that Bemis and Daniel4 claimed could be used to diagnose iliosacral dysfunction (innominate torsion). These investigators used a combination of a positive standing PSIS asymmetry finding, a positive standing flexion test finding, and a negative sitting flexion test finding to diagnose individuals with innominate torsion; those subjects with negative findings on all 3 tests were assigned to a comparison group. The resulting odds ratio applying these same criteria to data from this study was 1.19 (95% CI=0.45, 3.18). Although I measured PSIS asymmetry rather than using personal judgment alone as did Bemis and Daniel, the combination of test results recommended by Bemis and Daniel did not strengthen the association between test results and innominate torsion.

As an additional set of alternative analyses, I assessed the association of test results with alternative measures of innominate asymmetry other than estimates of innominate torsion. This was done to explore whether other variables reflecting asymmetry might be more strongly associated with test results than innominate torsion as I estimated it. The alternative measures of innominate asymmetry included PSIS and iliac crest asymmetry in standing and sitting positions, ASIS asymmetry in a standing position, and supine leg-length difference. These alternative measures include most, if not all, of the landmarks commonly used in the clinic to assess innominate torsion. I measured these alternative asymmetry variables and used fixed cutoff points to determine positive or negative findings, as opposed to using personal judgment as is often done in the clinic. Findings for PSIS asymmetry, ASIS asymmetry, and leg-length difference were considered negative if asymmetry was 4 mm or less and positive if asymmetry exceeded 4 mm. Only the Gillet test showed a potentially important increase in association with one of these variables, but this finding was not statistically significant. There were no substantive changes in the relationship of the remaining test results with any of the explored alternative asymmetry variables. The Gillet test results showed a stronger positive association with standing iliac crest level (ÔR=2.22 [95% CI=0.90, 5.47]) than with innominate torsion (ÔR=1.07 [95% CI=0.42, 2.74]), although the stronger association is still relatively imprecise and not statistically significant at P ≤.05. Given this stronger estimate of association, I recalculated sensitivity, specificity, and predictive values for the Gillet test with standing iliac crest asymmetry. Gillet test performance characteristics did not change in any substantive way from those for calculated innominate torsion.

Association of Test Results With LBP

In the final set of analyses, I determined the frequencies of positive and negative test results for each test among the subjects with and without LBP, as well as the estimated crude odds ratios and 95% CIs for the associations (Tab. 5). The data indicate a strong positive association between Gillet test results and LBP (ÔR=4.57 [95% CI=1.51, 13.86]). Confidence intervals, however, were wide due, in part, to the small number of positive test results. The sitting flexion test and supine-to-sit test also showed a positive association with LBP (ÔR=1.52 [95% CI=0.63, 3.64] and 1.23 [95% CI=0.75, 2.02], respectively), but the odds ratios were substantively weaker than for the Gillet test and CIs revealed considerable imprecision. The standing flexion test was not positively associated with LBP (ÔR=0.77 [95% CI=0.42, 1.42]).

Table 5.

Association of Test Results With Low Back Pain: Frequencies, Odds Ratios (ÔRs), and 95% Confidence Intervals (CIs)

I performed a separate set of analyses for subjects with LBP reporting symptoms 3 months or less in duration and for subjects reporting symptoms more than 3 months to 12 months in duration. The duration of symptoms only made a difference in the relationship between test results and LBP for the sitting flexion test. The association for the full comparison group and for members of this group with symptoms greater than 3 months to 12 months in duration was 0.82 (95% CI=0.21, 3.14). The estimated association of 2.01 (95% CI=0.80, 5.07) between the full comparison group and the members of this group with symptoms 3 months or less in duration was substantively higher, but the 95% CI indicates a fairly imprecise estimate. There is, therefore, some indication of an association between the sitting flexion test and acute LBP, but the data are inconclusive.

I examined the data for possible effect modification and confounding of the association between test results and LBP by sex, age group, leg-length difference, and iliac crest asymmetry. The differences between crude, stratum-specific, and standardized odds ratios were never of sufficient magnitude to argue that these variables were effect modifiers or confounders of the association between test results and LBP, with 2 exceptions. For the Gillet test, there was evidence of effect modification by sex and by age group. There was no positive Gillet test result among male subjects without LBP, resulting in an infinite value for the odds ratio (due to a zero cell). For female subjects, the estimated odds ratio was 2.78 (95% CI=0.85, 9.11). The odds ratio for younger subjects (<35 years of age) was 2.05 (95% CI=0.51, 8.28), whereas the estimate for older subjects (35–50 years of age) was 12.25 (95% CI=1.57, 95.55). Although the odds ratio for older subjects was high, the substantial width of the CI reflects the single positive Gillet test finding among older subjects without LBP. The sitting flexion test also showed some evidence of effect modification by sitting iliac crest level. The association between LBP and sitting flexion among those subjects with level iliac crests in a sitting position was 0.78 (95% CI=0.25, 2.46), but this association was 4.44 (95% CI=0.90, 21.3) among those subjects with asymmetric iliac crests in a sitting position. The estimated association between sitting flexion test results and LBP among subjects with level iliac crests in a sitting position had an extremely wide CI because there were only 2 positive sitting flexion test findings among the subjects without LBP.

I explored the association between LBP and positive results on 2 or more tests as I did for innominate torsion. Subjects with missing data were dropped from the analyses. Of the 115 subjects with LBP and complete test data, 20.0% (n=23) had positive results on 2 or more tests. Of the 132 subjects without LBP and complete test data, 15.2% (n=20) had positive findings on 2 or more tests. Two subjects with LBP and 1 subject from the comparison group had positive findings on 3 of the 4 tests. No one had a positive finding on all 4 tests. The association between LBP and 2 or more positive test results was ÔR=1.40 (95% CI=0.72, 2.71). This value is similar to those found for single test results.

I wanted to explore whether treatment for the subjects with LBP may have reduced the number of positive test findings, thereby reducing the association between test results and LBP. Although I was unable to determine whether treatment had changed test results, I removed from the analysis all data obtained from subjects with LBP who identified themselves as better than when they decided to seek treatment. Although removing these subjects' data did not necessarily eliminate all data obtained for subjects who might have had treatment, it did eliminate the data of those subjects for whom treatment (or time) provided pain relief. In this subgroup, the association between the Gillet test and LBP became stronger (ÔR=6.85 [95% CI=2.01, 23.33], although the CI widened due to the smaller numbers. The association of 2 or more positive test findings increased to an odds ratio of 2.07 (95% CI=0.87, 4.94).

Association of Tests With Each Other

As a final set of analyses, I explored the relationship among test results to determine whether 2 or more of the tests appeared to be assessing a similar phenomenon. The odds ratios and 95% CIs for these associations are presented in Table 6. Only the association between the results of the sitting flexion test and the results of the supine-to-sit test appeared to be potentially important (ÔR=2.04 [95% CI=0.82, 5.03]).

Table 6.

Association of Test Results With Each Other: Odds Ratios and 95% Confidence Intervals

Discussion

Association of Test Results and Innominate Torsion

My data suggest that the Gillet test, standing flexion test, sitting flexion test, and supine-to-sit test do not appear useful in identifying people with estimated innominate torsion. Although some of the associations with innominate torsion were weakly positive, the sensitivity and predictive values indicate that the tests are not useful for identifying patients with torsion. Controlling variables such as leg-length differences, standing iliac crest level, or sitting iliac crest level did not substantively alter the relation between test results and innominate torsion. Although the use of alternate asymmetry variables improved the association between test results and asymmetry for some of the tests, the test performance characteristics (and therefore their usefulness) were not substantively improved. Using composite test results (positive if 2 or more tests had positive findings) also did not substantively improve the association with innominate torsion.

Association of Test Results With LBP

The tests used in this study were not designed to differentiate between people with and without LBP. My goal in estimating the association between test results and LBP was to explore whether these tests might be related to LBP through some mechanism other than innominate torsion. Although sacroiliac joint hypomobility was not assessed, this might be one mechanism through which positive test results could be related to LBP if hypomobility were demonstrated to be related to LBP.

In spite of a relatively small number of positive results on the Gillet test, there was 4.5 times more LBP among subjects with positive test results than among subjects with negative test results. The data yielded a specificity of 97%, but a sensitivity of only 12%. The association was stronger when only subjects with LBP who had not experienced pain relief were included. Although the association might be stronger among males and among people 35 to 50 years of age, the data must be considered inconclusive given the imprecision of the estimated associations. The Gillet test demonstrated virtually no association with static innominate torsion (ÔR=1.07 [95% CI=0.42, 2.74]) and only a very weak association with PSIS asymmetry in a standing position (ÔR=1.55 [95% CI=0.58, 4.13]). The stronger association between the Gillet test results and LBP as opposed to measurements of innominate torsion might support the contention that the Gillet test assesses sacroiliac joint hypomobility and that it is the hypomobility rather than innominate torsion that leads to LBP. This hypothesis, however, may be contradicted by the possible increase in strength of the relationship among males and among people between 35 and 50 years of age. Males and older individuals may have less mobile sacroiliac joints.38,54 Decreases in mobility would be expected to increase the number of positive tests among males and older individuals (regardless of the association with LBP). However, the number (and proportion) of positive tests was similar for both age groups; male subjects actually had fewer positive test findings (about one half as many) than did female subjects. These data suggest that there may be no reduction in mobility among males and older individuals, or the data may indicate the existence of some other mechanism for positive Gillet findings than joint hypomobility.

The sitting flexion test was only weakly associated with LBP (ÔR=1.52 [95% CI=0.63, 3.64]), except among subjects with sitting iliac crest asymmetry where the association increased to 4.44 (95% CI=0.90, 21.83). As for the Gillet test, however, the small number of positive test results among the comparison group make it difficult to draw conclusions from the data. The sitting flexion test is viewed by some authors4,14,15 as a reflection of asymmetrical positioning (torsion) of the sacrum rather than torsion of the innominates, but data to support this hypothesis are lacking. Because the sacrum was not evaluated in this study, asymmetry of the sacrum might still be a factor and might explain the increase in association between the sitting flexion test results and LBP over the test's association with innominate torsion. If the important factor was sacral torsion, however, the test should be detecting it regardless of duration of symptoms. The stronger association among those subjects with more acute LBP (<3 months' duration) may suggest another as yet unknown mechanism for the association between the sitting flexion test and LBP. The sitting flexion test was most strongly associated with the supine-to-sit test. Although the supine-to-sit test is apparently used to test sacroiliac joint mobility, I did not find any authors who proposed that the supine-to-sit test is sensitive to sacral motion rather than innominate motion.

The standing flexion test was not directly associated with either LBP or innominate torsion. The standing flexion test is supposed to detect abnormal movement or asymmetry of the innominates,4,5,15,21 as is the Gillet test. Data from my study indicate that the Gillet test and the standing flexion test are not responsive to the same phenomena.

Alternative Explanations for and Limitations to Study Findings

The measurements of the heights of landmarks (where the largest source of potential error in these measurements was the palpation of the landmarks themselves) showed excellent reliability (>.99 for all single measurements). Although the asymmetry variables showed lower ICC values due to reduction in variability of the data (ICC=.61–.75), standard errors of the measurements were consistently low for both landmark data and asymmetry data (2.7–3.8 mm, with innominate torsion having the largest error of only 5.3 mm). The reliability of data obtained with the tests could not be assessed. The tests, however, are based on judgments of symmetry of PSIS excursion (Gillet, standing flexion, and sitting flexion tests) or shift of the medial malleoli (supine-to-sit test) and, thus, rely heavily on palpation of landmarks. There is no way to assess how transitioning from palpation with the subjects stationary to judgments of excursion of landmarks affected reliability. The crest level tester used to assess iliac crest asymmetry was found to yield reliable measurements between testers in 2 previous studies (Kappa coefficient >0.75).43,44 My ability to measure innominate torsion or use the tests may not be generalizable to other clinicians.

Calculation of innominate torsion, as used in this study, was an attempt to estimate sagittal-plane asymmetry between the innominates (ie, an estimate of torsion only). Landmarks on which the calculation is based may be asymmetric without torsion within the pelvic ring. Torsion may have been considered present due to underdevelopment or overdevelopment of one innominate (or of a landmark on the innominate) rather than any disruption in alignment. Such right-left asymmetries in the body are considered normal, in general, and have been documented within the pelvic ring.55,56 Distortions in other planes may also mask or exaggerate landmark asymmetry. Although true torsion was not necessarily assessed, this same limitation applies universally to clinical evaluation of innominate torsion in the absence of an accepted standard. The arbitrary cutoff points for dividing subjects into those with positive innominate torsion and those with negative innominate torsion should also be considered. I assessed the effect of using different cutoff values. I calculated odds ratios for the association between test results and innominate torsion using cutoff points of 4 mm (the standard error of the measurement for one landmark) and 8 mm (doubling the standard error of the measurement to allow for introduction of additional error from using multiple landmarks in the calculation). The estimated odds ratios for the association of test results with innominate torsion using these alternative cutoff points either decreased (were less positive) or remained essentially the same. That is, the choice of cutoff for innominate torsion did not appear to mask a direct association between test results and innominate torsion.

Comparison of Findings With Those of Other Studies

Cibulka and Koldehoff24 recently published a study with a similar design but with substantively different findings. Using a criterion of at least 3 positive findings on 4 tests (PSIS asymmetry in a sitting position, standing flexion, supine-to-sit, and prone knee flexion), they reported a sensitivity of 0.82 and a specificity of 0.88 in identifying subjects with LBP of no more than 6 weeks' duration. To be positive on any one of the tests, subjects had to be judged to have asymmetry or positional change of at least 2.54 cm (1 in). Their criterion for a positive test, therefore, was substantially more rigorous than in my study, and their composite outcome data were potentially more discriminating. Cibulka and Koldehoff found 86 of 105 subjects with acute LBP (82%) to have positive findings on 3 of the 4 tests. Although frequency data for individual test results were not reported, it would appear that they had a much larger number of positive tests than I found in spite of their more rigorous criteria. Among the comparable group of subjects with LBP of 2 months' duration or less in my study, I found 9 of 58 subjects (15.5%) to have a positive standing flexion test results and 21 of 54 subjects (38.9%) to have a positive supine-to-sit test result. I used as the positive criterion for these 2 tests any observable positional asymmetry or change as recommended in the literature. In the same group, 36 of 62 subjects with acute LBP (58.1%) had measured sitting PSIS asymmetry greater than the standard error of the measurement (4 mm); only 2 out of all 198 subjects (1 with LBP and 1 without LBP) had asymmetry of 25 mm (1 in) or more. Applying the criteria of positive findings on 2 of 3 tests in my data (sitting PSIS asymmetry >4 mm, standing flexion and supine-to-sit tests), I had 14 positive findings among 51 patients with LBP of less than 2 months' duration (yielding a sensitivity of 27.5% and a specificity of 64.6%).

Accounting for such a large number of positive findings among any group of people with nonspecific LBP is difficult, especially when the rigor of the criteria is considered. One possible explanation is that the 2 examiners may have been inadvertently biased in their judgment of test results. The authors do not address whether the therapists performing the judgments also treated the subjects in their study or whether the examiners were otherwise familiar with the subjects. If the examiners did a comprehensive evaluation on the subjects or knew the results of such evaluations, that information may have created a bias in their test results. In my study, I collected only the data for the study, and I was unfamiliar with the clinical status of any of the subjects. Although clinical tests of sacroiliac joint dysfunction are generally conducted as part of a larger examination, I believe that the performance characteristics of these tests should be assessed independently of other findings. Cibulka and Koldehoff24 also argued that their composite of 4 tests showed success in a previous study57 in predicting patients likely to respond to a treatment designed to correct innominate rotation asymmetries. Data from my study do not support the association of 3 of the 4 tests with estimated innominate rotation. I did not assess the performance characteristics of the prone knee flexion test because I was unable to find support in the literature for this test other than by those authors who first proposed the test5,12,57; no data on individual performance characteristics of this test have been published. Given the discrepancy between my findings and those of Cibulka and Koldehoff,24 it would appear appropriate to withhold conclusions about the association of combined tests of sacroiliac dysfunction with LBP until further data from independent investigators are available, including data on the performance characteristics of the prone knee flexion test.

Conclusion

The data in my study did not support the use of the Gillet test, standing flexion test, sitting flexion test, or supine-to-sit test to differentiate between subjects with and without static innominate torsion in a patient sample. Using 2 or more tests in parallel or using alternative measures of innominate torsion did not substantially improve the usefulness of the measurements. Subgroup and covariate analyses did not suggest that explanatory variables may have masked or distorted a positive relationship. This study does not argue against use of the 4 tests for assessing sacroiliac joint hypomobility or positional problems with the sacrum. Data to support that use, however, have not been reported. The data from my study indicate that only the Gillet test showed a substantive association with LBP, although the basis for the association cannot be determined from these data. My findings do not, in my opinion, argue against continued use of the 4 studied tests of sacroiliac joint dysfunction. Having potentially ruled out one possible basis for these tests (innominate torsion) and having raised a question about the relationship of test results to LBP, clinicians should be cautious in their use until more data are available.

Acknowledgments

Thanks go to the clinicians in participating physical therapy facilities throughout the Boston area and to my research assistants, without whom this study could not have been done. Special thanks go to the personnel of Beth Israel Hospital and Brigham and Women's Hospital. I also acknowledge the support of Dr Ken Rothman and Dr Nancy Watts, whose advocacy through the long dissertation process helped bring this study to a successful conclusion.

Footnotes

  • In addition to writing the article, Dr Levangie provided concept and research design, data collection and analysis, project management, and fund procurement. Dr Levangie's student research assistants contributed to data collection and clerical/secretarial support, and Dr Kenneth Rothman supported data analysis. Subjects, facilities, and institutional liaisons were provided by staff of the participating physical therapy facilities throughout the Boston area. Beth Israel Hospital and Brigham and Women's Hospital provided key and long-term support. Dr Rothman and Dr Nancy Watts provided consultation (including review of the manuscript prior to submission).

    This study received funding from the Foundation for Physical Therapy and was supported, in part, by Sargent College of Allied Health Professions, Boston University, where Dr Levangie worked during part of the study period.

    Manufacturer Information

  • * Ballert International Inc, 3645 Woodhead Dr, Ste 17, Northbrook, IL 60062.

  • Analytic Software Co, PO Box 12185, Tallahassee, FL 32317.

  • Microsoft Corp, One Microsoft Way, Redmond, WA 98052.

  • Received April 8, 1999.
  • Accepted June 6, 1999.

References

View Abstract