Background: The Gross Motor Function Measure (GMFM) is the instrument most commonly used to measure gross motor function in children with cerebral palsy (CP). Different scoring options have been developed, and their measurement properties have been assessed. Limited information is available regarding longitudinal construct validity.
Objective: The objective of this research was to study the longitudinal construct validity of 3 scoring options: the 88-item GMFM (GMFM-88) total, the GMFM-88 goal total, and the 66-item GMFM (GMFM-66).
Design: A clinical measurement design was used in this study.
Methods: Forty-one children with CP diplegia who were undergoing selective dorsal rhizotomy (SDR) were monitored with the GMFM for 5 years. The mean age at SDR was 4.4 years (range=2.5–6.6). Two subgroups for gross motor function before surgery were created according to the Gross Motor Function Classification System (GMFCS): GMFCS levels I to III and GMFCS levels IV and V. This study included results obtained before SDR and at 6, 12, and 18 months and 3 and 5 years after SDR. The effect size (ES) and the standardized response mean (SRM) were calculated.
Results: At 6 months postoperatively, ES and SRM values were small (≤0.5) for all GMFM scoring options. The GMFM-88 total and goal total scores showed large changes in ES values (range=0.8–0.9) and SRM values (range=0.9–1.3) at 12 months postoperatively, whereas the GMFM-66 scores showed lower ES values (range=0.3–0.4) and SRM values (range=0.7–0.8) for both subgroups. Later postoperatively, larger values for longitudinal construct validity were found. The ES and SRM values generally were lower for the GMFM-66 scores than for the GMFM-88 total and goal total scores.
Limitations: All children underwent an extensive intervention, and changes in gross motor function were expected.
Conclusion: All 3 scoring options showed large longitudinal construct validity in the long-term follow-up. The GMFM-88 total and goal total scores revealed large changes in gross motor function earlier postoperatively than the GMFM-66 scores.
Physical therapists and other health care professionals are being urged to evaluate the effectiveness of their treatment methods. Evidence-based practice requires relevant, reliable, valid, and responsive outcome measures to evaluate treatment effects.1 There is a need for research to better define the psychometric properties of measures.2
Longitudinal construct validity is the extent to which an instrument can detect purposive change longitudinally within the construct it is intended to measure.3 Different terms, such as “responsiveness,” “sensitivity to change,” and “longitudinal construct validity” have been used; however, there is a lack of consensus.3,4 Longitudinal construct validity or responsiveness has been suggested by Liang3 to be considered an important component of outcome assessment and a distinct criterion for psychometric evaluation in clinical research. Different approaches can be used for the assessment. When a gold standard is available, the magnitude of change detected by a new instrument should be compared with the gold standard. Another option is to use an external criterion (eg, asking a child's parents or health care provider to rate whether change has occurred).5
The effect size (ES) and the standardized response mean (SRM) are statistics that can be used to describe the size of the change detected and can be calculated to describe longitudinal construct validity. A large ES or SRM indicates that a studied measure detects a large change occurring within a group.6
The original Gross Motor Function Measure (GMFM), an 88-item measure also known as the GMFM-88, is a criterion-referenced observational measure specifically developed to evaluate changes in gross motor function over time in children across the wide spectrum of ability levels in cerebral palsy (CP).7 The GMFM-88 has 5 dimensions: A–lying and rolling; B–sitting; C–kneeling and crawling; D–standing; and E–walking, running, and jumping. The items are scored from 0 to 3. All items are summarized and expressed as a value of total points for each dimension of the GMFM-88. The GMFM-88 total score is calculated as the mean score of all 5 dimensions, and the goal total score is the mean of individually selected dimensions (ranging between 1 and 4) constructed to increase responsiveness in the GMFM-88 (Tab. 1).7 The GMFM-88 total score is the mean of all dimensions, and the GMFM-88 goal total score is the mean of individually selected dimensions constructed to increase responsiveness (Tab. 1). The GMFM-88 total score has been the measure most often chosen to detect changes in gross motor function in evaluations of various interventions.8–13 It is considered the gold standard for measuring gross motor function in children with CP. Few results have been published for the GMFM-88 goal total score.13–15 Studies have presented mainly clinical results and have explored the psychometric properties of the instrument less often.
In the original GMFM validation study, 111 children with CP, 25 children with acquired brain injury, and 34 children who were developing typically were tested twice over 5 to 7 months.16 Correlations between scores for changes in motor function measured with the GMFM-88 and the judgments of changes by parents, therapists, and masked evaluators supported the hypothesis that the instrument would be responsive to both negative and positive changes.16 Bjornson et al17 studied 21 children with diplegia and quadriplegia and provided additional validation evidence of the responsiveness of the GMFM-88. Kolobe et al18 found that the GMFM-88 was able to detect changes in motor function in 24 infants with CP over 6 months; the mean change was 4.2 points. Russell et al19 studied validity and responsiveness in 206 children with CP and found that the mean change in motor function detected by the GMFM-88 over 6 months was 3.5 points. Nordmark et al13 studied responsiveness in 18 children undergoing selective dorsal rhizotomy (SDR). The total and goal total scores were found to respond to changes in motor function over 6 and 12 months, especially for children with milder impairment. Vos-Vromans et al20 studied responsiveness in GMFM-88 total scores over 18 months in a population of children who had CP and who were 2 to 7 years of age; for the total score, the ES was 0.6 and the SRM was 0.9.
Both interrater and intrarater reliability of GMFM-88 scores have been reported to be good. Nordmark et al21 found the interrater reliability (Kendall coefficient of concordance) to be .77 and .88 at the first and second assessments, respectively, and the intrarater reliability to be .68 at the second assessment (Kendall rank correlation).13 Bjornson et al17 found the GMFM to be consistent for the measurement of gross motor skills. Children with CP exhibited stable gross motor skills during repeated measurements. The intraclass correlation coefficients (ICCs) ranged from .76 to 1.00.
The GMFM-88 has been used in different settings by therapists worldwide; however, there were limitations in the measure and how it was used.7 There was evidence that the reliability and the validity of the separate dimension scores were generally not as strong as for the measure as a whole. The interpretation of the total score also had its limitations, as children with different levels of motor function theoretically could have received the same score. Children with results in the middle of the scale had a greater potential to change than those with results that were either lower or higher on the scale, as the difficulty scale contained more items in the middle than at the ends.7 The designers used the Rasch method to improve scoring, interpretation, and overall clinical and research utility.1
The most recent version is known as the GMFM-66, as it contains 66 of the original 88 items. The goal was to develop the GMFM-66 to be less vulnerable than the GMFM-88 to missing items and to be more responsive for children with major functional limitations as well as children with minor functional limitations. The Rasch analysis revealed that 66 items contributed the most to the underlying construct of gross motor function. To improve reliability and validity, 22 items were deleted, and an interval scale was created. Of the 22 items, 13 were from the lying and rolling dimension, 5 were from the sitting dimension, and 4 were from the kneeling and crawling dimension.
Russell et al22 published data on the psychometric properties of the GMFM-66. The GMFM-88 measurements obtained for 537 children with CP by 110 physical therapists were converted to GMFM-66 scores. Children were excluded if they had received intrathecal baclofen or botulinum toxin injections or had undergone SDR. The gross motor function in the 228 children who were reassessed after 12 months depended significantly on time, age, and severity of impairments. Other findings were that children younger than 5 years of age changed more than those older than 5 years of age and that children with less-severe motor impairments improved more than those with more-severe impairments. Russell et al22 found high test-retest reliability, with an ICC of .9933, essentially the same as for the GMFM-88 (ICC=.9944). Wei et al23 found high levels of test-retest reliability and interrater reliability (ICC=.97 and .98, respectively) for a sample of 171 children.
Wei et al23 explored the clinical consequences of deleting the 22 items from the GMFM-88 for children younger than 3 years of age. They found that the GMFM-66 was responsive even for these young children, who mainly had functional abilities assessed in lying, rolling, sitting, crawling, and kneeling positions. Wang and Yang24 evaluated the responsiveness of the GMFM-88 total score and the GMFM-66 score. They compared the scoring options with the external criterion of a therapist's judgments of meaningful motor improvements at a follow-up at 3.5 months. They found the 2 scoring options to be equally responsive; however, the GMFM-66 was found to have better specificity than the GMFM-88 for the therapist's judgments of meaningful motor improvements.
We have not found any published study on the longitudinal construct validity of the 3 scoring options. The purpose of this study was to examine the longitudinal construct validity of the GMFM-88 total and goal total scores and the GMFM-66 score over 5 years in children with CP.
Data were included from the first 41 children (28 boys and 13 girls) with spastic diplegia who underwent SDR combined with intensive physical therapy. The purposes of the intervention were to reduce spasticity and to yield gains in gross motor abilities.25 Statistically significant changes in gross motor function after SDR in children with CP have been reported previously.11 The mean age at surgery was 4.4 years (SD=1.1, range=2.5–6.6). Gross motor function before surgery was classified according to the Gross Motor Function Classification System (GMFCS). This classification system is used to describe and classify functional abilities in children with CP in 1 of 5 levels (level I representing the fewest functional limitations and level V representing the most functional limitations) in 4 age groups: less than 2 years, 2 to 4 years, greater than 4 to 6 years, and greater than 6 to 12 years.26 The gross motor function of the children was classified as GMFCS levels I (n=1), II (n=9), III (n=13), IV (n=17), and V (n=1).
According to the degree of severity of gross motor function preoperatively, the children were separated into 2 subgroups to assess changes within each subgroup. The subgroup including GMFCS levels I to III (n=23) included those children who walked, with or without walking aids, and needed only some assistance in everyday gross motor activities. The mean age at surgery was 4.6 years (SD=1.1, range=3.1–6.6). The subgroup including GMFCS levels IV and V (n=18) included those children who had no or limited walking ability, even with the help of walking aids, and needed extensive assistance in everyday gross motor activities. The mean age at surgery was 4.1 years (SD=1.1, range=2.5–5.9).
All children were assessed before surgery and at 6, 12, and 18 months and 3 and 5 years after surgery. During play to motivate children to obtain optimal scores and minimize position changes, GMFM testing was performed by 1 physical therapist (EN). The assessments were observed, videotaped, and scored by another physical therapist (ALJ). If necessary, the videotape was referred to afterward to verify scoring. Before starting to use the instrument, both physical therapists were trained and examined by the test developers in administering the GMFM-88. The children were always tested without shoes, orthoses, or walking aids.
The GMFM results were calculated as GMFM-88 total and goal total scores as described above and in Table 1. The goal dimensions in the GMFM-88 goal total score were individually selected by the physical therapists (EN and ALJ) for each child according to the clinically relevant goal areas. For a child with gross motor function classified in GMFCS level IV, the individual goals may be to independently come to a sitting position from lying on floor, crawl short distances, and be able to bear weight on both legs during standing transfers. The 3 dimensions sitting, kneeling and crawling, and standing would be selected as the most clinically relevant dimensions and goal areas. For the 5 dimensions, a median of 3 (range=1–4) goal areas were selected. The goal total score was calculated as the mean of the scores for the selected dimensions.
The GMFM-66 score was obtained from the GMFM-88 total score with Gross Motor Ability Estimator software.7 The characteristics of the GMFM-88 and the GMFM-66 are shown in Table 1. Complete GMFM results were obtained for all except 2 children (Tabs. 2 and 3). None of the children had scores close to the minimum or the maximum; they all had the potential to show changes in motor function in the 3 different GMFM scoring options. The GMFM scores were normally distributed.
According to the Swedish National Board of Health and Welfare, clinicians are obliged to secure quality of care by performing and reporting the results of clinical studies in everyday practice. Approval from an internal review board is not required for this type of research; subjects and all data were treated in accordance with the guidelines of the Helsinki Convention.
For evaluation of the longitudinal construct validity of the GMFM scoring options, the ES and the SRM were used. The ES was calculated as the mean difference between the baseline score and the follow-up scores divided by the standard deviation of the baseline score. The SRM was calculated as the mean change score divided by the standard deviation of the change scores.5 Effect sizes of 0.2 to 0.5 were classified as small, values of 0.5 to 0.8 were classified as medium, and values of greater than 0.8 were classified as large.27 The ES and SRM were calculated for measurements obtained before surgery and at 6, 12, and 18 months and 3 and 5 years after surgery to study the longitudinal construct validity of the GMFM scoring options (as opposed to treatment effectiveness). Results were analyzed for the group as a whole and for the 2 subgroups. When comparing ES and SRM between the GMFM scoring options at the different time intervals, the preferable instrument can be determined depending on follow-up time and GMFCS level.
Role of the Funding Sources
The Linnéa and Josef Carlsson Foundation and the Stiftelsen för Bistånd åt Rörelsehindrade i Skåne funded the first author (ALJ) to analyze data and prepare a manuscript for publication.
Presurgery mean and median values for GMFM-88 total score, GMFM-88 goal total score, and GMFM-66 score are presented in Table 2 for the group as a whole and for the 2 GMFCS subgroups. Longitudinal construct validity, in terms of ES and SRM, is shown in Table 3 and in Figures 1, 2, and 3. At 6 months after surgery, the ES and SRM were small (≤0.5) for all 3 GMFM scoring options for the subgroups and for the group as a whole (Tab. 3, Figs. 1, 2, and 3).
At 12 months after surgery, children in the subgroup including GMFCS levels I to III showed large changes in GMFM-88 total scores (ES=0.8 and SRM=1.3) and GMFM-88 goal total scores (ES=0.9 and SRM=1.2). Less change was seen in GMFM-66 scores at 12 months after surgery (ES=0.3 and SRM-0.8) (Tab. 3, Fig. 2). For children in the subgroup involving GMFCS levels IV and V, both GMFM-88 total and goal total scores showed large changes at 12 months after surgery (ES=0.8 and SRM=0.9), and GMFM-66 scores showed less change (ES=0.4 and SRM=0.7) (Tab. 3, Fig. 3).
At 18 months after surgery, children in the subgroup involving GMFCS levels I to III showed large changes in GMFM-88 total scores (ES=0.8 and SRM=1.1) and goal total scores (ES=0.8 and SRM=0.9), and their GMFM-66 scores showed less change (ES=0.5 and SRM=0.8) (Tab. 3, Fig. 2). Children in the subgroup involving GMFCS levels IV and V showed large changes in GMFM total scores (ES=1.0 and SRM=1.1, and goal total scores (ES=1.1 and SRM=1.2), and their GMFM-66 scores showed less change in ES (0.6) and a large change in SRM (1.0) (Tab. 3, Fig. 3).
At 3 and 5 years after surgery, all 3 GMFM scoring options showed large changes for both the subgroup involving GMFCS levels I to III (ES=1.0–1.6 and SRM=1.0–1.2) and the subgroup involving GMFCS levels IV and V (ES=1.0–1.6 and SRM=1.0–1.7) (Tab. 3, Figs. 2 and 3).
We found that the 3 scoring options indicated progressive changes at short- and long-term follow-up after an extensive intervention for the group as a whole. Doubts have been raised about the longitudinal construct validity of GMFM-66 scores for children with more-severe disabilities, as many items in the lying and rolling, sitting, and kneeling and crawling dimensions were deleted.7 Our results indicated that the patterns of longitudinal construct validity between the 2 subgroups (GMFCS levels I to III and GMFCS levels IV and V) for the 3 scoring options were the same but that there were some differences among the scoring options.
Both GMFM-88 total and GMFM-88 goal total scores showed larger changes earlier after surgery compared with GMFM-66 scores (Figs. 1, 2, and 3). The 2 GMFM-88 scoring options showed almost the same pattern for detecting changes in function during follow-up. After the 6-month follow-up, only small changes were seen in all 3 GMFM scoring options, and the low ES and SRM at 6 months confirmed the expected delay in gross motor progress.
The goal total score has seldom been used in research reports. In clinical practice, we have found the goal total score to be useful for monitoring gross motor function changes. By identifying goal areas and deciding which dimensions to include in the goal total score, clinicians, the family, and the child can discuss the long-term expectations of the intervention. Specific items from these goal dimensions can be selected and used in short-term goal setting during rehabilitation after surgery; for example, the goal attainment scaling is well suited for this purpose.28
The GMFM-66 scores indicated similar longitudinal construct validity for the 2 subgroups, which suggests that change in gross motor function was equally possible to detect with the GMFM-66 for the subgroup involving GMFCS levels I to III and the subgroup involving GMFCS levels IV and V despite the 22 deleted items. It was at 18 months and 3 years after surgery when large longitudinal construct validity was first seen. The early changes appear to be detected with the GMFM-88 goal total to a greater extent than with the GMFM-66.
Wei et al23 found that the GMFM-66 could be used to detect changes in gross motor function in children in the lying and rolling, sitting, and kneeling and crawling dimensions. For the children in the subgroup involving GMFCS levels IV and V in the present study, however, changes were detected by the GMFM-66 later during follow-up compared with the GMFM-88 total and goal total scores. These children were older than the children in the study of Shi et al24 and, therefore, were developing in gross motor function more slowly, in accordance with the gross motor function curves presented by Rosenbaum et al.29
Russell et al7 recommended that the GMFM-88 should be used when comparison of changes among children is not needed or when a young child or a child with gross motor function classified as GMFCS level V is being tested. In addition, the GMFM-88 should be used when a child with orthoses and aids is being tested or when there is no access to a computer to obtain GMFM-66 scores. The GMFM-66 is recommended for research purposes, for comparing changes among children, and for monitoring the development of a single child over time, as the interval-scaled instrument is considered to provide a more reliable estimate of gross motor function.7 Compared with GMFM-66 scores, the earlier detection of large changes in GMFM-88 total and goal total scores among children with gross motor function classified as GMFCS levels IV and V is in agreement with our clinical impression.
Gross motor function has been shown to depend on age and severity of functional impairments. The most rapid changes in GMFM results occur during the first 4 years of life, and a plateau phase is reached between 5 and 6 years, depending on the severity of CP, as indicated by GMFCS levels.29 The mean age of children at SDR in the present study was 4.4 years, and most of them were likely to continue to improve in gross motor function development for another 1 to 2 years before reaching their probable maximum scores at 6 to 7 years of age. Changes in function during follow-up were expected because of the natural development of gross motor function and the effects of spasticity reduction.
Different statistical methods have been used to evaluate longitudinal construct validity; ES and SRM appear to be the 2 most commonly used methods.6 For the children in the present study, the relatively large standard deviation of the scores obtained before surgery and the small changes that were found during the first 6 months after surgery resulted in low ES and SRM during the first 6 months compared with later after surgery. The change scores and standard deviation of the change scores (for calculation of SRM) were more homogenous than the change scores and standard deviation of the preoperative score (for calculation of ES). This was reflected by the relatively larger SRM than ES in at least all follow-up intervals up to 18 months after surgery (Tab. 3 and Figs. 1, 2, and 3). As expected, none of the children in the present study showed large changes in gross motor function at 6 months after surgery.
In the present study, the ES and SRM did not differ significantly from each other. The small differences indicate that the group scores studied were similarly distributed at baseline and at follow-ups.
The interpretation of ES and SRM usually follows the guidelines of Cohen.27 Analyses and interpretation guidelines are based on statistical arguments rather than on the patient's or therapist's opinions of what constitutes an important change (eg, when using external criteria).6
In this study, we chose to investigate longitudinal construct validity of the 3 GMFM scoring options within 2 subgroups to provide readers with information on the measures’ performance in relation to the severity of function.
All children underwent an extensive intervention initially, and changes in function were expected. A larger sample including more children with gross motor function classified at each GMFCS level may reveal more severity-dependent differences in longitudinal construct validity for the 3 GMFM scoring options.
All scoring options showed large longitudinal construct validity in a long-term follow-up. The GMFM-88 total and goal total scores detected large changes in motor function earlier after surgery than the GMFM-66 scores in children with gross motor function classified as GMFCS levels I to III and GMFCS levels IV and V.
All authors provided concept/idea/research design. Mrs Lundkvist Josenby provided writing and data analysis. Mrs Lundkvist Josenby and Dr Nordmark provided data collection, participants, and facilities/equipment. Dr Jarnlo provided project management. Dr Jarnlo, Dr Gummesson, and Dr Nordmark provided consultation (including review of manuscript before submission).
The Linnéa and Josef Carlsson Foundation and the Stiftelsen för Bistånd åt Rörelsehindrade i Skåne provided funding to Ms Lundkvist Josenby to analyze data and prepare a manuscript for publication.
- Received January 31, 2008.
- Accepted December 29, 2008.
- American Physical Therapy Association