A Systematic Review of the Effectiveness of Exercise, Manual Therapy, Electrotherapy, Relaxation Training, and Biofeedback in the Management of Temporomandibular Disorder

Marega S Medlicott, Susan R Harris


Background and Purpose. This systematic review analyzed studies examining the effectiveness of various physical therapy interventions for temporomandibular disorder. Methods. Studies met 4 criteria: (1) subjects were from 1 of 3 groups identified in the first axis of the Research Diagnostic Criteria for Temporomandibular Disorders, (2) the intervention was within the realm of physical therapist practice, (3) an experimental design was used, and (4) outcome measures assessed one or more primary presenting symptoms. Thirty studies were evaluated using Sackett’s rules of evidence and 10 scientific rigor criteria. Four randomly selected articles were classified independently by 2 raters (interrater agreement of 100% for levels of evidence and 73.5% for methodological rigor). Results. The following recommendations arose from the 30 studies: (1) active exercises and manual mobilizations may be effective; (2) postural training may be used in combination with other interventions, as independent effects of postural training are unknown; (3) mid-laser therapy may be more effective than other electrotherapy modalities; (4) programs involving relaxation techniques and biofeedback, electromyography training, and proprioceptive re-education may be more effective than placebo treatment or occlusal splints; and (5) combinations of active exercises, manual therapy, postural correction, and relaxation techniques may be effective. Discussion and Conclusion. These recommendations should be viewed cautiously. Consensus on defining temporomandibular joint disorder, inclusion and exclusion criteria, and use of reliable and valid outcome measures would yield more rigorous research. [Medlicott MS, Harris SR. A systematic review of the effectiveness of exercise, manual therapy, electrotherapy, relaxation training, and biofeedback in the management of temporomandibular disorder. Phys Ther. 2006;86:955–973.]

Temporomandibular disorder (TMD) includes a variety of conditions associated with pain and dysfunction of the temporomandibular joint (TMJ) and the masticatory muscles.1 An estimated 20% of the population is affected, with 10% to 20% of those seeking treatment.25 These disorders also are referred to as “temporomandibular dysfunction,” “craniomandibular disorders,” and “mandibular dysfunction.”5

The presenting symptoms of TMD are: (1) intermittent or persistent pain in the masticatory muscles or the TMJ, and less frequently in adjacent structures; (2) limitations or deviations of mandibular movement; and (3) TMJ sounds.6 A variety of other symptoms, such as tinnitus, abnormal swallowing, and hyoid bone tenderness, also may occur.7 Quality of life may be affected, with a negative effect on social function, emotional health, and energy level.6

Currently, there is lack of consensus among researchers regarding the etiology, diagnosis, and management of this disorder. The diagnosis of TMD is commonly based on the presenting signs and symptoms.8 The Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) applies a dual-axis system to diagnose and classify patients with TMD.6,810 The first axis is divided into 3 groups of commonly occurring TMDs:

  1. Muscle disorders, including myofascial pain with and without limited mandibular opening.

  2. Disk displacement with or without reduction or limited mandibular opening.

  3. Arthralgia, arthritis, and arthrosis.

The second axis includes a 31-item questionnaire, used to evaluate relevant behavioral, psychological, and psychosocial factors (eg, pain status variables, depression, nonspecific physical symptoms, disability levels).6,8,10

Noninvasive, conservative treatments generally provide improvement or relief of symptoms and are recommended in the initial management of TMD.11 Physical therapists are frequently involved in the management of TMD, often in collaboration with dental professionals. In a survey of members of the American Dental Association, physical therapy was listed among the 10 most common treatments used, involving 10% to 17% of patients.12 A wide variety of physical therapy techniques, including joint mobilization, exercise prescription, electrotherapy, education, biofeedback and relaxation, and postural correction, have been used in the management of this disorder.1,6,13

Research evaluating the effects of physical therapy in the management of TMD has been criticized for its lack of methodological rigor.14,15 However, recent studies have attempted to address some previously identified limitations. Because much of the research examining the effects of physical therapy on TMD has not been published in physical therapy journals, developing an evidence base for managing TMD is not easy.

This systematic review of randomized controlled trials (RCTs) and nonrandomized controlled trials assessed the physical therapy management of acute and chronic TMD on clinically relevant outcomes such as pain, range of motion (ROM), disability and function, joint noise, tenderness, and psychological factors. Based on duration of the disorder, TMD was defined as acute (<6 months) or chronic (>6 months). Sackett’s levels of evidence facilitate the categorization of studies according to the strength of the research design and the degree of control for potential threats to internal validity.16,17 Based on 5 hierarchical levels of evidence, which have been used in previous systematic reviews of physical therapist practice, recommendations can be made regarding treatment options.17,18


The literature search was restricted to English-language publications from 1966 through January 2005. Index Medicus (MEDLINE), the Cumulative Index to Nursing and Allied Health Literature (CINAHL), and the Cochrane Central Register of Controlled Trials were searched using the text words “facial pain,” “physical therapy,” “rehabilitation,” “temporomandibular disorder (TMD),” “temporomandibular joint (TMJ),” “temporomandibular joint syndrome,” and “therapy.”

Study Selection Criteria

To be included in the systematic review, studies had to meet the following criteria: (1) subjects were from 1 of the 3 groups identified in the first axis of the RMC/TMD,6 (2) the intervention was within the realm of physical therapist practice, (3) an experimental design was used (eg, an RCT or nonrandomized controlled trial), and (4) the outcome measures assessed one or more of the primary presenting symptoms (eg, pain, ROM, disability or function).

Studies with any of the following exclusion criteria were not included in the review: (1) interventions post–TMJ surgery, (2) physical therapy interventions in combination with other non–physical therapy interventions, (3) acupuncture as an intervention, (4) interventions involving passive ROM devices. Studies that assessed only electromyographic (EMG) results were not included.

Review Criteria

Studies were evaluated according to Sackett’s initial rules of evidence,17 as described by Barry.16 These levels (I–V) are hierarchical and represent the confidence generated by the results produced in the studies.

Level I: (a) systematic review (with homogeneity) of RCTs (b) individual RCT (with narrow confidence interval) (c) all or none

Level II: (a) systematic review (with homogeneity) of cohort studies (b) individual cohort study, including low-quality RCTs (eg, <80% follow-up) (c) “outcomes” research

Level III: (a) systematic review (with homogeneity) of case-control studies (b) individual case-control studies

Level IV: case series (and poor-quality cohort and case-control studies)

Level V: expert opinion without explicit critical appraisal, or based on physiology, bench research, or “first principles”

Methodological Quality of Reviewed Studies

Methodological rigor of the studies was evaluated using the following criteria, adapted from Megens and Harris18,19 and the McMaster Occupational Therapy Evidence-Based Practice Research Group20:

  1. randomization,

  2. inclusion and exclusion criteria were listed for the subjects (and were subsequently grouped, by the primary author of this review, into 1 the categories on the first axis of the RMC/TMD),

  3. similarity of groups at baseline (if the study design used 2 or more groups),

  4. the treatment protocol was sufficiently described to be replicable,

  5. reliability of data obtained with the outcome measures was investigated,

  6. validity data obtained with the outcome measures was addressed,

  7. blinding of patient, treatment provider, and assessor,

  8. dropouts were reported,

  9. long-term (6 months or greater) results were assessed via follow-up, and

  10. adherence to home programs was investigated (if included in the intervention).

We rated the methodological rigor of the study as “strong” (“yes” score of 8–10), “moderate” (“yes” score of 6 or 7), or “weak” (“yes” score of ≤5). To assess the reliability of different raters’ judgments in classifying studies, 4 randomly selected articles were independently reviewed and classified according to Sackett’s levels of evidence17 and methodological rigor criteria by 2 different raters.


A large number of articles were identified that included physical therapy management of TMD. Many articles were general reviews or were descriptive in nature. Of the 108 articles that reported experimental studies, 30 articles met the inclusion criteria. No studies could be located that solely assessed disability related to TMD. The primary reason for the exclusion of all except 30 studies was the incorporation of non–physical therapy management, such as medication or surgery. One reviewer completed the study literature search and the study selection and data abstraction.

Interrater agreement (percentage of agreement) on the levels of evidence for each of the 4 studies independently reviewed was 100%. Interrater agreement, using the McMaster University Critical Review Form for Quantitative Studies20 to assess methodological rigor, was 73.5%.

The 30 studies included in this review were divided into groups based on the primary intervention used. Fourteen studies4,9,2134 investigated the use of exercise or manual therapy, 8 studies5,3541 investigated the use of electrotherapy, 7 studies4249 investigated the use of relaxation training or biofeedback, and 1 study50 investigated the use of exercise and electrotherapy. The study characteristics are summarized in Tables 1 through 3 (see pages 962–970), organized according to primary type of intervention.

View this table:
Table 1.

Studies on Exercise and Manual Therapya

View this table:
Table 2.

Studies on Electrotherapya

View this table:
Table 3.

Studies on Relaxation Training and Educationa

Effect Size

Effect size r was calculated using Meta-Analysis Programs by Schwarzer.51 If means and standard deviations were available, these data were used to calculate effect size r. In some cases, other statistics were reported, such as F values or chi-square values, which were transformed into an effect size r. A 95% confidence interval was subsequently calculated.51 Effect size measurements can indicate the relative magnitude of the experimental treatment and can allow comparison of the magnitude of experimental treatments between experiments. The suggestion by Cohen52 that effect sizes of 0.20 are small, 0.50 are medium, and 0.80 are large facilitates the comparison of the effect size results of an experiment with known benchmarks. Effect size was calculated for 24 studies; however, due to lack of data, it was not always possible to calculate effect sizes for all of the outcome measures utilized (ie, the remaining 6 studies lacked raw data), although the results were reported in terms of statistical significance with P<.05.

Levels of Evidence

Of the 30 studies reviewed, 22 were RCTs and were identified as level IIb due to low study quality. Four studies27,28,30,31 had a single-group pretest-posttest design with a nontreatment control period, 2 studies23,26 had a case series design, 1 study4 had a single-group randomized (treatment or placebo) crossover design, and 1 study40 involved 1 group with a randomized order of treatments (treatment or placebo) within sessions (with session 1 before session 2); these 8 studies were identified as level IV due to the lack of a control group.

Scientific Rigor of the Studies

The methodological rigor of the studies was evaluated using the 10 criteria shown in Table 4 (see page 971). The studies were organized in Table 4 according to score on the methodological criteria. The study quality scores ranged from 1 to 7.3, with a median score of 4.0 and a mean score of 4.15. None of the studies could be judged as “strong” (“yes” score of 8–10), 5 studies22,24,25,34,49 could be judged as “moderate” (“yes” score of 6 or 7), and the remaining 25 studies4,5,9,21,23,2628,3032,3543,4548,50 would be considered “weak” (“yes” score or ≤5).

Table 4.

Evaluative Criteria for Studies Reviewed


Subjects were randomly assigned to 2 or more groups in 24 studies,4,5,9,21,22,24,25,32,3443,4550 including the 2 studies that involved cross-over designs. The 6 studies in which subjects were not randomly assigned to groups were all single-group designs.23,2628,30,31

Subject Inclusion and Exclusion Criteria

Inclusion and exclusion criteria varied among the studies and in relation to the subgroup of TMD diagnosis of the sample studied. Subjects were classified into subgroups identified in the RDC/TMD. Seventeen studies4,21,22,24,25,27,34,38,4150 involved subjects with myofascial TMD, and 6 studies9,23,26,30,31,39 involved subjects with disk displacement (1 study with subjects with reduction,31 3 studies with subjects without reduction,23,26,30 and 2 studies with subjects with unspecified status as to reduction9,39). One other study37 involved subjects with myofascial TMD (50%) and subjects with arthritis (50%). Six studies5,28,29,32,33,35,36,40 involved people with arthritis (2 studies with subjects with disk displacement without reduction, 1 study with 89% of the subjects having rheumatoid arthritis, 1 study with 56% of the subjects having rheumatoid arthritis, 1 study with 64% of the subjects having ankylosing spondylitis, and 1 study unspecified).

Studies involving subjects from all subgroups of TMD were included in the systematic review, despite differences among subgroups. Inclusion criteria were not identified in 7 of the 30 studies. In 3 studies,21,32,46 a reference source was provided, but criteria were not otherwise defined. In the other 4 studies,9,26,43,48 inclusion criteria were unclear.

For the 23 studies that described inclusion (and exclusion) criteria, 12 required self-reported symptoms, most commonly pain (ranging from 1 month to 1 year in duration).22,24,25,2729,31,34,41,42,47,50 The other 11 studies4,5,23,30,3539,45,49 required self-reported symptoms of an unspecified length of time. Five of the studies involving subjects with arthritic TMD23,28,29,36,40 required radiological evidence of osteoarthritis among the inclusion criteria. One study involving disk displacement30 required magnetic resonance imaging (MRI) evidence. Six studies5,30,36,39,49,50 required that subjects have limited mandibular movement. Evidence of “postural dysfunction” was required in 3 studies,27,30,31 although postural dysfunction was not defined in detail. Five of the studies involving subjects with myofascial TMD4,22,39,42,50 required the presence of tenderness on palpation of masticatory muscles. Four studies25,27,31,42 also directly referenced the source of the inclusion criteria. Exclusion criteria tended to rule out a history of trauma or malocclusion, prior or concurrent treatment for TMD, and specific contraindications relating to electrotherapy modalities.

Similarity of Groups at Baseline

Fourteen studies21,22,24,25,35,3739,43,4547,49,50 reported on the similarity of groups at baseline.

Repeatability of the Treatment Protocol

Of the 14 studies involving exercise or manual therapy, 9 studies4,9,2123,25,26,32,34 provided sufficient description to allow replication of the intervention. In the remaining 6 studies,24,2731 5 of which were by Nicolakis and colleagues, exercises were not described in detail sufficient to replicate the treatments.

All studies involving electrotherapy as the primary intervention described the intervention in sufficient detail to allow for replication.5,3642 Of the 8 studies involving biofeedback or education, 6 studies43,4549 provided adequate information to allow replication of the intervention. Two studies42,43 failed to provide sufficient detail on the interventions utilized, preventing replication, although 1 study42 referred to a manual for the description of the intervention involved.

Outcome Measure Reliability

Reliability of data obtained with the outcome measures was reported in only 8 studies. Carmeli and colleagues9 reported intrarater reliability for the measurement of active ROM of the TMJ, whereas Taylor et al4 reported interrater reliability for maximal mandibular opening and lateral movement. Carlson and colleagues42 reported the internal consistency and intrarater reliability for subscales from the Multidimensional Pain Inventory (MPI) measuring pain severity, life interference from pain, and perception of life control. This group of researchers also reported the internal consistency and intrarater reliability for the somatization, depression, anxiety, and obsessive-compulsive scales of the Revised Symptom Checklist (SCL-90-R).42 Internal consistency and intrarater reliability for the affective distress scale from the MPI, as well the internal consistency and the intrarater reliability for the sleep dysfunction scale, also were reported.42

Internal consistency and interrater reliability for the muscle palpation pain index (PPI) and internal consistency for credibility ratings were reported by Turk and colleagues.49 Okeson and colleagues48 reported on the internal consistency for muscle and TMJ palpation. One of the studies by Nicolakis and colleagues27 referenced the reliability of scores for the visual analog scale (VAS).53 Wright et al34 referenced previously reported intrarater and interrater reliability of data for the modified symptom severity index (SSI-5 VAS), maximum pain-free opening, and muscle pain threshold.46,54,55 De Laat and colleagues22 referenced the reliability of data for the VAS, pressure pain threshold (PPT), and the Mandibular Functional Impairment Questionnaire (MFIQ).56,57 Of the 8 studies that reported reliability of data for outcome measures, only 2 studies22,34 reported reliability for all of the outcome measures used.

Outcome Measure Validity

Validity of data for outcome measures was reported in 3 studies.22,34,35 Wright and colleagues34 indicated that the validity of data for their outcome measures had been reported previously.48,53,54 Al-Badawi and colleagues35 indicated that the 10-point Numerical Pain Scale had been reported to be statistically sensitive when measuring pain and discomfort.53 De Laat and colleagues22 referenced the smallest detectable difference on a VAS to be considered clinically relevant in TMD secondary to disk displacement without reduction58 in subjects with myofascial TMD. None of the other studies presented any information on the validity for outcome measures used.

In the 30 studies reviewed, over 75 different outcome measures were utilized. The outcomes of interest were self-reported pain, pain on palpation, active ROM, EMG levels, questionnaires regarding self-reported symptom severity and frequency, dysfunction indexes related to impairment, and psychological status scales. A large variety of tools and other assessment methods were used to measure the outcomes of interest with different studies using different tools or methods to evaluate the same outcome.

Blind Assessment

Blinded treatment providers and outcome measure assessors were used in 11 of the 30 studies.9,22,25,3438,4042

Account for Attrition

Subject attrition was reported in 15 of the 30 studies.5,22,24,25,27,28,30,31,34,36,39,41,42,49,50 In the study by Moystad et al,40 6 subjects were inexplicably unaccounted for during the second phase of treatment. In the remaining 15 studies, subject attrition was not explicitly described.

Long-Term Follow-up

Long term-follow-up (6 months or greater) was reported in 10 of the 30 studies reviewed,24,2733,42,45,46,49 with the “long-term” assessment occurring from 6 months to 4 years after treatment.

Adherence to Home Programs

Although home intervention programs were explicitly identified in 20 of the 30 studies reviewed, the rate of adherence was not reported in 17 of those studies.9,21,22,27,28,3032,39,42,43,4550 Only 3 studies identified the rate of adherence (via self-report). Magnusson and Syren24 reported adherence at long-term follow up as less than 50%, Wright and colleagues34 reported a mean adherence of 75% after treatment, and Michelotti and colleagues25 reported adherence to the home physical therapy regimen as poor (27%) or medium (46%).

Discussion and Conclusions

The 22 RCTs included in the systematic review were ranked level II, using Sackett’s rules of evidence,17 due to low study quality. The remaining 8 studies were ranked level IV due to decreased rigor of the research designs.

Feine and Lund15 performed an analysis of review articles and controlled clinical trials to assess the efficacy of physical therapy and physical modalities for the control of chronic musculoskeletal pain disorders, which included TMD; they reported that symptoms improved during treatment with most forms of physical therapy, including placebo. Physical therapy was reported as almost always better than no treatment, with efficacy increasing in direct proportion to the amount of treatment received. In addition, those subjects who received more treatment modalities seemed to do better than those who received fewer modalities.15

With respect to specific interventions, 4 systematic reviews were located, none of which were included in the analysis performed by Feine and Lund.15 A 1996 systematic review59 stated that there was insufficient evidence to refute or support either manipulation or mobilization in treatment of the TMJ. A more recent systematic review of low-level laser therapy60 showed a reduction in pain and improvement in health status in chronic joint disorders. However, a systematic review of ultrasound in the management of chronic musculoskeletal disorders61 showed little evidence to support its use. A meta-analysis62 concluded that, although limited in extent, the available data support the efficacy of EMG biofeedback treatments for TMD.

Inclusion criteria varied among the studies we reviewed, likely due to the lack of consensus regarding the diagnosis of TMD. The lack of standardized inclusion criteria is a limitation when comparing studies, as well as with respect to the recommendations made. Subjects with myofascial TMD were included in 60% of the studies selected. The majority of patients who sought treatment for TMD and were subsequently involved in the studies were women.63 This finding may relate to a difference in treatment-seeking behavior between men and women, as well as the greater likelihood for women to have somatization disorders.63 The external validity of the recommendations is limited, due, in part, to the differences in the groups studied. There also may be differences between those who agree to participate in an RCT and those who do not. For example, one study64 showed that the patients who refused to participate had more pain and more condition-related interference in daily life when compared with those who participated.

Temporomandibular disorder-related pain of ≥6 months may represent a shift from acute to chronic TMD. Five of the studies in this review required a duration of pain for ≥6 months.4,24,34,49,50 The second axis of the RDC/TMD includes the more psychosocial aspects of TMD.6,8 Women and men who develop chronic TMD display more psychosocial distress than those whose acute TMD resolves. Other predictors of chronicity are TMD of the myofascial type and being female.64,65

Within our systematic review, a variety of interventions were used to treat the 3 TMD subgroups in the first axis. Interventions were grouped into 1 of 3 areas: exercise, electrotherapy, and biofeedback. Within the 3 areas, the interventions were often heterogeneous, making comparisons difficult. The use of multiple interventions in a number of studies resulted in recommendations based on a multi-intervention program because the effectiveness of a single intervention alone was not examined.

A spectrum of different outcome measures was used in the studies reviewed. Most of the studies included between 2 and 5 outcome measures. Although there was some continuity in the outcome areas assessed, the actual measures differed among the studies, with over 75 different methods used to assess the outcomes. Reliability was reported in only 8 studies,4,9,22,27,34,42,48,49 with only 2 studies22,34 reporting reliability on all of the outcome measures involved. Validity was reported in 3 studies,22,34,35 with only 1 study34 reporting on all of the outcome measures involved. Only 3 studies22,25,42 reported whether outcomes were clinically important. The lack of demonstrated reliability or validity for the outcome measures used limits the confidence with which the results may be interpreted.

Five studies22,24,25,34,49 fulfilled 6 or more (of 10) criteria for methodological rigor (Tab. 4). The majority of the remaining studies failed to report either reliability or validity for the outcome measures used, creating less confidence in the study results. The importance of long-term follow-up to assess the retention of short-term treatment effects is critical to examining the efficacy of the interventions involved.

This review has several limitations. Because only English-language articles were included, it is possible that this review is a not complete representation of the available evidence. The review was limited to published articles and thus may have missed those that were not submitted or accepted for publication, presenting a possible publication bias. As only the first author preformed the literature search and the subsequent selection of the studies to be considered in this review, a selection bias may be present. Additionally, the first author performed the data abstraction, as well as a significant proportion of the rating and classification of the studies, which may present a data abstraction and evaluation bias.

Implications for Clinical Practice

Despite reported limitations of this systematic review of the scientific evidence for physical therapy interventions for TMD, the following clinical recommendations are suggested:

  1. Active exercises and manual mobilizations, alone or in combination, may be effective in the short term in increasing total vertical opening (TVO) in people with TMD resulting from acute disk displacement, acute arthritis, or acute or chronic myofascial TMD. A home exercise program was often included in the treatment protocol.

  2. Postural training may be used in combination with other treatment techniques because the effects, independent of other treatments, are not known (eg, postural training combined with a home exercise program may decrease pain and increase TVO in people with myofascial TMD).

  3. Mid-laser therapy may decrease pain and improve TVO and lateral excursion in people with TMD secondary to acute disk displacement and may be more effective than other electrotherapy modalities in the short term, although comparison is difficult.

  4. Programs involving relaxation techniques and biofeedback, EMG training, proprioceptive re-education may be more effective than placebo treatment or occlusal splints in decreasing pain and increasing TVO in people with acute or chronic myofascial or muscular TMD in the short term and the long term.

  5. Programs involving combinations of active exercises, manual therapy, postural correction, and relaxation techniques may decrease pain and impairment and increase TVO in the short term in people with TMD resulting from acute disk displacement, acute arthritis, or acute myofascial TMD. However, it is impossible to discern whether a combination program is more effective than providing the separate elements of the program as individual treatment techniques.

Implications for Future Research

The foregoing clinical implications should be considered with caution because none were supported by numerous, decisive studies. Consensus on the definition of TMD, and subsequent inclusion and exclusion criteria, would allow further comparison across groups studied. In addition, agreement on use of valid and reliable outcome measures would yield more rigorous research.


  • Ms Medlicott provided concept/idea/research design. Both authors provided writing and data collection and analysis. Dr Harris provided consultation (including review of manuscript before submission).

  • Received June 6, 2005.
  • Accepted January 31, 2006.


View Abstract