Background The College of Physiotherapists of Ontario implemented an Onsite Assessment to evaluate the continuing competence of physical therapists.

Objective This study was undertaken to examine the reliability of the various tools used in the Onsite Assessment and to consider the relationship between the final decision and demographic factors.

Design This was a psychometric study.

Methods Trained peer assessors (n=63) visited randomly selected physical therapists (n=106) in their workplace. Fifty-three physical therapists were examined by 2 assessors simultaneously. The assessment included a review of practice issues, record keeping, billing practices, the physical therapist's professional portfolio, and a chart-stimulated recall process. The Quality Management Committee made the final decision regarding the physical therapist's performance using the assessor's summary report. Generalizability theory was used to examine the interrater reliability of the tools. Correlation coefficients and regression analyses were used to examine the relationships between demographic factors and performance.

Results The majority of the physical therapists (88%) completed the program successfully, 11% required remediation, and 1% required further assessment. The interrater reliability of the components was above .70 for 2 raters’ evaluations, with the exception of billing practices. There was no relationship between the final decision and age or years since graduation (r<.05).

Limitations Limitations include a small sample and a lack of data on system-related factors that might influence performance.

Conclusions The vast majority of the physical therapists met the College of Physiotherapists of Ontario's professional standards. Reliability analysis indicated that the number of charts reviewed could be reduced. Strategies to improve the reliability of the various components must take into account feasibility issues related to financial and human resources. Further research to examine factors associated with failure to adhere to professional standards should be considered. These results can provide valuable information to regulatory agencies or managers considering similar continuing competence assessment programs.

Professional competence has been defined as “the habitual and judicious use of communication, knowledge, technical skills, clinical reasoning, emotions, values, and reflection in daily practice for the benefit of the individual and community being served.”1(p226) Competence includes the ability to practice safely throughout a person's career, the ability to assimilate changes in the profession into one's practice, and the ability to deal with complex situations.2 Continuing competence is defined by the Federation of State Boards of Physical Therapy (FSBPT) as the “ongoing application of professional knowledge, skills and abilities which relate to occupational performance objectives in the range of possible encounters that is defined by that individual's scope of practice and practice setting.”3 As autonomous health care professionals, physical therapists are expected to demonstrate competence within the context of their practice environment and their role description.4 The efforts to maintain competence require “a commitment to a lifelong process of education and skill development to meet the ever-changing needs of health care.”5(p145)

Although it is suggested that health care professionals engage in self-assessment to monitor their own competence and guide professional development activities, there is little evidence to show that this practice is effective.68 Indeed, the literature on the accuracy of self-assessment among physical therapists suggests that the self-assessment skills of a physical therapist are not completely accurate, with those individuals whose performance is the poorest being most likely to have the least accurate self-assessments.6 Thus, more rigorous measures of competence, such as standardized written tests or peer feedback, are necessary to assist physical therapists in ascertaining their professional strengths and weaknesses and guiding their professional development activities.6 Furthermore, evidence from various continuing competency assessment programs undertaken by regulatory agencies for physicians indicates that as many as 10% to 12% of physicians require interventions to address performance difficulties.911 Similarly, up to 14% of pharmacists in a Canadian provincial jurisdiction have been found to not meet professional standards.12 For these reasons, formal programs to assess the continuing competence of health care professionals, including physical therapists, should be implemented. These programs are occasionally implemented in the workplace by management, but more commonly they are implemented by state or provincial authorities. This is the case in Ontario, Canada, where the assessment of competence is mandated by the government through the Regulated Health Professions Act (1991).13

There are currently no research reports regarding the psychometric properties of any tool or process to examine continuing competence in physical therapy. There are, however, a limited number of research reports that describe the psychometric properties of tools used to assess the continuing competence of other health care professionals. For example, the chart-stimulated recall (CSR) process, involving evaluation by peers, has been found to be a reliable and valid method to assess competence in several professional disciplines.9,14,15 In the CSR process, an interviewer uses an interview script to gain information about the care provided to patients, addressing areas such as the assessment of the patient, intervention planning and implementation, and evaluation of outcomes. In a study undertaken in occupational therapy, 2 occupational therapy faculty members assessed 12 occupational therapists on 2 occasions using a CSR process. The faculty members reviewed the care provided to patients using 10 charts on each occasion. Salvatori and colleagues14 reported high interrater reliability (intraclass correlation coefficient [ICC]=.97) and low inter-case reliability (ICC=.44). A CSR process also was part of the practice assessments used to examine the continuing competence of family physicians practicing in Ontario. Norman and colleagues9 examined the reliability of the various tools used by the College of Physicians and Surgeons of Ontario, which included a CSR process, an oral examination, standardized patients, a multiple-choice examination, and 5 objective, structured clinical examination stations. Intraclass correlation coefficients for interrater reliability ranged from .75 to .90 for the CSR process, from .68 to .79 for the oral examination, and from .72 to .79 for the standardized patients.9

Building on the lessons learned from these studies within both occupational therapy and medicine, a program to assess the continuing competence of physical therapists was introduced by the College of Physiotherapists of Ontario (CPO) through its Quality Management Program.16 It was developed in response to government legislation that required all agencies of regulated health care professions in Ontario to develop a program to assess the continuing competence of its licensed registrants. The CPO is the largest physical therapist regulatory agency in Canada (akin to a state board in the United States), with approximately 6,100 registrants. In the CPO's Onsite Assessment, trained peer assessors meet with randomly selected physical therapists in their workplace and use discussion and document review to evaluate the registrants’ adherence to the CPO's Standards for Practice for Physiotherapists, as described below and, in greater detail, on the CPO's Web site.17 Adherence to expectations regarding ethical practice outlined in the CPO's Code of Ethics18 also was included, as was knowledge about certain sections of relevant legislation and regulations that govern the practice of physical therapists in the province. The Quality Management (QM) Committee, a statutory committee defined in provincial regulation, oversaw the Onsite Assessment process. A description of the entire Quality Management Program and a copy of the precise evaluation forms used can be accessed from the CPO Web site.19,20

Van der Vleuten21 has provided a conceptual model with which to describe the utility of assessment methods. Utility has a multiplicative function of 5 variables with differential weights: reliability, validity, educational impact, acceptability, and feasibility.21,22 Indeed, developers of assessment programs need to identify the elements of the assessment that are most important for the context and purpose of the assessment and recognize that compromises will always need to be made for practical purposes. For example, if the assessment is to be formative (eg, in-training assessment), the assessment developers might consider compromising on reliability in favor of the educational impact of the measure. If the purpose of the assessment is to be summative, the reliability and validity aspects of the assessment will be most important. The essence of competence assessment is to attempt “to approximate the real professional or educational world as closely as possible, while maintaining standardized test-taking conditions,”21(p62) and these 5 qualities must be balanced in the process. Although it is clear that it is difficult to quantify some of the variables in the model, the relationship among the variables is conceived as multiplicative because if the value of any variable were zero, the utility of the assessment method would be zero.21

This study was undertaken to assess the extent to which peer assessment can provide valuable information for measuring the continuing competency of physical therapists through the Onsite Assessment. Specifically, the objectives of the study were: (1) to determine the reliability of the various tools comprising the Onsite Assessment and (2) to examine the relationships among the final decision of the QM Committee, demographic factors, and the results of the various components of the Onsite Assessment (including the assessors’ summary report) to determine which information most influenced the committee's decision making.


Development of the Onsite Assessment

The development of the Onsite Assessment was led by the Director, Quality Management, in collaboration with a small group of physical therapists interested in the Quality Management Program, staff at the CPO, the QM Committee, and the Council of the CPO. Although it is beyond the scope of this article to describe the process that took place over an approximately 6-month period, a brief summary will be provided. Further information can be obtained by contacting the coauthor (M.N.). The Onsite Assessment was developed during a 2-day workshop. A small group of physical therapists (n=8–10) were strategically selected to represent various clinical practice areas (musculoskeletal, neurology, cardiorespiratory), geographic areas of the province (urban and rural), and professional roles (clinicians, professional practice leaders, educators). The first day was facilitated by the Director, Quality Management, and was devoted to development of the Practice/Facility Evaluation, the Record Keeping Evaluation, the Billing Evaluation, and the Portfolio Review. The second day was facilitated by a consultant with experience in developing a CSR process for occupational therapists and was devoted to development of the CSR tool.

Various CPO documents (eg, Standards of Practice for Physiotherapists,17 Essential Competency Profile for Physiotherapists in Canada,23 and various CPO standards documents) were used extensively on both days to develop the questions and areas of focus for the assessment. The group discussed standards of practice that were amenable to a peer-review process and could be used in the assessment of continuing competence. For example, physical therapists are expected to: (1) maintain a body of specialized technical knowledge, (2) exercise reasoned judgment in the application of that knowledge, and (3) engage in lifelong learning activities. The various tools or processes by which these standards could be assessed were identified and discussed. The following processes and tools were selected as ways of enabling peer reviewers to judge whether these standards were met: a CSR process to address patient care, a review of the physical therapist's professional portfolio to address his or her learning needs and professional development activities, and the completion of a self-administered questionnaire that tested knowledge of various standards of practice and focused discussions to address record-keeping and billing practices. It should be noted that the provincial regulation (ie, Physiotherapy Act, 1991, Ontario Regulation 532/98)24 permits CPO staff or its delegates (ie, assessors) to access any patient charts necessary to conduct CPO business (eg, examining continuing competence), and assessors are bound by the confidentiality provisions in the Regulated Health Professions Act (1991).13

Following these meetings, a summary document was compiled by the Director, Quality Management, and circulated to the working group for review. Feedback was incorporated by the Director, and the proposed program was reviewed by the QM Committee and subsequently by the Council of the CPO, with recommended revisions incorporated. This iterative and collaborative process established the content validity of the Onsite Assessment. Subsequently, there was a pilot test of the Onsite Assessment to examine the feasibility of the process, train assessors, and finalize the scoring scheme. A detailed document outlining the results of the pilot test is available on the CPO Web site.25

The Onsite Assessment

The Onsite Assessment is intended to foster an environment that supports physical therapists in self-directed goals toward lifelong learning, while providing the opportunity to demonstrate continuing competence.16 As part of the Onsite Assessment, the assessor reviews the performance of the physical therapist in terms of the expected standards of practice described by the CPO in its effort to protect the public. When a physical therapist is randomly selected from the CPO database to undergo the Onsite Assessment, a peer assessor makes a half-day visit to his or her workplace. The peer assessor undertakes a review of the registrant's professional portfolio, which is to include 3 parts: (1) a curriculum vitae (CV), including professional development activities, (2) a learning plan, and (3) a completed Professional Issues Self-Assessment (PISA), a self-assessment questionnaire that assists registrants in identifying whether they are up-to-date on the CPO's standards of practice relevant to their work. The assessor conducts a walkabout of the workplace with the physical therapist and reviews a variety of issues related to the facility and the registrants’ practice. The assessor also examines the record-keeping practices of the registrant using 6 to 8 charts selected by the registrant and subsequently conducts a CSR process involving those same charts to discuss the provision of patient care. When relevant, the assessor reviews the billing practices (eg, for registrants in private practice or in long-term care facilities). A description of the components of the Onsite Assessment is provided in Appendix 1. For each of the components, the assessors used a scoring form that outlined the specific questions to review with the physical therapist in order to cover all of the required material in a standardized manner. These forms can be downloaded from the CPO Web site.20

At the time of this study, the QM Committee had proposed that 50 physical therapists be randomly identified for assessment each month using a computer-generated algorithm. This number would ensure that all physical therapists would be assessed within a 10-year period. To be eligible, physical therapists had to be working in clinical practice, not assessed in the previous 5 years, and registered for at least 3 years. Selected physical therapists received notification by mail that they had 3 months to complete the process, along with a detailed outline of the Onsite Assessment process. The therapists were asked to contact the CPO to confirm their participation and to discuss any questions about the process. It was possible for the therapists to defer their participation for valid personal or professional reasons (eg, personal illness, maternity leave).

Selected physical therapists were expected to prepare for their assessment by reviewing all of the material received from the CPO. The scoring forms used by the assessor were provided to the registrants to ensure they were knowledgeable about what the Onsite Assessment would involve and what questions would be asked.20 They were expected to prepare their portfolios according to specified requirements and to identify 6 to 8 charts for the record-keeping review and CSR process. Guidelines for chart selection were provided. They included the requirements that 1 to 2 charts be related to recently discharged patients. Preferably, patients should have been seen over a period of time, not seen once and discharged, and the charts were to reflect a variety of diagnoses, so that not all patients had the same condition. If the therapist performed any controlled acts, one chart should include a patient for whom the controlled act was performed.26 There are 13 controlled acts in Ontario. These are acts restricted to those professions that have been granted the authority to perform them. The controlled acts that physical therapists are authorized to perform are suctioning (also granted to respiratory therapists, physicians, and nurses) and spinal manipulation (also granted to chiropractors and physicians).26

Each randomly selected physical therapist was matched with an assessor who practiced in a similar practice area (eg, if the therapist was working primarily with neurological clients, the assessor would be someone with similar clientele). All peer assessors had at least 5 years in practice and were selected by the CPO for the purpose of peer assessment. All peer assessors had attended a 2-day assessor training workshop to prepare for this role. Included in the training was a session led by the Director, Quality Management, which was devoted to the discussion of scoring the various components using the response options, based on the principles outlined by the CPO documents (eg, Standards of Practice for Physiotherapists,17 Essential Competency Profile for Physiotherapists in Canada23) and the CPO mandate of protection of the public. A “present” or “absent” rating was used for the 3 portfolio sections, and a 4-point response format was used for the other 4 components of the Onsite Assessment. The 4 potential responses were “satisfactory,” “needs minor improvements,” “needs major improvements,” and “unacceptable practice” and were determined by the judgment of the trained peer assessor. During the training workshop, considerable time was spent reviewing the various CPO documents and discussing the application of the standards of practice. Assessors were told that the expectation was for physical therapists to be practicing at a level acknowledged by the community as reasonable practice. This approach was similar to the intent of the American Physical Therapy Association's definition of continued competence as “a component of professional development that addresses the minimum requirements of contemporary practice.”27 Time in the training workshop was allotted to a discussion among assessors and CPO staff regarding the various standards and acceptable performance, based on that expectation.

The peer assessor visited the physical therapist's workplace for approximately 4 hours and carried out the Onsite Assessment. The assessor provided formative feedback to the therapist throughout the process. For example, the assessor would offer suggestions about how to ensure the record keeping better met CPO standards, additional management strategies to consider for the therapist's patients, or how to improve the organization of the therapist's professional portfolio and its contents. The assessor completed a summary report and forwarded the completed materials (summary report and all rating sheets) to the QM Committee. The summary report included comments and a rating regarding the overall performance of the physical therapist using same 4-point response format described above. The QM Committee made the final decision about the therapist's performance based on its review of the documentation and ratings from the peer assessor. There were 5 possible outcomes the QM Committee could assign: completed successfully, completed with recommendations, self-directed remediation required, QM Committee remediation required, and reassessment required. Appendix 2 provides a full description of the various outcomes. Upon completion of the assessment, the Director forwarded a letter to the therapist outlining the QM Committee's decision.

A subset of physical therapists (n=53) comprised a sample of convenience that was assessed by 2 assessors simultaneously to enable examination of the interrater reliability of the different components within the Onsite Assessment. The allocation of the 2 assessors was influenced by feasibility factors (ie, identifying 2 assessors with similar clinical practice areas and practice settings who lived in reasonable proximity to the registrant). For each assessment, one physical therapist was designated as the “primary assessor” and the other as the “observer.” All novice assessors were observers for the assessment of at least one physical therapist prior to assuming the role of primary assessor. In the case where there were 2 assessors, the primary assessor's evaluation and summary report were forwarded to the QM Committee.

Data Analysis

An “anonymized” database was received from the CPO with the results of the first 3 months of the Onsite Assessment (January–March 2005). It included each registrant's demographic information of age, graduation year, educational credentials, and area of practice (eg, orthopedics, neurology). The variable “years since graduation” was determined by subtracting the year of graduation from 2005. The database also included the ratings of the various components from the primary assessor and the observer-assessor (when present), the summary report, and the QM Committee's final decision. For the purpose of the analyses, the categories of responses for the peer assessor and QM Committee decisions listed in the previous section were coded ordinally as 1 through 4 and 1 through 5, respectively. Statistical analyses were conducted using SPSS 13.0 for Windows, Graduate Student Version,* and Excel 2004 for Macintosh, version 11.0.

Descriptive analyses were used to identify the frequency of the results of the Onsite Assessment and the demographic data regarding the registrants and assessors. Generalizability theory was used to calculate the reliability of each of the 7 components (including the portfolio sections) of the Onsite Assessment, as well as of the rating assigned to the summary report.2830

Mathematically, reliability is a proportion (ranging from 0 to 1) indicating how much variance in scores can be attributed to the participants, which in our case would be the physical therapists, as opposed to that attributable to measurement error induced by differences between raters and the interaction between raters and participants, for example. Conceptually, interrater reliability can be thought of as the strength of the association between the ratings assigned to the physical therapists by one rater and those assigned by another rater. Inter-case reliability, by analogy, can be thought of as the strength of association between the ratings assigned to physical therapists on one patient case and those assigned on another case.

Generalizability theory enables us to consider the impact of multiple sources of error (eg, raters and patient case differences) at the same time, thereby allowing the dual advantages of generating estimates of reliability that better account for potential sources of error than ICCs, while allowing the analyst to determine how reliability could be optimized by varying the number of observations that are averaged.2830 This latter process is known as a “decision study” (D-study) and, in effect, builds on the general notion that an average score tends to be more trustworthy than a single observation, as error becomes diluted when averaging across multiple observations. G coefficients range between 0 and 1, with 0.75 or higher typically being treated as a threshold for high-stakes decision making.2830 Although the results of a D-study are a prediction of the reliability expressed across potential scenarios (eg, involving different numbers of raters and cases), they provide invaluable information to program developers about how widely different sources of variance should be sampled, which, in turn, can guide future program modifications.30

Interrater reliability was assessed for all components of the Onsite Assessment using a subset of 53 physical therapists whose performance was reviewed by 2 peers, one as an assessor and the other as an observer. For the CSR process, both rater and case (ie, patient charts) were variables in the data collected, thus allowing both an interrater and inter-case assessment of reliability.28 A minimum of 6 cases per physical therapist were expected to be available for the 53 physical therapists. However, 7 of the 318 scores (6 × 53) were missing (2%) and were subsequently replaced with the mean score of all of the documented scores (using both assessors’ ratings) on that therapist's record. The variance components derived from the generalizability study were submitted to a D-study to estimate the reliability in different hypothetical scenarios where the number of charts examined by 1 or 2 assessors increased from 1 to 6 charts.

Spearman correlation analyses and multiple regression analyses were used to consider the relationships between the demographic factors and the QM Committee's final decision, between the assessment instruments, and between individual instruments and the QM Committee's final decision. Summarizing the ratings generated from Likert scales and submitting them to parametric statistical analysis can be done without concern.31 Mean scores were computed for the various components of the Onsite Assessment when a pair of assessors had undertaken the assessment, and the single assessor's score was used if there was not a pair of ratings. For the regression analysis, the mean scores for the summary report, record keeping, CSR process, facility and practice report, and billing were used as independent variables. The actual scores (present or absent) for PISA, CV, and professional development ratings were used, but the data from 2 physical therapists were removed from the regression analysis because the assessor and observer provided different scores. Interactions between variables were left out due to sample size constraints.

Role of the Funding Source

This research was funded by a Strategic Training Fellowship in Rehabilitation Research to Dr Miller from the Institute of Musculoskeletal Health and Arthritis of the Canadian Institutes of Health Research.


The Registrants

In the initial 3 months of the Onsite Assessment, 106 physical therapists were assessed. Some demographic information was missing for 4 of the 106 registrants. The mean age of the therapists (n=103) was 42 years (SD=9.5), with a mean time since graduation of 18 years (SD=10.7). The professional (entry-level) qualification was a baccalaureate degree for the majority of the therapists (73%) and a diploma in physical therapy for the others (27%). The modal area of practice, as described by the physical therapists (n=102), was orthopedics (46%). The remaining therapists identified their area of practice as general (all areas) (15%), neurology (7%), and rehabilitation (7%), with other areas (eg, respirology, sports medicine, critical care, rheumatology) comprising the remaining 25%. The majority of the physical therapists (63%) were in private practice.

The majority of the physical therapists (88%, n=93) completed the program successfully, 11% (n=12) were deemed to require a remediation program, and 1 therapist (1%) was thought to need reassessment. Table 1 shows the final decisions of the QM Committee. Scores of the therapist who required reassessment were removed from subsequent analyses due to concerns regarding the validity of assessment findings. Table 2 shows the ratings of the primary assessor on the various components of the Onsite Assessment.

Table 1.

Final Decisions of the Quality Management Committee (N=106)

Table 2.

Summary of the Primary Assessors’ Ratings for the Components of the Onsite Assessmenta

The Assessors

The assessments were undertaken by a total of 63 different assessors, 21 of whom served as both assessor and observer when the interrater reliability study was conducted. The mean age of the assessors was 44 years (SD=8.2), with a mean time since graduation of 19 years (SD=7.6). For the majority of assessors (75%), the professional (entry-level) qualification was a baccalaureate degree, with the remaining assessors (25%) having received a diploma. The largest number of assessors were working in private practice (n=24, 38%), with general hospitals (n=12, 19%) and home visiting agencies (including the Community Care Access Centre) (n=10, 16%) as the next most frequent work settings. The remaining work settings (27%) included long-term care and continuing care facilities, rehabilitation centers, pediatric facilities, and government-funded clinics. There was 1 individual for whom data regarding primary area of practice were missing. The modal primary area of practice for the assessors (n=62) was identified to be orthopedics (52%). The remaining assessors identified their primary area of practice as general (all areas) (21%) or neurology (10%), with other areas (eg, respirology, cardiology, critical care, administration, rehabilitation, sports medicine, rheumatology) comprising the remaining 17%.

Objective 1: Reliability of the Onsite Assessment

The interrater reliability of the various components was moderate to good in all instances, with the exception of the assessment of billing practices, as illustrated in Table 3. As described above, the reliability of the average of multiple observations tends to be higher than the reliability of a single observation. As such, G1 in Table 3 indicates the extent to which a single rater's ratings are predictive of those of a second rater, whereas G2 indicates the extent to which the average score provided by 2 raters could be used to predict the average scores provided by another pair of raters.

Table 3.

Interrater Reliability Coefficients for the Various Components of the Onsite Assessmenta

The reliability of the CSR process can be examined in a more elaborate fashion due to the availability of a second variable (case). The inter-case reliability for the ratings assigned to a single case (ie, the extent to which the ratings assigned to one case can be expected to correlate with those assigned to another case) was found to be .65. This reliability rose to .92 when the average rating, taken across 6 cases, was considered. The D-study revealed that the number of charts peer assessors were asked to review could be dropped to 2 or 3 while still achieving an acceptable inter-case reliability (G>.75) (Tab. 4).

Table 4.

Inter-case Reliability for Varying Numbers of Charts Reviewed in the Chart-Stimulated Recall Process

The overall test reliability (ie, generalizing across both raters and cases at the same time rather than considering each in isolation) was .48 for a single observation (one rater and one chart). This statistic indicates the extent to which the scores assigned by one rater to one chart is predictive of the scores assigned by another rater to a different chart. Table 5 illustrates the results of the D-study, which indicate the impact on overall test reliability of averaging across variable numbers of cases and raters.

Table 5.

Overall Test Reliability for the Chart-Stimulated Recall Process Expressed as a G Coefficienta as a Function of the Number of Charts and Raters (n=53)

Objective 2: Relationships Between the QM Committee's Final Decision and Other Components of the Onsite Assessment

No significant relationships were identified between the final decision and the physical therapists’ age, years since graduation, or professional educational program (r<.05 in all instances) (Tab. 6). The correlation between the final decision of the QM Committee and the mean summary report score was r=.50 (P<.01) (n=105). The univariate correlation between each pair of components is shown in Table 7. Using multiple regression analysis, record keeping was found to be the most significant contributor to both the QM Committee's final decision and the assessors’ mean summary report. Table 8 presents the results of the regression analysis.

Table 6.

Relationships Among Demographic Factors and the Final Decision Expressed Using Spearman Correlation Coefficients

Table 7.

Univariate Correlations Among the Assessment Components, the Summary Report Mean, and the Quality Management Committee's Final Decisiona

Table 8.

Relationships Among Onsite Assessment Components Explored Using Multiple Linear Regressiona


The reliability of an assessment tool addresses the amount of error, both random and systematic, inherent in any measurement, and it also reflects the ability of that tool to consistently discriminate among the individuals assessed.28 The interrater reliability of the various components of the Onsite Assessment ranged from fair to good, with the exception of the assessment of billing practices. These findings were similar to those reported for the peer-assessment program undertaken by the College of Physicians and Surgeons of Ontario, where a CSR process also was used.9 Salvatori and colleagues14 had reported a higher interrater reliability of the CSR process in a sample of occupational therapists (ICC=.97). The lack of reliability for the billing practices component most likely arose because there was so little variability in performance (as shown in Tab. 2), which ultimately affects the ability of the tool to reliably discriminate among physical therapists. Although the reliability of the components is improved with additional raters, using one rater and a triangulation of methods across the Onsite Assessment in order to make a decision about competency (ie, considering the results across the various components) may be the most feasible manner in which to account for the moderate reliability of the individual components.

The results of this study indicate that the assessors can reduce the number of charts reviewed with the physical therapist from 6 to 3 and still achieve acceptable reliability (inter-case ICC=.85). The high inter-case reliability in this study was likely due to a number of factors. The candidates were free to choose their own charts in this study; therefore, they may have selected those that addressed patients with similar conditions, leading to little variation. Furthermore, the “halo effect” bias may come into play here, with the assessors being biased in the assessment of the charts encountered later in the CSR process by their perception of the charts presented earlier in the process.32 The inter-case reliability estimated in this study was higher than that identified in the study by Salvatori and colleagues,14 who found that a random sample of 11 charts was required to yield an overall reliability of .90. The charts for the CSR process were selected randomly in the study by Salvatori and colleagues, which may have given rise to a more heterogeneous sample of cases, thereby producing greater variation in each occupational therapist's performance.14 Asking the physical therapist to select the 6 to 8 charts for the Onsite Assessment (in contrast to having hospital personnel or the assessor select the charts) could raise the concern that he or she might select only the charts representing the provision of care meeting all the required standards. However, this does not appear to be the case, because in both the CSR process and the record-keeping evaluations, the assessors were able to reliably identify a range of performances within the sample. Therefore, the selection of charts by the physical therapist seems to be acceptable. Subsequent to this study, the number of charts required for review in the Onsite Assessment was reduced to 6, rather than 6 to 8. The number of charts reviewed was not reduced to a number less than 6 because the QM Committee felt it was important to review that number in order to ensure that the physical therapist was meeting the documentation standards.

None of the demographic factors examined in this study (ie, registrants’ age, time since graduation, or professional education) were found to be associated with registrants’ performance. Similar findings (ie, no significant correlation between performance and time since graduation) were reported in samples of both Canadian and internationally educated physical therapists beyond their graduation year who completed the written component of the Canadian Physiotherapy Competency Examination.33 These findings stand in contrast to the significant inverse relationship between competence and age or time since graduation identified among physicians911 and pharmacists.12 Age and time since graduation were highly correlated for obvious reasons. The education factor is highly correlated with both of these factors as a result of the historical context, where the entry-level diploma program was replaced by the baccalaureate program in Canada. Further study to investigate the factors associated with poor performance on the Onsite Assessment, which may be indicative of failure to comply with certain standards of practice, is warranted.

There was a moderate correlation between the QM Committee's final decision and the summary report from the assessor (r=.50, P<.01). Although a stronger correlation between these 2 ratings might have been anticipated, this finding could be attributed to several factors. It could be that the assessors were not comfortable in assigning scores lower than “satisfactory.” Norcini34 has suggested that when the stakes are high, as they would be in the Onsite Assessment, peers can be reluctant to provide lower ratings. As well, it could be that the scoring guidelines were unclear or misinterpreted by the assessors, which could be the case because this study was undertaken using the results from the first months of the new program.

The majority of physical therapists in this sample (88%) were deemed to be providing safe and competent care, as determined through the document reviews and discussions with the physical therapists. Although this finding speaks well of the physical therapists involved, of greater concern are those who do not meet the required standards. Eleven percent of therapists (n=12) were required to engage in a remediation process. This finding is similar to those in peer-assessment reviews of physicians, where 10% to 12% have been identified as demonstrating serious difficulties911 and 14% of pharmacists in the same Canadian jurisdiction.12 These findings alone indicate the importance of instituting a formal program such as the Onsite Assessment to identify those physical therapists whose practice falls below acceptable standards in order to protect the public from harm. The majority of remediation processes required of the therapists were self-directed, suggesting that the issues were not major. In only a few cases (n=3) was the remediation directed by the QM Committee, indicating a more serious concern. All remediations were followed by a reassessment to ensure the appropriate standards of practice had been met.

The results of the regression analyses suggest that the physical therapists’ performance in record keeping predominantly informed the ratings of both the peer assessors (ie, summary report) and the QM Committee (ie, final decision). This was the component of the Onsite Assessment with the poorest performance, with only 71% of the physical therapists viewed by the peer assessor as demonstrating “acceptable” practices. It is not surprising, therefore, that this component appeared as the most predictive independent variable in both regression analyses, as it was the component demonstrating the greatest variability. The incidence of successful charting procedures is similar to that in a sample of Quebec physicians surveyed through the Collège des Médicins du Québec, where 75% were judged to have good record-keeping practices.35 Reported record-keeping deficiencies have been related to the omission of important details, poor organization, and illegibility.36 There is evidence that remediation directed by a regulatory agency can result in improved clinical practice, including improved record keeping.37 Little is known about the relationship between submitting a PISA and adherence to professional standards, but it seems likely that the same characteristics that lead a person to keep careful records also may result in diligence with respect to completing this sort of administrative task. The predictive nature of the CSR process adds to the promise inherent in previously published research into the psychometric properties of this activity from other professions.14

Van der Vleuten's model of utility21,22 has informed our review of the CPO Onsite Assessment. Because this is a summative assessment, with consequences related to therapists’ ability to maintain their license to practice, great weight was placed upon the reliability of the assessment. The results of this study indicate that defensible summative decisions can be made using the current process and that the number of charts required for review can be reduced. The content validity of the process was defined through the comprehensive development process, which included pilot testing. Feedback is requested from physical therapists following the Onsite Assessment, which addresses some aspects of the acceptability and educational impact of the process. Although the costs of conducting such a program are quite high, averaging $525.00 (Canadian) per physical therapist assessment undertaken by one assessor (which includes the assessor's travel costs),38 the feasibility of this process has been determined to be acceptable by the CPO. This model may be useful to others (eg, other regulatory agencies, physical therapy managers or educators) who develop and use competence assessment tools or programs.

There are several limitations to this study. Although the sample size for this study was small, it was a random sample from the largest jurisdiction in Canada and, as such, offers us the first evidence regarding the continuing competence of Canadian physical therapists. Additionally, the Onsite Assessment does not include observation of the therapist while he or she is providing patient care or communicating with others (eg, patients, families, other health care professionals in a variety of contexts). Rethans and colleagues39 have identified the importance of addressing system-related factors (eg, policies and guidelines in the facility, access to other health care professionals) and individual characteristics (eg, mental health, relationships with others) when assessing clinical competence. The Onsite Assessment offers the assessor an opportunity to directly interact with the physical therapist and to evaluate his or her practice in the context of the work environment through document review and discussion. The Onsite Assessment, with the personal interaction between the assessors and the examinees in their workplace, appears to be a feasible process that can offer a comprehensive view of the physical therapist's continuing competence. Further validity testing, however, is needed before these conclusions can be stated with absolute confidence.

Subsequent research related to the Onsite Assessment could further address the validity of the process, potentially using a global rating scale completed by peer14 or multi-source feedback questionnaires.40,41 A study is under way to re-examine the reliability and validity of the Onsite Assessment (now called the Practice Assessment) using methods similar to those reported here, as it has been in operation for several years. Furthermore, the educational impact of the Onsite Assessment could be interesting to study because the registrants receive both formative and summative feedback through the process. An exploration of how the registrants use the feedback (ie, whether it alters their practice or influences the selection of continuing education events) might be interesting to undertake.


This is the first publication to report the results of peer assessment of continuing competence in a sample of physical therapists. The majority of the physical therapists assessed in the Onsite Assessment were considered to be providing competent care and adhering to professional standards of practice. This study provides important information about the psychometric properties of the components of the Onsite Assessment. When only one assessor was used, the reliability of the various components of the Onsite Assessment ranged from fair to good. However, using triangulation of results from various tests and the summary report that comprise the Onsite Assessment to inform the final decision should enhance the validity of the process. The findings indicate that the number of charts reviewed in the CSR process can potentially be reduced. Unlike similar studies with physicians and pharmacists, no evidence of a decline in competency associated with increasing age or time since graduation was demonstrated in this sample. Further research to identify factors associated with physical therapists’ failure to comply with the required standards of practice is indicated. The results of this study can provide valuable information to other physical therapy regulatory agencies or managers considering similar continuing competency assessment programs.

Appendix 1.

Appendix 1.

Components of the Onsite Assessment

Appendix 2.

Appendix 2.

Description of the Possible Outcomes of the Onsite Assessment


  • All authors provided concept/idea/research design. Dr Miller and Dr Eva provided writing and data analysis. Dr Nayer provided data collection, participants, and institutional liaisons. Dr Miller and Dr Nayer provided project management. Dr Nayer and Dr Eva provided consultation (including review of manuscript before submission).

  • Ethical approval for this study, which was conducted as part of Dr Miller's PhD thesis, was secured from the Faculty of Health Sciences's Research Ethics Board at McMaster University.

  • A podium presentation of this research was given at the 15th International Congress of the World Confederation for Physical Therapy; June 6, 2007; Vancouver, British Columbia, Canada.

  • This research was funded by a Strategic Training Fellowship in Rehabilitation Research to Dr Miller from the Institute of Musculoskeletal Health and Arthritis of the Canadian Institutes of Health Research.

  • * SPSS Inc, 233 S Wacker Dr, Chicago, IL 60606.

  • Microsoft Corp, One Microsoft Way, Redmond, WA 98052-6399.

  • Received May 6, 2008.
  • Accepted March 18, 2010.


View Abstract