Do we know how to improve teaching? I don’t mean tinkering around the edges—making a particular history lesson better or getting an individual teacher to alter his or her instructional strategies—but a lasting, substantive change, one that reshapes the profession. Do we know how to transform bad teachers into adequate teachers? Can we take teachers who are merely adequate and make them good—even outstanding?
Those questions are especially relevant right now. The burden of answering them affirmatively falls on professional development (PD). All levels of government spend a huge amount of money on teachers’ professional development; it’s a mainstay of federal education policy. Expenditures on Title II of the Elementary and Secondary Education Act (The Eisenhower Program), mostly devoted to PD, are budgeted at about $2.3 billion in 2014. More than $450 million of i3 grant money spent from 2010-2012 went to PD. Advocates of school reforms that affect teaching and learning inevitably rely on PD to implement their preferred changes. The prominent contemporary example is the Common Core. Advocates of the Common Core are counting on PD to equip teachers with the instructional capacity to actualize the standards.
Evaluations of Professional Development
As reported by Stephen Sawchuk in Education Week, a study conducted by Instructional Research Group and released last week reviewed the research on professional development in K-12 mathematics. Good research reviews whittle down an initial pool of studies based on quality of design. This review found that of 910 PD studies identified in a search of the relevant literature, only thirty-two employed a research design for assessing the effectiveness of PD programs. Of those, only five met the evidence standards set by What Works Clearinghouse. Of the five studies, two had positive results, one showed limited effects, and two detected no discernible effects.
Such dismal findings aren’t confined to PD in math. This past summer, the Institute for Education Sciences (IES) released a summary of randomized controlled trials (RCTs) funded over the past decade. Randomized evaluations of educational interventions are important for producing, in comparison to studies employing other methods, reliable findings about causality. Two of the IES studies evaluated professional development programs, one in early reading and the other in middle school mathematics.[i]
The early reading study randomly assigned teachers to two different PD programs and a control group. The PD programs included a content-focused institute that began in the summer and continued through the school year (Treatment A). In-school coaching was an additional component of Treatment B. The measured outcomes were: teacher knowledge of scientifically based reading instruction; teacher instruction; and student achievement. Outcomes were measured immediately at the end of the PD programs and one year later.
The good news is that at the end of the first year, statistically significant positive effects were found for teacher knowledge and for one of three instructional practices. Teachers knew more about teaching reading and had altered their instruction, at least in terms of one dimension. The bad news is that these effects didn’t last; they faded to statistical insignificance one year after the treatment. Even worse news is that coaching—a popular and costly PD strategy found in many districts—produced no statistically significant effects, not for students or teachers, either immediately after the PD or one year later. And the really bad news is that neither PD program had a significant impact on student achievement, whether measured immediately at the conclusion of the PD or a year later.
The math intervention was a two year program. It contained many of the elements believed to be associated with high quality PD, including a focus on specific content (rational numbers), and sought to improve instruction involving mathematical concepts, procedures, and problem solving. Teachers attended a summer institute and a series of one-day seminars during the school year. They also received in-school coaching—a total of more than 100 hours of PD. At the end of the two years, no statistically significant impact was detected on teacher knowledge (effect size = .05). Nor was there a statistically significant effect on student achievement in rational numbers (effect size = -.01). Teacher turnover may have affected the outcomes, but the PD teachers nevertheless received a substantial amount of the treatment.
The latest disappointing findings join a large but methodologically weak corpus of research on PD. A comprehensive review of the literature published in 2007 identified 1,343 studies of professional development. Only nine have either a RCT or quasi-experimental design that allows for causal inference. All nine of those studies consist of small samples (fewer than 50 teachers), examine PD programs that were delivered by their developers, and include only interventions that were restricted to first through fifth grades. The review found a net positive effect on student achievement—that’s encouraging—but the researchers could not identify common characteristics from such a small and varied sample. Now along come high quality IES studies of programs that feature most of the elements currently thought to characterize top notch professional development programs. And they find next to nothing.
So where does this put us? In a nutshell, the scientific basis for PD is extremely weak. I say that not only as a researcher, but as one who has spent most of his adult life either teaching or studying schools. I began my own teaching career more than 35 years ago, in the summer of 1978, when I entered a teacher training program. I was an elementary school teacher for most of the 1980s, and after that, a college professor for most of the 1990s. I still hold California credentials for elementary grade teaching and secondary mathematics and history. I have been professionally developed up one side and down the other. I have also been a professional developer myself.
When I hear people say that we know what good PD is, or that we know how to improve teaching but lack the will to do so, my initial reaction is that people who say such things are engaged in wishful thinking. We are flying by the seat of our pants. Teachers who seek to improve their own practice are primarily guided by common sense, intuition, word of mouth, personal experience, ideologically laden ideas about progressive or traditional instruction, the guidance of mentors, and folk wisdom—not a body of knowledge and practice that has been rigorously tested for its efficacy.
I do think we have improved the knowledge base since I began teaching, but only a little. If I had to summarize the current state of affairs, I’d say this: We know that teachers matter and that some teachers are better than others, but we don’t know the specific attributes that make some teachers effective and others ineffective. Until we can define those qualities and amass a scientifically sound body of research on how to develop them, significantly improving teaching will remain an elusive goal.
[i] Also ESL, not discussed here because of limitations on generalizing the findings.
Esther Care, an education expert at the Brookings Institution, calls the A-F grading system “nonsense.” “Grades are mere proxies for what we value. What we actually value is our children being prepared for the future,” she said. “We need to find ways in educational assessment to convey information about the degree to which they are ready to venture out and to deal constructively with the huge challenges posed by our 21st century.