Student surveys are ubiquitous in higher education as a means of evaluating teaching. (In fact, they are often the only source of feedback on classroom instruction for college professors.) But, until recently, they were quite rare in K-12 education. As state and district leaders redesign their teacher evaluation systems, they should consider adding student surveys to the set of measures included in teacher evaluation systems. As we learned in the Gates Foundation’s Measures of Effective Teaching project, student surveys have a number of advantages:
- Relationship to student achievement gains: We tested the predictive power of student surveys by comparing a teacher’s score on the Tripod Survey (developed by Ron Ferguson at the Harvard Kennedy School of Government) to their effectiveness in raising test scores with a different group of students or in a different academic year. After adjusting for measurement error, the correlation was between 0.3 and 0.4 in mathematics and 0.1 and 0.3 in English Language Arts. In other words, the teachers who scored higher on the student surveys saw higher achievement gains.
- Reliability: The student surveys were the most reliable of the measures we tested (that is, least volatile from year to year), especially in middle school. The reliability of student surveys derives from the power of averaging. Even if an adult is a more discerning evaluator of a teacher’s practice than the typical elementary or middle school student, classroom observations typically average over one or two observers. However, the typical elementary classroom has roughly 20 students and the typical middle school teacher works with 75 to 100 students, spread across multiple sections. In addition, rather than averaging over 2 or 3 lessons, students are present for 180 days.
- Improving Practice: Although student achievement gains or “value-added” measures provide predictive power (that is, they help identify teachers likely to see similar student achievement gains with future students), they offer little diagnostic power for identifying specific aspects of a teacher’s practice which deserve attention. In contrast, student surveys, like formal classroom observations, offer the chance to identify areas where a teacher could improve. The power of student surveys and formal classroom observations to drive changes in practice could be enhanced by aligning the language of the surveys with the language of the teaching standards.
- Cost and Coverage: Relative to the cost of observations by trained adults, or the cost of adding new assessments in untested grades and subjects, student surveys are a relatively low-cost way of providing additional sources of data for individual teachers. In the MET study, the youngest students we surveyed were in fourth grade and the oldest were in 10th grade. In these grades, the student surveys could be used to provide additional coverage in subjects such as social science, science, history, art, etc. where student assessments are often available. Future work should investigate the predictive validity and reliability of student surveys in younger grades.
- Emotional Salience: One of the potential strengths of student surveys is that they are measured in a currency that teachers inherently value— the perspective of students. A merit pay system attaches financial incentives to other measures—such as classroom observations or student achievement gain measures— to artificially attach value to those measures. However, to the extent that teachers inherently value what their students have to say, and care about whether their students rank them relative to their peers in responding to statements such as “We use time well and we don’t waste time” or “When I turn in homework, I get useful feedback which helps me improve,” then it may not be necessary to attach financial incentives to provoke the desired responses from teachers.
There are only a few places to look for independent sources of feedback on a teacher’s practice. Student achievement gains or “value-added” measures are valuable when they are available, but less than a quarter of teachers work in tested grades and subjects. Classroom observations by principals are another source, but it is costly to add observations by other observers from outside the school. Student surveys are a natural place to turn for an additional source of feedback for teachers. Outside the tested grades and subjects, student surveys may be the only source besides the teacher’s principal. As such, student surveys would be a valuable source for balancing or confirming those judgments.
Of course, we must be mindful that attaching high stakes for teachers to information from student surveys may introduce pressures to distort those measures. After all, some college professors have been known to chase higher student evaluation scores by being easy graders. One of the best ways to reduce this tendency is to use multiple sources of information, and not just one metric, for making important decisions about teachers. Meanwhile, through the MET project, we’ve learned what types of relationships to expect between student survey measures, student achievement gains and observations. States and districts should monitor the relationships among the various measures. If students or teachers begin abusing the student surveys (or another one of the measures), an early warning sign would be the breakdown of those relationships.
The following relevant reports can be found at www.metproject.org:
The Bill & Melinda Gates Foundation, Learning about Teaching: Research Report (Seattle, WA: The Bill & Melinda Gates Foundation, 2010)
Walter H. Gale Professor of Education and Economics - Harvard University
Thomas J. Kane and Douglas O. Staiger, Gathering Feedback for Teaching: Research Paper (Seattle, WA: The Bill & Melinda Gates Foundation, 2012)
Thomas J. Kane, Daniel F. McCaffrey, Trey Miller, Douglas O. Staiger, Have We Identified Effective Teachers?: Validating Measures of Effective Teaching Using Random Assignment (Seattle, WA: The Bill & Melinda Gates Foundation, 2013)
Kata Mihaly, Daniel F. McCaffrey, Douglas O. Staiger and J.R. Lockwood, “A Composite Estimator of Effective Teaching” RAND Working Paper, January 8, 2013.