Education assessment in the 21st century: New technologies

Editor's note:

This blog is part of a four-part series on shifting educational measurement to match 21st century skills, covering traditional assessments, new technologies, new skillsets, and pathways to the future. These topics were discussed at the Center for Universal Education’s Annual Research and Policy Symposium on April 5, 2017You can watch video from the event or listen to audio here.

Moving beyond correct-incorrect

In virtually every large-scale assessment, answers to test items are eventually scored as either correct or incorrect (with some provision for a “degree” of correctness). These answers, which are the actual selection of preferred answers in a multiple choice test, or the actual written text answer to an open-ended item, are then captured and used to indicate the amount of a competence that is presumed to exist—the construct (sometimes referred to as a latent trait).

The fact that we sometimes design tests to allow for degree of correctness tells us that the correct-incorrect dichotomy is not always useful or sufficient. It may be useful when we are addressing fact-based issues, but not when we are interested in capturing nuances or processes. Accordingly, there have been attempts to move beyond this absolutist approach. Frederiksen was among the first to have paved the way toward measurements that capture more information than simply correct or incorrect. In 1984, Frederiksen argued that multiple choice question (MCQ) formats measure only a subset of relevant skillsets and voiced concerns that overuse of the format can result in the teaching of a narrow set of skills.

Several decades later, the vast majority of large-scale testing is still delivered through MCQ format. Unless a very large number of well-written items are available, this medium of assessment does not allow for fine-tuned evaluation of student learning. A large number of items translate into large amounts of time taken out of student learning time.

Capturing processes—not just responses

Technology can be used to augment and enhance the measures of conventional core skills (e.g., numeracy and literacy). These enhancements focus on making proxy measures more closely linked with the latent trait by increasing the amount of information that is captured. Technology can also augment the data to understand students’ problem-solving processes by capturing task actions in addition to the usual test responses. Actions that represent process, often called “indicators,” are captured digitally and can range from simple to complex as shown in the examples in Figure 1.

Figure 1: Examples of process indicators by complexity

global_20170227_new technologies

Link: Vista, Awwal, and Care. 2015

Figure 1 is far from exhaustive, but provides a picture of how processes can be captured that are even relevant in the measurement of conventional skills such as literacy and numeracy. Note that these examples strongly reflect many of the observations a teacher would make as part of normal teaching practice in the classroom. For example, teachers may have expectations of how long tasks should take, and may informally evaluate students by time-on-task in the classroom. Similarly, teachers may observe the sequence of processes that students follow as they think through tasks in order to evaluate the degree to which they can apply their learning as part of classroom practice.

The potential of web-based assessments

Transitioning to a digital testing environment opens up a wealth of opportunity in those countries where internet access is almost universal. Putting aside the advantage of economies of scale, this capacity of web-based platforms to capture process data provides access to a more holistic assessment. And importantly, such applications provide a medium both for teaching, learning, and assessment, as illustrated by initiatives such as SimScientist developed by WestEd. Although primarily a teaching and learning tool, the capacity of such applications for capture of process data is enormous. Such capture can inform not only individual student assessment, but can provide rich data for analysis of student approaches to problem solving within traditional domains.