Assessment is a central process in education. If students learned what they were taught, we would never need to assess; we could instead just keep records of what we had taught. But as every teacher knows, many students do not learn what they are taught. Indeed, when we look at their work, it is sometimes hard to believe that they were in the classroom. In fact, it is impossible to predict with any certainty what students will learn as the result of a particular sequence of classroom activities. And because we cannot teach well without finding out where our students are starting from, we have to assess. Even if all our students started out at the same point (a highly unlikely situation!), each of them will have reached different understandings of the material being studied within a very short period of time. That is why assessment is the bridge between teaching and learning—it is only through assessment that we can find out whether what has happened in the classroom has produced the learning we intended.
Of course, assessment is also used for other purposes in education, which makes the picture much more complicated. In all countries, assessments of the performance of individual students are used to determine which students are, and which students are not, qualified for subsequent phases of education, and also to decide which kinds of education students should receive. In many countries assessment is also used to hold teachers, schools, district and provinces accountable to parents, taxpayers and other stakeholders. Few would argue that those who provide education should not be required to give some sort of an account to those who pay for education, but often the arrangements we make to hold schools accountable actually get in the way of improving education.
In some ways, this desire to use assessment results achieved by individual students to create a high-stakes accountability system is understandable. There is now quite strong evidence that the presence of a high-stakes accountability system raises student achievement by the equivalent of as much as an extra two months’ learning each year.1 However every single instance in which high-stakes accountability systems have been implemented, adverse unintended consequences have significantly reduced, and in many cases have completely negated, the positive benefits of such an assessment system.
There are many reasons for these unintended consequences, but two are particularly important. The first is that accountability systems are rarely fair to teachers and schools. In every single country where this has been studied, the scores that students get at school depend far more on their individual achievement before they went to that school, the influence of socio-economic factors, and the support given by parents and other family members. For example, in Canada, only 11 per cent of the variation in students’ science scores in PISA in 2006 was attributable to the school; the rest was attributable to factors over which the school had no control.2 Holding schools and teachers accountable for something over which they have little control seems contrary to natural justice, and this is why many teachers and other education professionals find the idea of high-stakes accountability testing so repugnant. However it is possible to design systems of “intelligent accountability” that control for the factors over which schools and teachers have no influence, for example by taking into account prior achievement, the socio-economic status of the students, their ethnic background and so on. 3 When this is done, the traditional ranking of schools in terms of their results is generally very different; schools that appear to be getting good results are shown to be schools who are fortunate to be drawing students from affluent communities, while others, with modest results, are shown to be making extraordinary progress with students from disadvantaged backgrounds.
The second main reason for the unintended consequences is that because assessment results can serve a number of functions, there is a tendency to use the same assessment results to serve a number of functions, ostensibly to save time and money, and to reduce the burden of testing on students. While this is a laudable aim, using the same assessment information to serve different functions brings these functions into conflict, and frequently the result is that the assessment system serves none of the functions well.
As I see it, the challenge is to create an assessment system that is externally referenced, distributed and cumulative. The assessment system needs to be externally referenced, so that the teacher can honestly say to the student, “These are not my standards.” When the authority for the standard of achievement that students have to reach does not come from the teacher, the teacher is free to be a coach, rather than judge and jury. When the assessment is distributed across the whole course, the negative effects of “teaching to the test” are minimized, because the only way to maximize the students’ achievement is to teach all the students everything. When the assessment is cumulative, there is no incentive for students (and teachers) to adopt a shallow approach, because if material is forgotten, it has to be learned again, because it is going to be assessed again.
There is no single best way to achieve this ideal of an externally referenced, distributed and cumulative assessment system, because any assessment system has to take account of the culture in which it will be used. Where great trust is placed in the professionalism of teachers there will be political support for systems that would be unpalatable in communities where such trust is lacking. The important thing is that the assessment system, as far as possible, creates positive incentives for teachers to teach well, and for students to study well. Once this kind of assessment system is in place, it should fade into the background and be unnoticeable, and unremarkable, because it would be so well aligned to the rest of the system. It would also support teachers and learners in focusing their time on the most important function of assessment: using assessment to improve what happens in classrooms.
Assessment for learning
The idea that assessment should be used to improve learning is not new, but recently, many research studies have shown that using assessment during teaching, rather than at the end of teaching—what is sometimes called “formative assessment” or “assessment for learning”—has a bigger impact on how quickly students learn than almost anything else.4
However, there have been many misinterpretations of these research findings, which prevent widespread adoption of effective practices. Perhaps the most widespread misconception is that any assessment that is intended to help learning will, in fact, do so. Many schools think that collecting data on their students’ progress and putting it all into a spreadsheet will help learning. There is little or no evidence that this kind of monitoring has any impact on students’ learning. At the other extreme, another widespread misconception is the idea that because a school has adopted formative assessment, there is no need to provide students with any indication of where they are in their learning. To be sure, giving grades and scores too frequently will certainly slow down learning, but not giving students any indication of whether they are making progress is just as misguided.
Although the terms formative assessment and assessment for learning are defined slightly differently by different people, there is increasing agreement that assessment improves learning when it is used to support five key strategies in learning5
- Clarifying, sharing and understanding learning intentions and criteria for success
- Engineering classroom discussions, activities and tasks that elicit evidence of student achievement
- Providing feedback that moves learning forward
- Activating students as learning resources for one another
- Activating students as owners of their own learning
Each of these five strategies has a considerable research basis individually; together they provide a structure for ensuring that students and teachers work together to harness the power of assessment to improve the learning of mathematics.
Of course, how teachers and students do this will vary according to the age of the students, the particular curricula being followed, and a range of other factors. Some techniques for implementing these strategies will work for some teachers and not for others. They will work with some students, and not others. That is why “what works?” is rarely the right question in education; what is much more important is “under what circumstances does this work?” And because the contexts of classrooms are so different, only the teacher is in a position to judge whether a particular technique is likely to be effective in a given situation. Nevertheless, below I offer one technique for each of the strategies that may provide a useful starting point for teachers.
1. Clarifying, sharing and understanding
A middle school math class was embarking on an open-ended investigative activity in which students were to take a number of coins and divide them into two unequal stacks. They were then to move enough coins off the taller stack of coins to double the height of the shorter stack, and repeat the process. Once they could predict what would happen, they were to investigate what happened with different starting combinations. Because the students did not have much experience with such open-ended mathematics tasks, as a preparatory activity she gave the students four anonymized pieces of student work on a different task (the number of integer-sided triangles that can be made with a given longest side). She chose the four pieces of work to represent different levels of quality and asked the students, working in groups, to decide whether some of the responses were better than others, what was good about the good ones, and what was lacking in the less good ones.
This strategy is particularly effective for two reasons. First, actual samples of work are far more effective than rubrics or scoring guides in communicating standards to students because descriptions of quality often do not mean to students what they mean to teachers. Second, students are often much better at spotting weaknesses in the work of other students than they are in their own, and when they see errors in the work of others, are less likely to make the same errors in their own work.
2. Eliciting evidence of student achievement
A teacher had been working with a middle-school class on measurement and observation in science. To check the students had understood the main ideas, about halfway through the lesson she asked the students the following question.6
Janet was asked to do an experiment to find how long it takes for some sugar to dissolve in water. What advice would you give Janet to tell her how many repeated measurements to take?
A. Two or three measurements are always enough.
B. She should take five measurements.
C. If she is accurate she only needs to measure once.
D. She should go on taking measurements until she knows how much they vary.
E. She should go on taking measurements until she gets two or more the same.
Students responded by holding up one finger if they thought “A” was correct, two for “B” and so on. Because almost all the students gave the correct answer, she decided to move on, but made a point of sitting down and providing individual help to the three students who had given incorrect responses. The important thing here is that it is the quality of the incorrect options—each of which is related to well-known student misconceptions—that allowed the teacher to conclude that a correct answer probably indicated a good understanding of the issue.
3. Providing feedback that moves learning forward
A high school English teacher had set a class a writing task in which they were asked to respond to a question about a Shakespearean play they had read as a class. While the teacher has been giving comments rather than grades, she still wasn’t happy with the amount of time her students were spending on the comments, so she tried a new approach. Rather than writing her comments on the students’ essays, she wrote them on strips of paper. Each group of four students received back their four essays, and the four strips of paper, and their task was to “match the comments to the essays.” Her guiding principles are that feedback should cause thinking and should be more work for the recipient than the donor.
4. Activating students as learning resources for one another
Students in a physical education class had been learning how to throw a javelin. The teacher gave each pair of students a flipcam, and students recorded each other throwing the javelin five times. Each student then reviewed their five attempts and decided which one was the best. They then swapped cameras and decided which of their partner’s five throws was the best. Where they disagreed about the best, they discussed the reasons for their differences.
5. Activating students as owners of their own learning
A third grade teacher had given a class a homework assignment on subtraction with re-grouping. As they came into the classroom the following day, students were asked to make a contribution to at least one of three flip charts the teacher had put up around the classroom. One flip chart bore a plus sign (“+”), one bore a minus sign (“-”) and the third bore the word “interesting.”
Students had to indicate something they had found easy about the homework task, something they had found hard or something they had found interesting. On the minus chart one student had written “I don’t understand when you borrow which column you borrow from when both are zero.” Teachers routinely report that such perceptive self-assessments are not unusual, even with young students. By engaging students regularly in reflecting on their work, students become better at helping the teacher help them.
As I said at the outset, assessment is the central process in teaching. Without assessment there is no interaction—the teacher might as well be speaking to a video camera that is being relayed to students in a different city. Assessment has a role in informing key transitions in education, and from education to work, and can play a role in assuring society that the money it spends on education is being used wisely (actually, it almost always is). But the most important assessment happens minute-by-minute and day-by-day in every classroom, and that is where an investment of time and resources will have the greatest impact on student learning.
- Wiliam, D. 2010. “Standardized testing and school accountability.” Educational Psychologist, 45(2): 107–122.
- Programme for International Student Assessment (PISA). 2007. PISA 2006: Science Competences for Tomorrow’s World. Vol. 1. Paris, France: Organisation for Economic Co-operation and Development.
- Ray, A. 2006. School Value Added Measures in England: A Paper for the OECD Project on the Development of Value-added Models in Education Systems. London, UK: Department for Education and Skills.
- For a summary of this research, see Wiliam, D. 2011. “What is Assessment for Learning?” Studies in Educational Evaluation, 37(1): 2–14.
- Wiliam, D. 2011. Embedded Formative Assessment. Bloomington, IN: Solution Tree.
- Osborne, J. 2011. “Evidence-based Practice in Science Education (EPSE). Teaching Pupils ‘Ideas-about-Science’: Clarifying Learning Goals and Improving Pupil Performance.” Science and Technology Education Unit seminar, London, UK, King’s College London School of Education.
Dr. Dylan Wiliam is an emeritus professor of educational assessment at University College London.