Evaluation Criteria
for Cooperative Learning
Elizabeth G. Cohen
Program for Complex Instruction, Stanford University
The practice of cooperative learning should ideally integrate assessment
with instruction. Both teachers and students need to know the objectives
of the lesson that employs cooperative learning and how well the students
are meeting those objectives. However, the proper evaluation of cooperative
learning has produced some confusion. If students are collaborating on
a group product, how can the teacher assess individuals? Isn't this incompatible
in some fundamental way with the ethos of cooperative learning? And if
the assessment is of the group product, how can the teacher (or anyone
else) tell how well a single member of the group understood and grasped
the objective for the lesson?
Some of the confusion arises because there are at least two concerns underlying
these questions. One concern is the need to provide groups and individuals
with feedback during cooperative learning. Feedback, a type of formative
evaluation, can help the group see what they have done well, where they
have failed to understand, and where they need to do better. The teacher
and/or the students can evaluate the group product and provide feedback
to the students that should prove helpful in promoting conceptual understanding.
Individuals can similarly receive formative evaluation through assessment.
If students create an individual report in connection with their group
task, then feedback on that report will signal the students whether or
not they are grasping central concepts and where they need to put in more
effort. Failure of many students to create an adequate individual report
could signal the teacher that the class requires more background to attain
the objective of the lesson.
The second concern underlying these questions has to do with what the
individual learns or fails to learn as a result of work with the group.
If the group experience has been successful, that individual should get
the help she or he needs to understand the task and to grasp the underlying
teaching objective of the group product. Then he or she should be able
to demonstrate that understanding on an individual assessment. If the
group experience has resulted in creative problem-solving, then everyone
in the group should grasp the solution that none of them could have understood
without the help of their classmates. If this is truly the case, then
individual performance assessment or test-taking should reveal that each
individual has understood the group's solution.
In contrast to the formative evaluation of feedback, this is a summative
evaluation for which teachers have the final responsibility. Students,
teachers, and parents want some measure of student learning whether or
not this measure leads to a formal grade. A summative evaluation might
show how many criteria individual students met. At the class level, a
teacher could calculate the percentage of the students mastering particular
criteria. Alternatively, teachers can assign grades according to a test
or individual performance assessment given at the end of the unit.
Student Knowledge
of Evaluation Criteria
The conversations in small groups of students often disappoint teachers
and staff developers. Students are so busy with creating a group product
that they neglect to talk about the intellectual basis for what they are
doing. Academic content is often absent in the discussion. In the case of
group tasks that have an objective of conceptual understanding, research
has shown that talking and working together is closely related to learning
outcomes (Cohen, Lotan, &;Holthuis, 1997). Also, the quality of the conversation
predicts learning gains (Cossey, 1996). Given these research findings, skimpy
and concrete conversations are particularly worrisome.
But is it very surprising that student talk is disappointing given that
teachers have not provided students with any information about the character
of the discussion they would like to hear? Nor do students know in advance
what criteria teachers will use to give them feedback when the group presents
their product to the class.
Student conversations would improve if students had clear criteria for
the group product that specified an exemplary presentation as including
the analytical and intellectual basis of that group product. Group conversations
would also improve if teachers asked students to discuss and evaluate
how well the group product met these criteria. This, in essence, is the
hypothesis that my colleagues and I have developed for our current research
on assessment. Using this reasoning, we have developed evaluation criteria
for cooperative learning in middle schools. The evaluation criteria are
specific to challenging open-ended tasks that make up multiple ability
curricula developed for complex instruction (Lotan, 1997).
Although we created tasks specifically for complex instruction, the basic
principles concerning the importance of evaluation criteria are generalizable
to other cooperative tasks. Evaluation criteria should work well where
objectives include conceptual learning and where the tasks do not prescribe
exactly how to reach an answer.

The hypothesis derives from a sociological theory on authority and evaluation
(Dornbusch, &;Scott, 1975). According to this theory, to the extent that
people view the evaluations they receive as soundly based, they will put
out more effort to attain improved evaluations. Specifically, the dimensions
of a soundly based evaluation as applied to the students are the following:
- Criteria
are clear
- Students
see criteria as fair and valid.
- There is
an adequate sampling of student work.
- There is
specific feedback to the students on how well they have met
the criteria.
In other words, if students
have criteria that they see as fair and valid, if they apply the criteria
to repeated samples of work, and if they receive specific feedback from
peers and from the teacher, then they should theoretically work harder.
This work will be directed toward improving the group product through group
discussion. Thus we predict that the use of evaluation criteria will improve
group discussion; and improved group discussion will produce measurable
improvement in learning outcomes.
How Would Evaluation Criteria Work?
Students in Helen Hagemann's seventh grade class are working on The Silk
Road, a multiple ability curriculum unit designed to allow students to
explore forms of exchange that influenced the spread of ideas and technologies
along the Silk Road. (Hagemann and Beth Scarloss developed the unit.)
There are six activities, each of which exemplifies a different type of
exchange. Students experience these group activities after an initial
study of the Silk Road in their social studies textbook. Groups of students
rotate through activities, so that each student experiences multiple activities
designed to produce understanding of the central concepts. Moreover, each
rotation tends to produce better group products because students have
learned by listening to the presentations of other groups.
We look in on a group of students making a presentation based on an
activity called "Personal Exchange." This activity examines the political
and cultural aspects of arranged marriages, used to cement political alliances
between China and the Uighur Empire. Students explore possibilities for
cultural diffusion along the Silk Road. The Task Card for the activity
asks the students to review the history of the Uighurs, an Asian culture
speaking a Turkic language in the area between the Tian Shan and Dunhuang.
The Resource Cards include a translated history of the giving in marriage
of a Chinese princess to the chieftain of the Uighurs. The history includes
the story of her journey to the "barbarian court" and her reception there.
Also on a resource card are three time lines of the period: one of Chinese
history, one of Uighur history, and a third of the marriages of Chinese
princesses to the Uighur chieftains. The instructions for the group product
are as follows:
- Imagine that
you are a historian presenting a biography of the Princess T'ai-ho
to children of her time.
- Create a
puppet show or pantomime that explains the significance of intermarriage
between Chinese royalty and Uighur leaders. Present your
pantomime or puppet show to the class.
As background for this
product, the group is asked to discuss the story of Uighur/Chinese history
between 775 and 840. They are also asked to discuss cultural experiences
of the princess's journey and to think about how women such as these Chinese
princesses contributed to exchange along the Silk Road. After creating the
group product, each member of the group must write an individual report
in response to this prompt: Tell the story of T'ai-ho as if you were one
of the Uighur princesses sent to teach her Uighur custom.

The students also have evaluation criteria for the group product and a second
set of criteria for the individual report. The evaluation criteria for the
group product are as follows:
- Puppet Show/Pantomime
communicates emotions that were involved in the events.
- Presentation
emphasizes physical and cultural differences between Chinese
and Uighurs.
- Group accurately
incorporates historical events in telling T'ai-ho's story.
- Presentation
shows women's role in exchange.
Helen's class contains
many English learners and comparatively few students who can read and write
at grade level. Reading the resource cards was a major challenge for the
group. The four students in the group are Aaron, Mona, Melany, and Ana.
Aaron introduces the skit: We're supposed to do the marriage thing on how--Mona
interrupts: The Eagers and the Chinese--Aaron reclaims his role,"And I'm
the father and Melany is my daughter. And then Mona's the father and Ana
is her son."
Mona then gets down to business, and addresses Aaron in his role as Chinese
emperor: You're the boy, right? OK, my son wants to marry your daughter.
I think they should get married.
Aaron asks, "Well, why should they get married?"
Mona responds, "Because. She's not married and he's not married and it sounds
pretty good."
Aaron objects,"Well, they don't even speak the same language," but Mona
responds, "She can learn!"
Aaron, the Chinese ruler is not convinced, "But what does that do for me?"
Mona answers, "Your daughter gets to go away to other cultural places and
she gets to learn more stuff and she gets a husband." She questions the
emperor, "Would you like to meet him (the young chieftain, her son)?"
"No," answers Aaron. Mona is undaunted: "Well I still think they should
get married."
Aaron relents and gives his permission, but Melany the princess says, "Nooo!"
Her father scolds her, "You have no say in this. So stay out of this. This
is our choice. Go with them."
"Nooo," wails the princess.
"GO!" orders her stern father.
"We won't blame you" offers Ana, the young Uighur chieftain.
Melany, the princess, gives in: "OK, guess we're married now." The class
laughs and one student pipes up, "You may kiss the bride."
In starting the discussion of this presentation, Helen asks the group
to describe how their presentation used the criteria for an exemplary
group product. Mona says, "Well I guess we were supposed to make you feel
like you were there. Well, we tried to, but we didn't do very well with
the setting. But we were supposed to show emotions. She showed emotions,
man, she didn't want to go! He showed emotions because he told her to
keep her mouth shut." Aaron agrees, "I got mad at her." Helen inserts
a comment, "I especially liked the way that the two of you bargained back
and forth, because the Chinese father was very reluctant to send his daughter
off into the wilderness. The way you said, 'It's too bad. You're going.
You don't have a choice,' was very much the way it would have been."
Mona continues, "It was hard to put in the history from the resource
card. There were like long sentences and I had to translate them into,
like, English."
Helen agrees that this was difficult and asks the next group to do this
task to build in some actual history to the presentation. Mona goes on
to point out how they met the requirement of illustrating the differences
in physical features. She says that the students who played the Uuighurs
were blond and the Chinese players had dark hair. Mona also felt that
they had shown women's role in the exchange. She pointed out that if the
princess learned the Uighur language, it would be easier for the chieftain
to make deals with other Chinese traders. Helen remarks, "In your discussion,
I hear how you tried to illustrate physical differences and the role of
women in exchange, but it is necessary to show this as part of your presentation
or your introduction to the presentation. Let's see if some of the next
groups to present can build these features into their presentation. And
there was one thing that you guys didn't do at all, as far as I can see.
You didn't do it in the form of a puppet show or pantomime. Tomorrow when
the next group does this activity, we'll be looking forward to seeing
this part." Mona has the last word as she turns to Aaron and asks, "What's
a pantomime?"
Here is a group for whom the task is a major challenge. Moreover, they
have had little previous experience in working with the criteria. Yet,
the teacher used the criteria to encourage them to study the resources
and to make use of them. The students used some criteria to construct
their product. Without help from the teacher, students were not yet ready
to use the criteria to criticize their own performance . However, with
further experience, they should be able to do so. The places where the
group has not managed very well became clear in the discussion, so that
the next group, having learned from what they have heard and seen, will
meet more of the criteria. The criteria have helped Helen to encourage
the group to be self-critical. Using the criteria, she focussed her comments
tightly on the skills and content she wanted the students to acquire.
Improving Group Conversations and Presentations
There is potential in the use of evaluation criteria to improve the character
of student conversations and presentations. However, pilot work in Helen's
classes has already taught us that students did not know how to use evaluation
criteria in their group discussion. We will have to find out how to teach
these students how to hold an analytical discussion. We are currently at
work on skillbuilders for this purpose.
Helen used the criteria in three of her classes and omitted them in
the other two. Before and after the unit, Helen administered a unit test
that we had developed. The test contains multiple choice items on the
factual content of the unit and questions using analogies that tap into
higher-order thinking. An example of a factual question is:
Which of the following did not influence exchange along the Silk Road?
a. sharing of technology;
b. mistrust of strangers;
c. available water routes;
d. lack of common coinage;
e. seasonal weather.
An example of a higher-order thinking question is:
Which of the following is most like exchange along the Silk Road?
a. the Home Shopping Network;
b. major department store;
c. door to door salesperson;
d. mail order / catalog;
e. flea market / swap meet.
The content of this test reflected the same criteria that the students
had used in evaluating their group products. In theory, the practice they
receive in meeting these criteria should prepare them for a much-improved
performance on the test.
Students in the classes where Helen used criteria did significantly
better on the factual questions than students in classes without the criteria.
According to systematic study by Beth Scarloss and Susan Schultz (1997),
the most dramatic difference between the two sets of classes was in Helen's
feedback to the students following the presentations. The use of the criteria
led her to focus much more on academic content and allowed her to provide
more specific feedback than was possible in her other classes. Even with
classes containing very few students who were reading at grade level,
she demanded more rigorous intellectual performance with the use of these
criteria.
We are in the midst of further pilot work with a group of teachers who
have worked with us in development of these assessment techniques. We
are seeking practical ways to implement the criteria for students' written
individual reports. The completion of individual reports and the receiving
of feedback on those reports contributes significantly to learning gains
in middle school social studies (Cohen et al., 1997). The criteria that
we have created for individual reports blend content with strategies for
expository writing. Students can use these criteria to give themselves
a score based on the number of criteria they think they have fulfilled.
Cooperative learning can provide an excellent opportunity to improve writing
skills by requiring individual reports. A larger study will follow next
year. We will test the effects of the use of evaluation criteria on the
quality of group conversations, the quality of student reports, and on
student learning.
In the Meantime
Of what use is this preliminary work on assessment to you who work with
cooperative learning? There is considerable merit to questioning what
students know of the criteria the teacher uses to evaluate group and individual
products. Creating and teaching criteria to the students should make a
significant difference in their ability to turn out more impressive group
and individual products.

Creating evaluation criteria for the projected lesson in cooperative learning
is very helpful. You can very quickly find out what you are assuming students
know before they begin and what they will be able to learn from the materials
for the task. As we developed these criteria, we found weaknesses in the
linkage between discussion questions, group products, and individual reports.
Once we grasped the weaknesses, it was possible to make revisions to curriculum
units that greatly strengthened them. There is a tendency to create criteria
for process alone such as "A good group product will reflect the different
opinions of the members of the group." This can be a good starting point
for training students to improve their products and to criticize themselves.
However, you will also want to experiment with content criteria that do
not give a right answer but push the student to put the group product
on a stronger academic and intellectual base.
Basing your final assessment on the same criteria students have learned
to work with means that the test will match the curriculum as implemented.
Very often, the tests given after work in cooperative learning yield disappointing
results because the test items do not match the content and skills the
students practiced. If students work with the criteria, receive feedback
according to criteria, and are then tested, using the same criteria, I
believe that cooperative learning can yield more powerful results than
previous research has shown.
References
Cohen, E.G., Bianchini,
J.A., Cossey, R., Holthuis, N.C., Morphew, C.C., &;Whitcomb, J.A. (1997)
What did students learn?:1982-1994. In E.G. Cohen &;R. A. Lotan (Eds.),Working
for equity in heterogeneous classrooms: Sociological theory in practice
(pp. 137-165). New York: Teachers College Press.
Cohen, E.G., Lotan, R.A., &;Holthuis, N.C. (1997).Organizing the classroom
for learning. In E.G. Cohen &; R. A. Lotan (Eds.), Working for equity
in heterogeneous classrooms: Sociological theory in practice (pp.
31-43). New York: Teachers College Press.
Cossey, R. (1996). Mathematics communication: Issues of Access and Equity.
Unpublished doctoral dissertation. Stanford, CA: Stanford University
Dornbusch, S.M., &;Scott, W.R. (1975). Evaluation and the exercise
of authority. San Francisco: Jossey Bass.
Lotan, R.A. (1997). Principles of a principled curriculum. In E.G. Cohen
&;R. A. Lotan (Eds.), Working for equity in heterogeneous classrooms:
Sociological theory in practice (pp. 105-116). New York: Teachers
College Press.
Scarloss, B., A., &;Schultz, S. E. (November, 1997). Evaluation criteria
in assessing group products.Paper presented at the meeting of the California
Educational Research Association, SantaBarbara, CA.
|