Students' perceptions of course difficulty and their ratings of the instructor.
Subject:
College students (Beliefs, opinions and attitudes)
College students (Educational aspects)
Student evaluation of teachers (Analysis)
Authors:
Addison, William E.
Best, John
Warrington, John D.
Pub Date:
06/01/2006
Publication:
Name: College Student Journal Publisher: Project Innovation (Alabama) Audience: Academic Format: Magazine/Journal Subject: Education Copyright: COPYRIGHT 2006 Project Innovation (Alabama) ISSN: 0146-3934
Issue:
Date: June, 2006 Source Volume: 40 Source Issue: 2
Product:
Product Code: E197500 Students, College
Geographic:
Geographic Scope: United States Geographic Code: 1USA United States

Accession Number:
147389147
Full Text:
Research dealing with the possible relationship between high grades and favorable student evaluation of instruction has focused on several possible explanations including reciprocal leniency. Although plausible, such explanations leave out the possible role of perceived difficulty. We hypothesized that student evaluations would be negatively affected when the course was harder than originally thought, regardless of grade earned, and we also hypothesized that student evaluations would be higher when the course was viewed as easier than initially expected, regardless of grade earned. Students in their first Psychology Statistics class and Introductory Psychology students responded to global summative items regarding their teacher's effectiveness. In addition, the students indicated how difficult the class was relative to their expectations. Finally, the student's alphabetic grade was recorded. We found that students who earned higher grades evaluated their teachers more favorably than did students who earned lower grades. However, after controlling for the grade earned, we found that students who thought the class was easier than expected evaluated the professor more favorably than did students who thought the course was harder than expected, who in turn evaluated the professor more negatively, regardless of grade earned. Although these results are correlational, they suggest that faculty may not be able to influence student evaluations through leniency, especially if the students already believe the course will be easy.

**********

Over the last several decades, research on student evaluation of teaching effectiveness has considered literally hundreds of variables that may play a role in influencing the process of student evaluation. In general, these variables can be grouped into one of three categories: Structural course variables such as class size or required/elective status (e.g, Kulik & Kulik, 1974; McKeachie, 1997), instructor variables such as expertise (Marsh, 1980; Marsh & Roche, 1997) "personality" characteristics (Best & Addison, 2000), or nonverbal behavior (Babad, Avni-Babad & Rosenthal, 2004), and finally, student variables.

Several student variables have been shown to influence the evaluation process including the preexisting, or the emerging interest level in the course, the students' understanding of the purpose for which evaluations are to be used, and the grade earned in the course. For example, Marsh and colleagues (Marsh, 1980, 1984; Marsh & Roche, 1997) have suggested that piquing student interest in the course content may put into play positive attributional processes for the teacher. That is, students may attribute particular teaching abilities to the teacher rather than to the favorable learning environment that a teacher can create when surrounded by students who are obviously interested in the material.

Regarding the student's understanding of the evaluation process as a possible student variable that could influence evaluations, Chen and Hoshower (2003) found that students view formative uses of evaluation (to help the teacher improve) as being far more important than summative uses of evaluation (such as to make tenure or promotion decisions). Furthermore, Young, Delli, and Johnson (1999) found that students adopt different cognitive schemas, with resulting differences in evaluations, when these different purposes are used as the basis for evaluation.

However, of all the student variables that might affect evaluations of teaching, perhaps none has generated as much controversy as the role of expected or actual grades, with a number of studies finding positive correlations between grades or expected grades and evaluations of faculty. For example Aleamoni (1999) was able to document 37 studies conducted in the last several decades that have small, but persistent positive correlations between expected or received grades and the favorability of student evaluations. However, the complexity of this basic phenomenon is also documented by Aleamoni, who cited 24 correlational studies from the same time period that did not find any relationship between grades and student evaluations. In attempting to explain the factors that could be involved in this relationship, if it exists, Wachtel (1998) has laid out three possible explanations: (1) students reciprocate the leniency their teachers show them (the leniency hypothesis), (2) students evaluate good teachers favorably because quality teaching enables the students to perform up to their full potential (the validity hypothesis), and (3) pre-existing differences among students affect both teaching effectiveness and student evaluations (the student characteristic hypothesis).

Chambers and Schmitt (2002) proposed a model to explain how the leniency and validity hypotheses might operate. They suggest that students develop a mental scheme about how they expect to be graded in a particular class based upon how they have been graded in other classes that seem to involve similar amounts of effort. If this comparison process produces a positive discrepancy (that is, given the same amount of effort in other classes that led to a B, the student in some particular class earns an A), then the result of that comparison will be a favorable attribution of teacher skill and higher evaluations ("The teacher did a good job, and that's why my effort enabled me to earn a higher grade."). On the other hand, if the result of the comparison process is negative (such as when the same amount of effort expended in other classes that led to a B produces a C in the course in question) then the result of that comparison will be an unfavorable attribution of teacher skill and lower evaluations ("I did more poorly than expected because my teacher did not do a good job."). Their evidence supports this model. They computed a difference score by comparing a student's grade in a particular class with his or her grades in other classes. As expected, positive discrepancies were significantly positively correlated with several dimensions of teaching effectiveness, including class organization, and instructor involvement. At the individual level, the greater the positive discrepancy, the higher the student evaluations of teaching became.

The research of Chambers and Schmitt (2002) raises several questions about the comparison process. For example, is the comparison process always a straight-up, even comparison of different courses and a constant effort needed to achieve a certain grade level? We argue that this is not necessarily the case: Students entering a particular course often have expectations about their grade based on commonly-held stereotypes about a course being "easy" or "hard," and this expectation may produce either a "perceived leniency" (or, conversely, perceived difficulty) that may influence their evaluations of the instructor. For example, a hypothetical student who usually earns Bs in most classes but who is struggling to earn a B in a course perceived as "easy" may attribute his or her relative difficulty in getting an A to the instructor's poor teaching methods, and thus evaluate the teacher less favorably. On the other hand, the same hypothetical student who believes that his or her typical effort in a particular class perceived as hard will not be sufficient to earn a B may attribute some of his or her success to the teacher's skill and elevate the student evaluations if he or she actually earns a B in that class.

In the current study, we examined the relationships among students' perceptions of course difficulty, their grades, and their evaluations of their instructors. Based on the work of Chambers and Schmitt (2002) and others, we hypothesized that a mental process involving comparison between a current course and other courses is carried out, based in part on the perceived difficulty of the current course. Specifically, we expected to find that student evaluations of a professor would be negatively affected when the course turned out to be harder than was originally thought, and we expected this effect to be above and beyond the suppressive effects of low grades themselves on student evaluation of instruction. We also hypothesized the reverse effect--that ratings of the professor would be higher for students who viewed the course as easier than initially expected. Additionally, we anticipated that these evaluations would be elevated above and beyond whatever effects there were from the grades alone.

Method

Participants

The participants included 157 students (129 women, 28 men) who were enrolled in either an introductory psychology class (N = 86) or one of four sections of an introductory psychological statistics course (N = 71) at Eastern Illinois University. Ninety-five percent of the participants were traditionally-aged college students.

Materials

For the purpose of this study, we developed a brief evaluation instrument we called the Course Experience Questionnaire (CEQ). The CEQ consists of 10, 6-point Likert scale (1 = strongly disagree, 6 = strongly agree) items designed to evaluate typical instructor behaviors. Included among the 10 items are 2 summative items: #6, "The instructor was generally an effective teacher, compared to other instructors at this university," and #7, "I would recommend this professor to a friend." In addition, participants responded to the following item: This course was:

--More difficult than I expected.

--Less difficult than I expected.

--About as difficult as I expected.

Design and Procedure

The introductory psychology course was chosen as a course that many students expect to be fairly easy, but that turns out to be more difficult than anticipated. On the other hand, many students expect the statistics course to be fairly difficult, and it turns out to be more manageable than initially thought. By surveying these two courses, we expected the data to yield differential numbers of students responding that the course was more or less difficult than they anticipated.

The CEQ was administered during the same class period as the regularly-used student evaluation instrument, and this administration occurred during the last two weeks of class in a 15-week semester. The instructor was not present during the administration of these instruments.

Results

We used the participants' responses to the perceived difficulty of the course to place each participant into one of three groups: Those who found the course more difficult than expected, those who found it less difficult than expected, and those whose perceptions of difficulty were about equal to their expectations.

To evaluate the success of the manipulation (i.e., surveying introductory psychology and statistics students), we conducted a 2 (introductory vs. statistics) X 3 ("more difficult," "less difficult," "about as difficult") chi-square test of independence. As expected, the two factors were not independent, [chi square] (2, N = 157) = 24.17, p < .001. Although the percentage of students who reported that the course was "about as difficult" as they expected (45%) was identical for students in the introductory psychology course (N = 39) and those in the statistics course (N = 32), 47% (N = 40) of the introductory psychology students reported that the course was "more difficult" than expected, compared with just 18% (N = 13) of the statistics students. Concomitantly, only 8% (N = 7) of the introductory psychology students reported that the course was "less difficult" than expected, compared with 37% (N = 26) of the statistics students. These results clearly support our decision to carry out the manipulation.

Our main hypothesis was that differences in perceived difficulty explain a significant proportion of variation in student evaluations on summative items on the evaluation instrument, above and beyond any variance that can be explained solely on the basis of grade earned in the class. To test this hypothesis, we conducted two one-way analyses of covariance (ANCOVAs) on each of the summative items on the CEQ, using the three levels of perceived difficulty as the independent variable, and grade earned in the class as the covariate. In conformity with current guidelines concerning best practices in statistical reporting of Analysis of Variance (ANOVA) and ANCOVA, we will report the Mean Square Error (MSE) as an indication of effect size (American Psychological Association, 2001).

For item 6 ("The instructor was generally an effective teacher, compared to other instructors at this university"), the ANCOVA confirmed that grade had a significant effect on evaluations, F(1, 153) = 19.91, MSE = .69, p < .01. In general, students who earned higher grades in the course tended to evaluate the professor more favorably on item 6 than did students who earned lower grades. However, after adjusting for the covariate, grade earned, the perception of course difficulty also had a significant effect on evaluations, F(2, 153) = 8.51, MSE = .69,p < .01. Specifically, students who reported that the course was less difficult than expected evaluated the professor significantly more favorably (M = 5.58) than did students who reported that the course was more difficult than expected (M = 4.79), F(1, 153) = 16.97, MSE = .69, p < 001. Students who indicated that the course was about as difficult as expected reported an intermediate mean evaluation on item 6 (M = 5.23), which was significantly lower than the evaluation given by the students who indicated that the course was less difficult than expected [F(1,153) = 5.62, p = .005], but not significantly greater than the mean evaluation of students who indicated that the course was more difficult than expected [F(1, 153) = 2.46, p = .06]. In sum, students who indicated that the course was less difficult than expected evaluated the instructor more favorably than did students who indicated that the course was harder than expected, or than those who indicated that it was about as difficult as expected. For item 7 ("I would recommend this professor to a friend"), the ANCOVA confirmed that the grade earned in the class had a significant effect on the evaluation of the professor, F(1, 153) = 7.47, MSE = .94, p = .007. As on item 6, students who earned higher grades in the class tended to evaluate the professor more highly on item 7 than did students who received lower grades. However, after adjusting for the covariate, grade earned, the perception of course difficulty also had a significant effect on evaluations on item 7, F(2, 153) = 19.25, MSE = .94,p < .001. Specifically, students who indicated that the course was less difficult than expected evaluated the professor significantly more favorably (M = 5.91) than did students who indicated that the course was more difficult than expected (M = 4.34), F(1, 153) = 37.76, MSE = .94, p < 001. Once again, students who indicated that the course was about as difficult as expected reported an intermediate mean evaluation on item 7 (M = 5.13), which was significantly lower than the evaluation given by students who indicated that the course was less difficult than expected IF(l, 153) = 14.57,p < .001], and it was also significantly greater than the evaluation of students who indicated that the course was more difficult than expected IF(l, 153) = 11.45, p < .001]. For both items 6 and 7, the pattern of findings is strikingly similar: In both cases, students who reported that the course was more difficult than expected evaluated the instructor less favorably than did students who reported that the course was easier than anticipated, regardless of the grade earned in the class.

Conclusions and Discussion

We hypothesized that students' evaluations of instruction would associated with their perceptions of the difficulty of the class, such that on summative evaluation items, students who found the class more difficult than they thought it was going to be tend to assign low evaluations, while those who found the class easier than they thought it was going to be assign somewhat higher evaluations. Moreover, we hypothesized that this effect of perceived difficulty was independent of the grade that students earn in the class. These hypotheses were confirmed: On two summative evaluation items, undergraduate students evaluated different professors more or less favorably, based in part on their perceptions of the class as more or less difficult than they had anticipated. Moreover, these effects are not explainable simply in terms of the grade earned in the class. Although grades were clearly somewhat predictive of student evaluations in the stereotypically predicted direction, with higher grades associated with more favorable student evaluations, student expectations also played an important role. Our findings suggest that when the high grades are earned in a class that is perceived as easy, then high grades per se might not have a particularly strong effect on student evaluations. On the other hand, our findings also suggest that when the high grades are earned in a course perceived as difficult, then earned high grades may play a strong role in helping to produce high student evaluations.

College professors have long debated the possible relationship between the grades they assign and the evaluations they receive from students, with some faculty members perhaps feeling that the relationship is one of simple exchange. For example, Baldwin and Blattner (2003) found that college teachers commonly believe grade expectation to be a very strong influence in student evaluation. In a survey of factors that college professors believe influence student evaluation of instruction, Baldwin and Blattner found that 40% of their respondents believed that a student's grade expectation in the course had a significant influence on his or her evaluation of instruction (compared with, for example, 53% believing that the professor's organizational skills and preparation had a significant influence). Such beliefs might underlie the relatively widespread distrust of student evaluations of faculty by some of their intended users, namely the faculty themselves (Nasser & Fresko, 2002).

Wachtel (1998) discussed three possible factors underlying the putative relationship between student evaluations and grades including leniency, validity, and pre-existing differences among students. Chambers and Schmitt (2002) have demonstrated a possible mechanism by which the leniency or validity hypotheses could operate together to produce student evaluations that are biased by grades earned. They suggest that students build a mental scheme in which their grade in a given course is correlated with the effort expended in that and in other courses. If the grade seems to be high relative to effort, then the student may attribute particular skill to the teacher and evaluate him or her accordingly. If the result of the comparison process is negative, (i.e., similar effort is followed by lower grades than usually achieved) then the teacher may be evaluated less favorably. Our findings extend this notion and also establish some ameliorating conditions. We demonstrated that incoming students in two typical courses in the Psychology curriculum view the difficulty level of each course very differently. Moreover this perception of difficulty interacts with grade expectations and is associated with rather substantial effects on student evaluations. Therefore, we may infer that, in some plausible scenarios, the effects we found could reduce some of the effects seen operating in the Chambers and Schmitt study. For example, consider the case of a hypothetical student who is taking five classes, including the first Statistics course, in a given semester. If the student believes that his or her typical effort leads to an A, and then the student does earn an A in four classes and earns a B in Statistics, Chambers and Schmitt might predict that the hypothetical student may evaluate the Statistics teacher less favorably than he or she evaluated the other four professors. Our findings suggest a different prediction: A student who earns a B in a class that was easier than he or she thought it was going to be may evaluate the instructor more favorably than teachers from classes from whom As were earned, especially if those classes were harder than expected.

As powerful as these effects seem to be, there are several strong caveats that should be borne in mind when interpreting our findings. First, the correlational nature of this study means that no causative inference can be drawn regarding the effect of leniency in grading and higher student evaluations. Although that interpretation cannot necessarily be ruled out in our study, there are other interpretations. For example, it may be the case that the role of the teacher may be construed differently, and more favorably, by high-performance students (Wachtel, 1998). Even assuming that a direct, reciprocal relationship could in general be established between grades and student evaluations, a professor who sought to raise student evaluations through the simple expedient of grading more leniently could be disappointed if the incoming students already perceived the class as easy. In that case, deliberately grading leniently might not be associated with a dramatic increase in student evaluations, or even any increase at all.

On a more positive note, our research supports the idea that good, clear communication between students and faculty regarding the purpose of evaluation is important if the process is to be valid and useful to all concerned. For example, Spencer and Schmelkin (2002) showed that students, for their part, are not sure that faculty actually use the evaluation information for either formative or summative purposes. Yet it also seems to be the case that despite this lack of confidence, students adopt different cognitive schemas, with resulting differences in evaluations, when these different purposes are used as the basis for evaluation (Young, Delli, & Johnson, 1999).

Finally, we believe our results may be helpful to faculty members who are trying to conduct fair and useful student evaluations of instruction. Specifically, our findings could be interpreted as identifying a potential source of bias in student evaluations of which faculty should be aware, and against whose potential negative effects they need to guard (Baldwin & Blattner, 2003).

Author Note

Correspondence concerning this article may be addressed to William E. Addison, Department of Psychology, 600 Lincoln Ave., Eastern Illinois University, Charleston, IL 61920. (e-mail: cfwea@eiu.edu)

References

Aleamoni, L. M. (1999). Student rating myths versus research facts from 1924 to 1998. Journal of Personnel Evaluation in Education, 13, 153-169.

American Psychological Association. (2001). Publication manual of the American Psychological Association (5th cd.), Washington, DC: Author.

Babad, E., Avni-Babad, D., & Rosenthal, R. (2004). Prediction of students' evaluations from brief instances of professors' nonverbal behavior in defined instructional situations. Social Psychology of Education, 7, 3-33.

Baldwin, T., & Blattner, N. (2003). Guarding against potential bias in student evaluations: What every faculty member needs to know. College Teaching, 51(1), 27-32.

Best, J. B., & Addison, W. E. (2000). A preliminary study of perceived warmth of professor and student evaluations. Teaching of Psychology, 27, 60-62.

Chen, Y., & Hoshower, L. B. (2003). Student evaluation of teaching effectiveness: an assessment of student perception and motivation. Assessment & Evaluation in Higher Education, 28, 71-87.

Chambers, B. A., & Schmitt, N. (2002). Inequity in the performance evaluation process: How you rate me affects how I rate you. Journal of Personnel Evaluation in Education, 16, 103-112.

Kulik, J. A., & Kulik, C. C. (1974). Student ratings of instruction. Teaching of Psychology, 1, 51-57.

Marsh, H. W. (1980). The influence of student, course, and instructor characteristics in evaluations of university teaching. American Educational Research Journal, 17, 219-237.

Marsh, H. W. (1984). Students' evaluations of university teaching: Dimensionality, reliability, validity, potential biases, and utility. Journal of Educational Psychology, 76, 707-754.

Marsh, H. W., & Roche, L. A. (1997). Making students' evaluations of teaching effectiveness effective: The critical issues of validity, bias, and utility. American Psychologist, 52, 1187-1197.

McKeachie, W. J. (1997). Student ratings: The validity of use. American Psychologist, 52, 1218-1225.

Nasser, F., & Fresko, B. (2002). Faculty views of student evaluation of college teaching. Assessment and Evaluation in Higher Education, 27, 187-198.

Spencer, K. J., & Schmelkin, L. P. (2002). Student perspectives on teaching and its evaluation. Assessment and Evaluation in Higher Education, 27, 397-409.

Wachtel, H. K. (1998). Student evaluation of college teaching effectiveness: A brief review. Assessment and Evaluation in Higher Education, 23, 191-211.

Young, I. P., Delli, D. A., Johnson, L. (1999). Student evaluation of faculty: Effects of purpose on pattern. Journal of Personnel Evaluation in Education, 13, 179-190.

WILLIAM E. ADDISON, JOHN BEST, AND JOHN D. WARRINGTON

Eastern Illinois University
Gale Copyright:
Copyright 2006 Gale, Cengage Learning. All rights reserved.