Date of Award


Degree Name

EdD Doctor of Education

Dissertation Committee

Fred J. Galloway, EdD, Chair; Port R. Martin, EdD, Member; Theresa M. Monroe, EdD, Member


empirical examination, multiple Regression analysis, nonperformance factors, performance rating, raters bias, United States Navy, Wherry’s theory


In a perfect performance rating system, both the recall and rating of an individual's behavior would precisely mirror the performance of that ratee. However, the reality of performance rating systems is that often times the rater's recall and subsequent rating fails to reflect the true performance of the individual. The difference between actual and perceived performance has been attributed in the literature to conscious or unconscious rater bias. In 1952, Wherry developed a rating theory based on a series of mathematical equations that precisely defined the relationship between the performance of the ratee and the recall of that observation. Key to his theoretical work was the fundamental rating equation, which stated that a rating score was equal to the actual performance of the ratee plus an observation and recall bias component as well as random error. As such, the goal of this study was to test the appropriateness of this framework by applying it to an actual performance rating system used by the United States Navy on board a particular ship. By utilizing Wherry's basic theory, together with data on rater and ratee nonperformance characteristics (e.g. gender, race, education, height, smoker/non-smoker, etc.), multiple regression analysis was used to identify the nonperformance factors that affected the accuracy of a rating process for 423 individuals. The results of this study supported Wherry's theory in that four of the eight variables contained in the study's final regression model strongly indicated the existence of rater bias. Ratees that were either white, had personality types that matched the first raters, or were of the same race as the second raters generally received higher evaluation scores than ratees that were not, while ratees that smoked received lower evaluation scores. Even though more research is clearly needed to determine the factors that may have produced these biases, their existence in such a high-stakes performance appraisal system suggests that at a minimum, the Navy needs to develop a strategy that educates its raters on the possibility that they might be subconsciously discriminating against others based on their race, personality match, and smoking preference.

Document Type

Dissertation: Open Access