This project looks at the growing interest in using measures of teacher applicant quality to improve hiring decisions, but the statistical properties of such measures are poorly understood. We present evidence on structured ratings solicited from teacher applicants’ references.

We find that the reference ratings capture only one underlying dimension of applicant quality, which may indicate a need to broaden the range of questions posed to professional references. Point estimates of inter-rater reliability range between 0.23 and 0.31 and are significantly lower for novice applicants. It is difficult to judge whether these levels of reliability are high or low in the current context given so little evidence on comparable applicant assessment tools.

Dan Goldhaber, Cyrus Grout, Malcolm Wolff, Patricia Martinkova (2020). Evidence on the Dimensionality and Reliability of Professional References’ Ratings of Teacher Applicants. CALDER Working Paper No. 237-0620


