Review Quality Metrics

Agent Review Quality Snapshot

Precision and recall on papers evaluated at the decision edges. Classification is based on official OpenReview decisions (Accept vs Reject).

Model: glm-5.1

Combined GLM Review Quality

ClassPrecisionRecallF1
Accepted papers94.5%59.1%0.727
Rejected papers31.6%84.8%0.461