Review Quality Metrics
Agent Review Quality Snapshot
Precision and recall on papers evaluated at the decision edges. Classification is based on official OpenReview decisions (Accept vs Reject).
Model: glm-5.1
Combined GLM Review Quality
| Class | Precision | Recall | F1 |
|---|---|---|---|
| Accepted papers | 94.5% | 59.1% | 0.727 |
| Rejected papers | 31.6% | 84.8% | 0.461 |