Review papers with raw data transparency
Quickly verify claims by accessing the underlying experimental data and figures.
Press Enter ↵ to solve
| Evidence dimension | What “strong” looks like | What I will extract (from paper) | Evidence weakness flags |
|---|---|---|---|
| Data provenance & leakage | Citable sources; explicit preprocessing; anti-leakage split protocol | Exact dataset names, filters, and split definitions | Train/test overlap via entities or time leakage |
| Baselines & fairness | Strong comparators; same features; same evaluation protocol | Which baselines; whether tuned; metric parity | Cherry-picked baselines; different protocols |
| Metric validity | Proper ranking + calibration reporting; CI/variance shown | AUPR/AUROC/accuracy specifics + uncertainty reporting | Only point estimates; no variance; improper negative sampling |
| Ablations & attribution | Component-wise ablations; interpretable drivers | Ablation list; effect sizes; where performance comes from | No ablations; performance “attributed” without tests |
| Generalization tests | Cross-dataset and/or cold-start settings | Which OOD splits and what “cold” means operationally | Same distribution only; entity overlap hidden |
| Reproducibility package | Code/data availability; hyperparameters; seeds | Any public repository; appendix hyperparameters | Missing seeds; closed source; underspecified preprocessing |
Custom summaries of the latest cutting edge Science research. Every Friday. No Ads.