1) Generalization & leakage risk (ML-for-proteomics)
Transformer and PTM prediction quality is often highly dependent on how train/test splits are constructed, whether peptide similarities cause information leakage, and whether evaluation covers instrument/method shifts. Even when results look strong, the key disambiguator is whether performance remains stable on truly independent datasets and unseen modification chemistries.
2) Baseline strength & metric choice
βImproved identificationβ can be sensitive to decision thresholds (precision/recall operating points), rescoring pipeline details, and FDR estimation assumptions. The quality of the claimed improvement depends on whether baselines are tuned fairly and whether significance persists across multiple settings.
3) PTMs are a long-tail problem
PTM discovery/prediction can overfit common/seen PTMs. For zero-shot settings, claims depend critically on whether rare/labile PTMs are represented and whether augmentation truly corresponds to chemistry rather than superficial label perturbations.
Note: I cannot verify leakage-controls or the exact baseline configurations from the input alone; doing so requires reading the full methods sections of each cited paper.