The paper proposes ModelDetector, a CNN (ResNet-18) that predicts among 9 amino-acid substitution models from summary statistics (pairwise and triplet-derived 20Γ20 frequency matrices and relative rate features). It reports ~97.45% accuracy vs ModelFinder ~97.88% on simulation, and substantial speedups on long alignmentsβe.g., minutes for ~1,000,000 sites versus thousands of seconds for ML model selection.
Key skepticism: results are trained/tested on simulations generated from real alignments, so performance on truly heterogeneous real alignments (including model mixture across sites, non-stationarity, rate heterogeneity beyond whatβs simulated, and alignment artifacts) remains the main uncertainty.
Goal: Replace computationally expensive ML model selection for amino-acid substitution models with a fast deep-learning classifier trained on alignment-derived summary statistics.
Scientific distinction (known vs inferred): In this work, the mapping from summary stats β model class is inferred by supervised learning; correctness is demonstrated only on the simulated-data generating family. Generalization to real protein evolution regimes that violate simulation assumptions is still an open question.
Model set (9 classes):
These model families (e.g., JTT, WAG, LG) are classical empirical AA substitution models, while Q.pfam and clade-specific models are estimated via methods like QMaker (as referenced in the paper).
Key implication: the deep model is effectively trained to invert a particular simulator+estimation pipeline, which may or may not match real complexities.
Reported average test accuracies (classification correctness): pModelDetector 96.78%, ModelDetector 97.45%, ModelFinder 97.88%.
Runtime comparison for 50 taxa and varying alignment lengths (includes summary-statistics creation + prediction for DL).
The paper reports high correlations between summary statistics computed from simulations and those computed from the real alignments they used for simulation parameterization. For example, it reports average correlation of F2 matrices ranging from 0.88 (yeast) to 0.94 (mammal), and >95% of alignments have correlation >0.8.
Skeptical interpretation: High correlations suggest that the specific simulator-derived summary statistics align well with statistics from the alignments used to parameterize simulation trees. But that does not guarantee that the learned mapping will hold under other sources of mismatch (e.g., different among-site rate distributions, non-stationarity, mixture models, alignment errors, compositional heterogeneity). The paper itself highlights simulator-realism concerns and the lack of real ground-truth labels.
Use BGPT to drill into (i) where simulation realism may fail and (ii) how to design stronger evaluation beyond βsimulation accuracyβ.
Custom summaries of the latest cutting edge Science research. Every Friday. No Ads.