Why BGPT?
logo

Review papers with raw data transparency

Quickly verify claims by accessing the underlying experimental data and figures.







Press Enter ↡ to solve



    Fuel Your Discoveries




     Quick Explanation



    ABodyBuilder3 concise verdict

    ABodyBuilder3 is an optimized, scalable reimplementation of ABodyBuilder2 that (1) introduces ProtT5 language-model embeddings for residue representation, (2) adds a pLDDT uncertainty head to replace compute-heavy ensembles, and (3) improves CDR loop RMSD (notably CDRH3 marginally) while accelerating training/inference and providing code and weights publicly GitHub and Zenodo availability

    Primary source: Kenlay et al. Bioinformatics 2024




     Long Explanation



    Detailed critical review of ABodyBuilder3

    Summary of what the paper does

    ABodyBuilder3 is presented as an improved, more scalable antibody variable region structure predictor derived from ABodyBuilder2. Key changes are: vectorized/OpenFold-style implementation for speed and bf16/mixed precision training; optional ProtT5 protein language model residue embeddings (concatenated heavy and light chain embeddings); a pLDDT per-residue uncertainty head that replaces the previous ensemble-derived uncertainty; and careful structure relaxation using OpenMM or YASARA to improve stereochemistry and accuracy. The authors release code and model weights publicly GitHub and Zenodo (data and weights)

    Direct textual excerpt from the paper:

    What works well (strengths)

    • Scalability and engineering: the authors reimplemented ABodyBuilder2 with vectorization and mixed/bf16 precision yielding a reported >3x speedup and multi-GPU scaling, which matters for screening large candidate sets.
    • Reproducibility and openness: code and model weights are publicly available on GitHub and Zenodo which supports reproducibility and reuse .
    • Practical uncertainty estimate: adding a pLDDT head gives a single-model intrinsic confidence estimate that correlates with RMSD and can substitute an ensemble, lowering compute cost during inference .
    • Targeted CDR improvements: modest but measurable RMSD gains for CDR loops (CDRH3 reduced from ~2.54 to ~2.42 A in their test), with LM embeddings giving further marginal gain (CDRH3 ~2.40 A), which is meaningful because CDRH3 is the hardest region to model.

    Concerns, limitations and blindspots

    1. Magnitude of accuracy gains The improvements in RMSD, while consistent, are modest (CDRH3 improvement ~0.12 A vs ABodyBuilder2; language-model advantage is marginal and often statistically non-significant according to the authors). That makes the claim of "state-of-the-art" improvement true but incremental and conditional on evaluation choices .
    2. Evaluation dataset choices and potential selection bias The authors filter SAbDab heavily (remove nanobodies, high-res cutoff, remove ultra-long CDRH3 >30, remove species occurring >15 times) and limit validation/test to human antibodies; these sensible curation choices reduce noise but also limit generalization to non-human repertoires, ultra-long loops (e.g., bovine cattle antibodies) and low-resolution structures. This must be kept in mind when applying ABodyBuilder3 outside the curated regime 3.5 standard deviations from the mean for any of the six summary statistics given by ABangle… We also filter out ultra-long CDRH3 loops by removing any sequence with a CDRH3 of over 30 residues. url=https://dx.doi.org/10.1093/bioinformatics/btae576 number-citations=# Source Citations descriptive-anchor-text=Dataset filtering evidence-strength=πŸ₯ˆ (Moderate Evidence)>.
    3. Possible dataset contamination via LM pretraining The authors note antibody-specific LMs can introduce pretraining contamination; ProtT5 is trained broadly which reduced that risk but does not eliminate it. Explicit, careful leakage analysis (exclusion of structures/sequences used during LM pretraining) would strengthen claims that gains are not from data leakage .
    4. Metrics are RMSD-centric RMSD is useful but incomplete: developability, epitope contact accuracy, paratope geometry, and energy landscape properties are not reported. For therapeutic design, downstream metrics (binding interface accuracy, predicted developability features) are critical and not assessed here.
    5. Uncertainty calibration and use pLDDT correlates with RMSD and the authors propose thresholds (pLDDT>85 retains ~32% with >80% having CDRH3 RMSD <2 A). However, pLDDT calibration (reliability diagrams, Brier scores) and how to use pLDDT in workflows (trade-off coverage vs accuracy) require more elaboration for practitioners .

    Technical appraisal of methods

    Architecture: ABodyBuilder3 keeps the ABodyBuilder2/AlphaFold-Multimer inspired pipeline: per-residue node features, relative positional edge encodings, eight sequential structure modules with invariant point attention and backbone updates, and torsion-angle-based full-atom reconstruction. Training losses: FAPE (with clamping), torsion-angle loss, structural violation penalties; optimizer RAdam with cosine annealing restarts and early stopping. These are sensible, modern design choices for antibody-focused structure prediction .

    Performance summary (numerical highlights)

    The authors report mean RMSD per region on their test set. Representative numbers (mean RMSD in Angstroms) are given in Table 1 of the paper; key facts:

    • ABodyBuilder2 CDRH3 RMSD ~2.54 A; ABodyBuilder3 CDRH3 ~2.42 A; ABodyBuilder3-LM ~2.40 A.
    • Framework and several CDRs show modest improvements; some per-region gains are dependent on relaxation method (YASARA vs OpenMM) .
    • pLDDT correlates with RMSD; Pearson correlations per region improved for ABodyBuilder3-LM relative to ensemble-derived uncertainty in ABodyBuilder2 (e.g., CDRH3 Pearson up to ~0.73 for ABodyBuilder3-LM) .

    Reproducibility and resource availability

    Code and weights are public (GitHub, Zenodo), training/evaluation details (batch size, optimizer, training schedule, clamp values, validation/test sizes) are reported, and common toolchains are used (ANARCI for IMGT numbering, SAbDab dataset), enabling other groups to reproduce results given adequate compute. The paper includes explicit dataset curation steps and training hyperparameters, supporting reproducibility .

    How practitioners should use ABodyBuilder3

    • Use for high-throughput screening of human-variable-region antibody candidates where moderate CDR accuracy and calibrated per-residue uncertainties suffice.
    • Apply pLDDT thresholding to select a subset of high-confidence models for downstream intensive computations (docking, MD, developability predictions) as per the authors guidance (pLDDT>85 retains ~32% with high accuracy on CDRH3) .
    • Do not expect large improvements on edge-case antibodies (non-human, ultra-long CDRH3, heavily glycosylated or antibody-antigen complex-induced conformational change) without further validation.

    What would falsify the main claims

    Concrete disproof scenarios the authors themselves acknowledge: (1) show no RMSD improvement over ABodyBuilder2 on independent, withheld datasets; (2) show pLDDT correlates poorly with RMSD when evaluated on broader antibody repertoires; (3) demonstrate that ProtT5 gains come from LM pretraining contamination rather than improved representations. The paper lists similar possible falsifications and suggests self-distillation as a next step.

    Suggested improvements and follow-up experiments (practical)

    1. Evaluate on external non-human antibody sets and ultra-long CDRH3 cases (bovine) to quantify generalization limits.
    2. Provide calibration curves and reliability diagrams for pLDDT and compare to ensemble-derived uncertainties (Brier score, ECE) to quantify calibration.
    3. Report interface/contact metrics for antibody-antigen complexes (even if only predicted antigens) to assess paratope accuracy for therapeutic applications.
    4. Run ablation studies isolating ProtT5 effect vs one-hot across diverse lengths and sequence identities, and include leakage analysis to rule out pretraining contamination.

    Overall assessment

    ABodyBuilder3 is a solid, well-engineered incremental advance in antibody structure prediction with good reproducibility and useful practical features (speed, uncertainty head, LM embeddings). It improves CDR modelling modestly and provides tools and data for adoption. The major limitations are the modest size of the RMSD gains, restricted evaluation regime, and the need for better calibration/validation of the uncertainty head across diverse repertoires.

    Selected citation




    Feedback:   

    Updated: October 05, 2025

    BGPT Paper Review



    Study Novelty

    80%

    The paper combines established AlphaFold/ABodyBuilder architectures with modern engineering, LM embeddings, and a pLDDT head; the integration and scaling improvements are novel in the antibody-specific space but conceptually build on recent advances.



    Scientific Quality

    90%

    High methodological rigor: clear dataset curation, explicit training protocol, public code and weights, sensible baselines, and appropriate ablations; modest limitations are acknowledged (dataset filtering, LM contamination risk) but do not undermine technical quality.



    Study Generality

    60%

    The approach generalizes to antibody variable regions within the curated human-like regime and to large-scale screening, but strict filtering and human-only validation limit generality to non-human repertoires and ultra-long loops.



    Study Usefulness

    80%

    Practically useful: faster training/inference, public code/weights, and an intrinsic uncertainty head make it valuable for industrial and academic antibody screening and prioritization workflows.



    Study Reproducibility

    90%

    High reproducibility: public code and weights, explicit preprocessing and training hyperparameters, and use of standard datasets (SAbDab) and tools (ANARCI). Full reproduction requires similar compute resources.



    Explanatory Depth

    70%

    Good technical detail on architecture, loss functions, and training but limited mechanistic analysis on why language-model embeddings help particular loop classes and limited downstream functional validation.


    🎁 Authors: Collect 435 Free Science Tokens (β‰ˆ $43.5 USD)

    Claim My Author Tokens

    Use for 108 days of free BGPT access (4 tokens = 1 day) or trade/sell (β‰ˆ $43.5 USD)

     Top Data Sources ExportMCP



     Analysis Wizard



    Downloading ABodyBuilder3 model weights and evaluating per-region RMSD and pLDDT calibration on an uploaded test set (SAbDab subset), producing reliability diagrams and per-CDR statistics.



     Hypothesis Graveyard



    LM embeddings directly encode 3D coordinates for CDRH3 β€” falsified because gains are marginal and model still requires structure modules and torsion-angle reconstruction.


    Ensemble-based uncertainty is always superior to pLDDT β€” falsified here since single-model pLDDT showed higher Pearson correlation with RMSD than the previous ensemble in some regions, though calibration caveats remain.

     Science Art


    Paper Review: ABodyBuilder3: improved and scalable antibody structure predictions Science Art

     Science Movie



    Make a narrated HD Science movie for this answer ($32 per minute)




     Discussion








    Get Ahead With Science Insights

    Custom summaries of the latest cutting edge Science research. Every Friday. No Ads.


    My BGPT