Why BGPT?
logo

Assess an author's data and outputs

See the raw experimental evidence behind an author's publications and reproducibility signals.







Press Enter ↵ to solve



    Fuel Your Discoveries




     Quick Explanation



    Concise verdict

    David Baker is a world-leading, highly‑productive pioneer in computational protein design and structure prediction with extraordinary citation impact and a sustained record of experimentally validated de novo designs — strengths and limitations summarized below with primary evidence.

    • High impact demonstrated by multiple landmark papers that introduced Rosetta, RFdiffusion and RFdiffusion2 and experimental validations across enzyme, binder and assembly design
    • Deep‑learning–based advances (RFdiffusion) that generalize de novo design workflows and show experimental binders/assemblies
    • State-of-the-art atom-level enzyme design (RFdiffusion2) and metallohydrolase successes with X‑ray validation and high kcat/KM values

    Want a deeper, interactive author review (citation timeline, productivity vs citations, top-paper breakdown)? Click to expand the visual review below.




     Long Explanation



    Author Review — David Baker (evidence-based, critical)

    Core summary visualization

    Three concise visual figures: (1) annual works & citations trend (OpenAlex extracted), (2) distribution of Baker's top-paper citation impact, (3) experimental validation success rates across representative design campaigns (RFdiffusion / RFdiffusion2 / metallohydrolases / enzyme campaigns).


    Evidence-based analysis (visual-first)

    1. Productivity & impact: OpenAlex metadata reports ~1,580 works and a cited_by_count ≈140,908 with h-index ≈189 — placing Baker among the most-cited leaders in computational structural biology. These numbers match the long, continuous output spanning Rosetta-era papers to modern diffusion-based design (RFdiffusion/RFdiffusion2)
    2. Experimental validation rate: Baker-led groups routinely move designs to experiments (nsEM/cryoEM/X-ray, binding assays, kinetics). RFdiffusion reported binder hit rates (~19% in five targets) and multiple structural corroborations; RFdiffusion2 reported atom-level enzyme scaffolding with in vitro activity across multiple reactions, including metallohydrolases with kcat/KM up to 53,000 M−1s−1 and X-ray structures (PDB 9PYL/9PYJ) — strong evidence of design→function pipeline working at scale
    3. Methodological breadth: Baker’s lab spans classical physical modeling (Rosetta) to modern deep generative models (RFdiffusion, RFdiffusion2), sequence-design networks (ProteinMPNN), ligand‑aware design (LigandMPNN), and integration with AlphaFold/AF3 for filtering — a pipeline combining physics, ML, and rigorous structural validation

    Strengths (evidence-first)

    • High translational throughput: many design campaigns progress to expression, structural determination (cryoEM/X-ray) and kinetic/binding assays (see RFdiffusion/RFdiffusion2 papers). Experimental data and code are shared in multiple cases, supporting reproducibility and reuse
    • Method integration: Combines physical modeling and ML; e.g., RFdiffusion uses RoseTTAFold backbone priors and AF2/ESMFold for validation; RFdiffusion2 uses DFT-derived theozymes to link chemistry to scaffold generation, improving catalytic preorganization and experimental yields
    • Open tools & data: Rosetta and multiple Baker-lab tools (RFdiffusion, RFdiffusion2, ProteinMPNN, LigandMPNN) have public code repos or data deposits, facilitating community adoption and independent replication

    Limitations, blindspots & critical caveats

    • Experimental throughput vs depth: many campaigns screen dozens–hundreds of designs but validate a smaller subset experimentally; success rates vary by task and remain far from universal — e.g., binder hit rates ~19% across five targets and limited in vivo translation reported in some campaigns
    • In vitro activity vs native enzymes: enzyme kcat/KM values produced by designed enzymes, while remarkable for de novo designs (e.g., RFdiffusion2 metallohydrolases up to 53,000 M−1s−1), generally remain below or variable compared to evolved natural enzymes for many chemistries; in vivo functionality and stability are still open in many cases
    • Dependence on in silico filters: work relies heavily on AF2/ESMFold/Chai-1 filtering; these predictors can produce false positives/negatives and are trained on PDB-like folds, potentially biasing selection toward what predictors prefer (PDB-centric bias)
    • Publication/selection bias: high-profile successes are emphasized; negative design campaigns and failed routes are less visible, an understandable but important bias when assessing generality.

    Overall scientific judgment (evidence-weighted)

    Weighing high experimental validation rates for selected tasks, public code/data, breadth of methods (physics + ML + chemistry), and very strong community uptake, Baker demonstrates exceptional scientific strength in computational protein design. Key concerns remain about generalization beyond tested chemistries/targets and the selective nature of experimental follow-up — but those are active, explicitly discussed limitations in the cited papers.

    Representative citations supporting these points are below:

    What would change this assessment (falsification tests)

    • Independent replication by multiple labs showing RFdiffusion/RFdiffusion2 designs fail to fold/function at similar rates or that predictor-filtering systematically overestimates success would markedly reduce confidence.
    • Large-scale negative-result repositories (many failed campaigns) becoming public would lower perceived generality; conversely, broad demonstrations of in vivo function/stability and substrate generality would increase confidence.

    Useful links (primary sources)



    Feedback:   

    Updated: February 09, 2026

    BGPT Author Review



    Scientific Quality

    100%

    Extensive track record of pioneering computational protein design, consistent experimental validation (structures and functions), major methodological innovations (Rosetta → RFdiffusion → RFdiffusion2), high citation impact and broad community tool adoption; potential blindspots are methodological biases and selective experimental follow-up but these do not materially reduce the author's core scientific strength.



    Communication Quality

    90%

    Publishes clear method papers with code/data releases, comprehensive methods sections, structural deposits and reproducible pipelines; writes accessible review articles and provides community tools, enabling adoption; occasional high-level complexity requires domain expertise.



    Author Novelty

    100%

    Repeatedly introduces novel paradigms (Rosetta-based de novo enzymes, de novo fold design, diffusion-based backbone generation, atom-level motif conditioning) that shift field capabilities; sustained innovation across decades.



    Scientific Rigor

    90%

    High experimental rigor with structural validation (X-ray/cryoEM), kinetics and mutational controls; transparent code/data deposition; however, selection bias toward successful designs and reliance on in silico filters are identifiable limitations.

     Top Data Sources ExportMCP



     Analysis Wizard



    Pulls OpenAlex-like yearly counts and generates reproducible plots + per-paper citation tables to quantify productivity and impact for Baker's works from provided counts_by_year data.



     Hypothesis Graveyard



    The hypothesis that diffusion-based generation alone can replace physical energy functions for catalysis without quantum-chemistry conditioning is unlikely — RFdiffusion2 shows atom-level DFT-derived theozymes remain essential.


    The idea that AF2-based pAE alone is sufficient to guarantee experimental activity is falsified — AF2 improves folding predictions but does not ensure catalytic geometry or kinetics.

     Science Art


    Author Review: David Baker Science Art

     Science Movie



    Make a narrated HD Science movie for this answer ($32 per minute)




     Discussion








    Get Ahead With Science Insights

    Custom summaries of the latest cutting edge Science research. Every Friday. No Ads.


    My BGPT