Why BGPT?
logo

Review papers with raw data transparency

Quickly verify claims by accessing the underlying experimental data and figures.







Press Enter ↡ to solve



    Fuel Your Discoveries




     Quick Explanation



    Quick appraisal: This eLife paper (DOI 10.7554/eLife.92991) trains CNNs on ENCODE histone PTMs to predict gene expression, models dCas9–p300 as local H3K27ac deposition, and achieves high cross-gene rank concordance with dCas9-p300 activations (Spearman β‰ˆ0.8 across genes) but weaker intra-gene ranking; important strengths are large ENCODE training set, MNase-seq in HEK293T, and open code/data; main limitations are mechanistic assumptions about dCas9-p300 (restricted to H3K27ac), use of HEK293 as proxy for HEK293T in places, limited perturbation sample size (8 genes in HEK293T) and imperfect modeling of gRNA efficacy β€” see visual analysis below and suggested experiments to close gaps.

    Key citations:



     Long Explanation



    Visual paper review β€” Predicting the effect of CRISPR-Cas9-based epigenome editing (DOI: 10.7554/eLife.92991)

    Visual summary (figures first, short captions)
    • Figure A β€” Model vs baseline performance (CNN vs Ridge) on cross–cell-type endogenous expression.
    • Figure B β€” Predicted vs observed fold-change ranking for dCas9-p300 perturbations across genes (rank Spearman β‰ˆ0.8 reported).
    • Interpretation panels and targeted recommendations follow the figures.

    Concise evidence-driven critique (evidence inline)

    • Training data and baseline performance: The authors train CNNs on ENCODE histone PTM ChIP-seq (-log10 p-value tracks) and RNA-seq across 13 human cell types and show CNNs outperform ridge regression on held-out chromosomes and held-out cell types (median cross-cell-type Spearman ρ β‰ˆ0.53 vs 0.39) β€” this supports that non-linear spatial patterns around TSS contribute predictive signal
    • Normalization & data hygiene: They adapted S3norm to correct batch-like differences across ChIP-seq tracks β€” appropriate and necessary for comparability; method grounded in Xiang et al. (S3norm)
    • Perturbation model assumptions β€” strength and Achilles’ heel: The in silico perturbation models dCas9-p300 as (i) steric hindrance at guide site via MNase occupancy, (ii) local Gaussian H3K27ac deposition (Οƒ kernel), and (iii) acetylation only at nucleosome-occupied positions (m_i Γ— Gaussian Γ— exp(-5 m_j) Γ— Ξ»). This is a reasonable, testable simplification consistent with prior work showing dCas9-p300 primarily increases H3K27ac nearby, but it ignores (a) p300 promiscuity and non-histone acetylation, (b) cross-talk with other PTMs (H3K4me3 etc.), and (c) potential recruitment of transcription factors or chromatin remodelers by dCas9 complexes β€” all of which the authors acknowledge as possible causes of lower within-gene ranking accuracy
    • Validation & generalization: The model ranks cross-gene fold-changes in HEK293T dCas9-p300 experiments with Spearman β‰ˆ0.8 (strong); but within-gene gRNA ranking is weaker and the Perturb-seq K562 validation shows moderate correlation (~0.47), indicating generalization depends on gene- and locus-level context and that gRNA efficacy variance and unmodeled biology remain important
    • Reproducibility and openness: The paper provides code and data (GitHub repository, SRA/GEO accessions for MNase-seq and Perturb-seq) which greatly helps reproducibility; computational methods (architectures, hyperparameters, ensemble of 100 replicates) and experimental protocols (CUT&RUN, MNase-seq, qPCR) are described β€” reproducibility rating is high, though full re-run requires ENCODE data download and system resources

    Where the claims are strongest

    1. The models reliably learn established spatial relationships between marks and expression (e.g., promoter H3K4me3/H3K27ac positive, H3K27me3/H3K9me3 negative) β€” consistent with prior literature on histone marks predicting expression
    2. Cross-gene ranking of dCas9-p300 efficacy is well predicted β€” useful for selecting which genes are likely to be responsive to acetyltransferase-based activation.

    Key limitations, blindspots & risks

    • Mechanistic simplification: modeling dCas9-p300 as only boosting H3K27ac at nucleosome-occupied positions omits non-histone acetylation, TF recruitment, and downstream chromatin remodeling that can vary across neighboring guides (cites p300 promiscuity)
    • gRNA efficacy variability: observed large differences between neighboring gRNAs (even within 50 bp) indicate that sequence- and locus-specific factors (PAM context, nucleosome breathing, local TF occupancy) strongly modulate efficacy and are not fully captured by current scoring metrics β€” authors note this and call for screens to map local efficacy
    • Training–perturbation mismatch: Endogenous PTM distributions used for training may not span extreme acetylation introduced by a strong dCas9-p300 perturbation (extrapolation risk for neural nets) β€” neural networks can fail to extrapolate beyond training ranges

    Actionable suggestions (experiments & modeling) to improve predictive power

    1. Generate a systematic paired dataset: For several genes, tile gRNAs in dense windows (e.g., every 5–10 bp across Β±500 bp), measure both local CUT&RUN or ChIP-seq for H3K27ac/H3K4me3 after each guide and RNA output (qPCR or Perturb-seq). This would create direct training data mapping guide β†’ PTM change β†’ expression change and break the current two-step assumption.
      Why: removes reliance on modeled perturbations and will let ML learn true mapping from guide to PTM perturbation patterns.
    2. Measure non-histone acetylation/trans effects after dCas9-p300 (proteomics time-course) to quantify p300 promiscuity and potential trans-acting changes (Weinert et al. style)
    3. Refine gRNA efficacy models for epigenome editors: combine sequence-based features, MNase occupancy, local TF ChIP signal, DNA methylation, and experimentally-measured on-target CUT&RUN H3K27ac to create an epigenome-specific guide scoring function.
    4. Train models to predict PTM change (guide β†’ delta PTM tracks) first (supervised using CUT&RUN), then map PTM-change β†’ expression with a separate expression model; compare to current direct perturbation-through-simulation pipeline.

    Short checklist for readers who want to re-run or extend the study

    • Download ENCODE ChIP-seq & RNA-seq tracks (13 cell types listed in paper) and Avocado imputations used for missing marks; apply S3norm parameters as described (IMR-90 as reference)
    • Obtain MNase-seq (PRJNA892960) and Perturb-seq (GSE255610) for exact replication.
    • Start with the 10 kb TSS-centered context and CNN architecture described (5 conv blocks, 32 kernels width 5, pooling, 16-unit dense) and ensemble across replicates for stability.

    Conclusions β€” balanced and evidence-weighted

    The paper convincingly shows that histone PTM patterns across cell types are predictive of endogenous gene expression and that a model of local H3K27ac deposition plus an expression predictor can rank expected dCas9-p300 fold-changes across genes with strong agreement (Spearman β‰ˆ0.8). However, predicting within-gene gRNA-level efficacy remains challenging because the mechanistic link between a single-guide–induced histone change and transcription likely involves additional variables (non-histone acetylation, TF recruitment, nucleosome breathing, off-target acetylation) not fully captured in the perturbation model. Overall: high-quality, transparent, and useful advance; next steps require richer paired perturbation PTM datasets to close mechanistic gaps.



    Feedback:   

    Updated: March 09, 2026

    BGPT Paper Review



    Study Novelty

    90%

    The paper combines genome-scale histone PTM→expression models with an explicit, MNase-weighted in silico model of dCas9-p300 perturbations and validates predictions experimentally (MNase-seq + multi-gRNA qPCR + Perturb-seq), which is a novel, integrative approach beyond prior separate modeling or dCas9-p300 studies.



    Scientific Quality

    80%

    High-quality dataset integration, transparent code/data sharing, appropriate normalization (S3norm), thorough CNN architecture and cross-cell validation, and experimental validation; minor concerns include using HEK293 data as a proxy for HEK293T in spots, limited perturbation sample size (8 genes) for general mechanistic claims, and known extrapolation limits for ML models β€” no evidence of prompt injection or data tampering detected.



    Study Generality

    80%

    The modeling framework (PTM→expression) and the perturbation model are broadly applicable to other epigenome editors and cell types, but current conclusions about dCas9-p300 generalize best to promoter-proximal acetyltransferase-based activators and cell lines similar to training set; extrapolation to very different chromatin states or in vivo tissues requires more perturbation data.



    Study Usefulness

    90%

    Provides a practical computational pipeline and experimental design guidance for labs planning epigenome editing; the ability to rank likely gene responders is immediately useful for prioritizing targets and gRNAs for dCas9-p300 activation studies, and availability of MNase-seq and code makes replication and extension feasible.



    Study Reproducibility

    90%

    Methods are described in sufficient detail; code and datasets (GitHub, SRA/GEO accessions) are provided; the CNN architecture, training hyperparameters, normalization strategy, and ensemble averaging are explicit. Reproducibility relies on access to ENCODE tracks and compute (GPU) but overall is strong.



    Explanatory Depth

    80%

    The study interprets learned spatial PTM features mechanistically (promoter vs gene-body effects), formulates a biologically plausible dCas9-p300 perturbation model, and tests alternatives via parameter sweeps β€” it advances mechanistic understanding though does not fully resolve non-histone or trans-acting mechanisms.


    🎁 Authors: Collect 500 Free Science Tokens (β‰ˆ $50.0 USD)

    Claim My Author Tokens

    Use for 125 days of free BGPT access (4 tokens = 1 day) or trade/sell (β‰ˆ $50.0 USD)

     Top Data Sources ExportMCP



     Analysis Wizard



    Building and evaluating a two-stage model: (1) predict guide-induced H3K27ac changes from local sequence+MNase+TF profiles, (2) predict expression from PTM profiles; using HEK293T MNase-seq and ENCODE PTMs for supervised training.



     Hypothesis Graveyard



    dCas9-p300 acts purely through H3K27ac deposition at nucleosomes β€” falsified as p300 has broad acetylome effects and trans-acting impacts that can alter activation beyond local H3K27ac patterns.


    Existing gRNA on-target scores (sequence-based) fully explain epigenome editing variability β€” falsified because neighboring gRNAs with similar scores produce widely different activation, implicating unmodeled chromatin features and guide-local context.

     Science Art


    Paper Review: Predicting the effect of CRISPR-Cas9-based epigenome editing Science Art

     Science Movie



    Make a narrated HD Science movie for this answer ($32 per minute)




     Discussion








    Get Ahead With Science Insights

    Custom summaries of the latest cutting edge Science research. Every Friday. No Ads.


    My BGPT