Quickly verify claims by accessing the underlying experimental data and figures.
Press Enter ↵ to solve
Fuel Your Discoveries
"Biology is the study of complicated things that have the appearance of having been designed with a purpose."
- Richard Dawkins
Quick Answer
Copied
High-level verdict
Using matched germline and tumor genomes from ~6,000 TCGA patients the paper reports a pervasive landscape of germline loci that bias (1) the tissue site where cancers arise (395 loci) and (2) the probability of somatic mutations in known cancer genes (17 loci), totalling 412 germline–tumor associations — a solid, hypothesis‑generating, genome‑scale mapping of germline–somatic interactions with appropriate caveats about power and functional validation.
Source & key data:
Long Answer
Visual paper review — "Common genetic variation in the germline influences where and how tumors develop" (Carter & Ideker, 2017)
Visualize first — key numeric findings from the paper (raw study counts taken from the manuscript):
Concise graphical roadmap
What they did (methods, succinct)
Assembled matched germline and tumor genotypes from ~6,000 TCGA patients and performed cross‑cancer association scans: (a) loci associated with tumor site; (b) loci associated with somatic mutation status of known cancer genes.
Followed selected loci with expression (eQTL-like) analyses and clinical correlates (example: an 8q24.13 allele associated with breast cancer and ~10‑year younger age at diagnosis).
Performed example mechanistic deep dives (e.g., RBFOX1 intronic enhancer locus associated with higher SF3B1 somatic mutation rate; GNA11/STK11 region linked to PTEN mutation rate and mTOR pathway biology).
All summary numbers and methodological steps come from the paper and are reproduced here to allow critical appraisal and replication planning.
Direct source for numbers and claims:
Critical appraisal — strengths
Scale and matched data: nearly 6,000 matched germline–tumor samples is large for integrated analyses, enabling global screens for germline influence on somatic evolution ().
Wide scope: cross‑cancer design finds loci that bias multiple tumor types and nominates both site‑specific and somatic‑mutation associations—generates testable mechanistic hypotheses rather than lone marker lists.
Mechanistic follow-up: the paper does not stop at association — it probes eQTL/expression and pathway connections (e.g., RBFOX1→SF3B1 splicing link; GNA11/STK11→PTEN→mTOR) that make biological sense and are experimentally falsifiable.
Power & context dependence: cross‑cancer aggregation increases power for pan‑cancer signals but can mask tissue‑specific germline→somatic relationships; authors acknowledge underpower for tumor‑type stratified somatic associations (example: PIK3CA associations in ER+ breast cancer reported elsewhere were not recovered) ().
Cohort composition and generalizability: TCGA is a specific cohort (demographics, selection for large resections, survival bias) — signals may not generalize across populations or clinical settings; population stratification and ancestry differences require careful control and replication in diverse cohorts.
Confounding & causality: association ≠ causation. Germline allele–somatic mutation correlations can arise from linkage, population structure, environmental exposures correlated with genotype, treatment histories, or tumor microenvironment effects; functional validation is required to claim mechanistic causality.
Multiple testing & replication: genome‑scale screens risk false positives; the paper reports validated associations but independent cohort replication (outside TCGA) and functional perturbations are necessary to upgrade evidence strength.
Annotation gaps: not all loci are near plausible candidate genes; long‑range regulatory effects or noncoding mechanisms complicate mechanistic interpretation and require chromatin/TF binding data to resolve.
Evidence strength & what would falsify key claims
Evidence strength: supportive but provisional — associations are genome‑scale and biologically plausible in cases, yet broad mechanistic confirmation and external replication are missing. The claims would be falsified if large independent matched germline–tumor cohorts failed to replicate the reported loci or if loci‑specific functional tests show no effect on the nominated pathway or mutation susceptibility.
Paper’s own falsifiability statement: authors note that failing replication in independent cohorts or in tumor‑type specific designs would disprove generalization ().
Practical recommendations & immediate next steps
Replication: test top loci in independent matched germline–tumor cohorts (e.g., ICGC, prospective tumor sequencing projects with germline consent) stratified by ancestry and tumor type.
Functional follow-up: allele‑specific reporter assays, CRISPR base editing in isogenic cellular models, allele‑resolved ChIP/ATAC and Hi‑C to confirm regulatory mechanisms for noncoding loci (e.g., RBFOX1 enhancer → SF3B1 association).
Context specificity: do tumor‑type stratified association scans (ER+/ER− breast, lung adenocarcinoma subtypes) to uncover associations masked by pancancer aggregation.
Biological assays: test whether germline alleles alter mutational processes (e.g., DNA repair efficiency, replication stress, APOBEC activity) that could explain increased somatic mutation rates at specific genes.
Quick reproducible check-list for a replication team
Collect matched germline and tumor WGS/WES + uniform somatic calling pipelines (harmonize with TCGA methods and adjust for batch).
Recompute association tests controlling for ancestry PCs, tumor purity, age, sex, and smoking/known exposures where available.
Use permutation/null models to calibrate genome‑wide significance and estimate FDR.
Prioritize top loci with effect sizes and biological plausibility for functional assays.
Paper metrics (expert critical scores)
Novelty: 9 — Large-scale integrated germline→somatic landscape was novel in 2017 and remains influential for hypothesis generation ().
Quality: 8 — sound computational/statistical design and careful caveats, but limited replication and functional validation lower the score.
Generality: 8 — pancancer approach increases generality, but population and cohort limits remain.
Usefulness: 8 — provides a resource and many testable hypotheses; immediate translational impact is limited until functional validation.
Reproducibility: 7 — TCGA data are accessible but replication requires careful phenotype harmonization and independent cohorts.
Explanatory depth: 7 — provides mechanistic vignettes (splicing, mTOR) but not broad mechanistic closure for most loci.
Key insight
Common germline variation shapes both the tissue vulnerabilities (where tumors arise) and the mutational paths (which driver genes are mutated) — implying inherited genotype constrains somatic evolutionary trajectories and could, with robust replication and mechanistic work, inform risk‑stratified surveillance and personalized evolutionary forecasts of tumor development.
What evidence would change my confidence
Stronger: independent replication of the top loci in multiple cohorts and successful functional perturbation linking allele → molecular phenotype → increased somatic mutation susceptibility.
Weaker: failure to replicate top loci in diverse cohorts, or demonstration that associations are artifacts of population structure, batch effects, or tumor sampling bias.
Full source for all claims and numbers:
Feedback:
Updated: March 16, 2026
BGPT Paper Review
Study Novelty
90%
Large matched germline–tumor, cross‑cancer screen (n≈6,000 TCGA patients) mapping hundreds of loci linking inherited variation to tumor site and somatic mutation patterns — in 2017 this represented a novel, hypothesis‑generating landscape.
Scientific Quality
80%
Strong use of matched TCGA data and sensible follow-up analyses; appropriate caveats; limitations include lack of independent replication and limited functional validation for many loci; statistical multiple‑testing control and cohort biases must be considered.
Study Generality
80%
Pan‑cancer approach increases generality and nominates loci acting across tissues; however TCGA cohort composition and cross‑cancer aggregation mean tissue‑specific signals may be missed.
Study Usefulness
80%
Provides a rich list of candidate germline loci and mechanistic leads for follow-up experiments; useful for researchers planning functional validation or replication studies, but immediate clinical utility is limited pending replication.
Study Reproducibility
70%
Data are based on TCGA (accessible) and methods are described, but reproduction requires careful harmonization of pipelines, phenotype definitions, and population structure control; many loci need independent cohort tests.
Explanatory Depth
70%
Paper provides mechanistic vignettes (splicing, mTOR) that increase explanatory depth for select loci, but most associations remain descriptive pending functional dissection.
Preparing allele-stratified association replication pipeline: extracting TCGA matched germline/tumor genotypes, harmonizing phenotypes, computing per-locus association with tumor site and somatic mutation status, and producing colocalization/eQTL plots for top loci.
Get emailed when your analysis is done!
We'll email you the results when your analysis is finished.
Hypothesis Graveyard
All reported germline–somatic associations are purely population-structure artifacts — unlikely because some loci map to plausible biology (e.g., RBFOX1→SF3B1) and authors performed follow-ups; nevertheless population stratification remains a possible confounder.
Germline effects always act via cis‑eQTLs affecting the same gene as the somatic target — contradicted by distant-locus associations reported (germline loci not near affected cancer genes), indicating trans or pathway-level mechanisms.