Quickly verify claims by accessing the underlying experimental data and figures.
Press Enter ↵ to solve
Fuel Your Discoveries
"The more we learn about the world, and the deeper our learning, the more conscious, specific, and articulate will be our knowledge of what we do not know, our knowledge of our ignorance."
- Karl Popper
Quick Explanation
Copied
Concise verdict: Zheng et al. (2005) present a practical, well-argued extension of averaged-power calculations to Cochran–Armitage (CA) tests in case–control studies and to affected-sib-pair linkage tests, showing modest but real reductions in naive (expected-count) power and offering computational shortcuts (O(N^2) vs O(N^4)) and sensitivity analyses for unknown allele frequency or genetic model — see methods, numerical tables and CardioGene example below for evidence and limitations
Long Explanation
Visual paper review — "On Averaging Power for Genetic Association and Linkage Studies" (Zheng et al., 2005)
Key claim: averaging power over the randomness in genotype counts or IBD counts gives more realistic (usually lower) power than plugging-in expected counts; using CA trend test reduces computation and yields averaged-power formulas dependent on single population allele frequency; sensitivity to assumed allele frequency and to genetic model should be examined.
Notes: points plotted are explicit values reported in the CardioGene example: planned averaged power at assumed p=0.3 (82.2%), realized averaged power if true p_t=0.33 (82.4%), and if p_t=0.1 (49.3%) — illustrating large sensitivity when true allele frequency differs substantially from planning assumption
Interpretation: Table 2 shows that averaged powers m and p are slightly (<1%) below naive power computed at expected counts, while p averaged over beta priors can reduce power further (several percentage points) depending on prior variance — supporting the authors' assertion that naive planning may overestimate realized power
Critical appraisal (concise)
Strength — Practical, computationally useful: Replacing the 2-df chi-square averaging by CA trends yields a big asymptotic/speed advantage (authors reduce combinatorics to O(N^2)), enabling routine sensitivity studies for sample planning
Strength — Clear sensitivity diagnostics: The CardioGene illustration gives a concrete example showing large power loss if real p differs substantially from planning p (e.g., 82%→49%), motivating inclusion of prior uncertainty in design
Strength — Linkage insights: For ASP linkage tests, the paper systematically applies averaging to IBD counts and finds the MERT (T0.355) retains robust averaged power across unknown models — useful guidance for designing ASP linkage analyses
Limitation — Asymptotics & finite-sample fidelity: The conclusions rely on normal/chi-square asymptotics for the CA and score tests; finite-sample deviations (low counts, rare alleles, strong HWE departures) could invalidate normal approximations and change averaged-power differences; authors note this but do not fully map boundaries of validity
Limitation — Simplifying assumptions: HWE assumption, single-marker setting, no LD, no genotyping error, and priors (beta) are simple — in modern GWAS/WES contexts (many correlated markers, population stratification, genotype uncertainty) extension is nontrivial and not demonstrated here.
Opportunity — Integration with modern designs: Methods here are directly useful when planning single-marker candidate studies or ASP screens; to scale to GWAS, one needs multi-marker corrections (multiple testing), LD-structure-aware averaging, or resampling-based averaged-power pipelines (authors' O(N^2) tricks still helpful for per-marker approximations).
What would change my evaluation?
Simulation studies showing where asymptotic approximations fail: low MAF, extreme imbalance, or HWE violation and how large the bias becomes.
Extension of averaging formulas to genome-wide multiple-testing contexts (Bonferroni/perm-based) or to multi-marker tests with LD.
Empirical validation on real GWAS arrays or sequencing-derived genotype data (with genotype calling uncertainty) to quantify practical differences in realized study power.
Direct citations
Core source:
Actionable recommendations for practitioners
When planning single-marker case–control studies, compute averaged power E_n[(n)|p] (and E_p{E_n[.]}) using CA test formulas to avoid modest overoptimism from expected-count plug-in power; if p uncertain, use beta priors and report sensitivity curves (example figure above)
For affected-sib-pair linkage, prefer robust tests (e.g., T0.355) if model is unknown; still compute averaged-power across plausible x priors to evaluate worst-case power loss.
For modern GWAS, adapt averaged-power ideas to per-marker estimates (accounting for local MAF and HWE departures), and combine with multiple-testing corrections or permutation to assess study-level discovery probability.
Reproducibility & limitations (brief)
Methods are analytic and numerical; tables give concrete parameter values (sample sizes, prevalences, genotype relative risks, beta priors) so replication of the reported numbers is straightforward with the paper formulas. However, authors did not publish code; implementation requires careful matching of formulas (e.g., CA asymptotic variances) and numerical summation; finite-sample exact enumeration or simulation would strengthen reproducibility for small counts or rare variants
If you want a full numerical reimplementation (exact reproduction of all tables and continuous sensitivity plots across allele frequency and prior variance), click to run an automated pipeline that will re-calculate all averaged-power tables and produce high-resolution figures.
Feedback:
Updated: March 06, 2026
BGPT Paper Review
Study Novelty
70%
Extends a recent (Ambrosius et al. 2004) averaging-power idea to two widely used tests: the CA trend test for case–control association (giving a computational speed-up) and affected-sib-pair linkage (averaging over IBD-count randomness and model uncertainty); the conceptual advance is moderate but practically useful for study planning.
Scientific Quality
80%
Mathematically clear derivations, concrete numerical examples, and relevant sensitivity analyses; reliance on asymptotic approximations and lack of publicly shared code are limitations but do not invalidate core conclusions; tables provide reproducible parameter lists.
Study Generality
70%
Applicable to single-marker association and ASP linkage designs broadly, but not directly to genome-wide correlated-marker settings or to departures from HWE/genotyping error without adaptation.
Study Usefulness
80%
Useful practical guidance for power/sample-size planning in candidate-gene case–control and ASP linkage studies; provides actionable sensitivity analysis procedures and a robust-test recommendation (T0.355) for linkage.
Study Reproducibility
80%
All formulas and numerical parameter values are provided in tables and examples so results can be replicated by re-implementing the summations; however, no code/data archive is supplied, and finite-sample exact simulations are not included.
Explanatory Depth
80%
Derivations for CA averaged power and for score-based linkage tests are thorough and linked to practical implementations; the paper provides both theoretical formulas and illustrative numerical studies, but does not deeply study all finite-sample breakdown cases.
Recomputing averaged power tables and sensitivity surfaces (p and prior variance) and running Monte Carlo finite-sample checks to validate asymptotic approximations using the paper's parameters.
Get emailed when your analysis is done!
We'll email you the results when your analysis is finished.
Hypothesis Graveyard
Hypothesis: Averaged-power differences are always negligible (<1%) — falsified by CardioGene example showing 82%→49% for large allele-frequency misspecification.
Hypothesis: CA test always dominates 2-df chi-square for averaged power — only true when a trend model holds; for truly non-monotone genotype risks, 2-df may be preferable.