Evidence-derived effect size distribution: a case study of caffeine ergogenics

Publication

Abstract

In statistical analyses, the magnitudes of the observed effects (e.g., the mean difference between treatments) are expressed as effect sizes. To discern small, medium, and large effect sizes, the rule-of-thumb thresholds (d 0.2, 0.5, and 0.8, respectively) suggested by Cohen are frequently used.

However, the effect size binning should not be universal but should reflect the effect size distribution (ESD) in particular research topics split into the 25th, 50th, and 75th quartiles. Against these distributions, effect sizes should be compared to determine whether they are smaller, average, or larger than those in comparable studies.

Multiple research fields have reported that evidence-based ESDs are considerably smaller than the traditional thresholds. The effect of caffeine on performance is one of the most studied topics in sports science, where traditional thresholds are also used.

We extracted 381 effect sizes from 12 meta-analyses across 7 sports domains (e.g., aerobic performance, muscle strength) to investigate the potential discrepancy between traditional and evidence-based thresholds. The ESD showed effect sizes of 0.13, 0.26, and 0.55 as smaller, average, and larger than average, respectively, with marginal variation between sports domains.

Similar to other research fields, traditional effect size thresholds misrepresent the ESD observed in caffeine ergogenic studies.

Keywords

Cohen’s d Hedge’s g SMD significance sample size reliability