The dual test: Safeguarding p-value combination tests for adaptive designs
Journal article, 2010
Many modern adaptive designs apply an analysis where p-values from different stages are weighted together to an overall hypothesis test. One merit of this combination approach is that the design can be made very flexible. However, combination tests violate the sufficiency and conditionality principles. As a consequence, combination tests may lead to absurd conclusions, such as 'proving' a positive effect while the average effect is negative. We explore the possibility of modifying the test so that such illogical conclusions are no longer possible. The dual test requires both the weighted combination test and a nave test, ignoring the adaptations, to be statistically significant. The result is that the flexibility and type I error level control of the combination test are preserved, while the nave test adds a safeguard against unconvincing results. The dual test is, by construction, at least as conservative as the combination test. However, many design changes will not lead to any power loss. A typical situation where the combination approach can be used is two-stage sample size reestimation (SSR). For this case, we give a complete specification of all sample size modifications for which the two tests are equally powerful. We also study the overall power loss for some suggested SSR rules. Rules based on conditional power generally lead to ignorable power loss while a decision analytic approach exhibits clear discrepancies between the two tests.
decision analysis
conditional power
clinical trial
flexible design
sample size reestimation