Efficient estimation of the number of false positives in high-throughput screening
Journal article, 2015

This paper develops tail estimation methods to handle false positives in multiple testing problems where testing is done at extreme significance levels and with low degrees of freedom, and where the true null distribution may differ from the theoretical one. We show that the number of false positives, conditional on the total number of positives, has an approximately binomial distribution, and we find estimators of the distribution parameter. We also develop methods for estimation of the true null distribution, as well as techniques to compare it with the theoretical one. Analysis is based on a simple polynomial model for very small p-values. Asymptotics that motivate the model, properties of the estimators, and model-checking tools are provided. The methods are applied to two large genomic studies and an fMRI brain scan experiment.

High-throughput screening

SmartTail

Correction of p-values

Positive false discovery rate

Multiple testing

False discovery rate

Extreme value statistics

Author

Holger Rootzen

University of Gothenburg

Chalmers, Mathematical Sciences, Mathematical Statistics

Dmitrii Zholud

University of Gothenburg

Chalmers, Mathematical Sciences, Mathematical Statistics

Biometrika

0006-3444 (ISSN) 1464-3510 (eISSN)

Vol. 102 3 695-704

Subject Categories (SSIF 2011)

Mathematics

Probability Theory and Statistics

Roots

Basic sciences

Areas of Advance

Life Science Engineering (2010-2018)

DOI

10.1093/biomet/asv015

More information

Created

10/7/2017