ModL: Exploring and restoring regularity when testing for positive selection

Joseph Mingrone, Edward Susko, Joseph P. Bielawski

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Motivation: Likelihood ratio tests are commonly used to test for positive selection acting on proteins. They are usually applied with thresholds for declaring a protein under positive selection determined from a chi-square or mixture of chi-square distributions. Although it is known that such distributions are not strictly justified due to the statistical irregularity of the problem, the hope has been that the resulting tests are conservative and do not lose much power in comparison with the same test using the unknown, correct threshold. We show that commonly used thresholds need not yield conservative tests, but instead give larger than expected Type I error rates. Statistical regularity can be restored by using a modified likelihood ratio test. Results: We give theoretical results to prove that, if the number of sites is not too small, the modified likelihood ratio test gives approximately correct Type I error probabilities regardless of the parameter settings of the underlying null hypothesis. Simulations show that modification gives Type I error rates closer to those stated without a loss of power. The simulations also show that parameter estimation for mixture models of codon evolution can be challenging in certain data-generation settings with very different mixing distributions giving nearly identical site pattern distributions unless the number of taxa and tree length are large. Because mixture models are widely used for a variety of problems in molecular evolution, the challenges and general approaches to solving them presented here are applicable in a broader context. Availability and implementation: https://github.com/jehops/codeml-modl Supplementary information: Supplementary data are available at Bioinformatics online.

Original languageEnglish
Pages (from-to)2545-2554
Number of pages10
JournalBioinformatics
Volume35
Issue number15
DOIs
Publication statusPublished - Aug 1 2019

Bibliographical note

Funding Information:
This work was supported by Discovery grants awarded to J.P.B. and E.S. by the Natural Sciences and Engineering Research Council of Canada.

Publisher Copyright:
© 2018 The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

ASJC Scopus Subject Areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'ModL: Exploring and restoring regularity when testing for positive selection'. Together they form a unique fingerprint.

Cite this