RADProc: A computationally efficient de novo locus assembler for population studies using RADseq data

Praveen Nadukkalam Ravindran, Paul Bentzen, Ian R. Bradbury, Robert G. Beiko

Producción científica: Contribución a una revistaArtículorevisión exhaustiva

12 Citas (Scopus)

Resumen

Restriction site-associated DNA sequencing (RADseq) is a powerful tool for genotyping of individuals, but the identification of loci and assignment of sequence reads is a crucial and often challenging step. The optimal parameter settings for a given de novo RADseq assembly vary between data sets and can be difficult and computationally expensive to determine. Here, we introduce RADProc, a software package that uses a graph data structure to represent all sequence reads and their similarity relationships. Storing sequence–comparison results in a graph eliminates unnecessary and redundant sequence similarity calculations. De novo locus formation for a given parameter set can be performed on the precomputed graph, making parameter sweeps far more efficient. RADProc also uses a clustering approach for faster nucleotide-distance calculation. The performance of RADProc compares favourably with that of the widely used Stacks software. The run-time comparisons between RADProc and Stacks for 32 different parameter settings using 20 green-crab (Carcinus maenas) samples showed that RADProc took as little as 2 hr 40 min compared to 78 hr by Stacks, while 16 brown trout (Salmo trutta L.) samples were processed by RADProc and Stacks in 23 and 263 hr, respectively. Comparisons of the de novo loci formed, and catalog built using both the methods demonstrate that the improvement in processing speeds achieved by RADProc does not affect much the actual loci formed and the results of downstream analyses based on those loci.

Idioma originalEnglish
Páginas (desde-hasta)272-282
Número de páginas11
PublicaciónMolecular Ecology Resources
Volumen19
N.º1
DOI
EstadoPublished - ene. 2019

Nota bibliográfica

Funding Information:
Canada Research Chairs; Canada Foundation for Innovation; Natural Sciences and Engineering Research Council of Canada

Funding Information:
Computational infrastructure used to carry out the analyses was supported by the Canada Foundation for Innovation and Compute Canada. RGB acknowledges the support of the Canada Research Chairs program. This research benefitted from a Canadian Natural Sciences and Engineering Research Council (NSERC) Strategic Grant to PB and RGB.

Publisher Copyright:
© 2018 John Wiley & Sons Ltd

ASJC Scopus Subject Areas

  • Biotechnology
  • Ecology, Evolution, Behavior and Systematics
  • Genetics

PubMed: MeSH publication types

  • Journal Article

Huella

Profundice en los temas de investigación de 'RADProc: A computationally efficient de novo locus assembler for population studies using RADseq data'. En conjunto forman una huella única.

Citar esto