Lost in parameter space: A road map for Stacks
Methods in Ecology and Evolution
Reason for embargo
1.Restriction site-Associated DNA sequencing (RAD-seq) has become a widely adopted method for genotyping populations of model and non-model organisms. Generating a reliable set of loci for downstream analysis requires appropriate use of bioinformatics software, such as the program stacks. 2.Using three empirical RAD-seq datasets, we demonstrate a method for optimising a de novo assembly of loci using stacks. By iterating values of the program's main parameters and plotting resultant core metrics for visualisation, researchers can gain a much better understanding of their dataset and select an optimal set of parameters; we present the 80% rule as a generally effective method to select the core parameters for stacks. 3.Visualisation of the metrics plotted for the three RAD-seq datasets shows that they differ in the optimal parameters that should be used to maximise the amount of available biological information. We also demonstrate that building loci de novo and then integrating alignment positions is more effective than aligning raw reads directly to a reference genome. 4.Our methods will help the community in honing the analytical skills necessary to accurately assemble a RAD-seq dataset.
This work was co-funded by the Environment Agency, Westcountry Rivers Trust and the University of Exeter. Overseas collaboration for the project was made possible by funding from The Genetics Society, Santander and the University of Exeter. Thank you to many RAD-seq workshop participants for invaluable insight and new ideas. We thank Dr Nicolas Rochette for his insights into parameter analysis. Thanks also to Dr Andy King for assistance with the brown trout data molecular work and analysis, and Guy Freeman and Martin Young for the species illustrations. Prof Peter Kille and Dr Luis Cunha, Cardiff School of Biosciences, Cardiff University, kindly provided the reference genome of L. rubellus.
This is the author accepted manuscript. The final version is available from Wiley via the DOI in this record.
First published: 18 April 2017