TY - JOUR
T1 - Future-proofing and maximizing the utility of metadata
T2 - The PHA4GE SARS-CoV-2 contextual data specification package
AU - Griffiths, Emma J.
AU - Timme, Ruth E.
AU - Mendes, Catarina Inês
AU - Page, Andrew J.
AU - Alikhan, Nabil Fareed
AU - Fornika, Dan
AU - Maguire, Finlay
AU - Campos, Josefina
AU - Park, Daniel
AU - Olawoye, Idowu B.
AU - Oluniyi, Paul E.
AU - Anderson, Dominique
AU - Christoffels, Alan
AU - Da Silva, Anders Gonçalves
AU - Cameron, Rhiannon
AU - Dooley, Damion
AU - Katz, Lee S.
AU - Black, Allison
AU - Karsch-Mizrachi, Ilene
AU - Barrett, Tanya
AU - Johnston, Anjanette
AU - Connor, Thomas R.
AU - Nicholls, Samuel M.
AU - Witney, Adam A.
AU - Tyson, Gregory H.
AU - Tausch, Simon H.
AU - Raphenya, Amogelang R.
AU - Alcock, Brian
AU - Aanensen, David M.
AU - Hodcroft, Emma
AU - Hsiao, William W.L.
AU - Vasconcelos, Ana Tereza R.
AU - MacCannell, Duncan R.
N1 - Publisher Copyright:
© 2022 The Author(s) 2022. Published by Oxford University Press GigaScience.
PY - 2022
Y1 - 2022
N2 - Background: The Public Health Alliance for Genomic Epidemiology (PHA4GE) (https://pha4ge.org) is a global coalition that is actively working to establish consensus standards, document and share best practices, improve the availability of critical bioinformatics tools and resources, and advocate for greater openness, interoperability, accessibility, and reproducibility in public health microbial bioinformatics. In the face of the current pandemic, PHA4GE has identified a need for a fit-for-purpose, open-source SARS-CoV-2 contextual data standard. Results: As such, we have developed a SARS-CoV-2 contextual data specification package based on harmonizable, publicly available community standards. The specification can be implemented via a collection template, as well as an array of protocols and tools to support both the harmonization and submission of sequence data and contextual information to public biorepositories. Conclusions: Well-structured, rich contextual data add value, promote reuse, and enable aggregation and integration of disparate datasets. Adoption of the proposed standard and practices will better enable interoperability between datasets and systems, improve the consistency and utility of generated data, and ultimately facilitate novel insights and discoveries in SARS-CoV-2 and COVID-19. The package is now supported by the NCBI's BioSample database.
AB - Background: The Public Health Alliance for Genomic Epidemiology (PHA4GE) (https://pha4ge.org) is a global coalition that is actively working to establish consensus standards, document and share best practices, improve the availability of critical bioinformatics tools and resources, and advocate for greater openness, interoperability, accessibility, and reproducibility in public health microbial bioinformatics. In the face of the current pandemic, PHA4GE has identified a need for a fit-for-purpose, open-source SARS-CoV-2 contextual data standard. Results: As such, we have developed a SARS-CoV-2 contextual data specification package based on harmonizable, publicly available community standards. The specification can be implemented via a collection template, as well as an array of protocols and tools to support both the harmonization and submission of sequence data and contextual information to public biorepositories. Conclusions: Well-structured, rich contextual data add value, promote reuse, and enable aggregation and integration of disparate datasets. Adoption of the proposed standard and practices will better enable interoperability between datasets and systems, improve the consistency and utility of generated data, and ultimately facilitate novel insights and discoveries in SARS-CoV-2 and COVID-19. The package is now supported by the NCBI's BioSample database.
UR - http://www.scopus.com/inward/record.url?scp=85124680073&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85124680073&partnerID=8YFLogxK
U2 - 10.1093/gigascience/giac003
DO - 10.1093/gigascience/giac003
M3 - Article
C2 - 35169842
AN - SCOPUS:85124680073
SN - 2047-217X
VL - 11
JO - GigaScience
JF - GigaScience
M1 - giac003
ER -