Description

Metagenomic benchmarking dataset for AMR detection pipelines for assemblies focusing on ESKAPE pathogens in addition to Salmonella. This dataset consists of closed genomes from NCBI where paired-end Illumina data was available. These genomes were then randomly assigned a relative abundance, had additional AMR genes randomly inserted (to cover all AMR genes in CARD v3.1.4) and metagenomic Illumina reads simulated from them. Metagenomic simulation was performed using: https://github.com/fmaguire/AMR_Metagenome_Simulator and the entire process can be repeated using the metagenome_benchmark.sh script included above. Files `amr_benchmarking_metagenome.csv` contains the metadata the input genome accessions, paths, and simulated copy number used for creation of the AMR metagenome. `AMR_metagenome_labels.tsv` a two column csv containing names of all reads that are derived from an AMR gene and an identifier for the corresponding AMR gene. AMR genes are identified using CARD Antibiotic Resistance Ontology (ARO), with a suffix listing any SNVs for nmutation related resistance genes. `simulated_metagenome.fna.gz` contains the full "assembled" true metagenomic contigs (derived directly from the input genome assemblies amplified to the correct copy number). `metagenome_unsorted.bed` contains the location of AMR genes in the full "assembled" true metagenomic contigs `simulated_metagenome_{1,2}.fq.gz` contain the simulated paired end metagenomics reads `simulated_metagenome_error_free.bam` contains the error-free mapping location from which simulated reads were derived
Datos disponibles2022
EditorZENODO

Citar esto