The content of SARS-CoV-2 databases has been compiled from several publicly available sequence databases available through INSDC (International Nucleotide Sequence Database Collaboration). Besides, we have used publicly available data from Johns Hopkins University Coronavirus Resource Centre and literature databases such as Europe PMC and PubMed for curation.
The sequence data, including assembly and nucleotide and protein sequences, has been downloaded from ENA. The sequences can be downloaded from the “Download “pages.
The contextual database has been compiled from available metadata parsed from BioProject (Study), BioSample (Sample), Assembly (Analysis) and Run (Experiment) files. The parsed contextual data has been curated to ensure consistency across the database. Phylogenetic linage has been predicted using the Pangolin2.0 (https://github.com/cov-lineages/pangolin).
The BLAST database has been created in-house and can be downloaded from the “Download” pages.
The sequences can be downloaded from the “Download” page.