Documentation

Data 

The content of SARS-CoV-2  databases has been compiled from several publicly available sequence databases available through INSDC (International Nucleotide Sequence Database Collaboration). Besides, we have used publicly available data from Johns Hopkins University Coronavirus Resource Centre and literature databases such as Europe PMC and PubMed for curation.

Sequence Data

The sequence data, including assembly and nucleotide and protein sequences, has been downloaded from ENA. The sequences can be downloaded from the “Download “pages.

Contextual database

The contextual database has been compiled from available metadata parsed from BioProject (Study), BioSample (Sample), Assembly (Analysis) and Run (Experiment) files. The parsed contextual data has been curated to ensure consistency across the database. Phylogenetic linage has been predicted using the Pangolin2.0 (https://github.com/cov-lineages/pangolin).

BLAST databases

The BLAST database has been created in-house and can be downloaded from the “Download” pages.

Data repository

The sequences can be downloaded from the “Download” page.

 REST API

An API gives programmatic access to the data. For more details, see here.