COMPARE data and information platform
WP leader: Guy Cochrane, EMBL (7) cocharne@ebi.ac.uk
co-leaders: Ole Lund, DTU (1) & Istvan Csabai, WIGNER (24)
lund@cbs.dtu.dk, csabai.istvan@wigner.mta.hu
Objectives:
- To create and operate the COMPARE Data Resource, workflow engine and portal.
- To create user spaces for COMPARE workflow development and pilot projects.
- To ensure the long-term sustainability of the tools developed and data generated.
An important barrier to routine application of NGS/WGS/WCS of pathogens in clinical and public health laboratories is the current capacity for bio-informatics and data management. This “big data” challenge has been recognized internationally, and calls for new solutions for data storage and rapid sharing that will be capable of handling the expected massive increase in data in the coming years. Here, we develop and pilot the core infrastructure for such future routine applications, building from the existing European ICT infrastructure that has served the needs of the wider research and public health community for the past 30 years, but linking this to frontline developments in bioinformatics and NGS/WGS/WCS. The system will support the spectrum of sequence-based analyses of relevance to pathogen detection and characterization, surveillance, outbreak detection and investigation, from single locus approaches, through whole genome methods to metagenomics studies. Data types will include contextual metadata, primary data (sequence reads), and derived data (such as genomic alignments of reads, assemblies and functional annotation). Supporting the public health, clinical, research and tools development communities, the system will be scalable and sustainable beyond the duration of funding for COMPARE. Full technical specifications will be provided to allow the system to be replicated in alternative hardware infrastructures according to future need and capacity.