AMR Machine Learning

Pilot project on Machine Learning and Antimicrobial Resistance

Monday 25 Sep 17

Contact for AMR Machine Learning pilot

Sébastien Matamoros

Department of Medical Microbiology

Academic Medical Center

University of Amsterdam

This pilot project has the following objectives:

  • Creation of a database of bacterial whole genome sequences (WGS) which includes extensive metadata about the phenotypic AMR patterns of all the incorporated isolates
  • Development of an algorithm of phenotypic AMR prediction directly from bacterial WGS based on a machine-learning approach

The main objective of this project is the creation of a software tool capable of predicting the antimicrobial phenotype (minimum inhibitory concentration for specific antibiotics) of bacterial isolates directly from their whole genome sequence (WGS), based on a big data machine-learning approach.

As the first milestone in this project, we have created a database in which bacterial WGS can be deposited and linked with their corresponding AMR phenotypic. Thanks to several partners in COMPARE, we are populating the database with the objective of reaching 1000 E. coli genomes. Three teams will then use each their own machine-learning approach on this dataset to extract the relevant information in a totally unbiased manner, and thus create the predictive algorithm.

This project was officially started in September 2016 during our first face-to-face meeting in Amsterdam. In September 2017, we doubled our number of attendees (20) for our second meeting. We discussed the present state of the database and made plans for the start of the development phase of the algorithms.

The main points of discussion were the harmonization of the data deposited in the database (standardized measures of AMR phenotype, which antibiotics are the primary focus), which type of genomic data will be used for the development of the machine-learning algorithms, and finally how to expand the database to expand the number of isolates for currently studied species (E. coli, Salmonella) and open the project to new species.

Our future effort will now concentrate on increasing the number of isolates deposited in the database and simultaneously using the available data to develop a first running version of the predictive algorithm to be demonstrated at the next COMPARE General Meeting (28 February – 2 March 2018).

This pilot project is part of the COMPARE WP3/6, Analytical workflows and frontline diagnostics.
18 OCTOBER 2017