Probabilistic assignments of formulas to mass peaks in metabolomic experiments
(a) Department of Computing Science, University of Glasgow, Glasgow UK
(b) Groningen Bioinformatics Centre, University of Groningen, The Netherlands
Motivation: High-accuracy mass spectrometry is a popular
technology for high-throughput measurements of cellular metabolites
(metabolomics). One of the major challenges is the correct
identification of the observed mass peaks, including the assignment
of their empirical formula, based on the measured mass.
Results: We propose a novel probabilistic method for the
assignment of empirical formulas to mass peaks in high-throughput
metabolomics mass spectrometry measurements. The method incorporates
information about possible biochemical transformations between the
compounds to assign higher weights to formulas that could be created
from other metabolites in the sample. In a series of experiments we
show that the method performs well and provides greater insight than
assignments based on mass alone. In addition, we extend the model to
incorporate isotope information to achieve even more reliable
formula identification.
Contact:srogers@dcs.gla.ac.uk
Supplementary document
Available for download as a .pdf
Code
Matlab implementation and example scripts will be available soon.
Data
The measured masses from the Trypanasoma dataset reported by Breitling et al. (2006) were matched to KEGG metabolites, using a mass window of ±10 ppm. Matching entries were retrieved with the SOAP interface provided at the KEGG website. All unique molecular formulas were selected and stored with the mass. The automated matching was performed by the MetabolomeExplorer software (Scheltema et al., in prep.), using a Java implementation based on the KEGG library (keggapi.jar) utilizing the functions search_compounds_by_mass (retrieves all compound KEGG ids within the provided mass range) and bget (retrieves all information in the KEGG database for a given id). Dedicated software was written to interpret the results from the bget function. The full annotation file is available for download.
The full list of chemical transformations can be downloaded here.