Charles Explorer logo
🇬🇧

Improving the Similarity Search of Tandem Mass Spectra using Metric Access Methods

Publication at Faculty of Mathematics and Physics |
2010

Abstract

The tandem mass spectrometry is a widely used method for determining protein sequences from an "in vitro" sample. The sequences are not determined directly, but they must be interpreted from the mass spectra, which is the output of the mass spectrometer.

This work is focused on a similarity-search approach to mass spectra interpretation, where the parametrized Hausdorff distance (dHP) is used as the similarity. In order to provide an efficient similarity search under dHP, the metric access methods and the TriGen algorithm are employed.

We show that similarity search using dHP exhibits better correctness of interpretation than the cosine similarity commonly mentioned in mass spectrometry literature. Moreover, the search model using the dHP distance could be extended to support chemical modifications in the query mass spectra, which is typically a problem when the cosine similarity is used.

Our approach can be utilized as a coarse filter by any other database approach for mass spectra interpretation.