BMRB

Biological Magnetic Resonance Data Bank


A Repository for Data from NMR Spectroscopy on Proteins, Peptides, Nucleic Acids, and other Biomolecules
Member of WWPDB

Input Format for Heuristical Formula Query

The page, "Heuristically Determined Formulas by Mass," is designed to be a tool for the Mass Spectrometry community. It is meant to be used for generating large lists of potential formulas with different isotopic labelings. The resulting files can then be compared using Mathematica scripts written by Adrian Hegeman. Because of this, there is a prefered format for the input files.

The software is designed to accept a file or list with one comparison per line, with each line containing an index number folowed by a retention time followed by the mass to be queried against.

For example:

    1   38.378  211.1558
    2   19.313  234.1112
    3   52.042  191.1523
    4   38.386  840.5971
    5   54.5    243.1246
    6   2.724   109.0015
    7   4.045   108.1168
    8   8.374   127.0994
    9   8.375   174.1091
    10  32.563  196.1336
        

It can also accept a file or list containing only masses. In this case, an index will be generated and the retention time will be set to '?'. Whichever format you choose, mark the appropriate 'Mass list type' radio button.

The results will be in a tab separated file that can be downloaded. The result lines will be of the form:

index-number retention-time input-mass matching-mass formula #C #H #N #O #P #S

An example of the results:

    1   38.378  211.155800000   211.153693279   C9H19N2O1P1S0   9   19  2   1   1   0
    1   38.378  211.155800000   211.153830135   C8H17N3O3P0S0   8   17  3   3   0   0
    2   19.313  234.111200000   234.110630624   C14H8N2O1P0S0   14  8   2   1   0   0
    2   19.313  234.111200000   234.111200340   C9H13N4O1P0S1   9   13  4   1   0   1
    2   19.313  234.111200000   234.112102130   C5H12N9O0P1S0   5   12  9   0   1   0
    2   19.313  234.111200000   234.112981554   C12H16N0O0P2S0  12  16  0   0   2   0
    2   19.313  234.111200000   234.113118410   C11H14N1O2P1S0  11  14  1   2   1   0
    2   19.313  234.111200000   234.113255266   C10H12N2O4P0S0  10  12  2   4   0   0
    3   52.042  191.152300000   191.150916694   C5H14N8O0P0S0   5   14  8   0   0   0
    3   52.042  191.152300000   191.151932974   C11H16N0O2P0S0  11  16  0   2   0   0
        

The formulas were generated as combinations of C, H, N, S, O, and P with natural abundance molecular masses of 2000 Da or less, and then rules for filtering the billions of possibilities down to around 400 million. These rules were obtained from the paper by Kind and Fiehn, "Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry" (Kind T. and Fiehn O. BMC Bioinformatics. 2007 Mar 27;8:105.)