|
1) How
do I let the pattern matching software know my mass accuracy window
and my elemental formula constraints?
There are no parameters that need to be set.
An instrument capable of obtaining a mass accuracy of 12 ppm or better is
needed to use Rational Numbers Search (the accuracy window can be customized to the user's instrument). No elemental formula constraints
are needed other than those imposed by the mass spectral data itself.
The database contain 259184 common compounds with an "apparent ESI mass" between 100 and 999 daltons composed
of the elements C,H,N,O,S,P,F, Cl,Br,Li,Na, or K.
Here is an example of the search results obtained on a published spectrum of malathion (Analytical Chemistry October 1, 2006 Thurman, Ferrer, and Zweigenbaum). The upper portion lists rows of partitions in decreasing score. Each partition is a different representation of a molecule and each is tested independently, so the same compound can be listed multiple times. Clicking on the PubChem ID number will take you into PubChem. The lower portion of the printout lists the number of synonyms for each compound in the mass accuracy window and its theoretical isotope pattern.
2) Will the Rational Numbers pattern
matching software work for every compound?
No, the success of the program is compound
dependent. Compounds that give at least one fragment ion will work,
but obviously the more fragment ions you have, the better your results will be. Multiple spectra at various collision energies can be combined to increase the number of fragments. Steroids with only one or two oxygen atoms may also be difficult compounds to identify because of the combination of a high carbon/heteroatom ratio and the possibility of a very large number of compounds in the database with the same elemental composition.
3) ChemSpider and PubChem have millions of compounds. Why is the MathSpec database only 259184 compounds?
Increasing the size of any database proportionately increases the likelihood of finding a match by pure chance. The erroneous arrest of Brandon Mayfield based on a fingerprint match from a very large database is a good example. Even with accurate-mass data, the structural information that can be gleaned from MS/MS spectra is limited. We need to have a database large enough that there would be a 98% probability that a known endogenous metabolite or commercially available substance would be present, but not so large that many of the hits would be due to chance alone. The vast majority of literature compounds are not in commercial use and therefore have an extremely low probability of being present in your sample.
4) Can the pattern matching software
find quaternary amines and compounds that are salts?
Besides the speed advantage of using a database,
this allows us to index or locate compounds at their “apparent
ESI mass”, and thus find compounds that otherwise would be
missed. For example, hydrochloride and hydrobromide salts are located
at the molecular weight of their corresponding free base. Sodium
and potassium salts are located at the mass of their conjugate acids.
Using the same logic, chlorpheniramine maleate would be located
at the exact mass of chlorpheniramine.
Likewise you will find quaternary salts such as acetylcholine.
These quaternary compounds are located one proton down from the mass of the M+ ion. The database contains only the
minimal information needed for comparing mass spectral data to chemical
structures which have been partitioned. For example, there is not
sufficient information in the database to draw any of the structures.
This means that the results are ID numbers and the user must have
access to PubChem to view the chemical structures.
5) I use ammonium salts in my mobile phase and some of the MS/MS spectra may be on ammonium adducts. How does the software distinguish protonated molecules from ammoniated compounds?
The software tests both possibilities automatically, unless the spectral data precludes an ammonium adduct.
6) Is the software suitable for data
from all types of high-resolution mass spectrometers?
Yes. Each spectrum is captured in WordPad
or Notepad and saved as a text file (mass and intensity columns), using Excel or the vender's "Copy
Spectrum List" utility.
7) Is it really necessary to acquire
accurate-mass fragmentation data? Why isn't the accurate mass of
the whole compound sufficient to identify it?
This is a very common misconception about
accurate-mass. For example,
xemilofiban has the formula C18H22N4O4.
The PubChem database has about 700 compounds with that same formula
and exact mass and 938 compounds within 10 ppm of its exact mass
(PubChem on 11-21-07).
8) Can the software use data where
a comma is used as the decimal separator (radix point)?
The pattern matching softwares can handle
data from both "comma countries" and "dot countries".
9) Do I need to purchase an expensive
workstation or install software to use the program?
The pattern matching software and database
reside on our computers. The file uploads and downloads are all handled securely by the server, and no software (except your web browser) is needed. This allows us to update the database very easily. Most searches take only a few seconds.
10) The isotope ratio data from my
instrument is sometimes way off. Can Rational Numbers Search work
without isotope data?
Yes. The isotope ratios of all compounds in
the database within 12 ppm of the unknown compound are printed at the bottom of
your report for comparison, but the search engine only requires fragmentation
data. The number of synonyms (a very good indication of relevance - see James Little reference) is also printed at the bottom of the report, but like the isotope data, the number of synonyms does not affect the scoring above.
11) Has anything been published where
I can get more details about the theory behind this search-engine
approach?
Analytical Chemistry 2003, 75, 5362-5373. |