|
Principal
component analysis (PCA) is routinely applied to the study of NMR based
metabolomic data. PCA is primarily used to identify relative changes in the
concentration of metabolites to identify trends or characteristics within the
NMR data that permits discrimination between various samples that differ in
their source or treatment. The data are generally presented as a two or
three-dimensional plot (scores plot) where the coordinate axis correspond to the
principal components (
representing the directions of the two or three largest variations
in the data set. Effectively, each NMR spectrum is reduced to a single point in
the PC coordinate axis, where similar spectra will cluster together and
variations along any of the PC axes will highlight experimental differences
between the spectra.
A common concern with PCA of NMR data are the potential over emphasis of small
changes in high concentration metabolites that would over-shadow significant and
large changes in low-concentration components that may lead to a skewed or
irrelevant clustering of the NMR data. We have identified an additional
concern, very small and random fluctuations within the noise of the NMR spectrum
can also result in large and irrelevant variations in the PCA clustering. Our
analysis of “ideal” metabolomic data (NMR spectra of ATP, ATP+glucose and
glucose) indicates that this inclusion of noise may result in significant and
irrelevant spreading of the PCA scores clusters that may inhibit proper
interpretation of the data. Alleviation of this problem is obtained by
simply excluding the noise region from the PCA by a judicious choice of a
threshold above the spectral noise.
Picture Gallery
From:
Journal of Magnetic Resonance, 178(1) 88-95.
PCA score plots of ATP & ATP-glucose (with noise)
NMR Spectrum of "Outlier"
PCA Loading Plots of "Outlier"
Comparison of NMR Noise regions
PCA score plots of ATP & ATP-glucose (no noise)
PCA score plots w/wo noise of ATP,ATP-glucose and
glucose
Back to Powers Lab Home Page
|