MetaFIND
Category Metabolomics/Metabonomics>Metabolic Profiling/Analysis Systems/Tools
Abstract MetaFIND (Metabolomics Feature INterrogation and Discovery) is a Java-based feature analysis tool that enables real-time, interactive correlation analysis of feature sets, discovery of additional related features and analysis of related metabolites.
It is designed specifically to aid feature analysis within nuclear magnetic resonance (NMR) metabolomics data.
MetaFIND Functions --
MetaFIND imports the features retrieved by the user’s chosen feature selection technique [e.g. Partial Least Squares Discriminant Analysis (PLS-DA), Principal Component Analysis (PCA), and Random Forest] and the metabolomics observations dataset.
The initial function of MetaFIND is to analyze the set of features retrieved by the investigator’s chosen feature selection technique (as stated above...) and provide support, via an interactive graphical interface, for uncovering the various ‘metabolite signatures’ that may be present within this set. This provides a tool to bridge the gap between data driven class discrimination and domain explanation.
Secondly, MetaFIND also enables the user to examine correlations outside the selected feature set, i.e. between the selected features and the rest of the features in the dataset. This has the potential to discover novel features overlooked by the initial feature selection process.
Any additional features retrieved may then further aid the identification or discovery of metabolites. Lastly, MetaFIND allows the higher level correlations between metabolite signatures themselves, to be examined. Both positively and negatively correlated metabolites signature may be uncovered.
The identification of correlated metabolites may in turn contribute to the construction of metabolic networks (pathways). Although some tools already exist, that enable analysis of spectral correlations, MetaFIND allows the user to dynamically examine the correlations of individual spectral features retrieved by arbitrary feature selection methods.
MetaFIND Implementation --
The MetaFIND application contains several components which support the user in the:
1) Reconstruction of the class discriminating metabolite signatures;
2) Identification of additional relevant features omitted from the feature selection; and
3) Identification of correlated metabolites which may aid the inference of the metabolic correlations at play in the system under investigation.
The Correlation graph -
Central to the MetaFIND application is the correlation graph which provides a medium through which feature collinearity may be represented and analyzed. In this graph the y-axis represents correlation. As this is derived from Pearson;s correlation, it may render both positive (above the x-axis) and negative (below the x-axis) correlations. The x-axis represents the binned regions or peak names assigned to a feature.
The correlation graph enables the user to examine the intra-feature correlations between a feature retrieved by feature selection and the rest of features in the dataset.
The features, or peaks, which are representative of a particular class discriminating metabolite, should be highly correlated over samples (having the same relative change in magnitude). Although this correlation may in some cases be reduced due to the cumulative effect of additional intensities in a region.
The user selects a reference feature to display from the imported list via a drop down menu above the correlation graph. Apart from displaying the list of imported features, this drop down menu directs the user to features yet to be examined within the selected set (white font).
The currently displayed set of features (black font) and the previously viewed features (grey font) are also highlighted in this list. This function is particularly useful when a large set (e.g. 100 features) is imported. This function also means that a user may choose quite a liberal cut-off threshold during the initial feature selection stage.
Should the current reference feature represent a peak from a metabolite signature, the user may render this signature at a certain correlation threshold. Ordinarily the identification of this threshold and signature may prove a tedious pursuit with an investigator having to generate graphs at numerous correlation thresholds before happening upon the optimal rendering.
However, MetaFIND extends this correlation graph concept into an interactive application in which thresholds may be tweaked and assessed in real-time (via the correlation slider). This allows a rapid retrieval of the optimal metabolite signature and potentially aids the identification of class discriminating metabolites.
The interactive correlation graph contains an additional zoom function which is very useful when displaying a high number of features (e.g. 5 - 10,000). The user can then enlarge a region by simply clicking and dragging the cursor over the area of interest.
Having selected the appropriate resolution the user may then scroll left or right using the arrow buttons above the correlation graph. Once a feature of interest is identified it may be further assessed by plotting its values over all samples. This is achieved by clicking on the projected feature which displays the feature plot.
The feature plot -
The feature plot facility supports the user in determining the relevance of a feature of interest by examining how its values change over samples. In this way feature magnitude (intensity), class discrimination and outlying samples may be quickly examined.
The MetaFIND application also allows a feature plot to be retained on the desktop for reference and comparison with other features. In this way the user may also directly compare the feature vectors.
This function is useful for determining the features which are representative of a single metabolite signature. The feature plot also contains a tool-tip function which allows ‘feature identifiers’ to be displayed.
Lastly, the feature plot also provides the p-value for each feature in the form of the non-parametric Kruskal-Wallis statistic. MetaFIND may also be used to analyze feature significance over multiple sample classes. For simultaneous assessment of all currently displayed features the Heatmap function may also be used.
The Heatmap -
The Heatmap displays a color representation of the values of all currently displayed features over all currently displayed samples. This function provides an alternative way to rapidly assess many features simultaneously. This display is also interactive returning feature names and values for individual samples.
In practice the Heatmap allows the user to rapidly assess the class separation and correlation of hundreds of features and direct the user to regions of interest. The user may then carry out a more detailed examination of these regions using the correlation graph and the feature plot functions.
The class spectrum -
The class spectrum function in MetaFIND allows the user to display the mean class values for the currently displayed features. The class spectrum plot allows the user to view the mean magnitude of features over a sample class and can aid preliminary identification of peaks.
It also contains a tool-tip interaction and fulfils a role similar to that of the Heatmap in directing the user to general regions of interest. The class spectrum is activated by clicking on the class spectrum button above the correlation graph.
Once opened both the Heatmap and the class spectrum displays are tied to the main correlation graph and dynamically change in response to the currently displayed features.
Note: The MetaFIND feature analysis application is implemented in Java and utilizes the JFreeChart and JCommons packages.
System Requirements
Contact manufacturer.
Manufacturer
- Complex & Adaptive Systems Laboratory (CASL)
- University College Dublin, Ireland
- And
- UCD School of Agriculture
- Food Science & Veterinary Medicine
- UCD Conway Institute
- University College Dublin, Ireland
Manufacturer Web Site MetaFIND
Price Contact manufacturer.
G6G Abstract Number 20655
G6G Manufacturer Number 104302