Category Cross-Omics>Data/Text Mining Systems/Tools

Abstract XplorMed is a web tool that summarizes MEDLINE search results according to subjects and allows you to navigate through abstracts in an interactive fashion.

The XplorMed server allows you to explore a set of abstracts derived from a MEDLINE search. The system gives you the ‘main associations’ between the words in ‘groups of abstracts’. Then, you can select a subset of your abstracts based on selected groups of related words and iterate your analysis on them.

XplorMed is recommended for cases in which you do Not know exactly what are you expecting to find. Your interests may be modified by the results obtained, or you may want to inquire new questions as the analysis develops.

Also, the results may suggest additional words to you that should be used to expand your query in MEDLINE (e.g., unexpected abbreviations of a ‘protein name’, or synonyms of a disease).

The only input needed is a set of abstracts (in one of the manufacturer's currently accepted input formats) or a definition of how to obtain them.

Generating Input for XplorMed --

1) There are three (3) ways of entering XplorMed:

This input file must conform to one of the ‘input files’ that the manufacturers are currently supporting.

Note: See the manufacturer's tutorial (located on their web-site) for an example of how to prepare such a file using the NCBI Entrez server.

2) Minimum number of abstracts recommended - The manufacturers recommend a minimum of 20 abstracts for the computation of relations. XplorMed makes statistics over ‘word co-occurrence’ that requires that words appear together, a minimum number of times.

The less data you have, the less relations can be reliably detected (in this limit, you canNot compute anything out of one single abstract). Obviously, the more structured your query is, the smaller the number of documents you will need, in order to get meaningful results.

3) Limitations of use of the server - There is a limit in the amount of abstracts that can be submitted to the server (for computational resource limitations) which is currently set at 500 abstracts.

4) Limiting your search - If your results contain too many abstracts you can either try another more specific query in PubMed (MEDLINE) by adding more words to the query, or use the limit options (e.g., search only papers published in one year). That may also give you a good overview.

If your results contain too few abstracts then you can make your query in PubMed less specific by using less terms in the query. You can add terms by using Boolean operators, e.g., (mobile OR cellular) AND (phone OR telephone).

Using the XplorMed server --

Access the XplorMed web-site via your web browser. Select your entry point (Yellow/Green/Red) and provide, either, a query to MEDLINE, a database entry identifier, or a file with abstracts.

Hit the ‘Sort abstracts by MeSH categories’ button. The next window displays the classification of your abstracts based on the ‘top hierarchy’ of the ‘MeSH Terms’ associated with each MEDLINE entry. Select the categories of your interest (or all of them if you do Not know). Then, hit the ‘Compute relationships between words in abstracts’ button.

In the next window you will get a list of ‘important words’ (words that are significantly related to other words) and therefore, likely to form part of ensembles of words repeated across several of your abstracts.

You can already begin to explore the context of those words by clicking in the word itself. This will lead you to a new window indicating all sentences containing the word.

Alternatively, you can follow the [R] links that show related words. For a more general ‘contextual analysis’, you can open an exploration window by hitting the ‘Explore context of any word’ button. From there you can find the context and dependencies of any word.

Before performing the next step, you can control two (2) variables that modulate how focused your analysis will be: ‘alpha’ (that controls the cut on the ‘fuzzy relations’ used for the following steps) and ‘Score’ (that is applied as a threshold on the ‘K score’ for word selection).

Low alpha values produce more and longer ‘word chains’. High K scores limit the number of words used to produce the word chains. By clicking the `Compute chains of related words' button, you will be shown ‘chains of related words’, that have been extracted according to the relations and the words selected.

Select one or more of those chains and ‘Rank abstracts by word-chain usage’ in order to rate the abstracts according to the presence of those words therein. Alternatively, you can type a chain of words in the window.

In the next window you will obtain a list of abstracts ordered by their relation to the selected ‘chains of words’. You can select a subset of abstracts and the ‘Compute relations between words in abstracts’ button, which leads you to the ‘next iteration’ of the analysis, which is done on the selected subset of abstracts.

You can also expand your selection with a number of ‘MEDLINE neighbors’ (MEDLINE neighbors are - Abstracts with similar word content to a given one. Two neighboring abstracts would be expected to deal with similar subjects), of the selected abstracts.

XplorMed Additional Information --

The manufacturers have used a list of ‘Stop-words’ to filter out words that are considered non-informative. This list is based on the one used at the National Center for Biotechnology Information (NCBI) with some additions (e.g., including measure units).

System Requirements



Manufacturer Web Site XplorMed

Price Contact manufacturer.

G6G Abstract Number 20553

G6G Manufacturer Number 104055