CFinder

Category Cross-Omics>Pathway Analysis/Gene Regulatory Networks/Tools

Abstract CFinder is a fast software system for locating and visualizing overlapping, densely interconnected groups of nodes in undirected graphs, and allowing the user to easily navigate between the original graph and the web of these groups.

It offers a fast and efficient method for clustering data represented by large graphs, such as genetic or social networks and microarray data.

The manufacturer has shown that in gene (protein) association networks, CFinder can be used to predict the function(s) of a single protein and to discover novel modules.

CFinder is also very efficient for locating the cliques of large sparse graphs.

CFinder reads a list of binary interactions, performs a search for dense subgraphs (groups), and it allows for any node to belong to more than one group.

Owing to its algorithm and implementation, CFinder is efficient for networks with millions of nodes and as a byproduct of its search, the full clique overlap matrix of the network, is determined.

CFinder’s results can be used to predict novel modules and novel individual protein functions.

CFinder overview --

The computational core of CFinder was implemented in C++, while the visualization and analysis components were written in Java.

The search algorithm uses the Clique Percolation Method (CPM) - (see below...) to locate the k-clique percolation clusters of the network that the manufacturers interpret as modules.

The user interface of CFinder offers several views of the analyzed network and its module structure.

Alternative views currently available in CFinder are ‘Communities’ (displaying the identified modules), ‘Cliques’, ‘Stats’ (statistics of, e.g. module and overlap sizes) and ‘Graph of communities’.

1) The Clique Percolation Method --

The original Clique Percolation Method (CPM) used by CFinder is designed to locate the k-clique communities of unweighted, undirected networks. (Extended versions of this algorithm, included in CFinder can handle directed and weighted networks as well - see below...)

This community definition is based on the observation that a typical member in a community is linked to many other members, but Not necessarily to all other nodes in the community.

In other words, a community can be interpreted as a union of smaller complete (fully connected) subgraphs that share nodes.

Such complete subgraphs in a network are called k-cliques, where k refers to the number of nodes in the subgraph, and a k-clique-community is defined as the union of all k-cliques that can be reached from each other through a series of adjacent k-cliques.

Two k-cliques are said to be adjacent if they share k-1 nodes.

An illustration of these communities can be given by k-clique template rolling. A k-clique template can be thought of as an object that is isomorphic to a complete graph of k nodes.

Such a template can be placed onto any k-clique of the network, and rolled to an adjacent k-clique by relocating one of its nodes and keeping its other k-1 nodes fixed.

Thus, the k-clique-communities of a graph are all those subgraphs that can be fully explored by rolling a k-clique template in them but cannot be left by this template.

The CPM was inspired by the fact that the k-clique communities also correspond to percolation clusters in the k-clique adjacency graph of the system.

The nodes of the k-clique adjacency graph represent the k-cliques of the original network, and there is an edge between two (2) nodes if the corresponding two k-cliques are adjacent.

The advantages of the above community definition are the following:

Naturally, the communities also constitute a network with these overlaps as their links.

The number of such links of a community can be called its community degree.

2) Outline of the community finding algorithm --

These components can be found by erasing every off-diagonal entry smaller than k-1 and every diagonal element smaller than k in the matrix, replacing the remaining elements by one, and then carrying out a component analysis of this matrix.

The resulting separate components will be equivalent to the different k-clique-communities.

3) Directed and weighted networks --

The original CPM method, as described above, can only handle undirected, unweighted networks.

CFinder, however, also contains algorithms for handling directed (CPMd) and weighted (CPMw) networks. These are natural extensions of the CPM method.

In both cases the key idea is that one can add extra criteria for a clique to be used when building the community.

Note: For details on the criteria and the above algorithms, see the numerous papers on the manufacturer's web-site under Publications.

CFinder documentation --

The manufacturer also provides an extensive FAQ section and User Manual.

System Requirements

Contact manufacturer.

Manufacturer

Manufacturer Web Site CFinder

Price Contact manufacturer.

G6G Abstract Number 20778

G6G Manufacturer Number 104355