Reactome FI Cytoscape Plugin

From ReactomeWiki

Jump to: navigation, search

Contents

Overview

Reactome FI Cytoscape Plugin was designed to find network patterns related to cancer and other types of diseases. This plugin accesses the Reactome Functional Interaction (FI) network, a highly reliable, manually curated pathway-based protein functional interaction network covering close to 50% of human proteins, and allows you to construct a FI sub-network based on a set of genes, query the FI data source for the underlying evidence for the interaction, build and analyze network modules of highly-interacting groups of genes, perform functional enrichment analysis to annotate the modules, expand the network by finding genes related to the experimental data set, display pathway diagrams, and overlay with a variety of information sources such as cancer gene index annotations. For an example how we use Reactome FIs for cancer data analysis, please see our publication: A human functional protein interaction network and its application to cancer data analysis.

Download and Launch the Reactome FI plugin

  • If you have installed Cytoscape already (version 2.7.0 or above), please save this jar file, caBigR3.jar, into your Cytoscape plugins folder, and restart Cytoscape.
  • You can also launch FI Cytoscape plug-in using Java Web Start by clicking this link: Cytoscape.jnlp. Please choose "Allow" if you see a dialog similar to the following screenshot so that the FI plugin can open your local file and save results:
Reactome FI Plugin Java Web Start
  • If you need the Java source code for this Cytoscape plug-in, you can download it from this link: FICytoscapePlugInSrc.jar

Use the Reactome FI plugin

After starting Cytoscape, you should see a menu item called "Reactome FIs" under the Plugins menu. After clicking this menu, you can see three sub-menus: Gene Set/Mutation Analysis, Microarray Data Analysis and User Guide. Gene set/mutation analysis is used to do FI network-based data analysis for a set of genes or a mutation data file, microarray data analysis used to do MCL (Markov Graph Clustering, http://micans.org/mcl/) based FI network clustering analysis by converting a non-weighted FI network to weighted network using correlations among genes in the network, and user guide brings you to this user guide.
Reactome FI Plugin Menu

Gene Set/Mutation Analysis

  1. Currently FI plug-in supports three file formats for gene set/mutation analysis:
    1. Simple gene set: one line per gene. For example, GWASFuzzyGenes.txt, a list of T2D GWAS genes.
    2. Gene/sample number pair. For example, GeneSampleNumber.txt, which contains two required columns, gene and number of samples having gene mutated, and an optional third column listing sample names (delimited by ";").
    3. NCI MAF (mutation annotation file). For example, GlioblastomaMutationTable.txt, the mutation file from the TCGA GBM project.
  2. Choose a file containing genes you want to use to construct a functional interaction network. Select an appropriate file format and parameters to load genes and construct FI network in the dialog. Click the "OK" button to start the FI network building process.
    Open Gene Set File
  3. The constructed FI network will be displayed in the network view panel. A FI specific visual style will be created automatically for the FI network.
    Reactome FI Sub-Network
  4. The main features of Reactome FI plug-in should be invoked from a popup menu, which can be displayed by right clicking an empty space in the network view panel.
    Popup Menu for Network
    1. Select nodes: select nodes from a list of node ids delimited by ", ".
    2. Fetch FI annotations: query detailed information on selected FIs. Three FI related edge attribues will be created: FI Annotation, FI Direction, and FI Score. Edges will be displayed based on FI direction attribute values. In the following screenshot, "->" for activating/catalyzing, "-|" for inhibition, "-" for FIs extracted from complexes or inputs, and "---" for predicted FIs. See the "VizMapper" tab, Edge Source Arrow Shape and Edge Target Arrow Shape values for details.
      FI Annotations
    3. Analyze network functions: pathway or GO term ennrichment analysis for the displayed network. You can choose to filter enrichment results by a FDR cutoff value. Also you can choose to display nodes in the network panel for a selected row or rows by checking "Hide nodes in not selected rows". The following screenshot shows results from a pathway enrichment analysis.
      Pathways in FI Sub-Network

      Tip: To analyze pathway or GO term enrichment on a set of genes that are not linked together, select the "Show genes not linked to others" option in the "Set Parameters for FI Network" dialog.
    4. Cluster FI network: run a network clustering algorithm (spectral partition based network clustering by Newman 2006) on the displayed FI network. Nodes in different network modules will be shown in different colors (different colors used only for first 15 modules based on sizes).
      Network Modules
    5. Analyze module functions: pathway or GO term enrichment analysis for each individual network modules. You can select a size cutoff to filter out network modules that are too small, choose a FDR cutoff to view enriched pathways or GO terms under a certain FDR value, and view nodes in a selected row or rows only in the network diagram.

HotNet Mutation Analysis

Reactome FI Cytoscape plug-in implements the algorithm developed by Raphael's group at Brown University, called "HotNet", for doing cancer mutation data analysis. For details about this algorithm, please see Algorithms for detecting significantly mutated pathways in cancer, and Discovery of mutated subnetworks associated with clinical data in cancer.

  1. Select a mutation data file and run HotNet algorithm: After selecting sub-menu "HotNet Mutation Analysis" from menu Plugins/Reactome FIs, you would see the following dialog. Choose a version of FI Network, a mutation file from your local file system, and set parameters required by the HotNet algorithm. Currently the plug-in supports the NCI MAF mutation file only. We are going to support more file formats in the future. If you are not sure what delta value should be used, you may choose "Auto" in the dialog. However, using "Auto" takes much longer time to run the algorithm. Random permutation is used to calculate p-values and FDR values. The largest number of permutation is 1000. For details about permutation, please see the above two papers. After entering all required parameters, click the "OK" button to start HotNet analysis. It may take several minutes. If you choose "Auto" for delta, it takes even longer time. For a test run, you may use the TCGA GBM mutation file, GlioblastomaMutationTable.txt, and choose the 2012 version of FI Network with delta 1.0e-4.
    Set Parameters for HotNet Mutation Analysis
  1. Select network modules and build a FI sub-network: The generated FI network modules from the HotNet analysis are listed in the HotNet result dialog (see below). In the dialog, you can choose a size cutoff, or a FDR cutoff. The displayed selected network modules will be used to build a FI sub-network after you click the "OK" button. In the dialog, you can also see the chosen delta value and the number of permutations. You may try different delta values for better results.
    HotNet Mutation Analysis Results

Microarray Data Analysis

The Reactome FI Cytoscape plugin can load gene expression data file, calculate correlations among genes involved in the same FIs, use the calculated correlations as weights for edges (i.e. FIs) in the whole FI network, apply MCL graph clustering algorithm to the weighted FI network, and generate a sub-network for a list of selected network modules based on module size and average correlation. The generated FI sub-network will be displayed in the network panel, and can be used for analysis as in Gene Set/Mutation Analysis. For details about this method, please see our publication: A network module-based method for identifying cancer prognostic signatures.

An array data file should be a tab-delimited text file with table headers. The first column should be gene names. All other columns should be expression values in different samples. The data set in the file should be pre-normalized. For example, see this gene expression file for breast cancer: NejmLogRatioNormGlobalZScore_070111.txt.zip. This data set was download from van de Vijver et al in 2002, and has been normalized.

  1. Select a microarray data file and run MCL network clustering: After selecting sub-menu "Microarray Data Analysis" from menu Plugins/Reactome FIs, you should see the following dialog. Choose a microarray data file, check if you want to use absolute values as weights for edges, and input an inflation parameter (-I) for the MCL clustering algorithm. The smaller the inflation parameter is, the bigger the average size of generated network modules. Based on our own experience, we use 5.0 for the inflation parameter, the highest recommended value, and choose the absolute value for edge weights. For more details on how to choose the inflation parameter, please see http://micans.org/mcl/. After you have set these parameters, click the OK button to load the data file, calculate correlations, and apply the MCL clustering algorithm.
    Set Parameters for Microarray Data Analysis
  2. Select network modules and build a FI sub-network: The generated network modules are listed in the MCL clustering results dialog (see below). Only modules having more than 2 genes can be listed, and used in the FI sub-network building. You can choose a module size or an average correlation value (absolute value if absolute has been checked before) to filter out modules that may not be significant (Note: after set these cutoff values, please press the "Enter" key to commit your changes.). In our analysis, we choose modules having 7 or more genes with average correlation values no less than 0.25. These values have been used as default in the dialog. In the dialog, you can see how many modules and genes will be chosen for building FI sub-network under your selected filter values. Click the OK button to start the sub-network building. The built sub-network will be displayed, and can be analyzed as with sub-networks generated from the gene set/mutation analysis.
    Choose MCL Network Modules

Other features in FI plugin

Query FI source

Select an edge and right click it to get the popup menu for edge. Select a menu called "Reactome FI/Query FI Source". If a FI is extracted from curated pathways or reactions, a dialog for the original data source(s) will be displayed. Double click a row in the displayed table to show a detailed web page for the source of the FI. If the selected FI is a predicted one, the evidence for this FI should be displayed.
Query FI Source
Reactome FI Plugin Menu

Fetch FIs for node

All FIs for a node can be queried. Select a node in the network panel, and right click it to get the popup menu for node. Select a menu called "Reactome FI/Fetch FIs". FI partners for the selected node will be displayed in two sections: partners have been displayed in the network and partners not displayed in the network. You can select partners from the second sections to expand the displayed network.
Query Node FIs
Show Node FIs

Show pathway diagram

Pathway diagrams can be shown for pathway hits. Select a pathway in the "Pathways in Network" or "Pathways in Modules" tab, and right click to get the popup menu for pathway. Select "Show Pathway Diagram" from the popup menu
Show Pathway Diagram
. If pathways are imported from KEGG, KEGG pathway diagram pages will be shown in a browser with node genes listed in the "Nodes" column highlighted in red (for text and borders in pathway diagrams). If pathways are from Reactome or other non-KEGG databases, pathway diagrams should be shown in a separated window. If pathways are curated by the Reactome project, human laid-out diagrams should be displayed if any. Otherwise, auto-laid-out diagrams should be displayed. Genes or proteins from the displayed network should be highlighted in blue. Detailed annotations for nodes and reactions displayed in the diagram window can be viewed by using a popup menu called "View Instance". Diagrams displayed can be zoomed in/out using the zoom slider at the bottom of the window. The diagram can be panned by the overview window at the top-right corner.
KEGG Focal Adhesion
Reactome Signaling by PDGF

Load cancer gene index annotations

Reactome FI plug-in can load cancer gene index annotations for genes/proteins displayed in the network. There are two ways to show these annotations: use a popup menu called "Load Cancer Gene Index" when no object is selected (left figure), and use another popup menu "Fetch Cancer Gene Index" for a selected node (right figure).


Load Gene Index
Load Node Cancer Gene Index

By using the first method, the user can load the tree of NCI disease terms and display the tree in the left panel. The user can select disease term in the tree, all genes or proteins have been annotated for the selected disease and its sub-terms will be selected.
Cancer Gene Index Overlay

By using the second method, the user can view detailed annotations for the selected gene or protein. The user can sort these annotations based on PubMedID, Cancer type, and annotation status, and also filter annotations based on several criteria.


Cancer Gene Index Annotations for Node

Survival analysis

Survival analysis is based on a server-side R script to do either coxph or Kaplan-Meier survival analysis. To do survival analysis, a tab-delimited text file containing at least three columns should be provided. The names of three columns should be: Samples, OSDURATION, and OSEVENT. For example, see this survival information file downloaded from van de Vijver et al in 2002: Nejm_Clin_Simple.txt, which has been simplified for our analysis purpose. To do survival analysis, use the popup menu "Analyze Module Functions/Survival Analysis..." (see below)
Survival Analysis Menu

In the survival analysis dialog (below), double click the text field to select a file containing survival information for samples used to build the displayed FI sub-network (Note: you cannot do survival analysis if you use a gene set file only to construct the displayed FI subnetweork). You can choose either coxph or Kaplan-Meier model to do survival analysis. If you choose the Kaplan-Meier model, you have to select a module for analysis. In the Kaplan-Meier analysis, all samples will be divided into two groups: samples having no mutated genes in the selected module (group 1) and samples having mutated genes in module (group 2). It is recommended to run the coxph module first without selecting any module in order to see which module is most significantly related to survival times. After that, you can focus on some specific modules for survival analysis.


Survival Analysis Dialog

The results from survival analysis will be displayed in the right Results Panel with a tab labeled "Survival Analysis" (below left). You can do multiple survival analyses. All results returned from the server-side R script will be displayed in this panel with labels based on your parameter selections in the survival analysis dialog. The last result will be selected as default. At most three sections are displayed in the result panel for each analysis: Output, Error, and Plot. If no warning or error returned from an analysis, the error section may not be shown. Rows for modules having p-values less than 0.05 from coxph (all modules) analysis are displayed in blue with text underlined. You can click these modules to do a quick single-module based survival analysis without going through the above steps. Single module-based Kaplan-Meier analysis will show a plot file. You can click the file to view the actual plot (below right). You may need to save the plot file for your future use.


Survival Analysis Results
Kaplan-Meier Survival Plot