Usersguide

Usersguide

From ReactomeWiki

Jump to: navigation, search

Contents

Reactome User's Guide

Introduction

This document is designed to get you (the new user) up and going with Reactome, a knowledgebase of biological processes. This is not a comprehensive guide, but should provide you with enough information to search and begin using the database. We encourage you to read through it and contact us with any comments or questions that you might have. -The Reactome Staff

Getting started

What is Reactome?
What types of information does Reactome contain?
How is Reactome data organized?
What do Reactome stable identifiers allow me to do?
In what other formats may I view/export Reactome events?
How do I search Reactome?
How is electronic inference used to predict events in Reactome?
What is the Reactome Pathfinder?
What is the Reactome Skypainter?
What is Mart?
How can I see who has contributed knowledge to Reactome?
How can I see the topics that will be annotated in future Reactome releases?
How can I download Reactome data?
How do I cite Reactome?
How do I link to Reactome?

Viewing Reactome data

The Reactome Frontpage

The Reactome Frontpage can be accessed by pointing your web browser at www.reactome.org. It contains five sections (from top to bottom):

  • The Navigator bar provides access to the rest of the Reactome web site. E.g. a detailed list of the pathways contained within Reactome can be seen on the Table of Contents (TOC) page.
  • The Search panel allows you to perform text-based searches on Reactome. This functionality is covered in detail below .
  • The Species chooser allows you to choose a species. The Reaction Map will change according to which reactions could be orthology-mapped to that species from human. Human reactions are shown by default; at the time of writing, 21 other species are available, including most of the popular model organisms.
  • The Reaction Map (also known as the "Starry Sky", because of its resembance to the constellations at night) provides an interactive graphical representation of Reactome pathways. A pathway is depicted as a set of interconnected arrows, each representing a reaction (e.g., "Formation of Cyclin E1:Cdk2 complexes"; "Phosphorylation of Cyclin E1:Cdk2 complexes"). Each of the reaction arrows is linked into pathways as dictated by the order of reactions in that pathway.
  • The Pathway Topic List provides a list of pathway topics currently represented in Reactome (for a more complete list of pathways contained in Reactome, see the TOC). A full description of each pathway (or pathway sub-event) is provided by individual Event Pages. A description of individual molecules and complexes is provided on separate pages that link out from the Event pages.
Reactome Home Page
Reactome Home Page

"Mousing-over" a Topic name, such as Apoptosis, highlights the cluster of reactions included in the Apoptosis pathways on the Reaction Map. The "moused-over" Topic is displayed as a series of colored reaction arrows. A pale halo of background color similar to that of the overlying arrow(s) is added to facilitate visualization of these arrow(s) within the map as shown above.

A color coded key under the Reaction Map describes the meaning of the different arrow colors. If the reaction has been experimentally confirmed in the selected species, the arrow is blue, if the reaction has not been confirmed directly but has been inferred manually from another reaction (either in the selected species or another species) the arrow is pink, if the reaction has been inferred computationally, the arrow is green. Reactions that are not part of the selected Pathway Topic are displayed as grey arrows. Each Reactome event is represented by only one arrow in the map. When a reaction in one pathway is preceded by a reaction from another pathway, this is indicated by a thin grey line between the reactions in the two pathways. In example shown below, the red arrow on the right points to the reaction "glutaryl-CoA + FAD => crotonyl-CoA + FADH2 + CO2" (name not shown here), within the pathway Metabolism of amino acids and related nitrogen-containing molecules. This reaction is also the preceding reaction for the reaction "Crotonoyl-CoA + H20 <=> (S)-3-Hydroxybutanoyl-CoA" (red arrow on right) in Lipid Metabolism. Thus, these two reactions are associated with a grey arrow. Mousing-over an individual reaction arrow displays the name of that reaction and highlights, within the Pathway Topic List, the name(s) of the pathway(s) in which this reaction occurs.

Pathways have multiple reactions.
Pathways have multiple reactions.

Selecting a pathway in the Pathway List or an individual reaction in the Reaction Map directs the user to the Event Page on which the reaction map been has zoomed to display the pathway of interest in greater detail. The Event Page provides a detailed description of the selected pathway or reaction.

Pathways, Proteins and More

This section introduces you to some of the more important types of data available in Reactome by means of a "walk" through the web pages for pathways, reactions, proteins and complexes.

Viewing Pathways

On the front page, click on the link marked "Apoptosis" in the Pathway Topic List panel. This will take you to the web page for displaying the pathway.


This pathway actually comprises of four subpathways, which you can see on the left-hand side of the page. This is called the Event Hierarchy. By clicking on one of the "+" symbols, you can expand the subpathway to find out what is inside. Clicking on any of the components of the subpathway takes you to the web page for that component.

In the section on the right-hand side of the page, you will see a text giving an overview of this pathway, plus a descriptive diagram. Literature references are also shown here, with links to PubMed.

Viewing Reactions

Return to the "Event hierarchy" at the top left-hand side of the page. Expand the subpathway "Apoptotic execution phase". Click on "Caspase mediated cleavage of APC".

This will take you to a reaction page. The event hierarchy will remain visible, showing you where you currently are. You will also see a reaction diagram:


Image:caspase_cleavage_APC.jpg

This summarizes the reaction in a graphical way. Boxes on the left hand side represent "inputs" or reactants. Boxes on the right represent "outputs" or products. The box above represents the "facilitator" or catalyst. All of these boxes are clickable and lead to the corresponding physical entity pages. The text in the middle gives the name of the reaction. Text to the left of the diagram with arrows pointing to some of the inputs in your diagram represent preceding reactions. If there were arrows to the right of the outputs pointing to some text, it would represent following reactions.

Additionally, you will see a "Details" panel:

Image:caspase_cleavage_APC_details.jpg

This presents the input, output and catalyst information in text form, plus information on the GO classification and references to the literature used as source for the curation of this reaction.

Viewing Proteins

Take a closer look at the "Participating molecules" section near the bottom of the page:


Image:APC_protein.jpg


Notice how the names have a number of colored symbols next to it. Mouse over these symbols slowly and notice how each one has text associated with it. These are links into other databases and allow you to get more information on the entity (in this case, a protein). Click on the last of these symbols (an "R" with a blue background) for APC_1. You will be taken to the appropriate EntrezProtein page:


Image:ncbi_link.jpg

Viewing Complexes

Click on the first subpathway under Apoptosis called "Extrinsic Pathway for Apoptosis" and then "Activation of Pro-Caspase 8". Click on the reaction "Formation of Caspase-8 dimer". In the "Output" slot, you will see a complex named "Caspase-8 dimer". Click on this. You will see the following:


Image:caspase8dimer.jpg


Two visual representations of the complex are available, the diagram at the top of the page and the field "Hierarchical view of the components". The diagram shows the components of the complex stacked up on top of each other. Where two or more boxes appear next to each other, alternatives are indicated (in Reactome-speak, this is a "set"). In the hierarchical view, you will see all of the components of the complex represented as a tree. In this case, there are only two components, namely "p18 subunit of Caspase 8" and "p10 subunit of Caspase 8". More deeply nested hierarchies are possible, if you have a complex which itself is made up of complexes.

Event page viewing formats

In Reactome, anything that involves a change of state is classed as an event. The most important events from the user's viewpoint are reaction and pathway. Things that participate in reactions, like small compounds, DNA, proteins or complexes, are considered to be "entities". The web page layout for all of these things is very similar, and is created by a program called the "eventbrowser".

The eventbrowser provides three alternative views on the data you are examining:

  1. The Classic format with event hierarchy in sidebar is the default format. The Reaction Map is above the Event diagram at the top of the page, with the Event Hierarchy and Event Details in side-by-side panels underneath.
  2. The Sectioned format displays the same information as the Classic view, but each panel - Reaction Map, Event Hierarchy, Event Diagram and Details - occupies the full width of the page.
  3. In the Instancebrowser format the selected event is displayed as a database instance in a frame format with all of its slot (attribute) values.

If you are viewing entities, then no Event Hierarchy will shown. In the sections describing the different views that follow, we will use a reaction as an example, but the same principles apply to pathways and entities. Some of the information categories described, e.g. "Input", are specific for reactions, and you would not expect these attributes to be present in, say, a complex.

The "Classic" view format

In the Classic view format, the page is divided into 3 sections; the Reaction Map at the top, the Reaction Diagram panel below that and the Details panel at the bottom. All of these panels can be hidden/displayed by clicking on the -/+ symbol above each panel.

Sections in the "Classic" format

The Reactome Event Page


Reaction Map: The Event Page view of the Reaction Map provides additional options for viewing pathways/reactions within the map. Three buttons at the top of the panel allow the user to zoom in or out on the map or to redefine the center of the map by shifting focus to a different map location. In addition, four scroll arrows allow the user to navigate up, down, left, or right within the map. In this view, it is also possible to see the name of any reaction in the Reaction Map by mousing-over it. However, only the event that is the subject of current event page is ever highlighted on the map.

Reaction Diagram: Displays the components of the event in a visual way. The usual structure is inputs proceeding to outputs with an optional catalyst. Various coloured boxes indicate different entities; white for small compounds, orange for complexes, blue for proteins and green for sets. Text not surrounded by a box indicates names of events. The middle peice of text is the name of the current event being viewed. Text to the left or right of the event indicates preceding or following events respectively. All the entities and text is clickable and will take you to that specific page.

Details This panel is split into two sections. On the left is the hierarchical relationship between sub-events (i.e. pathways and reactions) within a selected pathway. For example, the pathway "Thrombin-activated activation cascade" is subdivided into two pathways: "Thrombin-mediated activation of PARs" and "G-protein cascades". .

Reactome Heirarchy Panel
Reactome Heirarchy Panel

Notes:

  • The selected pathway in the event hierarchy panel is highlighted in bold text.
  • To expand/or condense the hierarchy view in the left panel, click on the +/- button to the left of the event in the hierarchy.
  • If the selected event is a component of multiple pathways (not shown here), each pathway involving that event will be shown with the event of interest highlighted in bold. Scrolling up and down will allow viewing of all instances of that event.

To the right of the details panel is the event description which contains a title and often a text description and/or a figure. The text description is associated with references where appropriate. In addition, an event may be described by the following categories of information. Note that a description /definition of each of these categories, and others not listed here, is provided on the live pages by mousing-over the category name.

Event Description
Event Description
Information categories

Stable identifiers are currently available for reactions, pathways, regulator events and physical entities (molecules, compounds, complexes)

Input the physical entities (molecules/complexes) that are consumed by a given event.

Output: the physical entities (molecules/complexes) that are produced by a given event.

Catalyst: the physical entity that catalyzes the reaction.

Essential catalyst component: the precise component within a catalyst complex (or domain within a simple catalyst) that enables the reaction to occur.

GO molecular function: the Gene Ontology term that represents the activity of a catalyst within the given reaction. For further description of the Gene Ontology, click here. A description of the GO term can be viewed via the GOID which links out to the QuickGO Gene Ontology browser.

Represents GO biological process: (N/A in above event) the Gene Ontology (GO) Biological Process term that corresponds to the Reactome event (if one exists). A description of the GO term can be viewed via the GOID which links out to the QuickGO Gene ontology browser.

Preceding event(s): a list of events that occur immediately before the event being viewed.

Following event(s): a list of events that occur immediately after the event being viewed.

Cellular compartment:the location within cell where the event occurs.

References:a list of supporting references each of which hyperlinks to its PUBMED abstract (when applicable).

Taxon: the name of the species in which the event occurs.

This event is deduced on the basis of event(s) in other organism(s): this indicates that event has not been experimentally demonstrated in humans, but has been inferred on the basis of data acquired for another species. The species in which the event has been demonstrated is listed here.

Equivalent event(s) in other organism(s): this provides links to events in other species that are either confirmed to occur in a very similar way in both species, or have been inferred by [http:www.reactome.org/electronic_inference.html electronic annotation]. Electronically inferred events point back to the original event with the link "This event is deduced on the basis of event(s) in other organism(s)". They also have the 'evidenceCode' slot filled in with 'inferred by electronic annotation' (view in instancebrowser).

Participating molecules: a list of all molecules that participate in the selected event as well as all of its sub-events.

For example:

Participating Molecules
Participating Molecules

The list of participating molecules for the event "mRNA 3'-end processing" includes those molecules involved in the component events: "Cleavage of mRNA at the 3'-end" and "mRNA polyadenylation". A participating molecule may be either a simple molecule or a complex and it may function in a reaction as input, output, catalyst, or regulator. In the case that the participating molecule is a protein, direct links (if available) are provided to its corresponding UniProt, ENSEMBL, Entrez Gene, KEGG gene, OMIM entries through the Image:U_links.gif, Image:E_links.gif, Image:G_links.gif,Image:K_links.gif and Image:OMIM_M_links.gif hyperlinks, respectively that are found adjacent to the participating molecule name.

In the case of small molecules, links are provided to ChEBI, KEGG COMPOUND and PubChem and appear as Image:ChEBI_C_links.gif, Image:C_links.gif and Image:PubChem.gif as well as hyperlinks that appear after the name of the "small molecule".

*Note: A detailed description of each Reactome molecule is displayed on an independent page that is linked to the molecule name.

Sectioned view format

In the sectioned view format, the Event page is divided into four sections, the Reaction Map at the top, followed by the Event Hierarchy, the Event Diagram and finally the Details section.

Top half of sectioned view
Top half of sectioned view
Bottom half of sectioned view
Bottom half of sectioned view

The +/- symbol in a box to the left of each section name allows the user to show or hide that particular section of the page. The Event Description, Event Hierarchy, or Reaction Map panels can be hidden/displayed by clicking on the -/+ symbol above each panel.

Sections in "Sectioned" view format

Reaction Map: In the Reaction Map, pathways are depicted as a set of interconnected arrows, each representing a reaction. Each reaction arrow is linked into pathways as dictated by the order of reactions in that pathway. Three buttons at the top of the panel allow the user to zoom in or out on the map or to redefine the center of the map by shifting focus to a different map location. In addition, four scroll arrows allow the user to navigate up, down, left or right within the map. In this view, it is also possible to see the name of any reaction in the Reaction Map by mousing-over it. However, only the event that is the subject of current event page is ever highlighted on the map.

Event hierarchy panel: This panel shows the hierarchical relationship between sub-events (i.e. pathways and reactions) within a selected pathway. The Image:reactionicon.jpg symbol indicates that the adjacent event is a reaction and the Image:pathicon.jpg symbol indicates that it is a pathway. For example, the Notch signaling pathway above is subdivided into 7 component subpathways.

Notes:

  • The + symbol indicates that a pathway can be "unfurled" further to reveal its component events (or that a generic reaction can be "unfurled" to reveal the specific reaction instances). Likewise the - symbol indicates that all of the existing sub-events for the pathway/reaction are shown. Clicking on the +/- symbol unfurls/hides the sub-events respectively. Absence of a + or - symbol next to an event indicate that this event has no sub-events.
  • The text color for a given event in the hierarchy indicates whether the event has been confirmed experimentally (blue), manually inferred from another given event for which there is direct evidence (pink), or electronically inferred (green).
  • The selected pathway is highlighted in bold text within the hierarchy. A set of 4 buttons at the top of the hierarchy panel allow the user to
  1. Open the event hierarchy from the top-most level pathway to which the selected event belongs down to the event itself
  2. Open the hierarchy containing the selected event completely, down to the end of the pathway to which the selected event belongs.
  3. Close the entire hierarchy, to show just the top-most level pathway to which the selected event belongs.
  • If the selected event is a component of multiple pathways (not shown here), each pathway involving that event will be shown with the event of interest highlighted in bold. Scrolling up and down will allow viewing of all of the instances that event.

Event Diagram: Displays the components of the event in a visual way. The usual structure is inputs proceeding to outputs with an optional catalyst. Various coloured boxes indicate different entities; white for small compounds, orange for complexes, blue for proteins and green for sets. Text not surrounded by a box indicates names of events. The middle peice of text is the name of the current event being viewed. Text to the left or right of the event indicates preceding or following events respectively. All the entities and text is clickable and will take you to that specific page.

Details: The Details section provides a description of the Event that is the subject of the page. This section contains a title and often a text description and/or a figure. The text description is associated with references where appropriate. In addition, an event may be described by the following categories of information. Note that a description /definition of each of these categories, and others not listed here, is provided on the live pages by mousing-over the category name.

Information categories

*Input the physical entities (molecules/complexes) that are consumed by a given event.

*Output: the physical entities (molecules/complexes) that are produced by a given event.

*Catalyst: the physical entity that catalyzes the reaction.

*Essential catalyst component: the precise component within a catalyst complex (or domain within a simple catalyst) that enables the reaction to occur.

GO molecular function: the Gene Ontology term that represents the activity of a catalyst within the given reaction. For further description of the Gene Ontology, click here. A description of the GO term can be viewed via the GOID which links out to the QuickGO Gene Ontology browser.

Represents GO biological process: (N/A in above event) the Gene Ontology (GO) Biological Process term that corresponds to the Reactome event (if one exists). A description of the GO term can be viewed via the GOID which links out to the QuickGO Gene ontology browser.

Preceding event(s): a list of events that occur immediately before the event being viewed.

Following event(s): a list of events that occur immediately after the event being viewed.

Cellular compartment:the location within cell where event occurs.

References:a list of supporting references each of which hyperlinks to its PUBMED abstract (when applicable).

Taxon: the name of the species in which the event occurs.

This event is deduced on the basis of event(s) in other organism(s): this indicates that event has not been experimentally demonstrated in humans, but has been inferred on the basis of data acquired for another species. The species in which the event has been demonstrated is listed here.

Equivalent event(s) in other organism(s): this provides links to events in other species that are either confirmed to occur in a very similar way in both species, or have been inferred by electronic annotation. Electronically inferred events point back to the original event with the link "This event is deduced on the basis of event(s) in other organism(s)". They also have the 'evidenceCode' slot filled in with 'inferred by electronic annotation' (view in instancebrowser).

Participating molecules: a list of all molecules that participate in the selected event as well as all of its sub-events.

For example:

Participating Molecules
Participating Molecules

The list of participating molecules for the event "mRNA 3'-end processing" includes those molecules involved in the component events: "Cleavage of mRNA at the 3'-end" and "mRNA polyadenylation". A participating molecule may be either a simple molecule or a complex and it may function in a reaction as input, output, catalyst, or regulator. In the case that the participating molecule is a protein, direct links (if available) are provided to its corresponding UniProt, ENSEMBL, Entrez Gene, KEGG gene, OMIM, UCSC Genome Browser, and REFSEQ entries through the Image:U_links.gif, Image:E_links.gif, Image:G_links.gif,Image:K_links.gif,Image:OMIM_M_links.gif, Image:UCSC_U.gif and Image:Refseq_R.gif hyperlinks, respectively that are found adjacent to the participating molecule name.

In the case of small molecules, links are provided to ChEBI and KEGG COMPOUND and appear as Image:ChEBI_C_links.gif and Image:C_links.gif hyperlinks after the name of the "small molecule".

*Note: A detailed description of each Reactome molecule is displayed on an independent page that is linked to the molecule name.

The Instancebrowser view format

This is a purely tabular display, describing the selected event or molecule as a database instance and provides the values of each of its attributes. For a description of each attribute, see the Reactome schema The instancebrowser view of the reaction "Notch 1 heterodimer binds with a Notch ligand in the extracellular space" is shown here:
Instance Browser
Instance Browser

Changing the page viewing format

A small menu bar found at the bottom of every page provides the option to [Change default viewing format]. Clicking here displays the following dialog box:

View selector

This allows the user to select the page format of the displayed information.

Viewing/exporting Reactome events in other formats

It is possible to export any Reactome Event in a number of different formats. A small menu bar found at the bottom of every event page lists the formats available and is described below.

Image:download_formats.gif

SBML and BioPAX are exchange formats of interest to bioinformaticians. PDF is the familiar document format, providing you with a convenient "document" of the pathway. The "List" chooser gives you the possibility to dump all of the protein or compound IDs for this pathway.

More comprehensive sets of Reactome data and tools are also available on the download page. This includes the complete Reactome textbook of biological processes in PDF or RTF format, the complete set of human reactions in Reactome (in SBML level 2 or BioPAX level 2 format), and a list of human protein-protein interaction pairs. For more information about SBML, click here. For more information about BioPAX click here.

Searching Reactome

Simple searches

The simple search window for Reactome is located just below the menu bar on each Reactome page.

The query term/phrase is entered in the central (empty) window of this search bar. For more information on search modes, click here. The species can be selected through the second menu.

Simple Search
Simple Search
Simple Search and results
Simple Search and results

A search for human Cdc6 results in 307 hits in the four different categories (Pathways, Proteins, Reactions and Others). Each result is preceded by the type of category it belongs to. For Others, this could be literature references, complexes, inhibitions, activations or anything else not covered by the first three categories. Only 10 hits are displayed per page and at the bottom of the page is a navigation tool to allow you to jump to different pages for the hits. You can also display all the results in a single page by clicking Show all results

Next to each category is a number in brackets to indicate the number of hits of that type found. For example, there were 16 proteins found from the total of 307 hits. Each category is next to a checkbox. Subsets of the results can be displayed if some of these categories are not required. Simple untick the boxes that are not required and click the Show button next to the right of these categories. The results will reload to display only those categories you wish to view.

Each results returned is clickable and will take you to the appropriate Reactome page when clicked.

For more Simple search examples, see extended searching below.

Extended search

The Extended search form can be accessed via the Extended search button located in the main menu bar on all pages. This search method allows more specific schema-based queries for particular types of Reactome data. Notably, this option allows searching for records (instances) in the database by multiple field (attribute) values. Queries are combined together with AND. For example, a query to retrieve all reactions which consume ADP and produce ATP would be formulated by selecting class Reaction, then selecting field name input and entering ADP into the query box, then selecting field name output on the next row and entering ATP. There are several possible search modes. For more information on search modes, click here.

Extended Search
Extended Search

Search examples

How can I find out more about my favorite process in Reactome?

In the simple search bar at the top of the page, select "process" in the first drop-down menu. Then choose "with exact phrase" in the second menu if the name of the process is widely accepted and uniformly used in the literature otherwise, choose a less stringent search mode such as "with all words". Next, enter the name of the process of interest such as "mRNA capping" into the text slot, select the species and hit GO!

In this case, a single pathway, "mRNA capping" matched the query phrase. Thus, the Event page for this pathway is displayed as the search result (figure shows only the top part of this page).

If this searching approach fails, try to identify a Gene Ontology term that corresponds to the process that you are interested in and then use this term to search for Reactome events that may be cross-referenced to this GO term as described below.

How can I find out more about my favorite protein in Reactome?

In the Simple search bar at the top of the page, select "molecule" in the first drop-down menu and choose "with exact phrase" in the second menu. This will search through the primary names and synonyms of all molecules and complexes. If this doesn't hit the protein of interest, see search tips below.

How can I use a GO Biological Process term to search Reactome?

In the Extended Search form, Select "GO_BiologicalProcess" in the Restrict search to a class slot, "display name" in the field name window and the name of the GO term in the value slot. Use the "exact match" mode if you have the exact GO term name. To search GO for the term of interest, use the AmiGO browser at the Gene Ontology website.

What written information does Reactome have involving my topic/protein of interest?

In the Simple search bar at the top of the page, select "summation" in the first drop-down menu. Then choose "with exact phrase" in the second drop-down menu for a very specific search on that term or phrase. Alternatively, choose a less stringent search mode such as "with all words" to search hits that include the all the query words in no particular order. Enter the term or phrase in the text entry window, select the species and hit GO

Reactome Tools

The Reactome Pathfinder

The Pathfinder tool is used to identify or discover pathways that connect a given input and one or more output molecules or events. When multiple output molecules/events are designated, the shortest path is displayed. (See notes on requirements below). A link to the Pathfinder tool can be found in the main menu bar on each page.

For example, to search for a human pathway between the molecule G6PD and xylulose 5-phosphate, G6PD is entered as the input and the two reactions as outputs. Very common molecules are excluded to reduce the number of irrelevant pathways generated. These are listed as "non-connecting compounds". Additional molecules can be excluded by adding their names to the precompiled list. Hitting GO! will search Reactome for the start and end compounds/events entered.

A drop-down list of hits to the entered names is provided. Here G6PD dimer is selected as the input and the reactions "D-ribulose 5-phosphate (cytosol)" is selected as an output. The search is initiated by clicking on GO!

If a path is found, a list of the events and molecules connecting the selected inputs is shown at the bottom of the page. The path between the entered input molecules is displayed in the reaction map. Reactions are highlighted in red and the small molecules in green.

*Note: The graphical display of the path requires your browser to have a java (version 1.3) plugin and does not work with Netscape 4.x.

The Reactome Skypainter

Skypainter
Skypainter

Skypainter is a tool to determine which events (reactions and/or pathways) are statistically overrepresented in a set of genes as specified by submitted list of identifiers. In other words, given a list of genes, Skypainter can identify common events for these genes.

Given a set of M genes which participate in an event, the total of N genes (for the given species) for which Reactome has data, and given the submitted list of K genes of which X genes participate in the given event, skypainter calculates (by performing the hypergeometric test) the probability of picking X or more genes involved in the given event from a set of K purely by chance. Hence a low probability suggests that participation in a given event is what the genes in the submitted list have in common. Note, however, that the probabilities as reported by Skypainter are not corrected for multiple testing arising from evaluating the submitted list of genes against every event for the given species.

Identifiers which can be used are UniProt accession numbers and ids, GenBank/EMBL/DDBJ protein ids, RefPep, RefSeq, EntrezGene, MIM, Affymetrix and Ensembl protein, transcript and gene identifiers. All purely numeric identifiers, such as from MIM and EntrezGene have to have the abbreviated database name and colon prepended to them, i.e. MIM:602544, EntrezGene:55718.

Each reaction arrow on the reaction map is coloured according to the number of genes in the submitted list participating in that reaction.

Similar functionality, i.e colouring reactions according to the number of times the reaction is hit by the identifiers in the submitted list, is also available for small molecule identifiers from ChEBI and KEGG COMPOUND identifiers (e.g. ChEBI:2359, C00002), Enzyme Commission (EC) numbers (e.g. 1.1.1.1) and Gene Ontology (GO) accession numbers (e.g. GO:0004672). GO cellular component accession numbers can be used to highlight reactions involving molecules with compartment with given accession number or which themselves correspond to the given GO cellular component term. GO molecular function accession numbers highlight reactions which are catalysed by the activities they specify. GO biological process accession numbers highlight reactions which either correspond or are components of pathways corresponding to the given GO biological processes. Note that the overrepresentation analysis is not performed for those identifiers.

If the identifiers are followed (separated by space or tab) by a numeric value, the colouring will be done according to the average of the numeric values of all identifiers linked to the reaction. A time series can be displayed as an animation by providing multiple values (on the same line, separated by a single space or tab) for each identifier. This feature can be used, for example, to produce a "movie" on the basis of micro-array expression analysis a time series. Note that the overrepresentation analysis is not performed in this case.

To see an example of the Reactome reaction map painted using a test set of identifiers, click here . To see an example of the reaction map painted using a test set of identifiers with values (producing a "movie") click here

We also have documentation providing detailed technical information on linking from your own web pages to the SkyPainter, e.g. for displaying expression, proteomics or metabolomics data projected onto Reactome pathways.

Reactome Mart

Introduction

Mart is a tool that allows you to perform fast bulk queries against Reactome and a number of other databases (e.g. UniProt, ENSEMBL). You can link queries together, so that the results contain information from more than one database. E.g. you can find the Affymetrix IDs associated with the genes in selected Reactome pathways by linking a Reactome query to an ENSEMBL query.

To access Mart, click on the button labelled "Mart" on the navigation bar.

There are two ways to use Mart:

Canned Queries

Reactome provides a small set of canned queries. You can use these without needing to understand the details of the BioMart query interface.

The canned query selecter allows you to choose a canned query. Once you have done so, clicking on the button "Go!" takes you to the page where you enter your data. A detailed description of these queries is available here.

The data entry page will be different, depending on whether only a single data item is allowed or whether multiple data items are allowed. If only a single item is allowed, then you will be presented with a selecter to choose the item, e.g. species.

If multiple data items are allowed, you will get a text area, which you can use to enter the items separated by newlines, e.g. a set of UniProt IDs.

In this case you will also see some extra buttons. Above the text field will be the button "Show example". Pressing this causes example values to be loaded into the text area, which you can use for guidance or testing purposes. Below the text field is the "Browse..." button, which allows you to choose a file from your local computer to upload as data. By default, the contents of this file will not be displayed, but you can examine (and edit) it by clicking on the button "Preview file content".

Once you have your data, you can click on "Run query" to get the results. Additionally, there is a button "Reset", which clears the page and allows you to start again, if you wish. If you do not enter any data, then the query will be performed over all known data items.

Once the query has been performed, the results are presented in a regular BioMart results page. This allows you to export the results as tab-separated values or as an Excel file, and additionally, to perform more complex queries. See below for more details.

The following queries are currently available:

  • Find list of pathways for specific species (single data item). You can use this to list all pathways known to Reactome for a species of your choice.
  • Find list of reactions for specific pathways (multiple data items). If you have a list of Reactome stable identifiers for pathways that interest you, you can use this canned query to find all of the reactions involved in the pathways. If you use this query without submitting any data values, all reactions involved in all known pathways will be returned.
  • Find list of proteins for specific pathways (multiple data items). If you have a list of Reactome stable identifiers for pathways that interest you, you can use this canned query to find all of the proteins involved in the pathways. If you use this query without submitting any data values, all proteins involved in all known pathways will be returned. Proteins are characterized by their UniProt IDs.
  • Find list of complexes for specific proteins (multiple data items). If you have a list of protein UniProt IDs, you can use this canned query to find all of the complexes in Reactome involving those proteins. If you use this query without submitting any data values, all complexes and their associated proteins will be returned.
  • Find list of pathways for specific genes (multiple data items). If you have a list of Entrez gene IDs, you can use this canned query to find all of the pathways in Reactome involving those genes. If you use this query without submitting any data values, all pathways and their associated genes will be returned.
  • Find list of genes for specific pathways (multiple data items). This is the converse of the previous query: if you have a list of Reactome stable identifiers for pathways that interest you, you can use this canned query to find all of the genes involved in the pathways. If you use this query without submitting any data values, all genes involved in all known pathways will be returned. Genes are characterized by their Entrez gene IDs.
  • Find list of reactions for specific genes (multiple data items). If you have a list of Entrez gene IDs, you can use this canned query to find all of the reactions in Reactome involving those genes. If you use this query without submitting any data values, all reactions and their associated genes will be returned.

Regular BioMart Query Interface

The regular BioMart query interface is situated directly below the canned query selecter. It is powered by the BioMart engine and you can find full documentation at www.biomart.org . Additionally, you will find a very good quick introduction to using the interactive query interface by clicking on the link "Mini Tutoral", that appears when you first go to the Mart page.

On the right hand side of the page, you can select database and dataset. By default, the database is "REACTOME" and the dataset is "complex". There are a number of other databases available, currently UniProt and ENSEMBL homo sapiens genes.

Reactome provides three datasets, "complex", "pathway" and "reaction". Choose the one most appropriate to the kind of query you want to make. E.g. if you would like to find all pathways associated with a given GO accession, start by selecting the "pathway" dataset.

The attributes to be displayed in the results table can be selected on the left hand side of the page. They are split into several different categories, which are pretty much the same for all datasets. The first category contains attributes that are directly taken from the dataset itself. E.g. for "complex", you will find things like the stable ID, but also associated complex name and species name. Subsequent categories may include other Reactome classes you can link to, plus in all cases, DNA, protein and small molecule. This means, for example, that you can show all of the proteins and small molecules associated with the reactions that interest you, assuming you have initially selected the "reaction" dataset.

Filters are also split into categories, but in a different way to the attributes. The first one is labelled "Limit to ... containing these IDs". This allows you to enter a list of IDs that will restrict the results returned by the query. E.g. if you have selected the "complex" dataset, you can supply a list of UniProt IDs. This returns only those complexes containing the proteins corresponding to the UniProt IDs.

The next filter is labelled simply "Species". If you do not use this filter, then the results will contain information from all species known to Reactome. If you select a species, then the results will be restricted to the (single) chosen species.

For the "pathway" and "reaction" datasets, the next filter will be labelled "GO accession". This allows you to enter a GO biological process accession number, and restrict the results to reactions or pathways containing that accession.

Finally, the "Miscellaneous" filter allows you to restrict either by version number or name. Version number means the stable ID version. This is a number that gets incremented every time something gets changed by a curator. E.g. if you are only interested in things that have never been changed, you could set the contents of this slot to "1". If you want to restrict by name, you should note that you need to enter the full name.

You can use the second "Dataset" link in the left hand panel to choose another dataset to link to. E.g. if you want to find the Affymetrix IDs associated with a set of pathways, select "pathway" as your first dataset, then select "[ensembl] Homo sapiens genes (NCBI36)" as your second dataset. In the second dataset, click on "Attributes" in the leftmost panel, and expand the attribute category "GENE" by clicking on the "+" symbol. Scroll down to "Microarry Attributes" and select the Affymetrix one most suitable to your needs, e.g. "AFFY HUGENEFL".

There is a little pitfall that you will need to be aware of when you link from Reactome to other datasets. Let's take linking pathways from Reactome to ENSEMBL transcripts as an example. On the Reactome side, there is gene and protein information associated with a pathway, and, if you wanted, you could select ENSEMBL gene ID or UniProt ID as Reactome attributes. However, if the database you are linking to also provides these attributes (ENSEMBL does) then you are strongly advised to select these attributes only in the linked-to database (ENSEMBL, in this case).

The reason is, you are making the link from the pathway and not from the gene or protein IDs. So, there will be no correspondence between, say, UniProt IDs coming from Reactome and transcript IDs coming from ENSEMBL.

To run a regular BioMart query, click on the "Results" link.

Once you have run your query and have produced a results page, you have a number of options for viewing the information. By default, the format will be "HTML", and you will be presented with the first 10 lines of the results on the web page.

Using the selecter labeled "Display maximum", you can increase the number of lines displayed by your browser to a maximum of 200. If you would like to see more lines, then you need to dump to a file, by clicking the "Go" button. Make sure you select the right output format before you do this. The default "HTML" might not be what you want. Other options are "TSV" (tab separated value, generates a .txt file with columns separated by tabs) and "XLS" (Excel spreadsheet, generates a .xls file that you can display with Excel - does not necessarily look good in OpenOffice).

If you have used other BioMart sites, you might be wondering why there is no "CSV" (comma-separated value) output format. This is because many of the values that are returned by Reactome, e.g. pathway names, contain commas, which would lead to confusing and unparsable output.

Download Reactome

To Download the Reactome data and code please follow this link.