New Reactome Curator Guide
From ReactomeWiki
A Guide for Reactome Curators
Introduction
The goal of Reactome is to describe the known biochemical details of human biological pathways. A Reactome curators job is to work with experts in different fields of biology to identify and curate suitable human pathways (see below) by breaking them down into subpathways and reactions and describing them in a format that is compatible with the Reactome data model. The purpose of this guide is to describe each step of that curation process to help the curator fully understand the steps involved. The guide is a useful reference for the experienced curator as it is nearly impossible to remember all of the steps and details of the curation process. Curators are actively encouraged to add to sections of the guide providing useful hints that may be missing for current and future curators. Regular use and addition of material to the guide will facilitate the process of curation and increase annotation consistency.
If in doubt about any of these steps...do not hesitate to e-mail the internal list for clarification (and then contribute any helpful info that you get to save someone else the trouble of asking later!)
This guide is divided into 7 sections.
- FAQ ("How do I ....?") A list of curation common curator related questions organized by keyword
- Choosing a topic
- Guidelines for naming entities and events
- How to use essential data model classes (using a curated example)
- Creating a "Reactome friendly" framework of your pathway module
- The essential QA process during curation
- Using the curator tool
- Drawing diagrams
- Preparing for database releases
Here we provide basic guidelines as to how data for relevant pathways should be collected, organized and entered into Reactome via the Curator Tool. The process of converting a biological topic should maintain the referential integrity of the reactions, both newly added as well as those already present in Reactome. It is assumed that you (the reader of the guide) have some familiarity with Reactome, and have read the Reactome papers. They are freely available as PDF files. Though this document is oriented toward the Reactome curator, there is much here for outside groups using Reactome. The entire dataset, website, curator, and author tools can be downloaded from the Reactome download page. Installation and configuration instructions are also provided here. If you have any problems with the install or tools please contact us at help@reactome.org.
FAQ or "How do I ...?"
A key word organized FAQ sheet for Reactome curation questions can be found here. Please add to this as you come up with solved examples for your own questions.
Choosing a topic/pathway for curation
Focus first on:
- Areas of biology that you know well (or one in which you have good contacts).
- Pathways that are well characterized at the molecular level AND for which there is considerable HUMAN biochemical data and proteins that are new to Reactome.
Points to consider:
- Give lower priority to topics that will require a large amount of inferences from other species.
- Don't try to tackle huge pathways all at once.
- Break large topics into *manageable* pieces. Experts are generally unwilling to commit to a project bigger than a dozen rxns and it is easy to get lost in the details and lose focus which is costly in terms of time.
- Work on several (preferably related) small projects simultaneously. Experts often fail to come through at critical times so it is good to have a few projects at different stages to keep work flowing through the pipeline.
- Check the Editorial Calendar to ensure that your plans don't ovelap with another curator. If in doubt contact the curator.
- Check with Peter - he may be aware of planned work that is not recorded in the Editorial Calendar.
Understanding and using the data model
A detailed understanding of the Reactome data model is important for accurate and consistent curation. The best way to start learning about the different data classes and how to use them is to browse the data model glossary. Some classes are used infrequently but you still need to be aware of them. Some of the more important/confusing data classes and attributes are listed below along with some relevant use cases. Please add to this if you come across new and useful cases. All the classes have associated data fields, divided into several categories. These are described. Below them are some usage examples from curated Reactome pathways (mostly from Apoptosis right now). We will continue to add to this list (and provide more actual examples) and we encourage you to add any informative examples of your own.
Key Reactome data classes and relevant use cases
Incomplete.......
Data Classes
- ReactionLikeEvents
- Regulation
- Disease
- FragmentModification
Important attributes
Conceptual Approaches To Curating and other hints
Mandatory, Required, and Optional
All the classes of data in Reactome have a number of details that MUST be completed before thay can be 'released', i.e. made visible to the public. Fields containing details have a symbol that indicates the type, either a red box with a letter superimposed, or a yellow diamond with n superimposed. The yellow diamond with n indicates fields that are automatically completed. The red squares indicate, by letter,
- m - Mandatory - you must provide this information.
- r - Required - you must provide this information if it is possible - a good example of this is species, required for defined sets of proteins, but not required for defined sets of small molecules.
- o - Optional - not available for every instance, so cannot be 'Required' but in some circumstances may be a procedural requirement, e.g. inferredFrom must be completed for human reactions that are inferred from a model organism.
Events
The main concepts in the Reactome data model are Event and PhysicalEntity. Events are of two types: Pathways or Reactions. Pathways in Reactome are multi-step events, whereas Reactions are single-step events (at the molecular or atomic level).
A concrete curated example is the Reactome Apoptosis pathway which is broken down into the subpathways "Extrinsic pathway", "Intrinsic pathway", "Activation of Effector Caspases", "Execution phase" and "Regulation of Apoptosis".
Events may be linked to other Events that precede them, regulate or are regulated by them. Reactions contain PhysicalEntities that take part in the Event. At present there are two subclasses of Event: ReactionlikeEvent and Pathway. The ReactionlikeEvent subclass has 4 subclasses: BlackBoxEvent, Depolymerisation, Polymerisation, and Reaction.
Reaction
A Reaction is an event that converts inputs to outputs in a single step.
PhysicalEntities can be the inputs, outputs, catalysts, regulators, or requirements in Reactions. PhysicalEntities can be single entities, such as proteins, small molecules, RNA, DNA, carbohydrates, lipids, or sub-atomic particles. They can also be complexes consisting of a combination of any of the single entities, or polymers synthesized from the single entities. Related entities can be grouped into a set. *The use of sets is described below*.
Here are some common examples of different types of reactions and blackbox event use cases. Clicking on the blue highlighted class name will bring you to some additional usage hints and warnings.
Simple binding/complex formation reaction
Simple dissociation reaction
Post-translational modifications
ReactionLikeEvents
BlackBoxEvent
A BlackBox event converts inputs to outputs in a multiple steps that are not annotated because:
- We don't know all of the intervening steps.
- The intervening steps are known but we do not want to curate them for the purposes of the module. Two examples are:
- the transactivation of a gene leading to production of a protein....we don't want to include all the steps of translation of that protein.
- The degradation of a protein.
Guidelines for Naming Entities and Events
Curators should use the shortest possible accepted molecule names and modification abbreviations when naming entities. Here are some guidelines for the different classes of entities:
1. EWAS: HGNC name, or uniprot short name
2. Modified entities (e.g post translationally modified proteins)
- Use accepted compact name preferentially
- If no compact name available, list component short names separated by colons
- Do NOT include the words "complex","dimer","monomer","associated with","bound to" in the complex name.
4. Sequence variants: see Human Genome variation society description of sequence changes at protein level.
5. Large deletions Annotation of Large Deletions, Insertions and Protein Fusions
Annotating a disease process
Structuring the disease pathway
New pathways that describe disease processes should be placed under the "disease" chapter.
When you are creating a disease pathway that has/will have a "normal" counterpart pathway in Reactome, the pathway should be structured as it is in the following example:
- Signaling by EGFR in cancer
- Signaling by EGFR
- Signaling by constitutively active EGFR
The disease pathway should have two sub-pathways only, one for normal counterpart, and another for grouping all disease related events (pathways or reactions).
Assigning Disease term attributes
To pathways:
Please be sure to label the top-level disease pathway and its sub-level disease pathway with a disease attribute. You can browse the hierarchy here and if you don't find the term that you are looking for in gk_central you can create a new one by opting to create a new instance. Enter the DOID identifier (number only) in the identifier slot and you will be asked if you want to import the entry. Say yes.
To reactions:
A disease attribute should be added to all reactions involving desease related physical entities, such as proteins of bacterial/viral/fungal pathogens, mutant human proteins and drugs used in disease management. Use the same procedure as the one described for addition of disease attributes to pahtways. If there are several related disease tags that are applicable to reactions and pathways, using the most general tag is preferrable, and even if specific disease attributes are added, always include the most general attribute for a given disease type. For example, for cancer related reactions and pathways, always include the cancer tag.
To entities:
Disease attributes should be added to mutant proteins associated with disease and may be very specific, referring to the specific disease type(s) in which a particular mutation was found. For example, EGFR L861Q mutant (DB_ID 1177542), in which L-leucine at position 861 is replaced with L-glutamine has been detected in non-small cell lung carcinoma and adult glioblastoma multiforme. Besides these specific disease tags, it may be advisable to also add more general disease tags to an EWAS, in this case lung cancer and cancer, to enable search of mutant EWASs using these general terms. When it comes to entity sets, such as EGFR KD mutants (DB_ID 1182966) which includes all kinase domain mutants of EGFR in cancer, a very general disease tag, such as cancer, is appropriate. This is because each member of the set has its own range of cancer types (while EGFR L861Q is found in lung cancer and glioblastoma, EGFR L858R is found in lung cancer, thymoma, thyroid cancer, breast cancer and ovarian cancer), but their biological behavior is identical/similar. For the same reasons, only the general cancer disease tag should be used for events in which cancer disease entities participate.
When annotating drugs used to treat a particular disease, curators should add an appropriate disease tag to a drug entity. For a cancer drug, the general cancer disease attribute should definitely be added, but when it comes to more specific disease tags, it may be advisable to add only those cancer types for which the treatment by a given drug is approved.
Associating normal and disease pathways with the same diagram
In order to be able to have the disease pathway share a diagram with the normal pathway, you must add the disease pathway as a value in the representedPathway slot for the normal pathway diagram. For the example above, you would create a diagram for the normal pathway "Signaling by EGFR " (called "Diagram of Signaling by EGFR") To share this diagram with the disease pathway, you would add "Signaling by constitutively active EGFR" as the second value in the representedPathway slot for the diagram "Diagram of Signaling by EGFR". Disease pathways will show normal events and entities as a shaded background, while disease events and entities should be emphasized by red lines.
Disease Pathways without a corresponding "normal" pathway in Reactome
If the disease pathway will not have a corresponding "normal pathway" in Reactome, the above organization does not apply, but the pathway should still be placed under the disease chapter and a disease term should be applied. If a suitable term cannot be found, please ask about submitting a new term suggestion to the EBI. Red lines should be used to emphasize disease entities and events in other disease pathways - e.g. viral proteins and their reactions with human proteins. Highlight any reaction that had to do with disease progression and any entity that is from another species. Host entities should be left black, but complexes that had host and other species were colored red. Coloring host entities red would be misleading, even if the host proteins are hijacked into doing something that has to do with disease progression. In these cases the reaction lines are red, but the host entities are not highlighted.
Using the curator tool
Downloading, Installing, And Maintaining The Curator Tool
Instructions for Downloading Installing and Maintaining the Curator tool can be found here.
Curator tool user interface
The curator tool has three panes to view the content. These different views can be switched using the tabs in the upper left hand corner of the curator tool window.
The Menu Bar
The menu bar contains a number of shortcut buttons, a scroll over message will tell you the function of each button.
The Schema View
The Schema view provides a hierarchical list of the data classes. In this view you can search, open, edit and create instances of these classes. This panel is used most often for creating new items, such as an EWAS derived from a ReferenceGeneProduct.
The Event View
This view is where most curation is done. The event view displays a hierarchical, alphabetical list of the events in the local project on the left, a graphic in the middle that shows the relationship between objects for selected events, and on the right the details of selected events. Unfurling a pathway by clicking on the + symbol to the left of its name in this views reveals all of its component Reaction events. DON'T CONFUSE these symbols for the check box - clicking on an unchecked box marks the event as ready for release onto the public website; clicking on a checked box has the opposite effect. For pathways the check box selects the ENTIRE pathway, doing this by mistake can be time-consuming and tedious to reverse so be warned!
The Entity Level View (ELV)
The ELV is where pathway diagrams are created and subsequently represented. In this view the pathway is layed out graphically showing the inputs, outputs, catalyst where appropriate, and regulatory molecules, each in the correct cellular compartment.
Synchronizing local projects with the database
Synchronize at least once a day. It is good practice to synchronize your local project with the gk_central database to ensure that your work is not lost should the local copy become corrupted or lost due to hard drive failure. It is useful for other curators who may be able to reuse EWASes or other instances that you have created. It is perfectly acceptable to 'check-in' incomplete work unless it is marked as 'doRelease'.
To keep the data in the opened project and the database repository
consistent, you need to synchronize the local project with the database
periodically. Several actions are available for you to do synchronizing.
The following is the "Database" menu for doing database-related actions:
"Match Instance in DB..." can find a matched instance in the database repository for the selected instance in the local repository. A matched instance has the same defining attributes as the local one. If a matched instance can be found, you can merge the local one to the database one. "Compare Instance in DB..." can compare a selected instance with the corresponded one in the database. Corresponding means the same DB_IDs. So you should not compare one instance checked from database A to an instance in database B because same DB_ID might be assigned to different instances. "Update from DB" will update a checked out instance from the database. "Check In" is used to check in newly created or modified instances to the database. It is recommended that you should use "Compare Instance in DB..." first before "Update from DB" or "Check In". After a new instance or modified instance checked into the database, the marker ">" will be removed to indicate that the local and the database copies are the same.
The "Synchronize with DB..." menu is used to check all instances in
the selected schema class in the schema view or in the whole opened
project if no class is selected. There are four possible inconsistency
categories between the two repositories: instances different between the
local and the db repositories resulting from instance modification
either locally or in the database, instances created locally, instances
deleted in the database by you or others, and instances deleted locally
by you. You can choose appropriate action for the selected instances in
different categories. Please be aware that if you do a multiple
selection from different categories, the enabled actions are applied to
all selected categories. For example, actions "Update from DB" and
"Commit to DB" can be applied to instance "AIF-mediated response" in the
first category, but only "Commit to DB" can be applied to instance
"TestPathway" in the second category. If you select both "AIF-mediated
response" and "TestPathway", only "Commit to DB" is enabled and "Update
from DB" is disabled. Double clicking an instance in the first category
will popup a comparison dialog the local and database instances, while
double clicking an instance in other categories will show the contents
of the clicked instance. To deselect a single instance, hold the control
key and click the selected instance.
There are three different cases in the first category, instances
different between the local and the db repositories: instance is
modified in the local project but not in the database, instance modified
in the database but not in the local project, and instances are modified
in both the local and database. The user can commit changes to the
database or overwrite changes by updating from database for the first
case. The user cannot commit changes for an instance in the second or
third case, but can update from database for both cases.
The following is the list of icons used in the synchronization
dialog:
What do I do if the curator tool flags instances as being duplicated in gk_central
You will need to evaluate each instance that is flagged as being a duplicate and compare it to the instance that is flagged in gk_central.
This is easiest to do before synchonizing your project with the database as described below:
Search the local project for all database instances that contain ‘-‘, then highlight each one in turn, right-click to open a pop-up menu, and choose “match instance in DB”. Any results returned by that query are already-existing instances that, by the rules of gk_central, are duplicated by the new local instance. Mostly, that’s right and you should accept the option offered by the form, to replace the local instance in the local project with the already-existing one from gk_central. That cleanly gets rid of the duplication and preserves all references correctly in the local project and in gk_central. Sometimes (rarely) the form is wrong because, despite identical defining attributes, two instances are genuinely different, e.g., if a person instance has already been created for J(ohn) Doe and you have now created one for J(ane) Doe. In this case, you should refuse the offer made by the form and should make a note to force the new instance into gk_central at synchronize-and-commit time.
The annotation process
Precuration
1. Create a basic outline of pathway: Regulation of Apoptosis as an example. Further description of the annotation process will focus on the subpathway highlighted below in yellow.
2. Flesh out outline with information including: molecules, compartment, species, text summary, references (PMIDs)
3. Create table of of molecules participating in the pathway
-Each modified form of a protein as a separate entry
-Look up/enter corresponding uniprotID identifiers
Data entry
Identify existing proteins/molecules
In many cases the proteins on your list will already exist in the database. You should make every effort to reuse existing instances wherever possible to avoid unnecessary and confusing duplications. Use the curator tool to search the ReferenceGeneProduct (RGP) class using Uniprot identifiers as follows:
Searching the database
Choose Class: ReferenceGeneProduct
Choose attribute: identifier
Attribute value: Use REGEXP
Enter your identifier in the search box. You can enter several as a pipe separated list (e.g A1A4S6|O43293|P43146|P43146....).
Select the ReferenceGeneProduct (RGP) returned, if a list select them one at a time, right click and opt to "View referrers". The reulting Referrers Dialog box lists Referrers by property name. At the top of the list can be isoforms of the protein, if present in Uniprot. Do not use isoforms unless you are certain that only specific isoforms have the functionality you intend to represent. Isoforms may have their own referrers and you should check this - if someone took the trouble to create an isoform-specific EWAS they probably had good reason to do so. Items listed with the property name referenceEntity are EntitiesWithAccessionedSequence (EWASs), a Reactome identifier for specific forms/locations of a protein. Often there will be more than one EWAS for a single RGP, because post-translationally modified forms of proteins and proteins in different cellular compartments each have a separate EWAS. If any of the listed EWASs correspond to your needs, right click and opt to "Check Out" that referrer. If the correct molecular compartment or post-translationally modified form is not present, you can still check out an EWAS and later use the Curator tool to clone it and modify it to your needs.
Create new proteins/molecules
Please see [|guidelines]] on naming entities!
If you search for a RGP that does NOT have referrers in the database you will get this message:
In this case, you will need to create the EWAS.
To do this, first check out the RGP of interest into your local project. Then, in the curator tool, select RGP in the class list, then scroll or search for the RGP of interest. Right click and opt to create EWAS from RGP.
![]()
It will ask if you want to accept the end coordinates described by Uniprot. Only say yes if you can confirm them to be accurate. If the end coordinates are not certain, the convention is to represent the start as 1, and end as -1.
In the newly created EWAS, the RefereneEntity will have a name and species entered by default. You can add an alternative name if this was specified by the Author, do this by right-clicking on the existing name and select Add. You must define the compartment. To do this, right click in the compartment slot and select the correct compartment in your local repository. Select the compartment and hit OK. If you don't see the desired compartment in your local project , click the "Browse database" button to search in gk_central.
If the protein you want to represent with an EWAS is post translationally modified, that is represented by completing the modified residue slot of the EWAS. For example, to indicate that the protein has a phospho-serine at residue 126, right click on the modified residue slot. You will be prompted to choose a modified residue instance from your local project, browse gk_central or create a new modified residue instance. Almost always you will want to create a new instance, as modified residue instances are specific to the RGP and residue position. Enter the Uniprot identifier for the ReferenceGeneproduct in the ReferenceSequence slot and right click in the PsiMod slot to select a modification type from within the local repository or by searching the gk_central database, and hit ok. Finally, enter the residue number in the coordinate slot and hit ok.
The modified residue instance can now be applied to the EWAS by clicking ok.
The modified EWAS is shown below.
If you are annotating a protein fragment, you can define the start and end coordinates of the fragment as shown below: Don't forget to change the default name to indicate that it is a fragment.
Creating a Complex
This is done in one of two ways. Either:
Go to the Schema view, select Complex, right-click and select Create Instance. The Create A a New Instance dialog box appears. Enter a name in the field for Name. Typically you would also enter the Compartment, Species and identify the entities (EWASes, sets or complexes) that make up this complex using the field hasComponent. All of these fields are completed by either double-clciking to type, or right clicking to select Add and identify the correct item from the local project.
Or, by selecting the appropriate field in the details of an event or entity that contains a complex, right click and select Add. This will produce a 'Select Instance' dialog that preselects the allowed classes that can be added to that field. E.g. if you right click the Output field in a Reaction, the allowed classes include several types of set, complexes, polymers and EWAS. To create a new complex at this point, select Complex in the list of options on the left, and click the New button on the right. The process is then identical to that described above.
Creating a Set
Reactome has several types of set - refer to the Glossary and User Guide for definitions.
The most commonly used sets are Defined Sets and Candidate Sets.
Defined Set members should be proven equivalents, i.e. all of them have been demonstrated to perform the function that is described by the event they participate in.
Candidate Sets have two categories of inclusion, members, equivalent to defined set members, and candidates, members that are not proven to be functionally equivalent, but are believed to be equivalent based on phylogeny, domain structure etc.
Creating sets is a similar process for all subtypes, select the appropriate type in the Schema view, right-click and select Create Instance, fill in the Name and Species, right click to add the set members.
- All members of a set must have the same compartment. The only time a set can have multiple compartment attributes is if its members themselves all have the same multiple compartment attributes, e.g., a set of membrane-spanning complexes with components explicitly located [on this side], in the membrane, and [on that side].
Creating a Pathway
Please see *note* below if you are adding a pathway that will be a new top level pathway.
Here is description of how the outlined mini pathway "Regulation of activated PAK-2p34 by proteasome mediated degradation" is built from its component events in the curator tool. The pathway consists of 2 reactions: "Ubiquitination of PAK-2p34" and "Proteasome mediated degradation of PAK-2p34". For simplicity, the reactions have already been created (see section on creating inferered reaction for an example.)
With the pathway class highlighted in the class hierarchy, right click and select Create instance, or use the create instance button in the menu bar at the top of tool.

After adding the pathway title, associate reactions as component events of the pathway by right clicking on the hasEvent slot and selecting "Add"

You can then select the events that you need one at a time or, as a short cut, you can search for your newly created events in your project if they have not yet been submitted to gk_central (they will all have DB_ID attributes that are negative ). To do this, search in your project for events with DB_ID containing - . From this list, you can hold the control key and select the events of interest.
The order of the events in a pathway is described through the use of the "precedingEvent" attribute on the reactions that are components of the pathway. If/when no preceding events is specified the order of the events displayed on the webpage reflects the order in which they are listed as components in the pathway instance that you are creating.
Once the component events have been added, the remaining required attributed are added. If the pathway that you are describing corresponds to a GO biological process, right click on the goBiologicalProcess slot and select set. Select the appropriate GO term from your local repository of gk_central. If you can't find the term of interest, ask for help.
- note: If you are creating a top level pathway(check with Peter if this is appropriate), it must be listed as frontPage item. To do this, check out the frontPage instance from gk_central and add your pathways in the frontPageItem slot. Also please mark this in the editorial calendar as a front page item.
Creating a Reaction
Like all other classes, you can create a new reaction by selecting the Reaction class in the Schema view, right-click and select Create Instance, but it is perhaps better practice and more intuitive to create new reactions inside a pathway. To do this, select a pathway in the Event Hierarchical View, select the hasEvent property name in the details panel on the right, right-click and select Add. This leads to a dialogue for providing details of the reaction.
When creating a reaction, first enter the name:

This may be enough detail for the moment, if you are simply creating a placeholder click OK.
To set the species of the reaction right click and select add in the species field.
If the species that you happen to be working with is not in your local project you can opt to search for it in the database using the "Browse Database" Button in the dialog box. When you set/change the species or the compartment of a reaction, it will ask if you want to propagte the species/compartment to all of the component molecules as well.
Use caution when selecting Yes. ONLY say yes here if you know that event and contained molecules have no referrers in the database that would be affected. (In other words ALL other reactions or complexes or sets in the db that make use of these now "changed" moleules would be affected).
Next add the compartment to the reaction. Right click in the compartment box and select add.
![]()
Again if you don't have the compartment you need in your project, you can Browse Database to find the one you need.
Now add the input and output molecules. Right click in the respective box and select "Add".

Select the molecule from the appropriate class:
Important: After adding your input and output molecules, it is important to verify that your reaction is balanced (all molecules represented as input are also present as output). See the QA section below for a description of how to do this.
Add the literature reference(s). The references associated with a reaction MUST provide direct experimental evidence for the occurance of that reaction in the species you are annotating (i.e. human for human Reactome). If there is no direct experimental evidence in human then you need to create an inferred human reaction as described in the section below. Enter the PMID for journal articles, and say yes when prompted do have the details filled in automatically. A description of how to add other types of references (Books , URLs) will be added soon.
If you don't see the reference in your local repository then opt to Browse Database.
If you can't find the literature reference in the database either, then you need to create a new one:
Enter the PMID (number only). Click out of the PMID box. You will be asked if you want to import the PMID record information. Say yes.
You will now see the full record:
Look back in your local repository and you will see the new literature reference. You an now add this as a reference for your reaction.
Once you have added your references, you can a text summary for the reaction in the "summation"slot. Right click and select add.
Then input your text and citations.

Associate the references associated with the citations in the text summary using the "Literaturereference" slot for the summation as described above for reactions.
Once the mandatory attributes have been filled, enter the remaining required attributed. Right click on edited to chose an instanceEdit from your project.
If you haven't created one previously, opt to create a new one by clicking on the New Instance button.
Then right click on author to select a person. If you don't see the one you want locally, browse the database.
The edited slot holds the name of the curator that should be credited with creating and editing the event and the date it was edited. This slot is filled with an instanceedit instance that contains this information.
A date time stamp is created automatically for that instanceEdit and clicking "OK" will add this instanceEdit to the edited slot in your reaction.
The authored and reviewed slots hold the instanceEdits describing the author/reviewer and dates of authoring/reviewing respetively. These are created as described above for the "editor" slot. When the reaction is read for release, the do_Release flag should be set to TRUE and the releaseDate slot should be filled with the appropriate release date.
Important: After adding your input and output molecules, it is important to verify that your reaction is balanced (all molecules represented as input are also present as output). See the QA section below for a description of how to do this.
When you can define preceding events for an event it should be done. One precision on that point is that a preceding event for a reaction should always be a reaction/reaction like event and not a pathway and the preceding event for a pathway should be another pathway when relevant and not a reaction or reaction like event.
Creating an inferred event
When constructing a human pathway, a curator will come across events that have no direct experimental evidence in humans, but have supporting experimental data from another/other species. If experts in the field believe that the event can in fact occur in humans,
the 'other species event' can be used to infer the human event. In the case illustrated below a human reaction is inferred from in vitro experimental results using proteins from human and Oryctolagus cuniculus (rabbit).
Here a reaction "Proteosome mediated degradation of PAK-2p34" is created and the species Homo sapiens and Oryctolagus cuniculus are assigned. To avoid having a human and a non-human reaction with identical names, it can be useful to use capitalized forms of object names for the non-human reaction and all-upercase names for human, e.g. Jak2 and JAK2 for the non-human and human proteins respectively. A text summary and the literature reference providing evidence for this mixed species reaction is added.
The rabbit protein "PAK-2p34" is seleted as an input
![]()
The human set "ubiquitin" is selected as an input...
![]()
...and the mixed species output complex "PAK-2p34" is selected as the output.
Note that this complex is not natural, it only occurs in vitro. To flag this the complex "isChimera" attribute is set to True.
![]()
Since the reaction is also multispecies and not a natural event, the isChimera flag is also set to true for the reaction.

Now that the reaction to be used for inference has been created, create the same reaction for human, using human participating molecules. Note, however, that the literature reference is not associated with this human event! Instead, in the "inferredFrom" field, right click and select "Add" to enter the non-human reaction used for inference.

Selecting the mixed species reaction created previously...
![]()
...now the link to the inferred reaction has been made.

Adding a cross reverence to a new database (not previously existing in Reactome) to an instance
If you want to add a crossReference attribut you can add this as long as you create an instance for the linkable database.
Connection between a generic and specific reactions
You may come across a situation where it's convenient to create a generic, all-encompassing reaction in which a set of proteins perform the same function. Specific proteins from this set may be used elsewhere in other pathways as specific, single reactions so we want a way to indicate the specific reaction is one reaction derived from the generic reaction. The way to indicate this is to use the "hasMember" property of the generic reaction.
An example is the ABCC family of transporters mediating organic anion transport across the plasma membrane. Three of the proteins from the set of proteins in this reaction are involved in three specific reactions elsewhere. To show there is a connection between them and this reaction, use the hasMember slot to indicate these three reactions are specific examples of this generic reaction (as shown below)
![]()
Creating a Catalyst
Reactions that involve a catalyst should include this information. Within the Reaction Details, the field is called catalystActivity. To complete this you need two things: the physicalEntity or object that is acting as catalyst, and the Activity of that object, defined as a GO molecular function. The physicalEntity will be a molecule or set or complex probably in your local project. The GO molecular function can be identified by consulting Uniprot, look at the ontologies section for GO Molecular Function. If none of the listed terms seems to be appropriate, either Browse Database for terms in gk_central, or use the OLS website at http://www.ebi.ac.uk/ontology-lookup/ to identify the correct term first. Use the most specific term possible.
Creating a new ChEBI entry
Using SMILES strings as input for the ChEBI submission tool
To use the submission tool, you must get a user name and password, and log in. Go here to do that. Once you have logged in, click "create a new submission" from the choices at the bottom of the page. That will cause a new line to appear in the table of "your active submissions" on that page. Click the "edit submission" option on that line to open the actual submission form.
Under the ‘Name And Structure’ section of the Submission tool, select ‘Edit Structure’.
Under the ‘Edit’ Menu, select ‘Import Name’.
An input box named ‘The Source – Name’ appears, this is the place to paste your SMILES string.
In this example, the following string for 1-PP-IP5 is used: OP(O)(=O)O[C@H]1[C@H](OP(O)(O)=O)[C@@H](OP(O)(O)=O)[C@H](OP(O)(=O)OP(O)(O)=O)[C@H](OP(O)(O)=O)[C@@H]1OP(O)(O)=O
After the SMILES string has been pasted, select the ‘File’ menu in ‘The Source – Name’ input box. Select ‘Import As’. Maybe displace circle or enlage so as not to hide word “File” in menu
Make sure ‘Import as Recognized (SMILES)’ Import Mode has been selected and click on ‘Import’.
The chemical structure defined by the SMILES string should now be present in the box, ready for editing. Press ‘Update structure’ to obtain details of the structure.
Structure details are now displayed in the right of the page. The structure on the left hand side can now be edited as you wish.
Annotating the regulation of a process
The following organization of regulation events works well for many kinds of processes, and it fits with our view that all parts of a process should be grouped, while respecting GO's view that regulatory events should be distinguishable from the rest of the process.
All about [process] (pathway) --The steps of [process] (pathway)
[process]reaction 1
[process] reaction 2
etc.
--Regulation of [process] (pathway)
[process] regulatory reaction 1
[process regulatory reaction 2
etc.
Regulation here can include reactions that are themselves concrete molecular transformations whose effect is to modulate one of the main process reactions by activating an enzyme, or providing or sequestering an input molecule.
but can also include airy things like "[this regulatory event] by an unknown molecular mechanism positively or negatively regulates [process reaction #] .
Modifying and Deleting
It is important to understand that if you locally modify or delete an instance you checked out from gk_central, it will also be modified/deleted in gk_central when you synchronize. You must ALWAYS CHECK FIRST that the instance you intend to modify/delete is not in use elsewhere, outside your local project. The best way to do this is to search for it using the Database Browser Schema View, select referrers, if it has any you didn't know about do not modify or delete it! There may be circumstances when you think something should be modified or removed but if it has been used by another curator, check with them first, or contact an experienced curator for advice.
Diagram checks after deleting entities or reactionlikeEvents
If and when you need to delete an entity from gk_central, you must run the deleted object in diagram check over gk_central to find any diagrams that have used those instances. A description of how to run this check is shown here.
More curation examples
Another example of the annotation process can be found here.
Project QA using the Curator Tool QA checks
Within the Tools menu the "QA Check" menu can be found.
This menu has a six separate QA script items within it.
- Imbalance Check (checks that the molecules present as input are also present as output)
- Mandatory Attributes Check (checks that the mandatory attributes for a class have been entered)
- Required Attributes Check (checks that the required attributes for a class have been entered)
- Compartment Check For:
- EntitySet (component of set members matches compartment of set)
- Complex (component of complex matches compartment of individual components)
- Reaction (component of reaction matches compartment of individual components)
- EntitySet (component of set members matches compartment of set)
- Diagram checks
Note: In order for the QA checks to effectively pick up errors, the project that you are working on must be fully extracted from the database. Instructions on how to do a full extraction can be found here.
Imbalance check
You must select Reactions in the hierarchy to perform this check.
Reactions are flagged as cleavage reactions if the output differs from the input only that the output contains "fragments" of the input molecule. A true imbalance is shown below:
Mandatory attribute check
A list of instances missing mandatory attributes (by class) is shown. To make the missing attributes of the instance easier to see, you can use the "order attribute" button (downward arrow with circle and triangle) in the upper right side of the tool. This orders the instances by type (mandatory, required, optional...etc)
Required attribute check
This checks work in the same way that the mandatory attribute check works.
Compartment check
Select the class you want to check in the hierarchy panel
Compartment conflicts are indicated in the bottom of the dialog box.

Species checks
These checks work in the same way that compartment checks work with the exception that Pathways are also checked for species conflicts.
Diagram checks
- Deleted objects in diagrams
When an entity or reaction is deleted in the instanceview of the curator tool, it must also be removed manually from any diagram that it has been drawn into. This does not happen automatically. This check is run over gk_central and will look for any diagrams that are affected by the deletion of a reaction or reactionlikeevent. This check MUST be run after any deletions of reactionlike events or entities have been committed to gk_central so that the affected diagrams can be identified and the appropriate changes made in any affected diagrams.
Any affected diagrams will be flagged. The DB_ID of the deleted instance will be displayed, but to see the affected "objects" in the diagram, you will need to view the diagram in gk_central.
Select that diagram in gk_central, right click and opt to "Show diagram".
Affected objects will be flagged
and the objects in the diagrams highlighted in red.
QA of projects before release
Because of the nature of the release process and the growing number of curators submitting projects the QA load has become greater. One of the solutions to this problem is for curators enter this data right from the begining and to run the QA checks in the curator tool before finishing their projects
- Top Six List Of Problems Identified During the Slice
- Species
- Complex Balances
- UniProt IDs
- Complex Compartment Checks
- Entity Compartments
- Balancing Of Reactions
QA Reminders
1. All of the instances must be updated from gk_central in order for these checks to be meaningful
2. QA Checks should be run regularly, once you have created a reaction or even a bunch of EWASs.
Everyday QA includes:
- Complete check-outs (No shell instances)
- Match instance in DB
- QA Tools
- Complete check-outs (No shell instances)
Drawing a pathway diagram
Reactome pathway diagrams are drawn and viewed in the curator tool using the ELV pane of the tool.
This example will shows how the superpathway "Regulation of Apoptosis" is diagrammed. If you are drawing a new diagram, see below. The pathway Regulation of Apoptosis is part of the supercanoical Apoptois pathway. Here, you can tell that Regulation of Apoptosis has been annotated but not yet incorporated in the Apoptosis diagram. You can tell this because the pathway is greyed out in the event hierarchy.
![]()
To incorporate this pathway in the Apoptosis diagram, simply click on, hold, and drag the pathway from the hierarchy to the diagram.
If you right click an any of the pathway boxes, you are offered the option to open Diagram. If one has been created it will open.
If it has not, as in the case of Regulation of Apoptosis, you get the below message.
Select "No" and an empty diagram will be opened in the pathway editor pane.
To create the cellular compartments that you need for the diagram, click on the shaded square in the menu bar for the Pathway editor. It is best to create all the compartments that you will need before you start to lay out reactions. Here cytosol is created.
To reposition the compartment, click on it and drag.
![]()
To enlarge the compartment, click on it to select it and then grab the compartment at one of its nodes in the corners. Then drag outward.
To begin drawing, select a reaction from the event hierarchy and drag it onto the diagram.
![]()
To see the names of the compartments of the reaction participating moleules, right click anywhere on the diagram and opt to show compartment names. This will make it easier to see that the molecules have been positioned in the correct compartment.
To reposition the reaction you can click and drag different components or you can select the entire reaction by clicking and dragging a selection bax over it. Then the reaction and all its component molecules can be moved as a unit.
Additional reactions are dragged out and positioned one at a time.
Reactions are of one of 5 types: Transition, Association, Dissociation, Omitted process, and Uncertain process. Transitions involve the moleules changing state, Association is a binding reaction, Dissociation is the Dissociation of a complex. Omitted process, and Uncertain process are currently not used. To apply a reaction type to a reaction, right click on on reaction, select change type .
![]()
Select type of interest. Here it is an association.
![]()
Continue to drag out, and map, and assign reaction type to the remaining reations.
Once all of the reactions have been laid out, the compartment names on molecules can be hidden by right clicking on the diagram anywhere and selecting "Hide compartment in names".
Right click and select "Tight Node Bounds. This will reduce the space left by removing the compartment names.
![]()
Here is the completed diagram.
![]()
If you want to include a link to a pathway that is not actually "part of" the pathway that you are diagramming, you can do this by checking out both the pathway you are diagramming (pathway A) as well as the pathway you'd like to include as a icon (pathway B) using the Event view. Open the diagram of pathway A in the ELV view. Then, drag the icon of pathway B into the the ELV. Save and commit the changes. Then, redeploy the pathway A diagram.
Drawing a new pathway diagram
If you are creating a diagram for a new top level pathway (check with Peter/Lisa if this is appropriate) ,remember that the pathway itself must be listed as frontPage item in order to see the deployed diagram. To do this, check out the frontPage instance from gk_central and add your pathways in the frontPageItem slot. Also, please mark this in the editorial calendar as a front page item and inform. In order to see the changes the Pathway hierarchy will need to be updated on the 8084 site. Please contact Peter or Lisa to do this.
If you are creating a new diagram diagramming a pathway (and is not a top level pathway, you will have to make sure that the pathway is represented (as an icon) in a diagram that represents (or is part of) a top level pathway.
Preparing for database releases
A full description of the release procedure can be found in the release SOP:
Curation Tools
Remote Attribute Search Tool
http://reactomedev.oicr.on.ca/cgi-bin/remoteattsearch2
or on live site:
http://www.reactome.org/cgi-bin/remoteattsearch2?DB=gk_current
Examples of how to use the remoteattsearch tool can be found here.
Identifying list members that are unique to one of two lists using Microsoft Excel
Here is a procedure that describes how to take two lists and compare entries to identify those that are present in only one of the two lists. This procedure can be useful, for example, to compare the list of proteins in the gk_central vs. live site to find those that are unreleased.
Advanced Curation
CuratorTool Tools
Helpful Tips
Create the framework for the pathway(s) you intend to curate before filling in the details. Start by creating a pathway, add to this new or existing reactions in the correct order, complete the summations and literature citations, then identify the EWASES, Complexes, Sets etc. required and complete the details of the reactions consecutively. Cascading signaling processes can involve very complicated Complexes, in these circumstances the Graphic Display in Entity Hierarchical View is very useful as an overview of the order of events.




























