ReleaseSOP

From ReactomeWiki

Jump to: navigation, search
Reactome release in 18 easy steps. Download the release Powerpoint slide here

Contents

Release SOP

Reactome curators are constantly feeding new data into the central Reactome repository, gk_central. The data that appears on the official Reactome website, however, is actually a "snapshot" taken from gk_central. This "snapshot" is known as a slice in Reactome-speak. It contains pathways from completed curation projects only; work-in-progress is not made publicly visible. An extracted slice from gk_central that has been put onto the public web site is known as a release. The human Reactome project creates a new release every 3 months.

This document describes the process of creating a new release.

Release Process Cheat Sheet

A very general and brief overview of the release process can be seen here.

General Overview of Procedure

The computers named below, reactomedev.oicr.on.ca and reactome.oicr.on.ca, are used by Reactome Central. If you are following this SOP for a diferent instance of Reactome then you should substitute your own server names.

Human Reactome bulids new release databases on a separate sever, reactomedev.oicr.on.ca, because this is a computationally intense process and we don't want to slow down the server running the public website, reactome.oicr.on.ca. Other Reactomes may not do this, it depends on the resources available.

If this is your first time as a Slice or Release Co-ordinator, check now that you have the required accounts and permissions.

You will need:

Access to the Unix environment - if you don't have a unix machine, you need terminal emulation software (you can't use the Windows command prompt). PuTTY works well. TeraTerm is an alternative. Both are free.

Some familiarity with a unix text editor. Arguably the best editor to learn is vi, because it is part of every unix and linux distribution. However it's not the easiest to use. If you have little or no unix experience you may prefer to user a simple GUI text editor such as Nedit, check it is available.

login access to reactomedev.oicr.on.ca and reactome.oicr.on.ca.

SUDO access on reactomedev.oicr.on.ca

If you don't have these, contact David or Guanming.

It may be worthwhile having a dry run through the process but consult with an expert beforehand, as some of the scripts if run inappropriately will break things!


The release procedure splits into 7 major parts and for Human Reactome involves taks performed by different people, including the website release manager, i.e. someone selected from a list of people at the EBI, the editorial release manager, currently Lisa, and the outreach manager, currently Robin.

  • Pre-slice preparation. This part is mainly the responsibility of the curation team, overseen by Lisa.
  • Creating new release databases. This is performed by the website release manager, mostly on reactomedev.oicr.on.ca. Takes 5 or 6 days to complete.
  • Website file updates (statistics/news etc). This part is the responsibility of the Editorial release manager.
  • Updating public fallback server. This is performed by the website release manager, on reactome.oicr.on.ca. Release databases need to be copied over from reactomedev.oicr.on.ca, this takes about half a day.
  • Release database QA. Selected curators manually check the website.
  • Updating and restarting production server. This is performed by the website release manager, on reactome.oicr.on.ca, produces an identical copy of the public fallback server. Requires about half a day.
  • Post release communication: Preparation of announcements and linking/association files for external databases. Several people contribute to this.

Only after the first 6 of the above have been completed does the new release become visible to the outside world.

Terms and Definitions

In this document, strings that are specific to a particular Reactome installation are highlighted in orange, blue, pink or yellow. These indicate the paths to the main Reactome directories. This will depend on where you are in the installation process and whether you are working on human Reactome or one of the 'other' Reactomes.

For human Reactome, the 'main' paths are incorporated into the details below, but just in case any were missed in editing, they are:

  • Creating new release on reactomedev.oicr.on.ca: <reactomedev.oicr.on.ca_development_main> means /usr/local/gkbdev
  • Updating public fallback server on reactome.oicr.on.ca: <reactome.oicr.on.ca_fallback_main> means /usr/local/gkb_prod
  • Updating public production server on reactome.oicr.on.ca: <reactome.oicr.on.ca_production_main> means /usr/local/gkb

Another directory that you will occasionally need to know about is <path_to_caBIG_directory>; on reactomedev.oicr.on.ca, this corresponds to /usr/local/caBIG. You won't need this on reactome.oicr.on.ca.

Other important strings not to be taken literally are highlighted in green:

<name of the slice db> e.g. test_slice_20

<name of the MYISAM db> e.g. test_slice_20_myisam

<name of the release db> e.g. test_reactome_20

Commands given on command line are indicated in a different font

like this

Commands can sometimes be rather long. If you see a command line with a backslash ("\") at the end of it, it means "continue on next line", i.e. the line you type should concatenate the line ending with the backslash and the line following it. DON'T include the backslash in the line you type.

e.g. if you see:

mysqldump --opt database > \

database.dump

...then what you should actually type is;

mysqldump --opt database > database.dump

Keeping Others Informed

The following description of release steps does not contain explicit instructions to communicate with other people involved in the process. So, once you have completed a step, please inform the person in charge of the next step (or even the internal@reactome.org list) that you have finished and they can proceed.

Advice on Running Commands Under Linux

The release process is a set of scripts that are run on Unix servers. This is the absolute mimimum you need to know:

Commands that end in "&" run in the "background", i.e. they are still running, but not visible in the terminal window you launched them from. This is a precaution against timeouts; if you use ssh to connect to reactomedev.oicr.on.ca or reactome.oicr.on.ca, the terminal session will be automatically terminated after a certain period of inactivity, e.g. when you don't type.

Output from most commands will be put into a file in a temporary directory, so you can look at if you need to. You can find out if the process is still running by typing "ps". Alternatively, you can use "tail -f" on the temporary file if you want to keep an eye on the output of the command.

To exit "tail", type Control-C.

To find out the directory you are in, pwd

To change to a different directory, cd /path_to_the_directory/

To go back to your home directory, cd

To list files in a directory sorted by time of creation, ls -lrt

Slice Preparation by content release manager and curators

Pre-slice procedure

The pre-slice process includes sending reminders of upcoming data freeze to curators, curator preslice-QA. The procedure is described in detail here.

Data Freeze

gk_central is closed on this date at 5 pm EST (until the date of the final slice) to the submission of any modifications to "released" or "to be released" events/entities.

On the day of the data freeze, all projects must be submitted to gk_central in preparation for database updates. In these updates, referenceEntities are updated (Uniprot, ChEBI) as are GO terms that serve as the identifiers for GO_BP, GO_MF and GO_CC instances. You can still use your local projects (any that you might have committed to gk_central before the data freeze), but when gk_central has been reopened you will need to synchronize (any/all projects that you continue to use) with gk_central. When you do this, you will need to opt to "update from database" instances that are now flagged as different from database. This would include any updated ReferenceEntities from the Uniprot and ChEBI updates and any updated GO terms from the GO updates.

GO/Uniprot/ChEBI updates

The GO/Uniprot updates are run on gk_central and can be done at any time in the release cycle. Currently they are run just after the data freeze to have the most "up-to-date" information available. At the time of the data freeze, send a reminder to curators not to submit to gk_central until further notice. On overview of the update procedure can be found here.

Once these scripts are complete, an email should be sent to the Reactome internal mailing list, so that the person responsible for other parts of the release process knows to proceed.

Prepare Orthopair files

Orthology projection in Reactome is a two-step process: first, files are generated containing pairwise mappings between proteins in human and proteins in the various other species. Then, at a later stage in the release process, Reactome events are projected from human to other species, using the mapping files generated in the first step. This section is for the first step, creating the mapping files.

For detailed information about the Ensembl Compara based procedure, see Electronic_inference_based_on_Ensembl_Compara.

Note: The Reactome release number is only needed here in order to collect the mapping files in a directory named by the release number. Otherwise, the running of the mapping scripts is independent of Reactome data.

To find the relevant Ensembl version, check the EnsemblGenomes website for the Ensembl version they have been using. The information is usually located to the left of the page under Releases. Example text reading like "The eighth release of Ensembl Genomes features updates to version of 61 of the Ensembl software". So in this case, we'd use 61 for <Ensembl release number>. This may be the same as the latest Ensembl version, or a slightly older version, as EnsemblGenomes updates their system a few weeks after a new Ensembl version has been released.</span>

cd /usr/local/gkbdev/scripts/compara

Run:

./wrapper_orthopair.sh <new Reactome release number> <Ensembl version number>\

>& /usr/local/gkbdev/scripts/compara/<new Reactome release number>/wrapper_orthopair.out &

If you would like to keep an eye on the progress of this command, type:

tail -f /usr/local/gkbdev/scripts/compara/<new Reactome release number>/wrapper_orthopair.out

"tail" with the -f option shows you the last few lines of the output, and updates continuously as new output arrives. To exit "tail", type control-C.

Total run time is about 10 hours.

After it has finished, and to check progress during the run, you can do a listing of the directory <new Reactome release number> to check that it has worked OK:

ls –la /usr/local/gkbdev/scripts/compara/<new Reactome release number>


You will see the following files for every species - check that none have a zero size as that indicates a problem:

hsap_protein_gene_mapping.txt

hsap_<species>_homol_mapping.txt

<species>_gene_protein_mapping.txt

hsap_<species>_mapping.txt

For mouse files to be completed, it takes about 40 minutes. At the end of the run, there should be files for all 19 species in this directory (current species list).

Should the mapping fail for a particular species, e.g. due to connection problems during the run, remove the incomplete mapping files for this species and simply run the script again. The script will check for existing final mapping files first and only rerun any 'missing' species.

Once the orthopair script has run, you should check the results by running the QA script:

./check_orthopair_files.pl -release <new Reactome release number>

If errors are reported, first try the automated fixing procedure:

./check_orthopair_files.pl -release <new Reactome release number> -fix -run_count 0

This may also take several hours. And then run:

./check_orthopair_files.pl -release <new Reactome release number>

again, to see if the fix has worked. You can repeat this process several times till there are files produced for all species in our list.

If there are still problems, you may need to debug the wrapper_orthopair.sh source code.

All files produced by this script are stored under /usr/local/gkbdev/scripts/compara/<new Reactome release number>.

Slice and Slice QA procedures

A slice is a database extracted from Reactome's central database, gk_central. It is a subset of gk_central, in the sense that not all gk_central instances end up in the slice. Only those event instances that have explicitly been marked by curators as _doRelease are deposited in the slice. This allows curators to selectively release only those projects which are complete. Slicing is performed using a specially designed slicing tool. Multiple test slices are taken and compared to one another to see how the cleanup is progressing. A description of the slicing process is described here.

Keeping release management group informed

At every stage, keep the group informed of the stage in the pipeline by emailing to the internal list. This may appear to be unwanted but this is more of a commonsense than anything else. The mails can be simple with minimum information like 'initial slice for QA is available now and no new data will be added for release purposes', or 'final slice is available from ---url---'. This is a good teamwork practice. Particularly, the final slice needs to be announced to the group and to the developers so that they can start their processing the slice for release purposes.

Create New Release Databases

N.B. This is NOT the start of the release process for the EBI release manager! See Prepare Orthopair Files in the he pre-release steps above!

It is recommended that you perform a test run for the pre-release steps as soon as a test slice is available, so that potential problems can be discovered and fixed early. In this case, or indeed in the case of a failed run for other reasons, some steps need to be taken to make sure no test run data interfere with 'final, real' data. These steps are described in section 030.

For the remainder of this section, <reactomedev.oicr.on.ca_development_main> should be replaced with the path to the main Reactome directory, /usr/local/gkbdev.

004 Establish Working Environment

If you log out during the release and then log back in again later to continue, you will need to repeat the steps detailled below.

N.B. You will of course need an account on reactomedev.oicr.on.ca but you almost certainly have that (if you can access the curator tool, you do). You also need mysql access, which you may not have. When logged on to reactomedev.oicr.on.ca, if you type mysql as a command, and it says access denied, you don't have mysql access and need to get it (if you did get in to mysql type quit).


Connecting to the Server

Log in to reactomedev.oicr.on.ca in order to start the release. e.g. from a terminal window on your computer, type:

ssh reactomedev.oicr.on.ca

Setting up Your Shell Environment

To establish the correct shell environment for running these scripts, do the following:

csh

setenv PATH /usr/local/bin:${PATH}

setenv PERL5LIB /usr/local/gkbdev/modules:/usr/local/bioperl-1.0

umask 0002

Making Reactome Perl Libraries Accessible

Additionally, some scripts use a symbolic link from the Reactome directory to the file GKB in your home directory in order to find Perl libraries. You will need to set up this link, but you only need to do this once. You should first check to see if you already have a directory with this name:

ls -l ~/GKB

If the results look something like this:

lrwxrwxrwx 1 bloggs bloggs 17 Sep 12 16:51 /home/bloggs/GKB -> /usr/local/gkbdev

then the link already exists, and you can skip the rest of this subsection. If the results look more like this:

lrwxrwxrwx 1 bloggs bloggs 17 Sep 12 16:51 /home/bloggs/GKB

then it means that you have your own personal copy of GKB checked out into your home directory. You will need to rename it to something else first:

mv ~/GKB ~/GKB_bloggs

Now create the link:

ln -s /usr/local/gkbdev ~/GKB

006 Update all Source Code

Before you start the release process, you need to ensure that you have the most up-to-date version of Reactome, by doing CVS update in certain critical directories.

cd /usr/local/gkbdev/scripts/release

./run_cvs.pl -release <RELEASE_NUMBER>

You will be asked for your password several times, because this script is using sudo and cvs behind the scenes. When it finishes (it should run for about 10 minutes), it may report that some files contain conflicts. These files will be listed. You will need to check through these files using a text editor and resolve these conflicts by hand; search for <<< to locate them.

For releases prior to 38, the following manual update instructions were used. If the script just described does not work, then you will need to revert to the manual method. Otherwise, you can skip to the next step in the release at this point.

Use 'cd' to change into each of these directories, and in each one, run BOTH sudo and BOTH cvs update commands:

sudo chgrp -R gkb *

sudo chmod -R g+rw *

cvs up -d

You will be asked for the CVS password, which should be the same as your normal login password.

To see the files which may have conflicts from an update, use:

cvs up -d |& grep "^C "

You should keep an eye on the output from CVS: any lines beginning with a "C" indicate that conflicts have occurred during the update. These may need to be fixed by editing the file and resolving them by hand. You will recognize these conflicts in the form of lines beginning with '>>>', '===' or '<<<' in the affected file. These lines should be deleted, as should unwanted alternative lines inserted by CVS. If you do not feel confident to do this, contact one of the Reactome software staff.


You should do this for the following directories:

<path_to_caBIG_directory>/caBIGR3</span>

/usr/local/gkbdev/biopaxexporter

/usr/local/gkbdev/modules/GKB

/usr/local/gkbdev/scripts

/usr/local/gkbdev/orthomcl_project

/usr/local/gkbdev/java

/usr/local/gkbdev/BioMart/reactome

/usr/local/gkbdev/ReactomeGWT

/usr/local/gkbdev/slicingTool


Before updating the website directory, remove all files from the userguide directory to avoid cvs conflicts:

rm -f /usr/local/gkbdev/website/html/userguide/*

/usr/local/gkbdev/website


Note: Ignore cvs conflicts concerning the images directory and previous release directories (like website/html/download/14). This update can take 15 minutes or so. Sometimes, cvs doesn't seem to complete, even if you wait for 30 minutes. A little tip: try hitting the return key again.

If there are files with conflicts, cd to the directory and look at them (more filename). If they look fine, i.e. none of the lines have the symbols indicated below, try moving the problem file to filename.bak and run cvs up again. Then do a diff to compare the newly checked out file with the backup. If they are the same, there never was a problem!

This error:

cvs update: failed to create lock directory for `/usr/local/cvs_repository/caBIG/caBIGR3' (/usr/local/cvs_repository/caBIG/caBIGR3/#cvs.lock): Permission denied

cvs update: failed to obtain dir lock in repository `/usr/local/cvs_repository/caBIG/caBIGR3'

cvs [update aborted]: read lock failed - giving up

May also be ignored (says Guanming)

030 Clear data produced in previous runs

THIS SECTION SHOULD ONLY BE CARRIED OUT IF THERE HAS BEEN A PREVIOUS RUN!!

If you have performed a test run previously, or attempted a previous run that failed for some reason, you need to make sure that no data produced in a previous run can interfere with the final data.

A "run" covers the part of the release procedure starting after step 30 and ending before step 140, i.e. generating the release databases.

Send a warning notice to people likely to be affected by starting a new run. It can have all sorts of side-effects, such as assigning different stable identifiers to new events. An email to Lisa and the internal mailing list would be in order.

030a

You need to drop the following databases if they exist (depending on how far the previous run had proceeded, they may or may not exist):

mysql

drop database test_slice_<RELEASE_NUMBER>_myisam

drop database test_reactome_<RELEASE_NUMBER>

drop database test_reactome_<RELEASE_NUMBER>_dn

quit

030b

This subsection is obsolete - skip

You also need to delete the report file, otherwise the new reports will be appended to the existing file:

rm /usr/local/gkbdev/orthomcl_project/reports/report_orthomcl_test_reactome_<RELEASE_NUMBER>.txt


030c

THIS IS THE MOST CRITICAL STEP: Check whether a dump for the stable identifier database exists for the current release:

/usr/local/gkbdev/tmp/test_reactome_stable_identifiers_<RELEASE_NUMBER>.dump


If yes, perform the following steps, if not, DON'T follow these steps!


mysql

drop database test_reactome_stable_identifiers;

create database test_reactome_stable_identifiers;

quit

cat /usr/local/gkbdev/tmp/test_reactome_stable_identifiers_<RELEASE_NUMBER>.dump | mysql test_reactome_stable_identifiers


In most cases you would restart with a new slice database anyway, but if you need to use the same one again, regenerate it in the same way as the stable_identifier database.

035 Assign stable identifiers

Back up the mysql databases that will be altered by the stable ID script, then start working with the new slice. This step must be performed for human Reactome. Releases for other species may skip this step, if they wish.

Running the stable ID script will make irreversible changes to the slice, to the identifier database and to gk_central, so it is important to use mysqldump to make copies of these three databases before you run it. Other databases are not affected.

The host name for human reactome is reactomedev.oicr.on.ca - this option is not usually required

The user name is curator

The port is 3306

To learn the password ask David, Steve or Bijay

035a

cd /usr/local/gkbdev/scripts/release

mysqldump --opt \

-h <host name> \

-u <database user name>\

-p<database user password>\

-P <database port>\

<name of the slice db> \

> /usr/local/gkbdev/tmp/test_slice_<RELEASE_NUMBER>.dump

035b

mysqldump --opt \

-h <host name> \

-u <database user name>\

-p<database user password>\

-P <database port>\

test_reactome_stable_identifiers \

> /usr/local/gkbdev/tmp/test_reactome_stable_identifiers_<RELEASE_NUMBER>.dump

035c

mysqldump --opt \

-h <host name> \

-u <database user name>\

-p<database user password>\

-P <database port>\

gk_central \

> /usr/local/gkbdev/tmp/gk_central_<RELEASE_NUMBER>.dump

Right, that's the databases backed up.

035d

A shell script called 'generate_stable_ids.sh' generates Reactome stable identifiers.

The script may require username (-user), password (-pass) or port (-port) options. These are the same as for the previous step. You also need to know the number for the release that you are about to make, i.e. the current release, specified via the –crnum option, and the number of the previous release, generally the current release number minus one, specified with the –prnum option. The identifier database is always named 'test_reactome_stable_identifiers' and supplied via the –idbname option. Also essential is the name of the current development database, which is 'gk_central', supplied via the –gdbname option. The name of the release slice is test_reactome_<RELEASE_NUMBER>, for example, test_reactome_22. An example of the command is given below.

This is an example for release 22 of human Reactome;

./generate_stable_ids.sh -f -user curator -pass <PASSWORD> -port 3306 -prnum 21 -cdbname test_slice_22 -crdbname test_reactome_22 -crnum 22 -idbname test_reactome_stable_identifiers -gdbname gk_central -nullify >& /usr/local/gkbdev/tmp/generate_stable_ids_22.out &

Now let's run the generate_stable_ids.sh script:

./generate_stable_ids.sh -f \

-host <host name> \

-user <database user name>\

-pass <database user password>\

-port <database port>\

-prnum <previous release number>\

-cdbname <name of the slice db> \

-crdbname <name of the release db> \

-crnum <current release number>\

-idbname test_reactome_stable_identifiers \

-gdbname gk_central \

-project <project name>\

-nullify >& /usr/local/gkbdev/tmp/generate_stable_ids_<RELEASE_NUMBER>.out &

If you are working with a non-human Reactome (e.g. Fly, Gallus, etc.), then you must supply a <project name> via the -project argument. You are advised to use the same name as you used for the $PROJECT_NAME variable in the Config.pm file.

If you would like to keep an eye on the progress of this command, type:

tail -f /usr/local/gkbdev/tmp/generate_stable_ids_<RELEASE_NUMBER>.out

"tail" with the -f option shows you the last few lines of the output, and updates continuously as new output arrives. To exit "tail", type control-C.


This script takes approximately 1 hour to run on reactomedev.oicr.on.ca.

When the script has finished, the bottom of the output file will look something like:

Checking...

No stable IDs are duplicated for test_reactome_stable_identifiers

All stable IDs in release 30 are also in test_reactome_stable_identifiers

All versions in release 30 are consistent with test_reactome_stable_identifiers

13 ReleaseId instances in test_reactome_stable_identifiers are orphans The following 11 stable IDs in release 37 have no correspondence with associated DOIs:

generate_stable_ids.sh has finished its job

The 13 orphans are not a problem unless the number rises significantly. The 11 stable IDs with no correspondence are (probably) not a problem.

038 Prepare GOA submission file (to be run by Joel Weiser)

This step is currently only significant for human Reactome. Releases for other species should skip this step.

Later steps are not dependent on this step so can be run in parallel.

For more background information, see here.

The website release manager should tell Joel when the stable ID script has ran so Joel can run this script. This is best done at this position as the script requires stable identifiers to be present, but it also looks at the date of last modification. So running it after the orthology inference script may lead to the true date of last modification being masked.

First, the GO OBO file needs to be updated:

cd /usr/local/gkbdev/GO_submission/go/ontology

Do a cvs update (cvs up). You will be asked for a password - ask Esther or Steve for more precise information.

GOA requires a tab delimited file with 16 columns (15 mandatory and 1 optional). This is produced by the following script:

cd /usr/local/gkbdev/scripts/release

./prepare_goa_submission.sh <name of the slice db> YYYYMMDD <YOUR_DATABASE_PASSWORD>

YYYYMMDD indicates new release date eg. for release 29, it would read 20090624

To run this scripts takes less than 10 minutes. Two files will be created: "gene_association.reactome" and "gene_association.reactome.stats". The former will be made available from the download page further down in the release process. Note: The "gene_association.reactome file is originally created in the release directory but immediately moved to /usr/local/gkbdev/GO_submission/go/gene-associations/submission by the prepare_goa_submission.sh script.

If while running the script it asks if you want to overwrite the files gene_association.reactome or gene_association.reactome.stats, type y.

The output will list lots of taxon errors, which you can ignore. But make sure at the end of the file, you see this:

--------


NUMBER of ERRORS by COLUMN

Column Col# Number of Errors

TOTAL ERRORS = 0

TOTAL ROWS with no issues = 6590

TOTAL ROWS removed after taxon check = 10943

TOTAL ROWS in file = 17533


--------


If there are no problems reported, continue with the next step in the release SOP. If there are problems, follow this procedure:

As a general rule, these errors should not cause the release to be put on hold, unless there should be a more systematic problem affecting lots of instances. The priority at this point is to continue the release process. But do make sure the errors are followed up as detailed here:

Run the GO submission quality check:

cd /usr/local/gkbdev/GO_submission/go/gene-associations/submission

ls -ltr (Make sure you see the gene_association.reactome file listed under today's date.)

../../software/utilities/filter-gene-association.pl -i gene_association.reactome -d 2>&1 | grep -v 'taxid'


Look out for lines starting with a number followed by ":", e.g.

15717: GOID column=5 This ID "GO:0070251" is not in the OBO file. Is the OBO file up-to-date? Is it the correct GOID?


Try to figure out what the problem is by looking at the Reactome/GO websites as needed. Common mistakes are typos in the GO id, putting GO: in front of the id in the 'accession' slot, using the wrong type, e.g. molecular function instead of biological process. Inform the person affected and ask them to fix this.

(If required, running ../../software/utilities/filter-gene-association.pl -i gene_association.reactome -e gives a more detailed report.)


Note 1: After the release has been published, this file is committed to GOA via cvs. It's not done now as we would otherwise make this data public before the release comes out, resulting in a discrepancy and confusion.

Note 2: Should there be the need to run the goa_submission script again at a later date, make sure you run it on a database slice with stable ids, but without added orthologous events: test_slice_<RELEASE_NUMBER> or test_slice_<RELEASE_NUMBER>_myisam

Note 3: Christina's e-mail is currently defined as the address where any reports should be sent. If this is not correct please have the conf file updated (gene_association.reactome.conf in the submission directory). If you have any questions or suggestions, please do not hesitate to contact Mike Cherry E-mail: cherry@genome.stanford.edu .

040 Convert db to MyISAM

The MyISAM 'format' enables full text queries, i.e. 'with ALL of the words', 'with ANY of the words' and 'with the EXACT PHRASE' query types on the website. Hence, if the db is in InnoDB format and you try using any of these searches you get an error message containing 'Table 'XXX' does not support fulltext queries

Create a copy of the database that uses MyISAM tables:

/usr/local/gkbdev/scripts/innodb2myisam.pl \

-dbfrom <name of the slice db> \

-dbto <name of the MYISAM db> \

-host <host name> \

-port <database port> \

-user <database user name> \

-pass <database user password>

e.g. on reactomedev.oicr.on.ca with current MySQL permission settings allowing local users to create databases with name starting with 'test_' (and thus not having to specify user name and password) the actual command line could look something like:


./innodb2myisam.pl \

-dbfrom test_slice_22 \

-dbto test_slice_22_myisam

This script runs very quickly, under 1 minute.

045 Simplify data model - now obsolete

This step is now obsolete.

050 Do computational orthology-based predictions for other species

This step is currently only significant for human Reactome. Releases for other species should skip this step.

Preconditions

You must prepare the orthopair files as described above in Slice Preparation before you proceed any further. If the following directory exists: "/usr/local/gkbdev/scripts/compara/<new Reactome release number>", then it means that the orthopair files have been generated. Check that all the files have content (are non zero size). If any are empty the ortho_pairs process failed, consult an expert.

There must be a database named 'test_slice_<RELEASE NUMBER>_myisam'. This will have been created as described in the section 'Convert db to MyISAM' above. If you created a MyISAM database which does not follow the naming convention (why would you do that?) i.e. 'test_slice_<RELEASE NUMBER>_myisam', you must make a copy of the database into a new database with the correct name, e.g. by use mysqldump to dump the database, then import it with the correct name.

You must run this script on the machine running the MySQL server.

050a

Given the preconditions above, run the following script:

perl /usr/local/gkbdev/scripts/compara/wrapper_ortho_inference.pl -r <RELEASE NUMBER> \

-host reactomedev.oicr.on.ca \

-user curator \

-pass <database user password> \

>& /usr/local/gkbdev/scripts/compara/<RELEASE NUMBER>/wrapper_ortho_inference.out &

The script creates a copy of the release slice named 'test_reactome_<RELEASE NUMBER>', adjusts the data model needed for inference, and runs the inference for each species. In the end it cleans up unused PhysicalEntities that were created in the process but then not needed after all. It does not change the input database.


On reactomedev.oicr.on.ca, this procedure requires approximately 8 hours to run.

You can keep track of its progress by running the "tail -f " command on the wrapper_ortho_inference.out file.

050b

A report named report_ortho_inference_<db name>.txt is created in the <RELEASE NUMBER> directory. Compare it with the report for the previous release. You should find that it has one line for each target species. The percentages should not differ too much from one release to the next. If they do - in general or for a particular species - one should investigate as to whether something has gone wrong or whether there are good reasons for it.

050c

Make a back-up of the database once this script has finished and all looks well with the data on the website (this is useful after any long-running script). Make the back-up in /usr/local/gkbdev/tmp/ as follows:

mysqldump --opt \

-h <host name> \

-u <database user name>\

-p<database user password>\ N.B. It's not a typo, don't leave a space after the -p and the password!

-P <database port>\

<name of the release db> \

> /usr/local/gkbdev/tmp/test_reactome_<RELEASE_NUMBER>_after_ortho.dump


To check the script worked (is working?) and see if species are being populated in the database, you can view this site:

http://reactomedev.oicr.on.ca:8084/cgi-bin/eventbrowser?DB=test_reactome_xx

...where xx is the release number you're working on

What should you see at this point???


For more information on the orthology inference procedure, see here.


NOTE:The rest of this section describes a redundant procedure based on OrthoMCL - go to step 055

(Process description could now be removed as the Ensembl Compara based system is established)


If you have not already done so, then you must run step 011 ('Prepare ortho pairs') before you proceed any further.

It is essential that there is a database with name 'test_slice_<RELEASE NUMBER>_myisam'. This will have been created in the 'Convert db to MyISAM' section above. If you created a MyISAM database which does not follow the naming convention 'test_slice_<RELEASE NUMBER>_myisam', you must make a copy of the database into a new database with the correct name, e.g. by using mysqldump to dump the database, then importing it under the correct name.

Additionally, you must run this script on the same machine as the MySQL server is running.

You also need to check on the first line of Step 011, which Ensembl version has been used for the present OrthoMCL version. Take the corresponding Ensembl Mart release number (at present -mart 42).

Given these preconditions, run:

perl /usr/local/gkbdev/orthomcl_project/scripts/wrapper_ortho_inference.pl\

-r <RELEASE NUMBER> \

-mart <Ensembl Mart release number>

-orthomcl <OrthoMCL_version> >& /usr/local/gkbdev/tmp/wrapper_ortho_inference.out.<RELEASE NUMBER> &

This wrapper script creates a copy of the release slice named 'test_reactome_<RELEASE NUMBER>', adjusts the data model needed for inference, and runs the inference for each species. In the end it cleans up unused PhysicalEntities that were created in the process but then not needed after all. It does not change the input database.

NOTE: Depending on the outcome of the discussion about splice variants, the db may need to be changed so that variantIdentifier is *not* a defining attribute for RPS. Otherwise existing mouse/rat RPSs will be duplicated. (This has been implemented for now.)

A report named report_orthomcl_<db name>.txt is created in the directory

/usr/local/gkbdev/orthomcl_project/reports

directory. Check this file: you should find that it has one line for each of the ortho pair files generated in step 011, and it should have very similar figures to the report for the previous release.

On reactomedev.oicr.on.ca, this procedure requires approximately 3 hours to run. While the script is running, there is no content to the report file, so to check progress, look at the new files in the reports directory. There are 2 files created for each species, starting with mouse, inferred_mmu_75.txt and eligible_mmu_75.txt. These should grow in size, and eventually more files should appear for the other species. Mouse should take about 15 mins.

It would be useful to make a back-up of the database once this script (and after all the longer scripts) has finished and all looks well with the data on the website. Make the back-up in /usr/local/gkbdev/tmp/:

mysqldump --opt \

-h <host name> \

-u <database user name>\

-p<database user password>\

-P <database port>\

<name of the release db> \

> /usr/local/gkbdev/tmp/test_reactome_<RELEASE_NUMBER>_after_ortho.dump

055 Update Config.pm file for New Release

The Config.pm file informs the Reactome server that it should switch to a new release. Go to the file:

cd /usr/local/gkbdev/modules/GKB

Open '"Config.pm"' in your favorite editor, e.g. "vi".

Search for "GK_DB_NAME". Change the database name to <name of the release db> (e.g. test_reactome_25).

Search for "LAST_RELEASE_DATE". Change the date to the date of the previous release. The format of the date string is YYYYMMDD. You can determine the last release date as follows:

  • For human Reactome, you can find this date by looking at the URL www.reactome.org/news.html.
  • If the current release is the very first one for your species, set LAST_RELEASE_DATE to 00000000.
  • Otherwise, if you maintain a news file for your species, you will find the date at the top of the file /usr/local/gkbdev/website/html/news.html

Save "Config.pm" and quit the editor.

060 Create ReactionCoordinates for predicted reactions

Only relevant for human Reactome. Releases for other species skip this step.

As of release 16, orthologous Reactions need to have their own coordinates for being displayed in the sky. Run:

/usr/local/gkbdev/scripts/create_ReactionCoordinates_for_orthologues.pl \

-db <name of the release db> \

-host <host name> \

-port <port number> \

-user <user name> \

-pass <user password>

On reactomedev.oicr.on.ca the last 4 command line options are optional, i.e. the following should suffice:

/usr/local/gkbdev/scripts/create_ReactionCoordinates_for_orthologues.pl \

-db test_reactome_<RELEASE NUMBER>

This process takes about 3 minutes on reactomedev.oicr.on.ca.

070 Check computational predictions - Obsolete

Does not apply to new version of the site. Proceed to next step

This step is currently only significant for human Reactome. Releases for other species should skip this step.

This is done by eyeballing that nothing truly weird has happened, i.e. go to the webpage for the new slice at

http://reactomedev.oicr.on.ca/cgi-bin/frontpage?DB=test_reactome_xx where xx is the version of the release. Click on the checkbox for cross-species comparisons, and pick a topic, pick a reaction and check that some have a link under Equivalent event(s) in other organism(s)


Important: Once OrthoDB is done and appears to be satisfactory, inform curation team to prepare files as seen in 1.6.23 (Step 110).

080 Add links to external resources

Reactome contains links to UniProt, OMIM, etc. These links are added by a script add_links.pl

080a

Make a backup of the release database before running the add_links script:

mysqldump --opt \

-h <host name> \

-u <database user name>\

-p<database user password>\

N.B. Don't add a space between the -p and the password!

-P <database port>\

<name of the release db> \

> /usr/local/gkbdev/tmp/test_reactome_<RELEASE_NUMBER>_before_addlinks.dump

080b

Now that the backup is made, run the script:


cd /usr/local/gkbdev/scripts/release

./run add_links.pl -db <name of the release db> >& /usr/local/gkbdev/tmp/add_links_<RELEASE_NUMBER>.out &

On reactomedev.oicr.on.ca this takes about 48 HOURS to run.

If the script runs to completion you should see the following at the end of the output file:


"add_links.pl has finished its job"


If the run fails near to the end of the process, or you only notice later on (top tip - it's less hassle if you notice now!) and you know what you are doing or have an expert to consult it will save a lot of time if you edit the script to start from the point where it broke. This also avoids potentially duplicating links. Comment out the calls to scripts that add links that are already present. Remember to un-comment when you have completed this stage so that the script is ready for the next release! If in any doubt, remove the links added - go into mysql, drop the release db, create the release db, and repopulate it with the dump file created above, i.e.

mysql;

drop database test_reactome_<RELEASE_NUMBER>;

create database test_reactome_30;

quit<code>

<code>cat /usr/local/gkbdev/tmp/test_reactome_<RELEASE_NUMBER>_before_addlinks.dump | mysql test_reactome_<RELEASE_NUMBER>

080c

After the run, and if all looks well on the website, (http://reactomedev.oicr.on.ca:8084/cgi-bin/eventbrowser?DB=test_reactome_xx where xx is the release number) i.e. there are working links from entities to external databases, make a copy of the database:


mysqldump --opt \

-h <host name> \

-u <database user name>\

-p<database user password>\

-P <database port>\

<name of the release db> \

> /usr/local/gkbdev/tmp/test_reactome_<RELEASE_NUMBER>_after_addlinks.dump

081 Assign stable identifiers to ortho-predicted instances

This step must be performed for human Reactome. Releases for other species may skip this step, if they wish.

Running this script will make irreversible changes to the release and to the identifier database, so it is important to use mysqldump to make copies of these databases before you do this. Other databases will not be affected by the script.

081a Backup mysql databases

cd /usr/local/gkbdev/scripts/release

mysqldump --opt \

-h <host name> \

-u <database user name>\

-p<database user password>\

-P <database port>\

<name of the release db> \

> /usr/local/gkbdev/tmp/test_release_<RELEASE_NUMBER>.ortho.dump

mysqldump --opt \

-h <host name> \

-u <database user name>\

-p<database user password>\

-P <database port>\

test_reactome_stable_identifiers \

> /usr/local/gkbdev/tmp/test_reactome_stable_identifiers_<RELEASE_NUMBER>.ortho.dump

Right, that's the databases backed up.

081b Run the generate_stable_ids script

Run the generate_stable_ids.sh script:

./generate_stable_ids.sh -f \ leave two spaces after the f before -host

-host <host name> \

-user <database user name>\

-pass <database user password>\

-port <database port>\

-prnum <previous release number>\

-crdbname <name of the release db> \

-crnum <current release number>\

-idbname test_reactome_stable_identifiers \

-o \ leave two spaces after o before -project

-project <project name>\

-nullify >& /usr/local/gkbdev/tmp/generate_stable_ids_<RELEASE_NUMBER>.ortho.out &

If you are working with a non-human Reactome (e.g. Fly, Gallus, etc.), then you must supply a <project name> via the -project argument. You are advised to use the same name as you used for the $PROJECT_NAME variable in the Config.pm file.

If you would like to keep an eye on the progress of this command, type:

tail -f /usr/local/gkbdev/tmp/generate_stable_ids_<RELEASE_NUMBER>.ortho.out

"tail" with the -f option shows you the last few lines of the output, and updates continuously as new output arrives. To exit "tail", type control-C.

This is what an example of the above looks like for release 22 for human Reactome;

./generate_stable_ids.sh -f -user curator -pass <PASSWORD> -port 3306 -prnum 21 -crdbname test_reactome_22 -crnum 22 -idbname test_reactome_stable_identifiers -o -nullify >& /usr/local/gkbdev/tmp/generate_stable_ids_22.ortho.out &

This script takes approximately 60 minutes to run on reactomedev.oicr.on.ca.

When the script has finished, the bottom of the output file will look something like:

Checking...

No stable IDs are duplicated for test_reactome_stable_identifiers

All stable IDs in release 30 are also in test_reactome_stable_identifiers

All versions in release 30 are consistent with test_reactome_stable_identifiers

13 ReleaseId instances in test_reactome_stable_identifiers are orphans

generate_stable_ids.sh has finished its job

The 13 orphans are not a problem unless the number rises significantly.

At this point, i.e. when the stableIDs have been generated and the orthoscript has completed, the release coordinator should send an email to the Editorial release coordinator (Lisa).

081c Insert Auto-Generated Keywords into Release - Obsolete

This step takes the Summations associated with Events, automatically extracts keywords from them, and then inserts these keywords into the Event instances. These kywords then appear in the META tags on Event web pages - they are invisible to normal users, but search engines see them, and use them for working out the ranking that should be given to a web page.

cd /usr/local/gkbdev/scripts/release

./insert_event_keywords.pl -db <name of the release db> >& /usr/local/gkbdev/tmp/insert_event_keywords_<RELEASE_NUMBER>.out &

This step will need about three hours to run.

082 Parallelize the BioMart update process

To speed up the release, you can use a spare machine (for example brie8) to start the BioMart generation (section 105) at this point, and then carry on immediately with step 83, without waiting for the BioMart run to finish, since it is quite independent. The BioMart script can take around 36hrs so it's useful to run this in parallel.

To do this you need to make a copy of test_reactome_XX on your spare machine (brie8 - gkbdev). In the steps below change XX to the current release number.

Log on to brie8 and set the environment variables as described above in Setting the Environment Variables.

Use SUDO to change group of all files to gkb, as described above.

Do a cvs update (cvs up -d) in the following directories:

/usr/local/gkbdev/modules/GKB

/usr/local/gkbdev/scripts

Create the database "test_reactome_XX" in mysql

mysql

create database test_reactome_XX;

quit

Then copy over the database from reactomedev to brie8

mysqldump --opt -h reactomedev.oicr.on.ca -u curator -p<pass> test_reactome_XX | mysql test_reactome_XX


Now perform the BioMart update as described in section 105.

083 Create skypainter db

The SkyPainter needs its own database, which you can generate as follows:

cd /usr/local/gkbdev/scripts/denormalised_db

./create_skypainter_db.pl -db <name of the release db> >& /usr/local/gkbdev/tmp/create_skypainter_db.out.<RELEASE NUMBER> &

The skypainter db will have the name: '<name of the release db>_dn'

On reactomedev.oicr.on.ca, this procedure requires about 5 hours to run. This is without computing the background p-value distribution (below).

In order to estimate the false discovery rate (FDR) the procedure now also calculates and stored background p-value distributions for a list of species.This list is specified in the GKB/scripts/denormalised_db/compute_and_store_background_p_value_distribution_4_some_species.pl script. If you are doing the procedure for some other reactome please make sure that your species of interest is on that list. Background p-value compute takes a long time: on new (autumn 2008) reactomedev.oicr.on.ca it takes at least 1.5h per species (using list length of 2..100). If you don't want the background p-value distribution to be calculated comment out/remove the call to compute_and_store_background_p_value_distribution_4_some_species.pl in create_skypainter_db.pl

For release 31, we should experiment with GKB/scripts/denormalised_db/compute_and_store_background_p_value_distribution_4_some_species.pl, by commenting out all species except for Homo sapiens, and then running the script to see how long it takes.

Using the default settings that were in place at release 28, the total time needed for this entire step was 4 hours.

The out file may contain errors like these - they are because there are species represented in Reactome that are not in the ensembl mart, not a problem with the script.

create skypainter errors

./indirectIdentifiers_from_mart.pl -db test_reactome_30 -sp 'Chlamydia trachomatis' No mart tables for 'ctrachomatis'. ./indirectIdentifiers_from_mart.pl -db test_reactome_30 -sp 'Chlamydia trachomatis' failed. ./gene_names_from_mart.pl -db test_reactome_30 -sp 'Chlamydia trachomatis' No relevant ReferenceGeneProduct instances. ./indirectIdentifiers_from_mart.pl -db test_reactome_30 -sp 'Clostridium botulinum' No mart tables for 'cbotulinum'. ./indirectIdentifiers_from_mart.pl -db test_reactome_30 -sp 'Clostridium botulinum' failed. ./gene_names_from_mart.pl -db test_reactome_30 -sp 'Clostridium botulinum' DBD::mysql::st execute failed: Table 'ensembl_mart_47.cbotulinum_gene_ensembl__xref_uniprot_accession__dm' doesn't exist at ./gene_names_from_mart.pl line 86. DBD::mysql::st execute failed: Table 'ensembl_mart_47.cbotulinum_gene_ensembl__xref_uniprot_accession__dm' doesn't exist at ./gene_names_from_mart.pl line 86. ./gene_names_from_mart.pl -db test_reactome_30 -sp 'Clostridium botulinum' failed.

084 Create and adjust PathwayCoordinates

This step is currently only significant for human Reactome. Releases for other species should skip this step.

To generate the coordinates for the Pathway labels which appear on the reaction map in SkyPainter when the 'Display topic names:' option is checked, run the following:

/usr/local/gkbdev/scripts/create_PathwayCoordinates.pl -db <name of the release db>

On reactomedev.oicr.on.ca, this procedure requires about 10 seconds to run. Please note that this places the labels according the coordinates of the most "extreme" reactions in the pathway. You may want adjust them manually.

086 Create Frontpage Images and Cached TOC File

This step is currently only significant for Reactomes with more than one species in their database. Releases for single species, e.g. only Gallus Gallus and nothing else, may skip this step.

In case you've created front page files for a db with this name already, please remove the cached files first as otherwise the old ones will be used:

rm -rf /usr/local/gkbdev/website/html/img-fp/<name of the release db>

Create the cached front page files (this should take place before pointing your browser towards the front page of this db):

cd /usr/local/gkbdev/scripts/release

./create_frontpage_files.pl DB=<name of the release db>

/usr/local/gkbdev/website/cgi-bin/toc DB=test_reactome_<RELEASE NUMBER>

/usr/local/gkbdev/website/cgi-bin/doi_toc DB=test_reactome_<RELEASE NUMBER>

sudo chown -R ${USER}:www-data \

/usr/local/gkbdev/website/html/img-fp/test_reactome_<RELEASE NUMBER>/toc

sudo chown -R ${USER}:www-data \

/usr/local/gkbdev/website/html/img-fp/test_reactome_<RELEASE NUMBER>/doi_toc

If you have sudo permission problems, see David!

090 Create database and static content for entity level view

This step is currently only significant for human Reactome. Releases for other species should skip this step.

090a

Make sure that there is a backup of the previous release's static content:

cd /usr/local/gkbdev/website/html/entitylevelview/pathway_diagram_statics

If the file:

test_reactome_<PREVIOUS RELEASE NUMBER>.tgz

does not exist, pack the previous release's static content into a tarball (.tgz file):

tar -zcvf test_reactome_<PREVIOUS RELEASE NUMBER>.tgz test_reactome_<PREVIOUS RELEASE NUMBER>

This will take about 15 minutes. If you encounter errors like "Out of disk space", you will need to remove the entity level tarball for an older release and repeat the tar.

090b

Once the backup is done, remove the static files for the previous release:

rm -rf test_reactome_<PREVIOUS RELEASE NUMBER>_pathway_diagram

The ELV diagrams are generated using test_reactome_<RELEASE NUMBER>, there is now no separate pathway_diagram datafile.


090c

Make a backup of the release db just in case:


mysqldump --opt \

-h <host name> \

-u <database user name>\

-p<database user password>\

-P <database port>\

<name of the release db>\

> /usr/local/gkbdev/tmp/<name of the release db>_before_pathway_diagram.dump

090d

Now run the shell script to generate ELVs.

If you need to rerun the release, remove the contents of this directory:

rm -r /usr/local/gkbdev/website/html/entitylevelview/pathway_diagram_statics/test_reactome_XX/*

This isn't critical but prevents error messages later.

cd /usr/local/gkbdev/WebELVTool

./runWebELVTool.sh \

<host name> \

<name of the release db> \

<user name> \

<user password>

<port number> \

/usr/local/gkbdev/website/html/entitylevelview/pathway_diagram_statics \

349401 \

>& /usr/local/gkbdev/tmp/runWebELVTool.out.<RELEASE NUMBER> &

This will take about 5 hours on reactomedev.oicr.on.ca.


To see the elv files being created:

ls -l /usr/local/gkbdev/website/html/entitylevelview/pathway_diagram_statics/test_reactome_XX

If file numbers are increasing, then script is working.


090f

Finally you need to pack the static content into a tarball (.tgz file):

cd /usr/local/gkbdev/website/html/entitylevelview/pathway_diagram_statics

tar -zcvf /usr/local/gkbdev/tmp/test_reactome_<RELEASE NUMBER>.tgz test_reactome_<RELEASE NUMBER>

This will take about 15 minutes. If you encounter errors like "Out of disk space", you will need to remove the entity level tarball for a previous release and repeat the tar.

092 Set up GWT

Builds the GWT Javascript, deploys the GWT servlets to Tomcat and puts some cached data into the database.

cd /usr/local/gkbdev/scripts/release

sudo rm -rf /usr/local/reactomes/Reactome/development/apache-tomcat/webapps/ReactomeGWT*

./create_gwt.pl -db <name of the release db> -url http://reactomedev.oicr.on.ca

Don't try to pipe the output of this command to a file.

On reactomedev.oicr.on.ca, this procedure takes about 20 minutes to run.

This is still experimental, so don't worry if errors occur, but please do report them to the dev list.

95 Back Up Documentation

This step is currently only significant for human Reactome. Releases for other species should skip this step.


The Reactome wiki is the central location for keeping documentation up-to-date, but for safety's sake, you should also make a release-time backup from the wiki to CVS. There is a script available for doing this. First, you should change to the scripts directory for releasing:

cd /usr/local/gkbdev/scripts/release

Now run the script for each set of documentation to be backed up:

./run download_html_from_wiki.pl -wiki_url \

"http://wiki.reactome.org/index.php/Usersguide" -target_dir ../../website/html/userguide

./run download_html_from_wiki.pl -wiki_url \

"http://devwiki.reactome.org/index.php/Release_SOP" -target_dir ../../docs/SOP

Since these scripts access CVS, you will be asked to type in a password; please use your CVS password (normally the same as your regular login password). You may need to give your password more than once, if new files have been added by download_html_from_wiki.pl.

100 Update the Download Directory

This section tells you how to create the files accessible to users via the 'Download' button at the top of Reactome web pages.


100a Create download directory

cd /usr/local/gkbdev/website/html/download

/usr/local/gkbdev/scripts/release/create_download_directory.pl \

-host <host name> \

-port <port number> \

-user <user name> \

-pass <user password> \

-r <RELEASE NUMBER> -db <name of the release db> \

>& /usr/local/gkbdev/tmp/create_download_directory.out.<RELEASE NUMBER> &

If you are working on a non-human Reactome, you will need to let create_download_directory.pl know which species you are dealing with. This is done by adding an extra argument after <name of the release db>. Add a -sp flag, followed by the full species name, enclosed in single quotes. E.g. for fly you would probably add:

-sp 'Drosophila melanogaster'

The script will automatically create the directory:

/usr/local/gkbdev/website/html/download/<RELEASE NUMBER>

if it does not already exist, and fill it with various data dumps.

On reactomedev.oicr.on.ca, this procedure requires about 44 hours to run.However, it is not necessary to wait for this script to complete before continuing with other steps in the release process. The remaining tasks in this step should only be done when the script completes.

100b Make download available

The final steps of the dump creation process:

cd <RELEASE NUMBER>

/usr/local/gkbdev/scripts/release/make_release_tarball.pl

You will be prompted for your password when this script starts – this is your CVS password (probably the same as your regular login password).

When this script completes, you need to link the 'current' directory to the new download directory. Normally, there will already be a 'current' directory from your previous release, so you need to remove the old 'current' directory. If this is the first ever release of your Reactome, skip the next step, otherwise, proceed as follows:

cd ..

ls -la current

You will see something like this:

lrwxrwxrwx 1 croft gkb 2 Dec 4 04:49 current -> <PREVIOUS RELEASE NUMBER>

The directory that 'current' points to will be an old release number. Continuing with this example, you will need to remove the old 'current' directory:

mv <PREVIOUS RELEASE NUMBER> foo

rm current

mv foo <PREVIOUS RELEASE NUMBER>

Finally, create a new link:

ln -s <RELEASE NUMBER> current

105 Upgrade BioMart and Restart the Server

Reactome has its own BioMart, running under its own server. In this section, you will generate a BioMart database from the current release, configure BioMart for Reactome, and estart the BioMart server.

105a Install Biomart

If a Reactome BioMart has not yet been installed, you will need to install it yourself. Instructions and downloadables are available from www.biomart.org . You should install into the directory /usr/local/gkbdev/BioMart. You will know that there is already an installation present if this directory contains the subdirectories "apache", "biomart-perl" and "martj-0.6".

105b Backup the old BioMart database

Unless this is the very first release for your species, make a backup of the old BioMart database:

mysqldump --opt --socket=/usr/local/gkbdev/BioMart/mysql/tmp/sock --port=8087 \

-h <host name> \

-u <database user name>\

-p<database user password>\

-P <database port>\

test_reactome_mart\

> /usr/local/gkbdev/tmp/test_reactome_mart.dump.<RELEASE NUMBER>

105c Generate the new BioMart database

Generate the BioMart database for the new release. Note that if you are short of time, you can skip the generation of interactions, by removing the "-interactions" flag from the command line.

cd /usr/local/gkbdev/scripts

perl martify_reactome.pl -db <name of the release db> \

-host <host name> \

-port <port number> \

-user <user name> \

-pass <user password>

-bdb test_reactome_mart \

-interactions \

>& /usr/local/gkbdev/tmp/martify_reactome.out.<RELEASE NUMBER> &

This will take about 28 hours, if you don't create interactions or 33 hours with interactions, on reactomedev.oicr.on.ca.

Before going any further, check to make sure that the script has completed:

tail /usr/local/gkbdev/tmp/martify_reactome.out.<RELEASE NUMBER>

The last line should read:

martify_reactome.pl has finished its job

105d Dump the new Biomart database

Make a dump of the BioMart database:

mysqldump --opt \

test_reactome_mart>/usr/local/gkbdev/tmp/test_reactome_mart_<RELEASE NUMBER>_new

For release 37, stop at this point, because subsequent steps will overwrite Nelson's modified BioMart, which needs to be publicly visible for a pending research paper review.

105e Deploy the BioMart db

Deploy the BioMart database to its own MySQL instance:

cat /usr/local/gkbdev/tmp/test_reactome_mart_<RELEASE NUMBER>_new | \

mysql --socket=/usr/local/gkbdev/BioMart/mysql/tmp/sock --port=8087 test_reactome_mart

Once this is done, you will need to import the latest parameter settings into the database:

perl update_reactome_mart.pl -sudo -biomart_version 0.7

This is an interactive script - you can simply give the default answers to all of the questions it asks you, by hitting the return key every time it asks a question. You will see perhaps 20 lines of warnings mentioning "CTable" when you first start the script - you can safely ignore these. You will also be asked for your password.

Once the script is done, check BioMart is working at:

http://reactomedev.oicr.on.ca/cgi-bin/mart

Select database REACTOME, dataset pathway, and click the Results button. You should get a table with 10 rows in it.
If this doesn't work, this step didn't work, simples (eeek!)

110 Website files update (stats, news, about etc.)

Is this a good time to check out the test site?????

http://reactomedev.oicr.on.ca/cgi-bin/frontpage?DB=test_reactome_<RELEASE NUMBER>


The procedure for updating the website files is described here. This step is done by the editorial release manager.

Update Fallback Server

In this section, <reactome.oicr.on.ca_fallback_main> should be replaced with the path to the fallback directory, /usr/local/gkb_prod.

Background

Now is the time to switch over from reactomedev.oicr.on.ca to reactome.oicr.on.ca.

On reactome.oicr.on.ca there are two Reactome servers, a production ('live') one, that is the main, publicly visible one, and a fallback server, which should be a carbon copy of the production one. Both servers are visible as web sites, but with different port numbers.

For human Reactome, the production server is accessible via the URL www.reactome.org, i.e. the regular Reactome URL. For the fallback server, we use port number 8000 and this is accessible via www.reactome.org:8000.

The two servers utilize two separate directory hierarchies; human Reactome uses /usr/local/gkb for the production server and /usr/local/gkb_prod (aka <reactome.oicr.on.ca_fallback_main>) for the fallback server.

The point of much of the following is to minimize production server downtime. Hence we set up the new release on the fallback server and reroute traffic to while updating the production server.

140 Connect to the server and set environment

Log in to reactome.oicr.on.ca in order to continue with the release.

If you log out during the release and then log back in again later to continue, you will need to repeat the steps for establishing the correct shell environment, detailled below.

To establish the correct shell environment, do the following:

csh

setenv PATH /usr/local/bin:${PATH}

setenv PERL5LIB /usr/local/gkbdev/modules:/usr/local/bioperl-1.0

umask 0002

Additionally, some scripts use a symbolic link from the Reactome directory to the file GKB in your home directory in order to find Perl libraries. You will need to set up this link, but you only need to do this once. You should first check to see if you already have a directory with this name:

ls -l ~/GKB

If the results look something like this:

lrwxrwxrwx 1 bloggs bloggs 17 Sep 12 16:51 /home/bloggs/GKB -> /usr/local/gkb_prod

then the link already exists, and you can skip the rest of this subsection. If the results look more like this:

lrwxrwxrwx 1 bloggs bloggs 17 Sep 12 16:51 /home/bloggs/GKB

then it means that you have your own personal copy of GKB checked out into your home directory. You will need to rename it to something else first:

mv ~/GKB ~/GKB_bloggs

Now create the link:

ln -s /usr/local/gkb_prod ~/GKB

150 Update all Source Code

Before you continue, make sure step 110 is completed.

You need to ensure that you have the most up-to-date version of Reactome, by doing CVS update in certain critical directories. Use 'cd' to change into each of these directories, and in each one, run BOTH sudo commands and BOTH CVS update commands, giving your sudo/CVS passwords when prompted:

sudo chgrp -R gkb *

sudo chmod -R g+rw *

cvs up -d

cvs up

To identify conflicted files, you can add the following to the cvs up command:

cvs up -d|& grep "^C "

Keep an eye on the output from CVS: any lines beginning with a "C" indicate that conflicts have occurred during the update. These may need to be fixed by editing the file and resolving them by hand. You will recognize these conflicts in the form of lines beginning with '>>>', '===' or '<<<' in the affected file. These lines should be deleted, as should unwanted alternative lines inserted by CVS. If you do not feel confident to do this, contact one of the Reactome software staff.


Repeat the above for the following directories:

/usr/local/gkb_prod/biopaxexporter

/usr/local/gkb_prod/modules/GKB

/usr/local/gkb_prod/scripts

/usr/local/gkb_prod/java

/usr/local/gkb_prod/BioMart/reactome

/usr/local/gkb_prod/slicingTool

/usr/local/gkb_prod/ReactomeGWT

The ReactomeGWT directory is still experimental, so don't worry if there are errors, but please report them to the dev list.

Before updating the website directory, remove all files from the userguide directory - to avoid cvs conflicts:

rm -f /usr/local/gkb_prod/website/html/userguide/*

Note: You can ignore cvs conflicts concerning the images directory and previous release directories (like website/html/download/14). This is slow, takes about 15 minutes.

/usr/local/gkb_prod/website

160 Update the Download Directory

This step should only be done if step '100 Update the Download Directory' has been completed on reactomedev.

Copy the download directory from reactomedev.oicr.on.ca to reactome.oicr.on.ca:

cd /usr/local/gkb_prod/website/html/download

scp -r reactomedev.oicr.on.ca:/usr/local/gkbdev/website/html/download/<RELEASE NUMBER> .

Once this is complete, you will need to link the 'current' directory to the new download directory. If there is already a 'current' directory (normally the case), you will first need to remove the old 'current' directory. Procede as follows:

ls -la current

You will see something like this:

lrwxrwxrwx 1 croft gkb 2 Dec 4 04:49 current -> <PREVIOUS RELEASE NUMBER>

The directory that 'current' points to will be an old release number. Continuing with this example, you will need to remove the old 'current' directory, which is slightly complicated:

mv <PREVIOUS RELEASE NUMBER> foo

rm current

mv foo <PREVIOUS RELEASE NUMBER>

Finally, create a new link:

ln -s <RELEASE NUMBER> current

170 Copy Over Databases

This step creates and loads these databases:

  • test_reactome_stable_identifiers
  • test_reactome_mart
  • test_reactome_<RELEASE NUMBER>
  • test_reactome_<RELEASE NUMBER>_dn
  • test_reactome_<RELEASE NUMBER>_pathway_diagram


First, make a backup copy of 'test_reactome_stable_identifiers':

mysqldump --opt test_reactome_stable_identifiers \

> /usr/local/gkb_prod/tmp/test_reactome_stable_identifiers.dump.<RELEASE NUMBER>


Then, create/make space for appropriate databases:

mysql

drop database test_reactome_stable_identifiers;

create database test_reactome_stable_identifiers;

create database test_reactome_<RELEASE NUMBER>;

create database test_reactome_<RELEASE NUMBER>_dn;

exit;


Now copy the databases over from the internal server to the public server, if you have two servers:

mysqldump --opt -h reactomedev.oicr.on.ca -u curator -p<password> \

test_reactome_stable_identifiers | mysql test_reactome_stable_identifiers


test_reactome_xx and test_reactome_xx_dn are large databases - trying to do a mysqldump invariably results in a time-out.

Instead, first make a dump of the databases on reactomedevoicr.on.ca in the /usr/local/gkbdev/tmp dir:

mysqldump --opt test_reactome_xx > reactome_xx.dump

mysqldump --opt test_reactome_xx_dn > reactome_xx_dn.dump


Then, over on reactome.oicr.on.ca, do a secure copy of the database dumps:

scp reactomedev.oicr.on.ca:/usr/local/gkbdev/tmp/reactome_xx.dump .

scp reactomedev.oicr.on.ca:/usr/local/gkbdev/tmp/reactome_xx_dn.dump .


Followed by cat:


cat reactome_xx.dump | mysql test_reactome_xx

cat reactome_xx_dn.dump | mysql test_reactome_xx_dn

The cat process can take 30 minutes.


mysql --socket=/usr/local/gkb_prod/BioMart/mysql/tmp/sock --port=8087

drop database test_reactome_mart;

create database test_reactome_mart;

exit;

scp reactomedev.oicr.on.ca:/usr/local/gkbdev/tmp/test_reactome_mart_<RELEASE NUMBER>_new /usr/local/gkb_prod/tmp

cat /usr/local/gkb_prod/tmp/test_reactome_mart_<RELEASE NUMBER>_new | \

mysql --socket=/usr/local/gkb_prod/BioMart/mysql/tmp/sock --port=8087 test_reactome_mart

180 Update Config.pm file for New Release

You will need to let the Reactome server know that you have switched to a new release. This is done by editing the Config.pm file:

cd /usr/local/gkb_prod/modules/GKB

Open the file "Config.pm" in your favorite editor, e.g. "vi".


Search for "GK_DB_NAME". Change the database name to <name of the release db> (e.g. test_reactome_25).

Search for "LAST_RELEASE_DATE". Change the date to the date of the previous release. The format of the date string is YYYYMMDD. You can determine the last release date as follows:

  • For human Reactome, you can find this date by looking at the URL www.reactome.org/news.html.
  • If the current release is the very first one for your species, set LAST_RELEASE_DATE to 00000000.
  • Otherwise, if you maintain a news file for your species, you will find the date at the top of the file /usr/local/gkb_prod/website/html/news.html

Save "Config.pm" and quit the editor.

190 Create Front Page Images and Cached TOC File

cd /usr/local/gkb_prod/scripts/release

./create_frontpage_files.pl DB=test_reactome_<RELEASE NUMBER>

Although at first glance these next two commands don't look like commands, they are.

/usr/local/gkb_prod/website/cgi-bin/toc DB=test_reactome_<RELEASE NUMBER>

/usr/local/gkb_prod/website/cgi-bin/doi_toc DB=test_reactome_<RELEASE NUMBER>

Note that execution of create_frontpage_files.pl as well as the chown commands below may result in an error message 'Operation not permitted' if you are not a member of the unix user group as specified by $GKB::Config::WWW_USER variable ('nobody' on reactomedev.oicr.on.ca and reactome.oicr.on.ca). However, this should not affect the working of the server.

sudo chown -R ${USER}:www-data \

/usr/local/gkb_prod/website/html/img-fp/test_reactome_<RELEASE NUMBER>/toc

sudo chown -R ${USER}:www-data \

/usr/local/gkb_prod/website/html/img-fp/test_reactome_<RELEASE NUMBER>/doi_toc

If you have sudo permission problems, see David!

200 Copy database and static content for entity level view

This step is currently only significant for human Reactome. Releases for other species should skip this step.

cd /usr/local/gkb_prod/website/html/entitylevelview/pathway_diagram_statics

Copy over the underlying data for the ELV from reactomedev.oicr.on.ca.

scp -r reactomedev.oicr.on.ca:/usr/local/gkbdev/tmp/test_reactome_<RELEASE NUMBER>.tgz .

tar -zxvf test_reactome_<RELEASE NUMBER>.tgz

This will take some time, perhaps 10 minutes. Watch out for errors, particularly "no space left on device". If you get this error, you will need to stop the tar command (control-C), delete an older pathway diagram, and start the tar again.

210 Set up GWT

Builds the GWT Javascript, deploys the GWT servlets to Tomcat and puts some cached data into the database.

cd /usr/local/gkb_prod/scripts/release

sudo rm -rf /usr/local/reactomes/Reactome/fallback/apache-tomcat/webapps/ReactomeGWT*

./create_gwt.pl -db <name of the release db> -url http://www.reactome.org:8000

If you have sudo permission problems, see David!

Don't try to pipe the output of this command to a file. On reactome.oicr.on.ca, this procedure takes about 20 minutes to run.

220 Upgrade BioMart and Restart the Server

If a Reactome BioMart has not yet been installed, you will need to install it yourself. Instructions and downloadables are available from www.biomart.org . You should install into the directory /usr/local/gkb_prod/BioMart. You will know that there is already an installation present if this directory contains the subdirectories "apache", "biomart-perl" and "martj-0.6".

Even if there is a pre-existing BioMart installation, you must import the latest parameter settings into the database:

cd /usr/local/gkb_prod/scripts

perl update_reactome_mart.pl -sudo -biomart_version 0.7 -b /usr/local/gkb_prod/BioMart

This is an interactive script - you can simply give the default answers to all of the questions it asks you, by hitting the return key every time it asks a question.

N.B. Because you are running this command under sudo, which gives you temporary administrator priveledges, you will also be asked for your password.

Manual ReleaseDB Website Check

Full testing is performed when the production server is available on reactome.oicr.on.ca. Full details on database testing are available here. The list is only for testing the mechanics of links and web pages. It is definitely not a complete check of the content! It is a very good idea to test using multiple browser environments. If you are lucky enough to have a panel of testers, assign each a browser to try out.

Once the new website is up and running, you will be able to find the tests on this page.

Which of these links is the right one???

Update Public Server

Subsequent steps will directly affect the publicly visible Reactome site, so take care!

Log in to reactome.oicr.on.ca for all subsequent steps.

In this section, <reactome.oicr.on.ca_production_main> should be replaced with the path to the production directory, /usr/local/gkb.

230 Reroute requests for www.reactome.org to www.reactome.org:8000

Once you have completed this step, people trying to access Reactome will be redirected to the fallback server while you update the production server. Effectively, the new release starts from this moment, since the fallback server already has the new release.

Edit the file:

/usr/local/gkb/website/conf/httpd.conf

Uncomment the following line:

#Redirect / http://www.reactome.org:8000/

...by deleting the hash ("#") symbol at the beginning of the line.

Save the file and exit the editor.

240 Restart the webserver

sudo /etc/init.d/apache2 restart

This will ask you for a password – normally the same as your login password. If an error message appears stating you are not in the sudo list, you'll need to contact Peter Van Buren to be put on it.

250 Update all Source Code

Before you continue, you need to ensure that you have the most up-to-date version of Reactome, by doing CVS update in certain critical directories. Use 'cd' to change into each of these directories, and in each one, run BOTH sudo and BOTH cvs update commands:

sudo chgrp -R gkb *

sudo chmod -R g+rw *

cvs up -d

cvs up

You will be asked for the CVS password, which should be the same as your normal login password. You should do this for the following directories:

/usr/local/gkb/biopaxexporter

/usr/local/gkb/modules/GKB

Before updating the website directory, remove all files from the userguide directory - to avoid cvs conflicts arising:

rm -f /usr/local/gkb/website/html/userguide/*

Note: 1) you can ignore cvs conflicts concerning the images directory and previous release directories (like website/html/download/14); 2) this is slow, takes about 15 minutes.

/usr/local/gkb/website

   (It appears that when doing a cvs -d update of the website directory, the files in the underlying html 
directory are NOT updated. Test out whether an additional cvs up (without -d) helps, otherwise do a
separate cvs update on the html directory - please update the SOP accordingly! Note: I confirmed this:
cvs up -d wasn't sufficient, cvs up helped - ees.)

/usr/local/gkb/scripts

/usr/local/gkb/java

/usr/local/gkb/BioMart/reactome

/usr/local/gkb/ReactomeGWT

/usr/local/gkb/slicingTool

The last one (ReactomeGWT) is still experimental, so don't worry if there are errors, but please report them to the dev list.

You should keep an eye on the output from CVS: any lines beginning with a "C" indicate that conflicts have occurred during the update. These may need to be fixed by editing the file and resolving them by hand. You will recognize these conflicts in the form of lines beginning with '>>>', '===' or '<<<' in the affected file. These lines should be deleted, as should unwanted alternative lines inserted by CVS. If you do not feel confident to do this, contact one of the Reactome software staff.

260 Update the Download Directory

In principle we could just copy the newly added contents over and switch the symlinks but to save some space we haven't kept 2 copies but just used additional symlinks. Hence the following:

cd /usr/local/gkb/website/html/download/

mv /usr/local/gkb_prod/website/html/download/<RELEASE NUMBER> .

ln -s /usr/local/gkb/website/html/download/<RELEASE NUMBER> \

/usr/local/gkb_prod/website/html/download

Once this is complete, you will need to link the 'current' directory to the new download directory. If there is already a 'current' directory (normally the case), you will first need to remove the old 'current' directory. Procede as follows:

ls -la current

You will see something like this:

lrwxrwxrwx 1 croft gkb 2 Dec 4 04:49 current -> <PREVIOUS RELEASE NUMBER>

The directory that 'current' points to will be an old release number. Continuing with this example, you will need to remove the old 'current' directory, which is slightly complicated:

mv <PREVIOUS RELEASE NUMBER> foo

rm current

mv foo <PREVIOUS RELEASE NUMBER>

Finally, create a new link:

ln -s <RELEASE NUMBER> current

270 Copy over Databases

This step creates and loads the databases:

  • reactome_stable_identifiers
  • gk_current
  • gk_current_dn

First, use mysql to create/make space for appropriate databases:

mysql -u curator -p<PASSWORD>

drop database reactome_stable_identifiers;

create database reactome_stable_identifiers;

drop database gk_current;

create database gk_current;

drop database gk_current_dn;

create database gk_current_dn;

exit;

Next, fill the newly created databases with data:

These database are now so big that if you try to populate them in one step you almost always get a time-out. If you do, you should drop and re-create the database in mysql as for the preceding step. It makes sense to avoid this problem by copying the database into a file which can then be used to cat the details into your chosen database. For example:


mysqldump --opt test_reactome_26_dn > reactome_26_dn.dump

cat reactome_26_dn.dump | mysql gk_current_dn -u curator -p<PASSWORD>


So do this for test_reactome_<RELEASE NUMBER>

test_reactome_<RELEASE NUMBER>_dn

and

reactome_stable_identifiers


mysqldump --opt test_reactome_<RELEASE NUMBER> > reactome_<RELEASE NUMBER>.dump

cat reactome_<RELEASE NUMBER>.dump | mysql gk_current -u curator -p<PASSWORD>


mysqldump --opt test_reactome_<RELEASE NUMBER>_dn > reactome_<RELEASE NUMBER>_dn.dump

cat reactome_<RELEASE NUMBER>_dn.dump | mysql gk_current_dn -u curator -p<PASSWORD>


mysqldump --opt test_reactome_stable_identifiers > test_reactome_stable_identifiers.dump

cat test_reactome_stable_identifiers.dump | mysql reactome_stable_identifiers -u curator -p<PASSWORD>

280 Update Stable Identifier Database

You will also need to let the stable identifier database know that the live server uses gk_current, rather than test_reactome_<RELEASE NUMBER>. Start up mysql:

mysql -u curator -p<PASSWORD>

use reactome_stable_identifiers;

select * from DbParams;

You will see that in one row (probably the last row), the dbName is equal to test_reactome_<RELEASE NUMBER>. You need to change the dbName to gk_current, using an update command:

update DbParams set dbName='gk_current' where dbName='test_reactome_<RELEASE NUMBER>';

exit;

290 Update Config.pm file for New Release

You will need to let the Reactome server know that you have switched to a new release. This is done by editing the Config.pm file:

cd /usr/local/gkb/modules/GKB

Open the file "Config.pm" in your favorite editor, e.g. "vi".

Search for "LAST_RELEASE_DATE". Change the date to the date of the previous release. The format of the date string is YYYYMMDD. You can determine the last release date as follows:

  • For human Reactome, you can find this date by looking at the URL www.reactome.org/news.html.
  • If the current release is the very first one for your species, set LAST_RELEASE_DATE to 00000000.
  • Otherwise, if you maintain a news file for your species, you will find the date at the top of the file /usr/local/gkb/website/html/news.html

Save "Config.pm" and quit the editor.

300 Copy Over Front Page Images, Create Cached TOC File

Get rid of the old gk_current front pages:

cd /usr/local/gkb/website/html/img-fp

rm -rf gk_current

This may produce "Permission denied" errors, in which case you should do:

mv gk_current /tmp

instead. Now create new front pages for gk_current:

cp -r /usr/local/gkb_prod/website/html/img-fp/test_reactome_<RELEASE NUMBER> .

cp -r test_reactome_<RELEASE NUMBER> gk_current

cd gk_current

find . -name \* -exec /usr/local/gkb/scripts/swop.sh test_reactome_<RELEASE NUMBER> \

gk_current {} \; -print

Also create the tables of contents:

/usr/local/gkb/website/cgi-bin/toc DB=test_reactome_<RELEASE NUMBER>

/usr/local/gkb/website/cgi-bin/toc DB=gk_current

/usr/local/gkb/website/cgi-bin/doi_toc DB=test_reactome_<RELEASE NUMBER>

/usr/local/gkb/website/cgi-bin/doi_toc DB=gk_current

sudo chown -R ${USER}:www-data \

/usr/local/gkb/website/html/img-fp/test_reactome_<RELEASE NUMBER>/*toc

sudo chown -R ${USER}:www-data \

/usr/local/gkb/website/html/img-fp/gk_current/*toc

310 Clear search cache

Get rid of the old gk_current stored search results:

cd /usr/local/gkb/website/html/img-tmp

rm -f query_store_gk_current*

320 Link database and static content for entity level view

This step is currently only significant for human Reactome. Releases for other species should skip this step.

cd /usr/local/gkb/website/html/entitylevelview/pathway_diagram_statics

ln -s /usr/local/gkb_prod/website/html/entitylevelview/pathway_diagram_statics/test_reactome_<RELEASE NUMBER> .

rm gk_current

ln -s test_reactome_<RELEASE NUMBER> gk_current

330 Set up GWT

Builds the GWT Javascript, deploys the GWT servlets to Tomcat and puts some cached data into the database.

cd /usr/local/gkb/scripts/release

sudo rm -rf /usr/local/reactomes/Reactome/production/apache-tomcat/webapps/ReactomeGWT*

./create_gwt.pl -db gk_current -url http://www.reactome.org

Don't try to pipe the output of this command to a file. On reactome.oicr.on.ca, this procedure takes about 20 minutes to run.

340 Switch Back to Public Server

Edit the file:

/usr/local/gkb/website/conf/httpd.conf

Comment the following line (by adding # in front of it):

Redirect / http://www.reactome.org:8000/

Save the file and exit the editor.

Restart the webserver (you need to have permissions to do it):

sudo /etc/init.d/apache2 restart

Point your browser to http://www.reactome.org and check that everything works technically. (If not, re-route the traffic to http://www.reactome.org:8000 while you sort out the problems).

350 Restart Tomcat for WS SOAP API

After release, Tomcat should be restarted to make the WS SOAP API correct. To restart, please use the following command:
sudo /etc/rc5.d/S90tomcat5.5 restart

To run this, you need to have sudo privilege.

Post-release (EBI release manager/ Joel Weiser and Outreach manager)

Post-release communications

Reactome Announcement

The release manager (on the NYU editorial side) prepares an announce mail text based on the previous release announcement and current release details. This needs to be sent to Robin and Peter for revision and review before release. From your official email mailbox, send this email to reactome-announce@reactome.org. You must be reactome-announce list administrator to do this. This is a mailing list with open subscription but restricted posting privileges. Generally, the system administrator (Peter van Buren) will allow the message to be sent to the members of the list. If any doubts about the list, contact the system administrator. You can check to see if the mailing went out successfully by checking the archive. In the event that the message gets held for approval before sending, you will receive the approval request message (since you are an administrator for the list).

Post News on Reactome

The Reactome announcement is also posted on the Reactome New webpage. The News system is based upon the Wordpress theme. Approximately 1-2 weeks before release, the news item is uploaded into the Wordpress webpages. Preview and save the unreleased news item but do publish it. Make sure to check the links to each new/updated content item that has a hyperlink as these can not be checked in advanced. On the day of release: log into Wordpress, click on the unreleased news item in the summary list, and press the publish button. Check on the News site that the news item is release. To refresh the Reactome home page to show the new News item, reload the /ReactomeGWT from the Tomcat Web Application Manager (David, Robin and Guanming can perform this task). Check that the News item on the home page is updated.

EBI Reactome announcement (prepared by outreach manager, forwarded by EBI release manager)

A Reactome announcement is also posted by the EBI. A shortened (55 word) announcement is prepared by the editorial release team in this format and sent ( AFTER manual release database QA) to the the release manager at the EBI who forwards it to EBI External Services <es-request@ebi.ac.uk>. This announcement MUST be forwarded by an EBI staff member or the request will not be processed. To ensure that the announcement is published on the day of the release, the announcement must be submitted 2 days ahead of the scheduled release date.

Social network announcements (prepared by outreach manager)

The Reactome announcement is also posted on Facebook, Twitter and LinkedIn. The full release announcement is posted on Facebook and LinkedIn. However, a shortened (140 character) announcement is posted on Twitter.

Submission of Reactome data files to external groups

CVS commit of GOA submission file (Joel Weiser prepares)

(In case of problems contact Guanming Wu.)

This step should be carried out on reactomedev.oicr.on.ca.

Go to the GO submission directory:

cd /usr/local/gkbdev/GO_submission/go/gene-associations/submission/

gzip gene_association.reactome

cvs commit -m "Reactome release <RELEASE NUMBER>" gene_association.reactome.gz

Ask Esther or Steve about the password.

Generate an updated GO term to Reactome term mapping file (Joel Weiser prepares)

(In case of problems, contact Guanming.)

This step should be carried out on reactomedev.oicr.on.ca.

Go to /usr/local/gkbdev/java/GOMapper and run the GOMapper program:

java -Xmx256m -jar Reactome2GoMapper.jar localhost test_reactome_XX authortool T001test Reactome2GO_VXX

where test_reactome_XX is the current test database and the output file is called Reactome2GO_VXX where XX is the current Reactome version.

This file should be sent to Amelia Ireland at the GO consortium (aji@ebi.ac.uk).

If you have a local unix machine, scp the file to it. If you have a Windows machine, it's a bit more complicated. Simple FTP will not work. Best bet is to use Putty pscp - install Putty, open a command prompt window and type

set PATH ="C:\path\to\putty"

pscp

pscp your_username@reactomedev.oicr.on.ca:/usr/local/gkbdev/java/GOMapper/Reactome2GO_VXX .


That should work - don't forget the .

LinkOut FT files for NCBI (prepared by outreach manager)

Please read this important note before running the scripts.

From the GKB directory on reactomedev:

These scripts must be run in GKB/scripts in order and output files left in place until all three scripts have run.

1. run 1geneentrez.pl script first. perl 1geneentrez.pl

You will prompted for the version number (plain numerals) and the db on reactomedev.oicr.on.ca (use test_reactome_xx). Once you enter this information, the script will create output files that will be sent to the GKB directory

2. run 1proteinentrez.pl perl 1proteinentrez.pl

You will prompted for the version number (plain numerals) and the db on reactomedev.oicr.on.ca (use test_reactome_xx). Once you enter this information, the script will create output files that will be sent to the GKB directory

3. run 1omimentrez.pl perl 1omimentrez.pl

You will prompted for the version number (plain numerals) and the db on reactomedev.oicr.on.ca (use test_reactome_xx). Once you enter this information, the script will create output files that will be sent to the GKB directory.


The three files that will be uploaded to NCBI are:

gene_reactomeXX.xml
protein_reactomeXX.ft (flat file)
omim_reactomeXX.ft (flat file)

To FTP the files:

ftp ftp-private.ncbi.nih.gov

username: reactome password: (ask NYU editorial release manager)

Once logged on to the ftp site:

cd holdings
put gene_reactomeXX.xml
put protein_reactomeXX.ft
put omim_reactomeXX.ft


cd holdings ls -l

verify that the files are there.

These are the live linkout files, and are reprocessed twice a day. When you add, remove, or update files in your holdings directory, the corresponding changes should appear on the NCBI web pages with the next daily update of Entrez.

Once you upload the new files, you MUST delete the old files from the holdings directory by using "del filename" command.

After the deleting you can list the files and compare the file size between the old and new lists (from the same ftp window by scrolling up). The new files *should* have higher size IF THE DB IS OK AND THERE HAVE NOT BEEN MAJOR SCHEMA CHANGES.

HapMap (prepared by outreach manager)

Please read this note before running scripts for the first time. In the directory GKB, run the '1haprefseq.pl' script. Use the database called 'test_reactomeXX' where XX is the release version number.This will create the file 'hapmap_reactomeXX.txt' where XX is the version number. This file should be forwarded to at HapMap. For each release, make a directory HapMap_links_VerXX. Archive files here after they are sent. Test new gene at www.hapmap.org...should have a reactome pathway associated with it. HAPMAP contacts are Lon Phan (lonphan@ncbi.nlm.nih.gov), Hua Zhang (zhahua@ncbi.nlm.nih.gov) and Marcela Karey Monaco (karey.tello@gmail.com).

UCSC (prepared by outreach manager)

Before running this script for the first time, please read this note . In the directory GKB, run the '1ucscentity.pl' script. Enter the database 'test_reactomeXX' where XX is the version number. This will create the file 'ucsc_entityXX' and 'ucsc_events' where XX is the version number. These files are emailed to Jim Kent (kent@soe.ucsc.edu)and Fan (fanhsu@soe.ucsc.edu) at UCSC. For each release, make a directory UCSC_links_VerXX. Archive files here after they are sent.

GSEA (prepared by outreach manager)

Insert

Post release QA

Create skip lists for QA tool based on previous release QA

To be written

Check for new events that have been pulled from release but have kept releaseDate entry

To be added

Check for proteins in gk_central that have not been released

A procedure to check for proteins in gk_central that have not been released can be found here