DataServices:SNP

From CVRG Wiki

Jump to: navigation, search

Contents

System Requirements

Below you will find a list of technologies used by the SNP CaGrid 1.2 Data Service.

Software Type Vendor/Name Minimum Required Version
IDE Eclipse Europa
Development Language Java 5.0
UML Modeling Enterprise Architect 7.0
Build Mechanism Ant* 1.7.0
Dependency Management IVY 2.0.0-beta2
Application Server Tomcat 5.5/6.0
Virtualization (opt) vmware Workstation 6
Database MySQL 5.0.22
Grid Framework Globus ws-enum-4.0.3
Grid Middleware caGrid 1.2
Grid Service Generator Introduce 1.2
caCORE SDK CaCORE 3.2.1


Below you will find a list of ports used by the SNP CaGrid 1.2 Data Service.

Software Component Port Note
Data Service Port 9449 This port needs to be open to the internet.
Data Service Shutdown Port 9009 This port should be closed.
caCORE SDK Port 9081 This port needs to be open to – only -- the server on which the SNP Data Service is running.
caCORE SDK Shutdown Port 9018 This port should be closed


Source Repository

With $CVRG_LOCATION being the SVN checkout from https://scm.cci.emory.edu/svn/cvrg/trunk, the following are key locations for SNP:

  • SNP caCORE Application
    • $CVRG_LOCATION/dev/applications/JHUSNP/cacoresdk/cacoresdk-3.2.1-mysql.tar
    • $CVRG_LOCATION/dev/applications/JHUSNP/cacoresdk/Tomcat_JHUSNPcaCORE3.2.1
  • SNP DDL Scripts
    • $CVRG_LOCATION/dev/deployment/ddl_scripts/snpdb/snpdb.sql
    • $CVRG_LOCATION/dev/deployment/ddl_scripts/snpdb/snpdb_without_contraints.sql
  • SNP Data Model
    • $CVRG_LOCATION/dev/models/jhu_arking_snpdb/ea/snp.EAP
  • SNP Data Service
    • $CVRG_LOCATION/dev/services/JHUSNPDataService
  • Test Data for Systems Tests
    • $CVRG_LOCATION/dev/integrationtest/databases/JHUSNP


Data Model

SNP Data Model


Installing SNP 

Installing SNP

The following software must be installed on a deployment server in order for SNP to run

Step 1: Install Prerequisite Software

  1. J2SE 5.0
  2. Tomcat 5.5 for SNP caCORE
  3. Tomcat 5.5 for SNP caGrid
  4. Media:Ws-core-4.0.3.zip
  5. Ant 1.7.0


Step 2: Setup environment variables

  1. Create a GLOBUS_LOCATION environment variable and point it at the directory in which you installed Globus (ws-enum-4.0.3).
  2. Create a CATALINA_HOME environment variable and point it at the directory in which one of the installed Tomcat.
    1. This variable will be changed depending on which Tomcat container we are working with (starting, stopping, or deploying to).
  3. Create a JAVA_HOME environment variable and point it at the directory in which you installed Java.
  4. Create a ANT_HOME environment variable and point it at the directory in which you installed Ant.
  5. Add the following values to your PATH variable
    1. $CATALINA_HOME/bin
    2. $ANT_HOME/bin
    3. $JAVA_HOME/bin


Step 3: SNP SDK and Database

The SNP caCORE application is located in the SVN repository as a single compressed file ($CVRG_LOCATION/dev/applications/jhusnpdb/cacoresdk/cacoresdk-3.2.1-mysql.tar). Within this file there is a Web Archive (war) file, /output/package/webapp/jhusnpdb.war, which is the JHUSNP caCORE application.

  1. Install MySQL and create the jhusnp database with the name snpdb.
    1. The section “Building MySQL 5.x on CentOS 5 VM (JHUSNP Installation)” will guide you through the steps for building and installing MySQL in addition to creating the jhusnp database.
    2. NOTE: Populating the jhusnp database with sample data is optional, and is primarily used for system testing and development.
  2. Install a clean Tomcat container.
    1. Modify $TOMCAT_SNP_SDK_HOME/conf/server.xml to have the following:
    2. http connector with port 9081.
    3. Shutdown port as 9018.
    4. Copy /output/package/webapp/jhusnpdb.war to $TOMCAT_SNP_SDK_HOME/webapps/ and start Tomcat.


  • The SNP caCORE 3.2.1 application can be loosely tested by opening http://localhost:9081/jhusnpdb/. You should see the standard caCORE browser user interface to query for the Assay, Marker, RawData, SampleGeno, SubjectGeno, and SubjectInfo objects.


Step 4: SNP caGrid 1.2 Data Service

Since SNP is deployed to a Tomcat container within the production system, workstation systems typically use Tomcat as a service container also.

When the SNP Data Service is generated, an ant task (deployTomcat) is also created within it’s build.xml file. With $TOMCAT_SNP_HOME being the install path of the tomcat container used be SNP caGrid 1.2, make sure that there is an environment parameter $CATALINCA_HOME that equals $TOMCAT_SNP_HOME.

  • $CVRG_LOCATION/dev/services/JHUSNPDataService ant deplyTomcat will deploy the data service to tomcat.

After SNP Data Service is deployed to $TOMCAT_SNP_HOME, that tomcat container should be restarted.


Step 5: Testing SNP Data Service


<TODO: Complete Documentation>


Building SNP 

Creating the SNP caGrid 1.2 Data Service using Introduce 1.2

Creating the SNP data service with caCORE SDK 3.2.1 style (Remote API) with caGrid 1.2 and Introduce 1.2


  • In Introduce, click the button labeled “Create caGrid Service Skeleton”
    • Select a directory for the service
      • Create a new directory under the $CVRG_LOCATION/dev/services directory with the name “SNPDataService”
    • Type a name, “SNPDataService”, for the service
      • By convention, the name should start with an uppercase letter and should match the directory name
    • Make sure the service is in a package “org.cvrgrid.snp.arking”
    • In the “Standard” tab, select “Data Service” radio button
    • Keep default choices under “Advanced”
    • Click “Create”
      • When the dialog for “Data Service Configuation” appears, select the “caCORE SDK v3.2.1” drop-down from the list, and click “OK”.
  • Wizard panels
    • Here you select the directory that contains your client libraries. The guidelines for choosing the directory are: 1) if the service is remote, you want to choose the following directory created by caCORE SDK: output/SNPDataService/package/client/lib, or 2) if the service is local (and you want to access it locally), choose the “thick-client” lib: output/SNPDataService/package/thick-client/lib. Select the appropriate radio button for “Remote API” or “Local API” as appropriate. There are no additional dependencies (unless you have others).
    • Next, enter a remote service URL of the following form: http://{web_server_name}:9081/jhusnpdb/http/remoteService.
    • In step 4, select “Domain Model From File” and browse to the location of the XMI file for your caCORE SDK project. Select the “fixed” XMI that was produced by the caCORE SDK. Note: be sure to click the file type drop-down box and select XMI (and NOT XML!). Click OK.
    • Then fill in the values in the pop-up dialog box that follows. The project short name and project version are useful for versioning the domain model. I recommend the project version match the caGrid release (1.2). Do NOT check Fix EA model, since you selected the fixed model. Click OK.
    • Click Next.
    • In step 5, packages from the model will be listed. For each package, click the “Resolve” button and select from file. The guidelines for choosing the directory are: 1) if the service is remote, you want to choose the following directory created by caCORE SDK: output/<project name>/package/client/conf, or 2) if the service is local (and you want to access it locally), choose the “thick-client” conf: output/<project name>/package/thick-client/conf. Choose the xml schema in the directory. Click “Load Schemas”.
  • Optional service property change
    • In the Introduce GUI, you can modify a service property called “dataService_validateCqlFlag” to have the value “true”. If you make this change, the service will validate incoming CQL before executing the query, eliminating one source of errors. Note: as of Introduce 1.1, you actually need to edit the service.properties to make this change before deploying.
  • Deploy the service
    • If you need to modify the caCORE system backend location, simply modify the location in the service’s service.properties file and re-deploy
    • Deploy the service with “ant deployGlobus” from the service’s top level directory. Be sure that GLOBUS_LOCATION environment variable is set properly before deploying. When deploying the service, be sure to use GLOBUS_LOCATION as the destination during development. During CVRG deployment, we want to use CATALINA_HOME (which is tomcat).
    • Be sure the index service URL is cagrid05 so that others can locate your service.
    • Finally, remoteService.xml (in the output/<project name>/conf directory) needs to be on the classpath for data service clients at the moment (SDK 3.2 makes remote calls and needs that file to do so).


Installing MySQL 5.0.22 on CentOS 5 VM 

Installing MySQL 5.0.22 on CentOS 5 VM


Installation

  Sudo apt-get install mysql-server (Gridgrouper and GTS require MySQL)
  Sudo apt-get install mysql-client (Gridgrouper and GTS require MySQL)

Configuration

(with mysql-server package 5.0.22, it appears you can do sudo /etc/init.d/mysql reset-password just after installation to change password. Start the server with sudo /etc/init.d/mysql start and then change password)

IMPORTANT: CONFIGURE MYSQL: drop the anonymous users and set the root password per below The grant tables define the initial MySQL user accounts and their access privileges. These accounts are set up as follows:

  1. Accounts with the username root are created. These are superuser accounts that can do anything. The initial root account passwords are empty, so anyone can connect to the MySQL server as root — without a password — and be granted all privileges.
    1. On Windows, one root account is created; this account allows connecting from the local host only. The Windows installer will optionally create an account allowing for connections from any host only if the user selects the Enable root access from remote machines option during installation.
    2. On Unix, both root accounts are for connections from the local host. Connections must be made from the local host by specifying a hostname of localhost for one of the accounts, or the actual hostname or IP number for the other.
  2. Two anonymous-user accounts are created, each with an empty username. The anonymous accounts have no password, so anyone can use them to connect to the MySQL server.
    1. On Windows, one anonymous account is for connections from the local host. It has no global privileges. (Before MySQL 5.1.16, it has all global privileges, just like the root accounts.) The other is for connections from any host and has all privileges for the test database and for other databases with names that start with test.
    2. On Unix, both anonymous accounts are for connections from the local host. Connections must be made from the local host by specifying a hostname of localhost for one of the accounts, or the actual hostname or IP number for the other. These accounts have all privileges for the test database and for other databases with names that start with test_.

As noted, none of the initial accounts have passwords. This means that your MySQL installation is unprotected until you do something about it:

  1. If you want to prevent clients from connecting as anonymous users without a password, you should either assign a password to each anonymous account or else remove the accounts.
  2. You should assign a password to each MySQL root account.


The following instructions describe how to set up passwords for the initial MySQL accounts, first for the anonymous accounts and then for the root accounts. Replace “newpwd” in the examples with the actual password that you want to use. The instructions also cover how to remove the anonymous accounts, should you prefer not to allow anonymous access at all.

You might want to defer setting the passwords until later, so that you don't need to specify them while you perform additional setup or testing. However, be sure to set them before using your installation for production purposes.

Anonymous Account Password Assignment

To assign passwords to the anonymous accounts, connect to the server as root and then use either SET PASSWORD or UPDATE. In either case, be sure to encrypt the password using the PASSWORD() function.

To use SET PASSWORD on Windows, do this:

  shell> mysql -u root
  mysql> SET PASSWORD FOR @'localhost' = PASSWORD('newpwd');
  mysql> SET PASSWORD FOR @'%' = PASSWORD('newpwd');

To use SET PASSWORD on Unix, do this:

  shell> mysql -u root
  mysql> SET PASSWORD FOR @'localhost' = PASSWORD('newpwd');
  mysql> SET PASSWORD FOR @'host_name' = PASSWORD('newpwd');

In the second SET PASSWORD statement, replace host_name with the name of the server host. This is the name that is specified in the Host column of the non-localhost record for root in the user table. If you don't know what hostname this is, issue the following statement before using SET PASSWORD: mysql> SELECT Host, User FROM mysql.user; Look for the record that has root in the User column and something other than localhost in the Host column. Then use that Host value in the second SET PASSWORD statement.

Anonymous Account Removal

If you prefer to remove the anonymous accounts instead, do so as follows:

  shell> mysql -u root
  mysql> DROP USER ;

The DROP statement applies both to Windows and to Unix. On Windows, if you want to remove only the anonymous account that has the same privileges as root, do this instead:

  shell> mysql -u root
  mysql> DROP USER @'localhost';

That account allows anonymous access but has full privileges, so removing it improves security.

Root Account Password Assignment

You can assign passwords to the root accounts in several ways. The following discussion demonstrates three methods:

  1. Use the SET PASSWORD statement
  2. Use the mysqladmin command-line client program
  3. Use the UPDATE statement

To assign passwords using SET PASSWORD, connect to the server as root and issue two SET PASSWORD statements. Be sure to encrypt the password using the PASSWORD() function.

For Windows, do this:

  shell> mysql -u root
  mysql> SET PASSWORD FOR 'root'@'localhost' = PASSWORD('newpwd');
  mysql> SET PASSWORD FOR 'root'@'%' = PASSWORD('newpwd');

For Unix, do this:

  shell> mysql -u root
  mysql> SET PASSWORD FOR 'root'@'localhost' = PASSWORD('newpwd');
  mysql> SET PASSWORD FOR 'root'@'host_name' = PASSWORD('newpwd');

In the second SET PASSWORD statement, replace host_name with the name of the server host. This is the same hostname that you used when you assigned the anonymous account passwords.

To assign passwords to the root accounts using mysqladmin, execute the following commands:

  shell> mysqladmin -u root password "newpwd"
  shell> mysqladmin -u root -h host_name password "newpwd"

These commands apply both to Windows and to Unix. In the second command, replace host_name with the name of the server host. The double quotes around the password are not always necessary, but you should use them if the password contains spaces or other characters that are special to your command interpreter.

You can also use UPDATE to modify the user table directly. The following UPDATE statement assigns a password to both root accounts at once:

  shell> mysql -u root
  mysql> UPDATE mysql.user SET Password = PASSWORD('newpwd')
    ->     WHERE User = 'root';
  mysql> FLUSH PRIVILEGES;

The UPDATE statement applies both to Windows and to Unix.

After the passwords have been set, you must supply the appropriate password whenever you connect to the server. For example, if you want to use mysqladmin to shut down the server, you can do so using this command:

  shell> mysqladmin -u root -p shutdown
  Enter password: (enter root password here)




Personal tools
Project Infrastructures