DataServices:ProteinDB

From CVRG Wiki

Jump to: navigation, search

Contents

System Requirements

Below you will find a list of technologies used by the ProteinDB CaGrid 1.2 Data Service.

Software Type Vendor/Name Minimum Required Version
IDE Eclipse Europa
Development Language Java 5.0
UML Modeling Enterprise Architect 7.0
Build Mechanism Ant 1.7.0
Dependency Management IVY 2.0.0-beta2
Application Server Tomcat 5.5/6.0
Virtualization (opt) vmware Workstation 6
Database PostgreSQL 8.1.11
Grid Framework Globus ws-enum-4.0.3
Grid Middleware caGrid 1.2
Grid Service Generator Introduce 1.2
caCORE SDK CaCORE 4.0


Below you will find a list of ports used by the ProteinDB CaGrid 1.2 Data Service.

Software Component Port Note
Data Service Port 9449 This port needs to be open to the internet.
Data Service Shutdown Port 9009 This port should be closed.
caCORE SDK Port 9081 This port needs to be open to – only -- the server on which the ProteinDB Data Service is running.
caCORE SDK Shutdown Port 9018 This port should be closed


Source Repository

With $CVRG_LOCATION being the SVN checkout from https://project.bmi.ohio-state.edu/svn/cvrg/trunk, the following are key locations for ProteinDB:

  • ProteinDB caCORE Application
    • $CVRG_LOCATION/dev/applications/JHUProteinDB/cacoresdk/cacoresdk-3.2.1-mysql.tar
    • $CVRG_LOCATION/dev/applications/JHUProteinDB/cacoresdk/Tomcat_JHUProteinDBcaCORE3.2.1
  • ProteinDB DDL Scripts
    • $CVRG_LOCATION/dev/deployment/ddl_scripts/ProteinDBdb/ProteinDBdb.sql
    • $CVRG_LOCATION/dev/deployment/ddl_scripts/ProteinDBdb/ProteinDBdb_without_contraints.sql
  • ProteinDB Data Model
    • $CVRG_LOCATION/dev/models/jhu_arking_ProteinDBdb/ea/ProteinDB.EAP
  • ProteinDB Data Service
    • $CVRG_LOCATION/dev/services/JHUProteinDBDataService
  • Test Data for Systems Tests
    • $CVRG_LOCATION/dev/integrationtest/databases/JHUProteinDB


Installing ProteinDB 

Installing ProteinDB

The following software must be installed on a deployment server in order for ProteinDB to run

Step 1: Install Prerequisite Software

  1. J2SE 5.0
  2. Tomcat 5.5 for ProteinDB caCORE
  3. Tomcat 5.5 for ProteinDB caGrid
  4. Media:Ws-core-4.0.3.zip
  5. Ant 1.7.0


Step 2: Setup environment variables

  1. Create a GLOBUS_LOCATION environment variable and point it at the directory in which you installed Globus (ws-enum-4.0.3).
  2. Create a CATALINA_HOME environment variable and point it at the directory in which one of the installed Tomcat.
    1. This variable will be changed depending on which Tomcat container we are working with (starting, stopping, or deploying to).
  3. Create a JAVA_HOME environment variable and point it at the directory in which you installed Java.
  4. Create a ANT_HOME environment variable and point it at the directory in which you installed Ant.
  5. Add the following values to your PATH variable
    1. $CATALINA_HOME/bin
    2. $ANT_HOME/bin
    3. $JAVA_HOME/bin


Step 3: ProteinDB SDK and Database

The ProteinDB caCORE application is located in the SVN repository as a single compressed file ($CVRG_LOCATION/dev/applications/jhuProteinDBdb/cacoresdk/cacoresdk-3.2.1-mysql.tar). Within this file there is a Web Archive (war) file, /output/package/webapp/jhuProteinDBdb.war, which is the JHUProteinDB caCORE application.

  1. Install MySQL and create the jhuProteinDB database.
    1. The section “Building MySQL 5.x on CentOS 5 VM (JHUProteinDB Installation)” will guide you through the steps for building and installing MySQL in addition to creating the jhuProteinDB database.
    2. NOTE: Populating the jhuProteinDB database with sample data is optional, and is primarily used for system testing and development.
  2. Install a clean Tomcat container.
    1. Modify $TOMCAT_ProteinDB_SDK_HOME/conf/server.xml to have the following:
    2. http connector with port 9081.
    3. Shutdown port as 9018.
    4. Copy /output/package/webapp/jhuProteinDBdb.war to $TOMCAT_ProteinDB_SDK_HOME/webapps/ and start Tomcat.


  • The ProteinDB caCORE 3.2.1 application can be loosely tested by opening http://localhost:9081/jhuProteinDBdb/. You should see the standard caCORE browser user interface to query for the Assay, Marker, RawData, SampleGeno, SubjectGeno, and SubjectInfo objects.



Step 4: ProteinDB caGrid 1.2 Data Service

Since ProteinDB is deployed to a Tomcat container within the production system, workstation systems typically use Tomcat as a service container also.

When the ProteinDB Data Service is generated, an ant task (deployTomcat) is also created within it’s build.xml file. With $TOMCAT_ProteinDB_HOME being the install path of the tomcat container used be ProteinDB caGrid 1.2, make sure that there is an environment parameter $CATALINCA_HOME that equals $TOMCAT_ProteinDB_HOME.

  • $CVRG_LOCATION/dev/services/JHUProteinDBDataService ant deplyTomcat will deploy the data service to tomcat.

After ProteinDB Data Service is deployed to $TOMCAT_ProteinDB_HOME, that tomcat container should be restarted.


Step 5: Testing ProteinDB Data Service


<TODO: Complete Documentation>


Building ProteinDB 

Creating the ProteinDB caGrid 1.2 Data Service using Introduce 1.2

Creating the ProteinDB data service with caCORE SDK 3.2.1 style (Remote API) with caGrid 1.2 and Introduce 1.2


  • In Introduce, click the button labeled “Create caGrid Service Skeleton”
    • Select a directory for the service
      • Create a new directory under the $CVRG_LOCATION/dev/services directory with the name “ProteinDBDataService”
    • Type a name, “ProteinDBDataService”, for the service
      • By convention, the name should start with an uppercase letter and should match the directory name
    • Make sure the service is in a package “org.cvrgrid.ProteinDB.arking”
    • In the “Standard” tab, select “Data Service” radio button
    • Keep default choices under “Advanced”
    • Click “Create”
      • When the dialog for “Data Service Configuation” appears, select the “caCORE SDK v3.2.1” drop-down from the list, and click “OK”.
  • Wizard panels
    • Here you select the directory that contains your client libraries. The guidelines for choosing the directory are: 1) if the service is remote, you want to choose the following directory created by caCORE SDK: output/ProteinDBDataService/package/client/lib, or 2) if the service is local (and you want to access it locally), choose the “thick-client” lib: output/ProteinDBDataService/package/thick-client/lib. Select the appropriate radio button for “Remote API” or “Local API” as appropriate. There are no additional dependencies (unless you have others).
    • Next, enter a remote service URL of the following form: http://{web_server_name}:9081/jhuProteinDBdb/http/remoteService.
    • In step 4, select “Domain Model From File” and browse to the location of the XMI file for your caCORE SDK project. Select the “fixed” XMI that was produced by the caCORE SDK. Note: be sure to click the file type drop-down box and select XMI (and NOT XML!). Click OK.
    • Then fill in the values in the pop-up dialog box that follows. The project short name and project version are useful for versioning the domain model. I recommend the project version match the caGrid release (1.2). Do NOT check Fix EA model, since you selected the fixed model. Click OK.
    • Click Next.
    • In step 5, packages from the model will be listed. For each package, click the “Resolve” button and select from file. The guidelines for choosing the directory are: 1) if the service is remote, you want to choose the following directory created by caCORE SDK: output/<project name>/package/client/conf, or 2) if the service is local (and you want to access it locally), choose the “thick-client” conf: output/<project name>/package/thick-client/conf. Choose the xml schema in the directory. Click “Load Schemas”.
  • Optional service property change
    • In the Introduce GUI, you can modify a service property called “dataService_validateCqlFlag” to have the value “true”. If you make this change, the service will validate incoming CQL before executing the query, eliminating one source of errors. Note: as of Introduce 1.1, you actually need to edit the service.properties to make this change before deploying.
  • Deploy the service
    • If you need to modify the caCORE system backend location, simply modify the location in the service’s service.properties file and re-deploy
    • Deploy the service with “ant deployGlobus” from the service’s top level directory. Be sure that GLOBUS_LOCATION environment variable is set properly before deploying. When deploying the service, be sure to use GLOBUS_LOCATION as the destination during development. During CVRG deployment, we want to use CATALINA_HOME (which is tomcat).
    • Be sure the index service URL is cagrid05 so that others can locate your service.
    • Finally, remoteService.xml (in the output/<project name>/conf directory) needs to be on the classpath for data service clients at the moment (SDK 3.2 makes remote calls and needs that file to do so).


Personal tools
Project Infrastructures