DataServices:ProteinDB
From CVRG Wiki
Contents |
System Requirements
Below you will find a list of technologies used by the ProteinDB CaGrid 1.2 Data Service.
| Software Type | Vendor/Name | Minimum Required Version |
|---|---|---|
| IDE | Eclipse | Europa |
| Development Language | Java | 5.0 |
| UML Modeling | Enterprise Architect | 7.0 |
| Build Mechanism | Ant | 1.7.0 |
| Dependency Management | IVY | 2.0.0-beta2 |
| Application Server | Tomcat | 5.5/6.0 |
| Virtualization (opt) | vmware | Workstation 6 |
| Database | PostgreSQL | 8.1.11 |
| Grid Framework | Globus | ws-enum-4.0.3 |
| Grid Middleware | caGrid | 1.2 |
| Grid Service Generator | Introduce | 1.2 |
| caCORE SDK | CaCORE | 4.0 |
Below you will find a list of ports used by the ProteinDB CaGrid 1.2 Data Service.
| Software Component | Port | Note |
|---|---|---|
| Data Service Port | 9449 | This port needs to be open to the internet. |
| Data Service Shutdown Port | 9009 | This port should be closed. |
| caCORE SDK Port | 9081 | This port needs to be open to – only -- the server on which the ProteinDB Data Service is running. |
| caCORE SDK Shutdown Port | 9018 | This port should be closed |
Source Repository
With $CVRG_LOCATION being the SVN checkout from https://project.bmi.ohio-state.edu/svn/cvrg/trunk, the following are key locations for ProteinDB:
- ProteinDB caCORE Application
- $CVRG_LOCATION/dev/applications/JHUProteinDB/cacoresdk/cacoresdk-3.2.1-mysql.tar
- $CVRG_LOCATION/dev/applications/JHUProteinDB/cacoresdk/Tomcat_JHUProteinDBcaCORE3.2.1
- ProteinDB DDL Scripts
- $CVRG_LOCATION/dev/deployment/ddl_scripts/ProteinDBdb/ProteinDBdb.sql
- $CVRG_LOCATION/dev/deployment/ddl_scripts/ProteinDBdb/ProteinDBdb_without_contraints.sql
- ProteinDB Data Model
- $CVRG_LOCATION/dev/models/jhu_arking_ProteinDBdb/ea/ProteinDB.EAP
- ProteinDB Data Service
- $CVRG_LOCATION/dev/services/JHUProteinDBDataService
- Test Data for Systems Tests
- $CVRG_LOCATION/dev/integrationtest/databases/JHUProteinDB
Installing ProteinDB
Installing ProteinDB
The following software must be installed on a deployment server in order for ProteinDB to run
Step 1: Install Prerequisite Software
- J2SE 5.0
- Tomcat 5.5 for ProteinDB caCORE
- Tomcat 5.5 for ProteinDB caGrid
- Media:Ws-core-4.0.3.zip
- Ant 1.7.0
Step 2: Setup environment variables
- Create a GLOBUS_LOCATION environment variable and point it at the directory in which you installed Globus (ws-enum-4.0.3).
- Create a CATALINA_HOME environment variable and point it at the directory in which one of the installed Tomcat.
- This variable will be changed depending on which Tomcat container we are working with (starting, stopping, or deploying to).
- Create a JAVA_HOME environment variable and point it at the directory in which you installed Java.
- Create a ANT_HOME environment variable and point it at the directory in which you installed Ant.
- Add the following values to your PATH variable
- $CATALINA_HOME/bin
- $ANT_HOME/bin
- $JAVA_HOME/bin
Step 3: ProteinDB SDK and Database
The ProteinDB caCORE application is located in the SVN repository as a single compressed file ($CVRG_LOCATION/dev/applications/jhuProteinDBdb/cacoresdk/cacoresdk-3.2.1-mysql.tar). Within this file there is a Web Archive (war) file, /output/package/webapp/jhuProteinDBdb.war, which is the JHUProteinDB caCORE application.
- Install MySQL and create the jhuProteinDB database.
- The section “Building MySQL 5.x on CentOS 5 VM (JHUProteinDB Installation)” will guide you through the steps for building and installing MySQL in addition to creating the jhuProteinDB database.
- NOTE: Populating the jhuProteinDB database with sample data is optional, and is primarily used for system testing and development.
- Install a clean Tomcat container.
- Modify $TOMCAT_ProteinDB_SDK_HOME/conf/server.xml to have the following:
- http connector with port 9081.
- Shutdown port as 9018.
- Copy /output/package/webapp/jhuProteinDBdb.war to $TOMCAT_ProteinDB_SDK_HOME/webapps/ and start Tomcat.
- The ProteinDB caCORE 3.2.1 application can be loosely tested by opening http://localhost:9081/jhuProteinDBdb/. You should see the standard caCORE browser user interface to query for the Assay, Marker, RawData, SampleGeno, SubjectGeno, and SubjectInfo objects.
Step 4: ProteinDB caGrid 1.2 Data Service
Since ProteinDB is deployed to a Tomcat container within the production system, workstation systems typically use Tomcat as a service container also.
When the ProteinDB Data Service is generated, an ant task (deployTomcat) is also created within it’s build.xml file. With $TOMCAT_ProteinDB_HOME being the install path of the tomcat container used be ProteinDB caGrid 1.2, make sure that there is an environment parameter $CATALINCA_HOME that equals $TOMCAT_ProteinDB_HOME.
- $CVRG_LOCATION/dev/services/JHUProteinDBDataService ant deplyTomcat will deploy the data service to tomcat.
After ProteinDB Data Service is deployed to $TOMCAT_ProteinDB_HOME, that tomcat container should be restarted.
- To verify that the Tomcat container and the ProteinDB data service were at least sucessfuly initialized, open https://localhost:9449/wsrf/services/cagrid/JHUProteinDBDataService within an Internet browser. You should see a web page stating that the AXIS web service JHUProteinDBDataService exists.
Step 5: Testing ProteinDB Data Service
<TODO: Complete Documentation>
Building ProteinDB
Creating the ProteinDB caGrid 1.2 Data Service using Introduce 1.2
Creating the ProteinDB data service with caCORE SDK 3.2.1 style (Remote API) with caGrid 1.2 and Introduce 1.2
- In Introduce, click the button labeled “Create caGrid Service Skeleton”
- Select a directory for the service
- Create a new directory under the $CVRG_LOCATION/dev/services directory with the name “ProteinDBDataService”
- Type a name, “ProteinDBDataService”, for the service
- By convention, the name should start with an uppercase letter and should match the directory name
- Make sure the service is in a package “org.cvrgrid.ProteinDB.arking”
- In the “Standard” tab, select “Data Service” radio button
- Keep default choices under “Advanced”
- Click “Create”
- When the dialog for “Data Service Configuation” appears, select the “caCORE SDK v3.2.1” drop-down from the list, and click “OK”.
- Select a directory for the service
- Wizard panels
- Here you select the directory that contains your client libraries. The guidelines for choosing the directory are: 1) if the service is remote, you want to choose the following directory created by caCORE SDK: output/ProteinDBDataService/package/client/lib, or 2) if the service is local (and you want to access it locally), choose the “thick-client” lib: output/ProteinDBDataService/package/thick-client/lib. Select the appropriate radio button for “Remote API” or “Local API” as appropriate. There are no additional dependencies (unless you have others).
- Next, enter a remote service URL of the following form: http://{web_server_name}:9081/jhuProteinDBdb/http/remoteService.
- In step 4, select “Domain Model From File” and browse to the location of the XMI file for your caCORE SDK project. Select the “fixed” XMI that was produced by the caCORE SDK. Note: be sure to click the file type drop-down box and select XMI (and NOT XML!). Click OK.
- Then fill in the values in the pop-up dialog box that follows. The project short name and project version are useful for versioning the domain model. I recommend the project version match the caGrid release (1.2). Do NOT check Fix EA model, since you selected the fixed model. Click OK.
- Click Next.
- In step 5, packages from the model will be listed. For each package, click the “Resolve” button and select from file. The guidelines for choosing the directory are: 1) if the service is remote, you want to choose the following directory created by caCORE SDK: output/<project name>/package/client/conf, or 2) if the service is local (and you want to access it locally), choose the “thick-client” conf: output/<project name>/package/thick-client/conf. Choose the xml schema in the directory. Click “Load Schemas”.
- Optional service property change
- In the Introduce GUI, you can modify a service property called “dataService_validateCqlFlag” to have the value “true”. If you make this change, the service will validate incoming CQL before executing the query, eliminating one source of errors. Note: as of Introduce 1.1, you actually need to edit the service.properties to make this change before deploying.
- Deploy the service
- If you need to modify the caCORE system backend location, simply modify the location in the service’s service.properties file and re-deploy
- Deploy the service with “ant deployGlobus” from the service’s top level directory. Be sure that GLOBUS_LOCATION environment variable is set properly before deploying. When deploying the service, be sure to use GLOBUS_LOCATION as the destination during development. During CVRG deployment, we want to use CATALINA_HOME (which is tomcat).
- Be sure the index service URL is cagrid05 so that others can locate your service.
- Finally, remoteService.xml (in the output/<project name>/conf directory) needs to be on the classpath for data service clients at the moment (SDK 3.2 makes remote calls and needs that file to do so).
