DataServicesOverview
From CVRG Wiki
CVRG Data Services Overview
The current CVRG data services have been developed to meet the needs of the initial Driving Biomedical Project, which is the D. W. Reynolds study of Sudden Cardiac Death in the setting of coronary artery disease (see http://www.reynolds.jhmi.edu/index.html). These data services are described in detail, with links to proper installation guidelines below:
SNP Data Service
The SNP Data Service provides grid-enabled access to SNP data. The service uses a MySQL version 5.0.22 database and a data model developed in caCORE 3.2.1. The data model is based upon information captured by the JHU Institute for Genetic Medicine, on behalf of the Reynolds investigators. The data model associates subject information with biomarkers derived from primary SNP calls stored in SNP arrays.
Installation and Configuration Instructions for SNP Data Services
Protein DB Data Service
Protein DB Version 2.0 is a full-featured proteomics data service based on the PSI-OM. Protein DB has been harmonized with the caBIG caDSR and can be viewed in the NCI UML Model Browser. A data model is available on the JHU Proteomics Center site. Source Code and System Configuration information is available on the CCBM Protein DB page.
DICOM Image Data Service
The DICOM image data service has been instantiated at JHU. It leverages an Open-Source Clinical Image Management system (DCM4CHE). Access to the system is currently available through the use of virtualPACS.
CVRG DICOM Data Service Installation Instructions
VirtualPACS Installation Instructions
DCM4CHEE - Clinical Image and Object Management
dcm4che is a collection of open source applications and utilities for the healthcare enterprise. These applications have been developed in the Java programming language for performance and portability, supporting deployment on JDK 1.5 and up.
At the core of the dcm4che project is a robust implementation of the DICOM standard. The dcm4che-1.x DICOM toolkit is used in many production applications across the world, while the current (2.x) version of the toolkit has been re-architected for high performance and flexibility.
Also contained within the dcm4che project is dcm4chee (the extra 'e' stands for 'enterprise'). dcm4chee is an Image Manager/Image Archive (according to IHE). The application contains the DICOM, HL7 services and interfaces that are required to provide storage, retrieval, and workflow to a healthcare environment. dcm4chee is pre-packaged and deployed within the JBoss application server. By taking advantage of many JBoss features (JMS, EJB, Servlet Engine, etc.), and assuming the role of several IHE actors for the sake of interoperability, the application provides many robust and scalable services.
For general information, go to http://www.dcm4che.org.
CVRG DCM4CHEE Installation Instructions
ECG HL7aECG Data Service
In 2004, the Food & Drug Administration announced its intent to accept annotated ECG waveform data in XML following the Health Level 7 (HL7) Annotated ECG Waveform Data Standard (aECG). The HL7aECG data service leverages the XML Data Service, developed jointly by the caBIG In Vivo Imaging workspace and the CVRG. The HL7aECG files are stored within in an open-source embeddable Oracle Berkeley DB XML database.
ECG HL7aECG Data Service Installation Instructions
ECG Physionet Data Service
PhysioNet was established in 1999 as the outreach component of the Research Resource for Complex Physiologic Signals. Its PhysioBank is a large and growing archive of well-characterized digital recordings of physiologic signals and related data for use by the biomedical research community. The Physionet data service leverages the XML Data Service, developed jointly by the caBIG In Vivo Imaging workspace and the CVRG. Through the XML Data Service, metadata about the WFDB files are stored in an open-source embeddable Oracle Berkeley DB XML database. The WFDB files themselves are stored in a referenced location on the CVRG file system.
WFDB Data Service Installation Instructions
OpenClinica Data service
The OpenClinica Data Service provides grid-enabled, secure access to clinical information. The service consists of three main components:
- A PostgreSQL version 8.1.11 database as the backend system. This database is built using the OpenClinica relational database schema.
- A caCORE SDK 4.0 generated server application. The object model for this application has been developed in UML using the Enterprise Architect system. The version 1.1 of the object model allows queries against Study and Subject information.
- A caGrid-compatible data service. This service allows grid-enabled, secure access to data serviced through the object model.
The new version (ver 1.2) of the OpenClinica Data Service supports querying of Case Report Forms information as well as Study and Subject information. More detailed instructions for building and deploying the OpenClinica Data Service with the new object model can be found here.
How to Build and Deploy OpenClinica Data Service.
caArray Data service
caArray is an open-source, web and programmatically accessible mRNA array data management system developed as part of the cancer Biomedical Informatics Grid caBIG. caArray supports the annotation and exchange of mRNA array data as part of a grid. caArray is not currently deployed on the CVRG. A cross-registration of caBIG and CVRG services is underway where the CVRG will leverage caBIG's caArray and caBIG will leverage CVRG's ProteinDB and ECG services.
CVRG caBIG Collaboration Page on caBIG Knowledge Center Website
Berger Algorithm Parameter Data Service Information
The Berger Algorithm Parameter Data Service is a persistent storage mechanism for the parameter files utilized by the Berger Algorithm. The Berger Algorithm itself requires a parameter file for each ECG file that it processes. The Berger Algorithm was developed by Dr. Ron Berger at Johns Hopkins University. See Berger, R.D., Kasper, E.K., Baughman K.L., Marban E., Calkins H., Tomaselli G.F. (1997) Beat-to-beat QT interval variability: novel evidence for repolarization lability in ischemic and nonischemic dilated cardiomyopathy. Circulation. 96(5):1557-1565 and Berger, R.D. (2003). QT Variability. J. Electrocardiol. 36: 83-87.
Berger Algorithm Parameter Data Service Installation Instructions
Berger Algorithm Results Data Service Information
The Berger Algorithm Results Data Service is a persistent storage mechanism for the output files produced by the Berger Algorithm.
Berger Algorithm Results Data Service Installation Instructions
Physionet QT Algorithm Results Data Service Information
The Physionet QT Algorithm Results Data Service is a persistent storage mechanism for the output files produced by the Physionet QT Algorithm.
Physionet QT Algorithm Results Data Service Installation Instructions
General Purpose XML Data Service
A general purpose XML data service has been developed as part of the CVRG Project. Further development of this service is now a caGrid Community Project.
