6.2Biomedical applications
The biomedical working group represents a broad entity with several user communities. Currently, applications from the bioinformatics and the medical imaging communities have been ported to the LCG2 infrastructure. At the beginning of the project, three pilot applications (CDSS, GPS@ and GATE described below) were ready to run on the LCG2 infrastructure. Three additional applications have been proposed by the project partners since the beginning (SiMRI3D, gPTM3D, and xmipp_MRrefine introduced below).
A discussion has also started with the Mammogrid European FP5 project to port this mammograms analysis application on the EGEE infrastructure. Given the investment done by the Mammogrid team on the AliEN middleware infrastructure, mammogrid is waiting for the gLite middleware (partly AliEN based) to become stable for pushing its EGEE integration further.
Other external biomedical projects have expressed an interested for using the EGEE infrastructures and discussions about their involvement in the projects are on-going. The cost of middleware migration for projects having already invested in a different middleware often prove to be prohibitive.
The status of biomedical applications is periodically updated on the following web page:
http://egee-na4.ct.infn.it/biomed/applications.html.
6.2.1Pilot biomedical applications 6.2.1.1Clinical Decision Support System (CDSS)
The Clinical Decision Support System is an application that extracts medically relevant knowledge from a large set of information with the objective of guiding the practitioners in their clinical practice. The CDSS does not substitute for human medical decisions, but it improves factors such as sensitivity, sensibility and working conditions.
Contact
|
Ignacio Blanquer, iblanque@dsic.upv.es
|
Description
|
Clinical Decision Support System is a tool to assist on the diagnosis providing a classification based on expert systems trained with a database validated by users. Three application areas (anaemia, brain tumours, and genetic profile of schizophrenia classification) and up to six different classification engines.
|
Deployment and status
|
Originally developed under a serviced-based approach, this application has now been ported to LCG2. Classifiers are installed on different Computing Elements and discovered through MDS. Access is provided through web services from java components and using web browser certificates to authorise access.
|
Present infrastructure usage
|
The application is deployed on the UPV LCG2 Resources, already included in the official set of EGEE resources.
|
Desired infrastructure usage
|
Installation of different components in other LCG2 sites. No need for large computing or storage resources, but there is a need for defining BDII (Berkeley Database Information Index) entries. This is an LDAP server backed by a Berkeley Database, which provides information according to the GLUE schema, and is populated by regular queries to information providers at each site.
|
Present users
|
About 10 medical users from 5 organisations, about 10 runs per day (each run on the order of minutes)
|
Plans
|
Enlarge the user community when more application components become available.
Integrate on a user-friendlier interface (short time).
Train new classifiers for more pathologies.
|
Main issues related to deployment
|
No problem reported.
|
6.2.1.2Grid Protein Sequence Analysis (GPS@)
GPS@ (http://www.gpsa.fr/) aims to be an integrated grid portal devoted to molecular bioinformatics. The current version is under development and deployed on LCG2.
GPSA is a porting experiment of the NPSA (Network Protein Sequence Analysis) services onto the EGEE grid. NPSA is a production web portal hosting proteins databases and algorithms for sequence analysis. NPSA is reachable at http://npsa-pbil.ibcp.fr and has been serving the bioinformatic community since 1998. Currently, we have placed strong restrictions in terms of databanks and algorithms available to users, due to NPSA's limited resources (one cluster of 14 CPUs). Therefore the number of users connecting to the portal and the size of the data sets they can process are restricted by the server. The same user community will be eager to (transparently) use the grid version of the same service once it has proven to be as stable and as efficient as the original service.
Contact
|
Christophe Blanchet, christophe.blanchet@ibcp.fr
|
Description
|
Grid web portal for bioinformatics
|
Deployment and status
|
GPS@ has been deployed on LCG2 for testing and validation purposes only. Some bionformatic international databanks have been deployed on the grid for tests. Some algorithms and other data have been tested on the grid through sandbox processes at each job submission. GPSA has not been integrated into the backoffice of the NPSA portal.
|
Present infrastructure usage
|
GPS@ has been linked to its own EGEE user interface and is hosted by a standard Apache http server on it own computer. It relies on the rest of the EGEE infrastructure (CNAF Resource Broker and Replica Catalog, Computing Elements, etc) for computations and data deployments. Due to its experimental and testing status, GPSA has not yet used large resources on the EGEE infrastructure.
|
Future infrastructure usage
|
With the increase of computing resources and biomed VO deployment, longer and bigger-IO bioinformatic jobs will be executed.According to the NPSA statistics since 1998, thousands of jobs per day with typical durations of 1 minute to 1 hour would be processed.
|
Present users
|
GPS@ has not yet been published to scientific community, and so is used by few users. The reference statistics are these of NPSA portal which serves hundreds of bioinformaticians daily (about 3000 jobs/day).
|
Plans
|
GPS@ will be integrated into NPSA when it shows similar robustness and quality of service as the current service.
|
Main issues related to deployment
|
The long latencies induced by the LCG2 middleware must be overcome to provide a compelling replacement for NPSA.
On NPSA portal, bioinformaticians do not authenticate individualy to run jobs. GPS@ requires a service certificate so that the server can start bioinformatics jobs on behalf of the users as it is routinely done in this community.
GPSA would also need a Replica Location Service for the biomed VO to enable grid files registration.
|
6.2.1.3Geant4 Application for Tomographic Emission (GATE)
Radiotherapy and brachy-therapy use ionizing radiation to treat cancer; before each treatment, physicians and physicists plan the treatment using analytical treatment planning systems and medical images data of the tumour. In order to treat patients with the best accuracy, Monte Carlo simulations are today the best tool to model and plan the tumor treatment for complex requirements. By involving the accuracy of the treatment, Monte Carlo simulations are very time consuming that hospitals and clinical centres to use them for current treatment planning.
By using the grid environment provided by the EGEE project, we will be able to reduce the computing time of Monte Carlo simulations in order to provide a tool with a reasonable response time for specific cancer treatment requiring Monte Carlo accuracy.
GATE (Geant4 application for Tomographic emission) is a C++ platform based on the Monte Carlo Geant4 software. It has been typically designed to model nuclear medicine applications, such as PET and SPECT among the OpenGATE collaboration. Its functionalities combined with its ease of use make this platform suitable for radiotherapy and brachy-therapy treatment planning.
The simulations are parallelized on the Grid by splitting the random number sequences necessary to run Monte Carlo simulations. Each simulation is then computed using anonymous medical images of the tumor of the patient located on storage resources of the Grid.
Contact
|
Lydia Maigne, maigne@clermont.in2p3.fr
|
Description
|
Monte-Carlo simulation platform for brachy-therapy, radiotherapy and nuclear medicine
|
Deployment and status
|
GATE has been installed at CC-IN2P3 (46 CPUs) and LPC Clermont-Ferrand (100 CPUs).
It has also been interfaced to the GENIUS grid portal.
|
Present infrastructure usage
|
Sets of 5 to 10 jobs, 4 minutes each, for testing
|
Desired infrastructure usage
|
Sets of 20 jobs, 1 to 4 hours each, for data production
|
Present users
|
Developers only
|
Plans
|
The software should be installed on the CNB (Madrid) cluster. Larger scale tests will be performed when the application is deployed on more nodes
|
Main issues related to deployment
|
The queue waiting time is too long (up to 30 minutes) for short jobs (in the order of minutes of execution time).
|
6.2.2Internal biomedical applications 6.2.2.13D MRI Simulator (SiMRI3D)
SiMRI3D is a 3D Magnetic Resonance Imaging simulator. Its input is a digital virtual volume providing the magnetic properties of objects at each voxel. The application simulates the MRI physics to produce a realistic Magnetic Resonance image. Such images simulated from a known source are very useful for
-
understanding MR physics and artefacts
-
exploring new MR acquisition sequences without the need for a physical device
-
assessing image processing algorithms using a known result.
Contact
|
Fabrice Bellet, fabrice.bellet@creatis.insa-lyon.fr
|
Description
|
Magnetic Resonance Images parallel simulator (MPI-based)
|
Deployment and status
|
The MPI-based simulator has been implemented and some performance tests have been run on local cluster (up to 18 CPUs).
Tests on CINES supercomputers are on-going.
|
Present infrastructure usage
|
The application is not running on LCG2 (no MPI enabled resource available).
|
Desired infrastructure usage
|
About 1000 to 2000 jobs have been executed last year. Each job is a computation that can consume seconds (for testing) up to weeks (for large scale realistic images) of CPU times.
|
Users
|
Only developers (5 persons) are using the application code today until it can be deployed on the grid. There is a very large potential user community.
|
Plans
|
There is a strong user request for an MRI simulator service. Plans are to deploy such a service through a web portal once the simulator code can be executed on LCG2.
|
Main issues related to deployment
|
No MPI-enabled resources are available on EGEE infrastructure.
|
6.2.2.2Macromolecular 3D structure analysis (xmipp_MLrefine)
xmipp_ML_refine corresponds to the first successful extension to 3D of the Maximum Likelihood phasing/registration in the field of three-dimensional structure determination of macromolecular nano-machines by EM. The implications are very far reaching in the field and the work is performed under a Network of Excellence in 3DEM.
Contact
|
Angel Merino, AJ.Merino@cnb.uam.es
|
Description
|
Macromolecular 3D structure analysis
|
Deployment and status
|
The application has recently been ported to LCG2 and tested both on Clermont and Madrid clusters.
|
Present infrastructure usage
|
-
|
Desired infrastructure usage
|
One experiment corresponds to about 500 jobs and one week of computations on Madrid site
|
Users
|
Developers only
|
Plans
|
To shorten experiments time by using more resources
|
Main issues related to deployment
|
Computing resources shortage.
|
6.2.2.3DICOM data analysis tool (grille – Poste de Traitement Médical 3D, gPTM3D)
PTM3D is a DICOM data analyser developed at LIMSI(CNRS) by A. Osorio team featuring transfer, archiving and visualization of DICOM-encoded data, running on a standalone PC. gPTM3D targets the remote execution of time-consuming parts of the software suite, e.g. volume reconstruction of large or complex organs.
Contact
|
Cécile Germain-Renaud, germain@lal.in2p3.fr
|
Description
|
Radiological data interactive segmentation and analysis
|
Deployment and status
|
The application has been ported to LCG2 on top of the interactive job submission service (interactive job attribute in JDL file).
|
Present infrastructure usage
|
It has been deployed and tested on the Orsay cluster.
|
Desired infrastructure usage
|
Submission of short interactive jobs with guaranteed bandwidth between the user interface and the computing nodes.
|
Users
|
Currently, the application is mainly run by developers for porting purposes. A community of medical users trained to use the PTM3D software is eager to experiment the grid-enabled version. The tests of the LCG2 version will be completed in October. The resulting software will be used for calibration experiments on datasets coming from high-performance scanners.
|
Plans
|
The interactivity mechanisms rely on an general-purpose interactivity portal; which in turn is built on top of a proxy mechanism to enable communications between the UI and the external world (hospital). This proxy system, although efficient from the performance point of view, should be integrated with or redesigned from the new and higher-level APIs which will be defined in EGEE, in order to ensure compatibility with further evolutions of the EGEE core middleware.
It is planed to demonstrate the application at the next "Journées Françaises de Radiologie".
|
Main issues related to deployment
|
The execution of the application does require that the WN can open a TCP connection (not accept) to a non-EGEE machine.
The QoS required for interactivity requires that some sites in the Biomed VO define a job management policy including a high priority (without preemption) for this class of jobs. Implementation of this high priority is not unique, and can be adapted to the specifics of the local policy.
|
Dostları ilə paylaş: |