The first EGAAP meeting has been held at CERN on the 14
th of June 2004. During this meeting, four new communities have been identified and invited to deploy their applications on GILDA and then on EGEE:
-
Earth Science Research
-
Earth Science Industry (Geophysics)
-
Computational Chemistry
-
Astro-particle physics
Earth Science Research
Scientific area
|
Earth Observations
|
Contact
|
Monique Petitdidier, IPSL, CNRS, France
monique.petitdidier@cetp.ipsl.fr
|
VO manager
|
Wim Som de Cerff, sdecerff@knmi.nl
|
VO services
|
VO administration, Resource Broker and RLS at SARA
|
Application(s) deployed
|
Processing and validation of global atmospheric ozone observations made by the GOME satellite
|
Status on GILDA
|
IPSL cluster incorporated into EGEE at PM6
|
Status on EGEE-0
|
First jobs submitted on EGEE-0
|
Number of nodes accessible on EGEE infrastructure
|
|
Main requirements
|
Metadata handling, security (restricted access, groups and roles within a VO), scalability (e.g. number of files, file sizes), support for parallel programs (e.g. MPI, PVM)
|
Main issues related to deployment
|
|
Earth Science Industry
Scientific area
|
Geophysics
|
Contact
|
Dominique Thomas, CGG, France
dthomas@cgg.com
|
VO manager
|
|
VO services
|
VO administration, Resource Broker and RLS at SARA
|
Application(s) deployed
|
3D analysis of Earth underground for oil and gas search.
|
Status on GILDA
|
RPM of application Egeode 1.0 deployed and under test.
|
Status on EGEE-0
|
|
Number of nodes accessible on EGEE infrastructure
|
|
Main requirements
|
Metadata handling, security (restricted access, groups and roles within a VO), scalability (e.g. number of files, file sizes), support for parallel programs (e.g. MPI, PVM)
|
Main issues related to deployment
|
|
Computational Chemistry
Scientific area
|
Computational chemistry
|
Contact
|
Antonio Laganà, Dipartimento di Chimica, University of Perugia, Italy, lag@unipg.it
|
VO manager
|
Osvaldo Gervasi, osvaldo@unipg.it
|
VO services
|
VO administration, Resource Broker and Replica Location Service located at CNAF
|
Application(s) deployed
|
“A priori” atomic and molecular simulations. The application consists of a portal from which user can start their simulations in parallel using MPI and retrieve the results.
|
Status on GILDA
|
First cluster in Perugia added to GILDA at PM6
|
Status on EGEE-0
|
|
|
|
Main requirements
|
Deployment of licensed software on the grid infrastructure. Possibility to use MPI.
|
Main issues related to deployment
|
|
Astro-particle Physics
Scientific area
|
Astro-particle Physics
|
Contact
|
Harald Kornmayer, FZK, Germany, harald.kornmayer@iwr.fzk.de
|
VO manager
|
Harald Kornmayer, KZK, Germany, harald.kornmayer@iwr.fzk.de
|
VO services
|
VO administration, Resource Broker and Replica Location Service located at SARA
|
Application(s) deployed
|
MAGIC telescope Montecarlo simulations.
|
Status on GILDA
|
Realisation of RPM(s) of Magic Montecarlo in progress.
|
Status on EGEE-0
|
|
Number of nodes accessible on EGEE infrastructure
|
|
Desired infrastructure usage
|
|
Main requirements
|
|
Main issues related to deployment
|
|
Besides these communities, due to the curiosity and interest triggered during tutorials and induction courses, mentioned above in this document, also:
-
Astrophysics
-
Hydrology
-
Seismology
-
Grid Search Engines (the EU GRACE Project, http://www.grace-ist.org)
-
Stock market simulators
-
Digital Video Applications
-
European Space Agency
have expressed their intention to use GILDA and join it both as communities of users and sites offering computational and storage resources. In most cases, the Virtual Organisations have already been created and some tests have already been successfully performed on the GILDA test-bed.
The next EGAAP meeting to select the new EGEE Generic Applications is expected to be held on the 25th of November 2004, within the Second EGEE Project Conference.
6.4Collaborations beyond application deployment
While the EGEE major focus is on providing a production quality grid infrastructure, contacts established since project kick-off show that there are many ways EGEE can collaborate with existing application oriented grid projects beyond application deployment:
-
some grid application deployment programs (Mammogrid, GRACE, Diligent) express interest to the use of EGEE middleware to deploy their own infrastructure.
-
some grid middleware development programs (Mygrid, SIMDAT) express interest to deploy on EGEE the high level services or the platforms they are or have been developing, using EGEE middleware as an underlying middleware handling the interaction between their services and the resources.
-
others (BIRN, DEISA) are interested by joint deployment of applications and/or exchange of technology.
6.4.1EGEE as a middleware provider to grid application deployment programs
EGEE middleware is designed to address the needs of a production quality infrastructure. As such, it is appealing to all grid projects committed to provide an infrastructure while not having the critical mass needed to develop their own middleware. Indeed, the rapid change in grid standards has generated a lot of turmoil and there are very few middleware available today which are based on web services and which can be used by user communities.
As an example, NA4 was approached by the European project Mammogrid (FP5, eHealth unit). The aim of the MammoGrid project is to develop a Europe-wide database of mammograms that will be used to investigate a set of important healthcare applications and to explore the potential of the grid to support effective co-working between healthcare professionals. Mammogrid expressed interest to use EGEE middleware to deploy their own testbed. Two of Mammogrid developers are integrated into the NA4 biomedical technical team and benefit from early access and testing of EGEE middleware.
Here, NA4 plays the role of a mediator between external projects and EGEE middleware developers.
Project name
|
MammoGrid
|
Project contact point
|
S. Amendolia, CERN
|
NA4 contact point
|
J. Montagnat
|
Other EGEE contact points
|
NA1, JRA1
|
Project web site
|
http://mammogrid.vitamib.com/
|
Project type (national, European,…)
|
European (FP5, eHealth unit)
|
Nature of envisaged collaboration
|
Use of EGEE middleware for deployment of a pan-european database of mammograms
|
Main requirements
|
Migration of EGEE middleware to gLite
|
Status
|
MammoGrid developers integrated into EGEE biomedical technical team
|
6.4.2EGEE as an “underware” provider to grid middleware development projects
Several EU and national grid projects focussed in the last three years on designing and implementing high-level environments for specific user communities. These environments, often described as virtual laboratories, are based on web services and try to provide extensible open platforms for data and tools interoperability. However, these virtual laboratories need to access resources in order to provide the services needed by their customers. In that case, EGEE can provide the necessary infrastructure provided the virtual laboratory services are interfaced to EGEE middleware.
As an example, myGrid [4] aims to deliver a personalized collaborative problem-solving platform for e-Scientists working in a distributed environment, such that they can construct long-lived in silico experiments, find and adapt others, publish their own view in shared repositories, and be better informed as to the provenance of the tools and data directly relevant to them. The focus is on data-intensive post-genomic functional analysis. Contacts are established to deploy myGrid on EGEE infrastructure in the coming months.
Here, NA4 cannot act as a mediator between external projects and EGEE and the involvement of other EGEE activities, namely JRA1, is needed.
Project name
|
Mygrid
|
Project contact point
|
C. Gobble, Univ. Manchester, UK
|
NA4 contact point
|
V. Breton
|
Other EGEE contact points
|
B. Jones
|
Project web site
|
www.mygrid.org
|
Project type (national, European,etc.)
|
National (UK e-science)
|
Nature of envisaged collaboration
|
Porting Mygrid platform on EGEE
|
Main requirements
|
Migration of EGEE middleware to gLite
|
Status
|
Discussions under way
|
Contacts are also established with the GridLab [7] project which focuses on producing a set of application-oriented Grid services and toolkits providing capabilities. Tight contacts are already established between GENIUS and GAT developers.
6.4.3Collaboration with other grid infrastructures
Contacts have been established with several projects (BIRN, DEISA), which also aim to deploy a grid research production infrastructure. In that case, collaboration focuses on joint deployment of applications and middleware interoperability.
Potential collaboration has been identified between EGEE and DEISA in the area of medical simulation and drug discovery. Applications in these two areas are presently under deployment on the EGEE production infrastructure and their scientific relevance would be significantly improved by parallel deployment on DEISA.
6.4.3.1Potential collaborations with DEISA
GATE (Geant4 Application for Tomographic Emission) is a Monte Carlo platform dedicated to nuclear medicine, radiotherapy and brachytherapy applications. The goal for GATE deployment on the EGEE grid infrastructure is to study the possibility to offer to hospital or clinical centres a grid service for treatment planning. As Monte Carlo simulations are CPU-intensive, the idea is to parallelize them on multiple nodes or processors. Each sub-simulation of one big simulation will typically use the same anonymous binary medical image to compute personalized treatment. The typical size of medical images used is 40 MB.
Tests performed on DataGrid have shown that overall computing time was highly dependent on the workload of the grid nodes. In the perspective of clinical routine, response time of the order of 8 hours from job submission to computed treatment must be guaranteed. The interest of a joint deployment on EGEE and DEISA is to study the interest of transferring such jobs to a supercomputer in case of excessive workload.
The requirements on networks and middleware can be summarized as follows:
- Guarantee of data integrity (e. g. to avoid information loss during network transfers);
- Guarantee of computation completion (e.g. If a node shuts down during the computation, a new computation has to be launched on another node very quickly);
- The global duration (time between job submission and results availability) has to be guaranteed (e.g. less than 8 hours in order to be used in the hospital context.);
- Traceability: job history must be kept in case of legal inquiries.
The added value of using a supercomputer in terms of computing power is limited by the fact the GATE code is not massively parallel.
A new EGEE application to be submitted at the EGAAP meeting in November aims at exploring in silico Drug Discovery in the emerging framework of grids. The ultimate goal is to set up a fabric for scientific discovery to address neglected diseases.
In a first step, the application is focussed on in silico docking. A docking algorithm computes the binding energies of a protein target to a chemical compound to evaluate the compound potential action on the protein active sites. This docking requires knowing the three-dimensional structure of the compounds and the target proteins. It requires also a powerful algorithm to compute the binding energy from the 3D configurations of the molecules.
Two neglected diseases have been chosen to demonstrate the concept: Dengue and Malaria. These diseases could be addressed on a joint in silico screening prototype by the Swiss Biogrid initiative (Dengue) and EGEE (Malaria). Regarding malaria, preliminary tests have been deployed on GILDA testbed using protein targets from P. Falciparum and A. Gambiae, the MSDChem library (EBI) and the GRAMM docking algorithm (Stony Brook University). The tests have demonstrated significant improvement – up to a factor 10, from 30 to 3 hours – on execution time compared to one single processor.
Further tests are foreseen using the DOCK algorithm which parallel version could be deployed on a supercomputer in collaboration between EGEE and DEISA.
6.4.3.2Other collaborations
The Biomedical Informatics Research Network (BIRN) [5] is a
National Institutes of Health initiative that fosters distributed collaborations in biomedical science. Currently the BIRN involves a consortium of 15
universities and 22
research groups that participate in one or more of three test bed projects centered around brain imaging of human neurological disorders and associated animal models. BIRN middleware is based on the Storage Resource Broker technology developed at San Diego Supercomputing Center.
Contacts have been established regarding the interoperability of EGEE and BIRN middlewares and one of the groups participating to NA4 biomedical activity is installing a BIRN node in Madrid.
Project name
|
BIRN
|
Project contact point
|
M. Ellisman, San Diego Supercomputing Center, USA
|
NA4 contact point
|
J-M Carazo, CNB, Madrid
|
Other EGEE contact points
|
F. Gagliardi
|
Project web site
|
www.nbirn.net
|
Project type (national, European,etc.)
|
National (USA)
|
Nature of envisaged collaboration
|
Setting up joint EGEE/BIRN nodes
|
Main requirements
|
Interoperability of SRB and EGEE middleware
|
Status
|
Discussions under way
Installation of a BIRN node in Madrid
|
7Conclusion
The document has presented the strategy adopted by the project to host new application communities on its infrastructure. In the first 6 months of the project, several groups involving NA4 representatives (joint NA4/SA1 group, Project Technical Forum, EGEE Generic Application Advisory Panel) have been created to better integrate user communities. As well, the GILDA demonstration and training infrastructure has also proven to be a major first step for new users. All in all, the project strategy is in its infancy and will evolve within the course of the project. Indeed, EGEE is the first open grid infrastructure and there is a lot to learn from the pilot applications and the first new user communities hosted on EGEE.
Contacts established since the project start have highlighted the other dimensions by which EGEE can enable grids for e-science in Europe beyond porting applications on its infrastructure:
- EGEE can provide the underlying middleware to which high level environments like virtual laboratories can be interfaced.
- several projects expressed interest to use EGEE middleware to deploy their own infrastructure
- EGEE and DEISA have complementary features allowing joint deployment of applications
- EGEE is already collaborating with the largest grid research infrastructures outside Europe.