(ACRONYM)
Preservation of Scientific Data for International Long-Term Analysis
28/10/2011
Part B
Type of funding scheme:
Combination of Collaborative Project and Coordination and Support Action: Integrated Infrastructure Initiative (I3)
Work programme topic addressed:
INFRA-2012-3.2 – International cooperation with USA on common e-infrastructure for scientific data.
Name of the coordinating person:
Jamie SHIERS
List of Participants:
Part. no.
|
Participant organisation name
|
Short name
|
Country
|
1
(coord.)
|
European Organization for Nuclear Research
|
CERN
|
Switzerland
|
2
|
Centre National de la Recherche Scientifique
|
CNRS
|
France
|
3
|
Deutsches Elektronen Synchrotron
|
DESY
|
Germany
|
Table of Contents
1. Scientific and Technical Quality 3
2. Implementation 4
2.1. Individual participants 4
2.1.1. CNRS 4
CERN 10
DESY 11
3. Impact 12
4. Ethical Issues 13
4.1. Use of Human Data on the Grid Infrastructure 13
5. Gender Action Plan 15
6. Annex: References 16
7. Annex: Glossary 17
8. Annex: Letters of Support 22
1.Scientific and Technical Quality 2.Implementation 2.1.Individual participants 2.1.1.CNRS
Brief description of legal entity
The Centre National de la Recherche Scientifique (CNRS - National Centre for Scientific Research http://www.cnrs.fr) is a government-funded research organization under the administrative authority of the French research ministry. CNRS, consisting of 1211 laboratories, is the largest fundamental research organization in Europe and is extensively involved in national, European, and international projects covering all fields of knowledge.
Within the ROSCOE project, CNRS is involved on behalf of the Institut des Grilles (IDG), directed by Dr. Guy WORMSER. The Institut des Grilles (IDG) was created September 2007 to federate the French CNRS contributions to grid deployment and grid research, to reinforce interaction between computing science research and production infrastructures, and to represent CNRS in European projects. IDG offers to users and to scientific communities an easy and transparent access to computing resources distributed on tens of sites. More than 15 CNRS laboratories from 4 scientific departments and 2 national institutes (IN2P3 and INSU) take part in IDG.
IDG represents the following Joint Research Units:
-
Laboratoire de l’Accélérateur Linéaire (LAL, UMR8607), directed by Dr. Guy WORMSER
-
Centre de REcherche en Acquisition et Traitement du sIgnal pour la Santé (CREATIS, UMR5220), directed by Dr Isabelle MAGNIN
-
Institut de Recherche sur les Systèmes Atomiques et Moleculaires Complexes (IRSAMC), directed by Pierre LABASTIE and comprised by four laboratories: Laboratoire Collisions Agrégats Réactivité (UMR5589), Laboratoire de Chimie et Physique Quantique (UMR5626), Laboratoire de Physique et Chimie de Nano-Objets (LPCNO, UMR5215), and Laboratoire de Physique Thèorique Toulouse (UMR5152)
-
Laboratoire de Physique Corpusculaire de Clermont-Ferrand (LCP, UMR6533), directed by Dr Alain BALDIT
-
Laboratoire de Recherche en Informatique (LRI, UMR8623), directed by Michel BEAUDOUIN-LAFON
These units are Joint Research Units for which CNRS will represent the Partner Institutions via the special clause 10 in the Grant Agreement and Consortium Agreement. The CNRS Ile-de-France Sud Regional Office (DR4) will be the lead delegation coordinating and representing all the CNRS Units involved in the Project.
CNRS/LAL
Brief description of unit
The Laboratoire de l’Accélérateur Linéaire (LAL) is supervised jointly by the University Paris-Sud XI and the Institut National de Physique Nucléaire et de Physique des Particules (IN2P3) of CNRS. The laboratory’s research activity centres on particle physics supplemented with strong programmes in cosmology and astrophysics. Since 2001, the laboratory has contributed strongly to European grid projects in both the deployment of grid technologies and their utilization by scientists.
Main tasks in project and relevant experience
LAL will lead the management of the ROSCOE project capitalizing on its experience from previous grid projects, particularly its experience leading the “User Community Expansion & Support (NA4)” activity of EGEE-III whose size in terms of partners is nearly that of the entire ROSCOE project.
LAL will also contribute to with one full time engineer to the Computer Science & Engineering (CSE) VRC. LAL has been an active grid resource centre since the European DataGrid project (2001-2004) and has extensive experience in running grid services. This experience will help the CSE develop scripts to extract relevant information from grid services for publishing in the Grid Observatory.
Profiles of individuals undertaking the work
Charles Loomis, PhD in high-energy physics, currently works at LAL as CNRS research engineer and is also a partner in SixSq Sàrl. After obtaining his PhD, he worked for nine years on major high-energy physics experiments at Fermilab in the United States and at CERN in Geneva, Switzerland. This research required facility with all aspects of computing—programming custom electronics, developing software, managing code, and exploiting distributed computing facilities—giving him extensive, practical experience in all those areas. Since 2001, he has been involved deeply in European grid technology projects. In the European DataGrid project he managed the deployment and operation of the project’s grid infrastructure. Since then he has shifted more user-related activities and is now, in EGEE-III, coordinator of the User Community Expansion & Support work package. This experience will be extremely valuable in coordinating the ROSCOE project.
2.1.1.1.CNRS/CREATIS
Brief description of unit
CREATIS (Centre de REcherche en Acquisition et Traitement du sIgnal pour la Santé) is a French laboratory co-located at CNRS (unit UMR 5220), Inserm (U630), INSA Lyon and Université Claude-Bernard Lyon 1. Its main research directions are the following:
-
The identification of major health issues that can be addressed by imaging.
-
The identification of theoretical breakthroughs dedicated to live imaging to be made in signal processing and image processing, modelling and numerical simulation.
Creatis meets these challenges through a multidisciplinary approach based on a matrix highlighting the interactions between eight research teams belonging to science and information technology and communication, engineering sciences and life sciences.
Main tasks in project and relevant experience
Creatis will contribute to the Life-Science Virtual Research Community of ROSCOE. In particular, it is involved in VRC coordination, dissemination and training (NA2 and NA3), in the design and setting up of the LS gateway (SA2) and in application porting (SA3).
Creatis has been involved in e-Infrastructure projects since the beginning of the 2000s, including DataGrid and EGEE-I, II and III.
Profiles of individuals undertaking the work
Pr. Hugues Benoit-Cattin is a professor at INSA Lyon. He has been involved in several grid projects since the early 2000s, including DataGrid and EGEE I, II and III. In particular, he has been working on the porting to EGEE of SIMRI, a versatile simulator for MR imaging parallelized with MPI. In addition, he has been contributing to several national grid-related projects.
Ms Sorina Pop is a research engineer at CNRS. She has been working on the porting of grid applications for 2 years. In particular, she is working on the use of pilot-jobs for the GATE radiotherapy simulation application. She has been involved in EGEE-II and EGEE-III and in several national grid-related projects.
Dr Tristan Glatard is a junior scientist at CNRS. He is leading the French national VIP project (Virtual Imaging Platform) aiming at porting image simulators on DCIs. Before that, he has been involved in EU projects EGEE-II/III, in French national projects AGIR (grid technology for medical imaging) and Gwendia (workflows for medical imaging) and in Dutch national project VL-e (Virtual Lab for eScience). He is one of the developers of the MOTEUR workflow engine.
2.1.1.2.CNRS/IRSAMC
Brief description of the unit
IRSAMC (Institut de Recherche sur les Systèmes Atomique et Moleculaires Complexes) includes laboratories investigating fundamental chemistry and physics, joint research units with the University Paul Sabatier, and CNRS. The goal of the institute is to promote the fundamental research of the constituent laboratories by encouraging collaborative scientific research and by providing a certain number of services.
Main tasks in project and relevant experience
SA1
CNRS/IRSAMC will be involved in the support to the user in the utilization of quantum chemistry codes on a grid framework. Codes will be both widespread ones and more particular codes developed at IRSAMC. The expertise in the code development, in particular in the field of correlated methods and the expertise in the Quantum Chemistry treatment of complex problems will make the support action of IRSAMC quite efficient for the chemistry community.
SA2
Moreover IRSAMC can share in the community all the tools developed by its members that can lead the easy and well-balanced design and production of FORTRAN codes
SA3.
IRSAMC engages in the implementation and exploitation on the grid of quantum chemistry codes and in order to achieve original and high-impact scientific applications in front-end domains.
In particular we remind again that many quantum chemistry codes are developed, or co-developed at IRSAMC, such as the Quantum Monte Carlo code QmcChem, which will be one of the first to be implemented on the grid.
Finally IRSAMC engages to continue the promotion of the important activity of standardization to assess interoperability. IRSAMC has already devoted himself to this task considerably in the past. Some of the members of the center have been the main developers of Q5Cost that is emerging as the de facto standard of Quantum Chemistry code data management.
Profiles of individuals undertaking the work
Stefano Evangelisti Professor, author of about 120 publications in international journals. Research interests: methodological development in Quantum Chemistry, quantum chemistry software production, treatment of electronic correlation, interoperability and standardisation of quantum chemistry codes and data formats.
Thierry Leininger Professor, author of about 50 publications in international journals. Research interests: methodological development in Quantum Chemistry, quantum chemistry software production, co-developer of molpro quantum chemistry code (mixed ab-initio/DFT methods).
Michel Caffarel “Directeur de recherche”, author of about 60 publications in international journals. Research interests: methodological development in Quantum Chemistry, Quantum Montecarlo treatment of chemical problems, quantum chemistry software production.
To be hired: He/She will participate in all the scientific activity related to the project, in particular providing assistance and/or performing the gridification of the codes, and participating in the process of the development of the standard for Quantum Chemistry (definition of a better data ontology, development and test of the software libraries). He will also assist with his competence the run of test cases on the grid that will also cover new and original scientific problems.
The researcher hired within the action of the CCSMT will mainly have the profile of a quantum chemist involved in the problem of methodological development and code production. Such a background should enable a clear scientific understanding of the subjects and at the same time the capacity of an easy interface with the skills and technological abilities needed for addressing code interoperability, high-throughput and diffused computation, and code optimization. An expertise of quantum chemistry scientific codes and methods development will be of crucial importance, as well as knowledge of the grid environment and standardization.
2.1.1.3.CNRS/LRI
Brief description of unit
LRI (Laboratoire de Recherche en Informatique) is a joint laboratory of Université Paris-Sud-11 (UPS) and Centre National de la Recherche Scientifique (CNRS). LRI is located in Orsay, France. LRI is a member of Institut des Grilles. The Optimization and Learning group (TAO, http://tao.lri.fr/) of LRI is a joint team of LRI and INRIA-Saclay.
Main tasks in project and relevant experience
The TAO team includes three full professors/senior researchers, 6 associated professors/junior researchers, and ~18 post-docs or PhDs on average. Its areas of expertise are Computational Learning, Evolutionary Computation, and Autonomic Computing. The group is in charge of the overall supervision of the CSE VRC. TAO is involved in numerous international and national large-scale projects (http://tao.lri.fr/tiki-index.php?page=Contracts), and in the Microsoft-INRIA Joint lab project on Adaptive search for e-Science. The Autonomic Computing Special Interest Group (AC-SIG) of TAO has been involved in EGEE-II, has pioneered the creation of the Grid Observatory (GO) Cluster in EGEE-III, and fostered national projects supporting the GO through Institut des Grilles, University and DIGITEO research.
Profiles of individuals undertaking the work
Cécile Germain-Renaud is a full Professor at UPS, and the leader of the AC-SIG. She authored and co-authored more than 50 papers in High Performance Computing and more recently Autonomic Computing. She has been coordinator of ANR and University projects. She leads the Grid Observatory cluster in EGEE-III, and participates in scientific committees (EGEE UF, NPC, CCGrid). She has initiated the Grids Meet Autonomic Computing workshop associated with the IEEE Conference on Autonomic Computing.
Michèle Sebag is a senior researcher within CNRS. She authored and co-authored more than 80 papers in journals and main international conferences in Machine Learning and Evolutionary Computation. She is or has been member of the editorial boards of Machine Learning Journal, Genetic Programming and Evolvable Hardware, Knowledge and Information Systems, and of the Steering Committee of the PASCAL NoE, Management Board of the KD-Ubiq, Program Committee of the major AI conferences (ICML, ECML, PKDD, ICDM, ILP, GECCO, PPSN), area chair for ICML 2009, 2008, 2005, ECML/PKDD 2008, 2005, program co-chair for ILP 2001, vice-chair of ICDM 2003. She is the president of the French Association for Artificial Intelligence
To be hired: One engineer will be recruited. He will be in charge of data pre-processing and restructuring in SA2, and participate in support (SA1) and code development (SA3).
CNRS/LPC
Brief description of unit
UMR-CNRS 6533 - LPC Clermont-Ferrand – IN2P3 - Laboratoire de Physique Corpusculaire de Clermont-Ferrand, FRANCE.
The PCSV team at LPC (Laboratoire de Physique Corpusculaire – CNRS-CNRS-IDG) studies the interface between high-energy physics and life sciences; its approach is to apply the computing methods and tools developed in high-energy physics for biomedical applications with a specific focus on grid technology. Through its involvement in the DataGrid (FP5), EGEE (FP6-FP7), EMBRACE (FP6) and SHARE (FP6) European projects as well as in the French GLOP and Rugbi projects funded by the French ministry of research, the team has acquired a recognized expertise in the deployment of biomedical applications in grid environments. The team is also leading the AuverGrid regional grid initiative in Auvergne around LPC Clermont-Ferrand that is also one of the main nodes providing resources to the biomedical virtual organization in EGEE.
Main tasks in project and relevant experience
David SARRAMIA is an assistant professor belonging to LPC laboratory. He will participate to Computer Science & Engineering VRC namely CSE VRC in Scientific Gateways (SA2) work package and more precisely on providing digital asset search and retrieval facilities to scientific communities through its gateway. He will contribute to the design of the grid ontology and its use.
Vincent BRETON is a researcher at CNRS and will lead the LS VRC.
Profiles of individuals undertaking the work
Dr. Vincent Breton obtained PhD in Nuclear Physics from the University of Paris XI-Orsay in 1990. From 1990, he has been a research associate at the French National Centre for Scientific Research (CNRS). In 2001, he founded the Computing Platform for Life Science research group (http://clrpcsv.in2p3.fr) on the application to biomedical sciences of the IT technologies and tools used in high energy physics. He has led application support and identification in the EGEE project (2004-2006). Member of the steering committee of EMBRACE network of excellence dedicated to grids for Bioinformatics (FP6), he is also the spokesperson of the international WISDOM collaboration.
Dr. David Sarramia is a computer scientist in modelling, simulation. He obtained is PhD in Computer Science for the University of Blaise Pascal in 2002. He has a high experience in modelling and simulating complex system. His knowledge embraces UML, design methodology, other oriented object formalisms, mathematical models and ontology. He has been published in international and national conferences and journals. He has also worked in multi-scale modelling of complex system, using several model types at the same time (models coupling). He has published in the area of modelling and simulation methods by proposing methods allowing deriving conceptual models to simulation models independently from the functional models used. In this area of interactive simulation, he has also participated to the design of ontology.
An engineer will be recruited to work on the WP NA2 and WP NA3.
CERN
Brief description of legal entity
CERN, the European Organization for Nuclear Research, is the largest particle physics laboratory in the world and is an International Organisation with its headquarters in Switzerland. Currently CERN is commissioning the Large Hadron Collider (LHC), a new particle accelerator on the Swiss-French border near Geneva, expected to be operational at the end of 2009. LHC is the world’s most powerful accelerator and will provide research facilities for several thousand high-energy physics researchers from all over the globe. The LHC experiments are designed and constructed by large international collaborations and will collect data over a period of 10-15 years. These experiments will run up to 1 million computing tasks per day and will generate around 15 petabytes of data per year. This data will be shared with all the participating institutes. The computing capacity required to analyse the data far exceeds the capacity needs of any comparable physics experiments today and relies on the combined resources of some 200 computer centres world-wide. CERN and the particle physics community have chosen grid technology to address the huge data storage and analysis challenge of LHC.
Main tasks in project and relevant experience
CERN will be involved in work packages NA2, NA3 as well as SA1, SA2 and SA3.
IT department of CERN currently has 228 staff, predominantly engineers, who operate one of Europe’s largest research computer centres supporting about 17,000 users. The department has developed leading expertise in large-scale data centres and long-standing collaborations with industrial and academic partners in the fields of high performance computing and advanced networking.
The CERN IT department has been at the forefront of computing for many years and now leads the world’s largest grid project, EGEE (Enabling Grids for E-SciencE). CERN has also prominently contributed to a number of EGEE-related grid projects aiming at extending the EGEE production grid infrastructure to new geographical areas, to serve new applications domains and to support the grid community: BalticGrid-II, D4Science, D4Science-II, EGI_DS, enviroGRIDS, ETICS 2, GridTalk, Health-e-Child and SEE-GRID-SCI. Under FP6 and FP7, the department has been involved in some 20 European Commission-funded projects. CERN is a founding partner of the recently formed European Grid Initiative that will provide a sustainable grid infrastructure for Europe’s research communities.
Profiles of individuals undertaking the work
Dr Jamie Shiers currently leads the Grid Support group in CERN’s IT department. He has been involved in grids since the early days of the European Data Grid. Since 2005 he led the WLCG Service Challenge activity and later organized the Common Computing Readiness Challenge 2008 and the Scale Test for the Experiment Programme 2009 – two key demonstrations of production readiness of WLCG.
Jakub T. Moscicki is a software engineer and researcher at CERN. He obtained the MSc in Computing at the AGH University of Science and Technology in Krakow. His research and engineering interests focus on the distributed and parallel applications deployed on large-scale computing infrastructures such as the Grids. He is the technical leader within the Ganga project (and the creator of DIANE) now used by several hundred users from High-Energy physics and other sciences.
Maarten Litmaath has more than 20 years of in-depth experience with POSIX-compliant computing and UNIX derivatives. He joined the CERN IT department in 2002 and since then has been working on building up the Worldwide LHC Computing Grid (WLCG) as well as the Enabling Grids for E-sciencE (EGEE) infrastructure.
DESY
Brief description of the legal entity
The “Stiftung Deutsches Elektronen-Synchrotron DESY” is one of the world's leading centers for the investigation of the structure of matter. DESY develops, operates, and uses accelerators and detectors for photon science and particle physics. As a member of the Helmholtz Association in Germany, DESY is a non-profit research organization supported by public funds.
Main tasks in the project and relevant experience
As a founding partner of the DE/CH federation in EGEE, and as a Tier-2 centre for WLCG, DESY has been successfully operating a grid infrastructure for many years. DESY acts as the host for a number of global and regional VOs, which are supported by many sites throughout the scientific community, including core grid services and support activities for the user communities.
DESY is a key partner in the ILC community, active in the development of accelerators and detectors for a future linear collider. By founding appropriate VOs, DESY enabled institutes, universities, and scientist working towards the ILC access to grid resources. Within the HEP VRC, DESY plans to support the worldwide ILC effort by providing the following: user support; helping to adopt grid components; and the integration of experiment specific frameworks.
DESY will also participate in the Photon Science VRC. The main tasks are in SA2, focussing on support of Photon Science users and metadata handling, and in JRA1 on future data management technologies.
Experiences for SA2: DESY is one of largest centres for Photon Sciences worldwide, operating a number of facilities, and is a founding member of the European XFEL. DESY has been supporting Photon Science users for the last 50 years.
Experiences for JRA1: DESY is the host organization for dCache.org – one of the main storage solutions in use in HEP at most of the LHC Tier-1 and many larger Tier-2 sites. DESY, as leading partner of an international collaboration, has a long history in providing storage solutions for large laboratories in Europe and the US, as there are FERMILAB (Chicago), BNL (New York), IN2P3 (Lyon), SARA (Amsterdam) and many others. DESY has successfully steered the development of the dCache storage technology towards industry standards, including state-of-the art protocols like http(s) and NFS4.1 (pNFS).
Profiles of individuals undertaking the work
Dr. Andreas Gellrich graduated in experimental HEP and has been working for the DESY IT division as a staff member since 2001. He leads the grid activities at DESY and is the representative for DESY in EGEE. Dr. Gellrich has long-time experiences in deploying and operating grid infrastructures and supporting new communities to use the grid.
Dr. Frank Gaede also graduated in experimental HEP. He joined DESY IT in 2002 and holds one of the leading management roles for software development in the ILC. Dr. Gaede is the architect and main developer of the ILC software framework. From a very early stage in the development of the framework he has made the grid an essential component of the ILC computing model.
For the ILC activities on the grid, DESY plans to hire one person to work full-time for the HEP VRC. Additional individuals for the Photon Science activities are to be hired.
Dostları ilə paylaş: |