Grid and Astrophysical Research: current activities and future perspectives C. Vuerli (1,2), M. Sponza

Yüklə 445 b.
ölçüsü445 b.

Grid and Astrophysical Research: current activities and future perspectives

  • C. Vuerli (1,2), M. Sponza(1) and F. Pasian (1,2)

  • (1) INAF – Astronomical Observatory of Trieste (2) INAF – Information System Unit

  • ECSAC 2009

  • Veli Lošinj, Thursday 27 August 2009


  • Gridified Applications

    • Planck (presentation by F. Pasian)
    • MAGIC
    • Applications developed at II-SAS
    • Databases and a practical example: BaSTI and FRANEC
    • Other A&A Applications
  • A&A SSC in EGI: Lessons learned, tools and services, planning

Scientific targets

  • Scientific targets

  • Cosmic Accelerators

    • AGN (Active Galactic Nuclei), PWN (Pulsar Wind Nebulae), SNR (Supernova remnants), GRB (gamma-ray bursts), …
  • Fundamental Questions

    • Dark Matter, Cosmic Rays, Quantum Gravity, Cosmology …

Major issue: Background rejection

  • Major issue: Background rejection

    • Separate g-rays from hadrons
    • Based on image parameters
  • Monte Carlo simulations required

    • No VHE “test beam” available

MAGIC VO exists since 2004

  • MAGIC VO exists since 2004

    • Initiative by H. Kornmayer et al.
  • Issue

  • 2007-08: New crew taking over grid operations

    • UCM (Madrid)
    • IFAE and PIC in Barcelona


  • Why GRID?

    • Monte Carlo production and data reduction require lots of CPU
    • Data has to be distributed to all collaborators across Europe
    • Improved control over analysis & MC production
    • User access to shared resources and standardized analysis tools
    • Better and easier data management
    • Increased technical support, benefit from LCG experience
  • How to proceed?

    • Resume development of MC tools and start MC production
    • Migrate data to a grid-aware file system
    • Use grid tools for data transfer and distribution
    • Migrate existing analysis tools to Grid & produce new ones
    • Interfaces to access data, monitor jobs & transfers …
    • BUT: Convince users to use these tools! Training…

Current storage system requires too much maintenance

  • Current storage system requires too much maintenance

  • Non-existent file catalog, requires custom tool development

  • Solution: adopt Tier1-grade Grid-based storage system

    • Standard tools + supported service @ PIC
    • LFC: Easier data management and monitoring
  • Transition to new scheme will be done while in production

Data access requirements:

  • Data access requirements:

    • Access data anytime from anywhere
  • Two approaches:

    • Data access using GridFTP or equivalent
      • Robust transfers, not easy file browsing
      • BUT: Not all institutes support Grid
    • Web access
      • Easy file browsing, not that easy transfers
  • Solution:

    • Build web-based service to interface to GridFTP
    • Use http doors as backup solution & for “non-Grid-expert” users
    • Use LFC + project database as backend

Computing at MAGIC now

  • Computing at MAGIC now

    • Each institute uses its own computing resources (CPU + Storage)
    • Only few of them can access a computing farm
    • Data center CPUs dedicated to “official” data analysis
  • We go towards opening the computing service to all users

    • Grid-based computing
    • Universal access to data in the SE and use of the CE (and +CEs)
    • Standard analysis tools
    • PIC data center will still play a central role
      • Data management, manpower, …
  • more resources & better efficiency  more and better scientific outcome

The unified theory of Kuiper-belt and Oort-cloud formation: experiences from porting to EGEE

  • Jan Astalos

  • Institute of Informatics

  • Slovak Academy of Sciences

Oort: Application details

  • Collaboration

      • Slovakia: Astronomical institute, Slovak Academy of Sciences
      • Italy: INAF-Catania Astrophysical Observatory
      • Poland: Astronomical Observatory of the A.Mickiewicz University
  • Main goal

    • Working out unified theory of the formation of:
      • Kuiper belt and Scattered Disc
        • populations of small bodies beyond the Neptune’s orbit
      • Oort cloud
        • very distant cometary reservoir
  • Method

      • simulation of the dynamical evolution of a large number (~10000) planetesimals (treated as test particles) in the proto-planetary disc; the trajectories of the particles are influenced by the perturbing forces from the giant planets, Galactic tide, and stars passing near or through the Oort cloud

Oort: More details

  • Computational methods:

    • Numerical integration of orbits using RADAU integrator (included in public available package MERCURY developed by J. Chambers)
    • Stellar perturbations – using advanced impulse approximation (proposed by P.A. Dybczynski)
  • Structure

    • Sequence of sub-simulations
    • Each sub-simulation consists of many independent tasks
    • Output from all tasks is needed for preparation of the next sub-simulation (requires user interaction)

Oort: Running in Grid

  • Demands for CPU time

    • If a single 2.8 GHz CPU was used, the computation of the orbits of 4 giant planets and 10038 test particles for 1 Gyr would last about 21 years
  • Running in Grid

    • Tasks of each sub-simulation divided among 4 users and run in two Grids
      • EGEE: Virtual Organisation for Central Europe - 3/4
      • TriGrid (Trinacria Grid Virtual Laboratory) - 1/4
    • Simulation of 1Gyr was finished in ~ 5 months

Oort: Problems that we had to solve

  • Previous experiment done by users

    • parametric simulation with large number of short term tasks
  • Throughput

    • Problems with failed jobs
    • Jobs stuck in queues at some sites
    • Low performance on some worker nodes
  • Productivity

    • Too much effort needed for managing large number of jobs
    • Users needed easy-to-use tools for automatic job management
    • Use of Grid should be as simple as possible: prepare input data, start processing, download output
  • Fair resources sharing

    • Users wanted to use only a subset of available resources

Oort: Design

  • Goals

    • The framework should be as generic as possible - reusability
    • No external services – only standard EGEE services
    • Based only on technologies available in EGEE
    • Automatic – with minimal user interaction
    • As simple as possible – to minimise maintenance effort
  • Existing tools

    • not easy-to-use or too much application specific
  • We made use of pilot jobs

    • Concept used in production by some of the virtual organisations in EGEE
    • Input data are not associated with job, downloaded when pilot job starts running on worker node
    • Failed and waiting jobs can be simply discarded

Grid and Databases BaSTI as a practical example

  • Giuliano Taffoni, Santi Cassisi

  • INAF – Istituto Nazionale di Astrofisica

DB in EGEE Grid

  • Integration of DB in Grid recognized as a core research activity

  • Some tools/services developed for this purpose:

    • AMGA : a flexible and fast tool to store and access metadata. Well integrated in the M/W stack it is stable and fast although focused on metadata (but it is rapidly evolving…)
    • OGSA-DAI project addresses the data virtualization and focuses on data services implementation. It offers a set of Grid information system interfaces able to locate and to interact with JDBC adapter.
    • G-DSE is based on the idea that actual middleware implementation used to access computational resources can be also used to access DBs. It is fully integrated with LFC and supports GSI and SSL (Secure Sockets Layer)
    • GRelC service provides both basic and advanced primitives to transparently get access, query, manage and interact with different data sources. It is based on gSOAP and designed on a client/server idea.


  • BaSTI (a Bag of Stellar Tracks and Isochrones) database is a theoretical astrophysical catalogue that collects fundamental data sets involving stars formation and evolution.

    • Stellar evolution theory provides the main tool to constrain a galaxy Star Formation History from observations of its integrated magnitudes and spectra; it enables us to predict spectral and photometric properties of stellar populations of varying ages and initial chemical compositions
  • The relational database of evolutionary predictions fulfils the following three main criteria for a reliable and homogeneous stellar evolution library:

    • the input physics employed in the model computations is the most up-to-date;
    • models for all initial chemical compositions are computed with the same evolutionary code and the same physical framework;
    • models and isochrones reproduce a large array of empirical constraints obtained from observations of single stars and local resolved stellar populations.

BaSTI: the Query Interface

Recent Extensions of the DB

  • Recently, the capabilities and relevance of the BaSTI archive have been largely increased by including an extended set of population synthesis models that allow the study of the integrated colours and magnitudes of far, unresolved stellar systems

  • It is however evident that the potentialities, availability and possible extension of this database, as those of any data set, are strongly hampered by the capabilities of the hardware and software framework supporting the archive.

    • In this context, it is now clear that the possibility to implement the BaSTI archive and the whole theoretical framework related to it within the Grid system, would open fully not-exploited possibilities of development.

FRANEC and Grid

  • FRANEC: a Fortran 77 code that simulates the evolution of a star on the basis of a number of different physical inputs and parameters

  • Benefits coming from the Grid:

    • More computational resources
    • Other Grid-intrinsic added values: e.g. distribution of datasets over different sites  no single points of failure

FRANEC-Grid Interactions

FRANEC in gLite Environment

  • FRANEC simulation software requires the creation of some specific services on gLite general environment used to run both a SMR (Synthetic Model Run) or a FIR (Full Isochrone Run)

  • Runs take place through subsequent steps:

    • Deployment of the Simulation code on the Grid (copy the compressed code on the Grid using LFC)
    • Creation of an application specific environment on top of the User Interface (UI). A set of scripts allow users to configure a pipeline and submit it to the Grid
      • When running a FIR, we use the parametric jobs execution capability offered by the WMS to parametrize jobs in terms of initial mass and/or metallicity
      • The JDL file is configured with some standard ”requirements” based on the GlueSchema

FRANEC in gLite Environment

      • secure and easy access to stored models produced by previous simulations shall also be possible
      • Accessing simulated data through the Grid requires the implementation of a metadata description
      • A set of metadata is used to identify the parameters and to associate each of these parameters to the corresponding file location in the Grid (to its SURL)
      • AMGA is used to handle metadata service
      • e.g. If an EOS already exists on the Grid, the EOS jobs are not submitted but the FRANEC job is configured to download the EOS from the Grid
    • The final step to fully integrate the EGEE Grid Environment and the Astronomical Community Portal “BaSTI” implies the possibility to automatically update the BaSTI DB when a new simulation is available, being the BaSTI DB an external, independently managed entity.
      • We decide to test the G-DSE service to access the DB from the Grid.

FRANEC-Grid: Application Workflow

Auger: Use of the EGEE Grid for production of libraries with cosmic rays showers simulated using the CORSIKA program with the EPOS model

  • Auger: Use of the EGEE Grid for production of libraries with cosmic rays showers simulated using the CORSIKA program with the EPOS model

  • Port in Grid of the Astro-Wise Environment and usage of part of it for the LOFAR long-term archive

  • HERSCHEL Data Reduction

  • Applications developed by the French A&A Community in EGEE:

    • Evolution of the interstellar medium and stars formation through the family of PDR (Photo-Dissociation Regions); they compute in a consistent way the UV radiative transfer, the chemistry and thermal balance arising in molecular interstellar gas.
    • Modeling of the very high energy (VHE; E > 100GeV) γ-ray emission and multi-wavelength data from AGN (Active Galactic Nuclei).
    • Simulations aimed at studying the stability of the Solar System.

A&A SSC Proposal

  • C. Vuerli (1,2) and F. Pasian (1,2)

  • On behalf of the EGEE-NA4 A&A Cluster

  • and of the A&A SSC Editorial Board

  • (1) INAF – Astronomical Observatory of Trieste (2) INAF – Information System Unit

EGI and the A&A Community

  • EGEE-III ends in April 2010 and the new EGI is now under construction

  • A&A community in EGEE-III with its applications demonstrated that the Grid could be the right technology for a large fraction of applications

  • A&A therefore is now moving toward a dedicated SSC whose setup is in progress

  • An Editorial Board for A&A SSC has been formed and now the scientific tasks and activities to be carried out in EGI are under definition

  • A structure for the A&A SSC has been already proposed although it still represents a first draft

SSC Structure

Lessons learned in EGEE-III

  • Training

    • Users need to be trained when approaching the problem of gridifying their applications. Training may concern general aspects concerning the Grid technology as well as specific aspects related to A&A
    • Tools: A&A specific training events organized within the A&A SSC; exploitation of training opportunities offered by and by NGIs.
  • Dissemination

    • Users should be aware of opportunities coming from the Grid technology for their scientific activity
    • Tools: publicize results achieved within the SSC in workshops and conferences, through any other communication mean; identification of use cases and demonstrator applications could help in this case

Lessons learned in EGEE-III

  • Easy ways to access and use the Grid

    • Often users are reluctant to approach the Grid as they perceive the technology tricky and complicated to use. Basic interfaces like CLIs don’t help in this case
    • Tools: Portals and Science Gateways can be decisive in overcoming this psychological barrier; Science Gateways in particular are emerging as a key tool for all communities engaged in the process of building their SSCs; they have the great advantage of collecting and organizing in a harmonized fashion: on-line training, dissemination material, applications, services… and all this easily accessible by final users

Lessons learned in EGEE-III

  • Improve the synergy with other technologies (HPC and the VObs)

    • A&A community almost equally resort to HPC and Grid to procure computational resources. Interoperability between these technologies is important if our objective is an operational environment where users transparently make use of resources for their work. For this reason PRACE (ESFRI) is one of the most important projects that could establish a partnership with our SSC.
    • The VObs is an example of Virtual Community that developed its Science Gateways through a suite of tools and Services. Interfaces connecting the A&A SSC and the VObs are strategic.
    • Interoperability requires two fundamental services: Science Gateways and tools allowing to create and manage workflows where resources of different nature are intermixed.

ESFRI Projects

  • Try to involve big A&A projects that could benefit from the Grid. Tight relationships with ESFRI projects are strongly encouraged for any SSC

  • The current ESFRI projects recognized as relevant for A&A are:

    • E-ELT (Extremely Large Telescope): the follow-up project of the current generation of optical telescopes. With segmented mirrors and built-in adaptive optics, it is feasible to build a 40-m class telescope
    • SKA (Square Kilometer Array): the SKA will have a collecting area of one million square meters distributed over a distance of at least 3000 km. This area, necessary to collect the faint signals from the early universe, will result in a 100 times higher sensitivity compared to existing facilities. The radically new concept of an “electronic” telescope will allow very fast surveys
    • CTA (Cherenkov Telescope Array): the pioneering Cherenkov telescopes HESS and MAGIC have observed a multitude of gamma ray sources both in our galactic centre and outside our galaxy. The CTA will greatly extend the reach of these two projects and allow for further exciting scientific discoveries
    • PRACE (Partnership for Advanced Computing in Europe): an European strategic approach to high-performance computing. It concentrates the available resources in a limited number of world-class top-tier centers in a single infrastructure connected to national, regional and local centers, forming a scientific computing network to utilize the top-level machines

Tools/Services: Summary

  • Moreover…

    • Moving bunches of data on the Grid
    • Code deployment on the Grid
    • Storing intermediate data produced by applications on Storage Elements
    • Storing/Retrieving data to/from the Grid
      • Grid compliant driver for CFITSIO Library


  • EGEE-III Home

  • EGI Home

  • EGEE-III Astronomy and Astrophysics Cluster (Have a look in particular to QRs where significant activity within A&A is reported)

  • A&A Transition to EGI

  • A&A session at EGEE User Forum IV (Catania, March 2009)

  • Mailing List of A&A Cluster Members


End of Presentation

Yüklə 445 b.

Dostları ilə paylaş:

Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur © 2022
rəhbərliyinə müraciət

    Ana səhifə