The French aci grid* initiative and its latest achievements using Grid'5000
tarix 26.10.2017 ölçüsü 445 b. #13473
The French ACI GRID* initiative and its latest achievements using Grid'5000 Thierry PRIOL Thierry.Priol@inria.fr Franck Cappello Director ACI GRID Grid’5000 Franck.Cappello@inria.fr
Objectives of the ACI GRID Push the national research effort on grid computing Increase the visibility of French Grid research activities Fund medium and long term research activities in Grid using a bottom-up approach (nothing imposed !) Stimulate synergies between research groups Encourage experimentations with the available grid infrastructure being deployed through national projects Develop new software for experimental grid infrastructures New system and programming environments for distributed computing or large data management
Organisation Programme Director : Thierry Priol since January 2004, M. Cosnard before Scientific council : Brigitte Plateau Budget: ~8 M€* (including 8 PhD grants) This is incentive funding (around 98.3 M€ estimated by GridCoord)
Several kinds of projects Multidisciplinary project Software project Young research team Collaboration International Testbed
Middleware, tools, environments CGP2P (F. Cappello, LRI/CNRS) ASP (F. Desprez, ENS Lyon/INRIA) EPSN (O. Coulaud, INRIA) PADOUE (A. Doucet, LIP6) MEDIAGRID (C. Collet, IMAG) DARTS (S. Frénot, INSA-Lyon) Grid-TLSE (M. Dayde, ENSEEIHT) RMI (C. Pérez, IRISA) CONCERTO (Y. Maheo, VALORIA) CARAML (G. Hains, LIFO) Algorithms TAG (S. Genaud, LSIIT) ANCG (N. Emad, PRISM) DOC-G (V-D. Cung, UVSQ) Compiler techniques Métacompil (G-A. Silbert, ENMP) Networks and communication RESAM (C. Pham, ENS Lyon) ALTA (C. Pérez, IRISA/INRIA)
GRID ASP: Client/Server Approach for Simulation over the Grid Call 1 (2001 - 2003) Project coordinator: F. Desprez E-mail : Frederic.Desprez@inria.fr Web: http://graal.ens-lyon.fr/ASP/ Participants ENS-Lyon, INRIA, LORIA, LIFC, IRCOM, LST, SRSMC, Physique Lyon1 Objectives Building a portable set of tools for computational servers in a ASP (Application Service Provider) model DIET (Distributed Interactive Engineering Toolbox) Porting several different applications physic, geology, chemistry, electronic device simulation, robotics, … Focus on issues resource localization (hierarchical) scheduling, performance evaluation (both static and dynamic), data persistence, data redistribution between servers Clients
TLSE : Web expert site for sparse matrices based on grid infrastructure Call 2 (2002 - 2004) Project coordinator: Michel Daydé E-mail : Michel.Dayde@enseeiht.fr Web: http://www.enseeiht.fr/lima/tlse/ Participants CERFACS, FéRIA-IRIT, LIP-ENSL, LaBRI, CEA, CNES, EADS, EDF, IFP Objectives Design a Web expertise site for sparse matrices Dissemination of our expertise in sparse linear algebra Easy access and experimentation with software and tools: only statistics are provided, not computing resources Exploitation of the computing power of the grid for parametric studies Contents : Sparse matrix software, Bibliography, Collections of sparse matrices
CGP2P: Global P2P Computing “Fusion of Desktop Grid and P2P systems” Call 1 (2001 - 2003) Coordinator: Franck Cappello, email: fci@lri.fr Web: www.lri.fr/~fci Participants: LRI, LIFL, ID IMAG, LARIA, LAL, EADS
RMI: Programming the Grid with distributed Objects Call 1 (2001 - 2003) Project coordinator: C. Pérez E-mail : Christian.Perez@irisa.fr Web: http://www.irisa.fr/Grid-RMI/en/ Participants IRISA, ENS-Lyon, LIFL, INRIA, EADS Objectives Provide a framework to combine various communication middleware and runtimes For parallel programming: Message based runtimes (MPI, PVM, …) DSM-based runtimes (TreadMarks, …) For distributed programming RPC/RMI based middleware (DCE, CORBA, Java) Middleware for discrete-event based simulation (HLA) Get the maximum performance from the network! Offer zero-copy mechanism to middleware/runtime
HydroGrid: distributed code coupling in hydrogeology, using software components Call 2 (2002 - 2004) Project coordinator: M. Kern E-mail : Michel.Kern@inria.fr Web:http://www-rocq.inria.fr/~kern/ HydroGrid/HydroGrid-en.html. Participants: INRIA Rocquencourt, INRIA Rennes, IMFS Strasbourg, Geosciences Rennes Objectives Simulate flow and transport of pollutants in the subsurface Take into account couplings between different physical phenomena Couple parallel codes on a grid, software from ACI GRID RMI project Links between numerical and software coupling Example applications: reactive transport (top), density driven flow (bottom), fractured media
Main feedback from call1 & call2 projects Lack of a large scale testbed available for experiments Several small scale testbeds at the regional level Duplication of effort when setting up testbeds Various type of Grids Need to be able to experiment various software layers Incompatible with a production Grid
How to proceed…
The Grid’5000 project
Grid’5000 Objective Deploy an experimental large scale computing infrastructure to allow any kind of experiments Experiments of any kind of grids (Virtual Supercomputer, Desktop Grid, …) Experimental conditions Configuration of the entire software stack from the application to the operating system
The Grid’5000 Project Building a nation wide experimental platform for Large scale Grid & P2P experiments 9 geographically distributed sites Every site hosts a cluster (from 256 CPUs to 1K CPUs) All sites are connected by RENATER (French Res. and Edu. Net.) RENATER hosts probes to trace network load conditions Design and develop a system/middleware environment for safely test and repeat experiments Use the platform for Grid experiments in real life conditions Port and test applications, develop new algorithms Address critical issues of Grid system/middleware: Programming, Scalability, Fault Tolerance, Scheduling Address critical issues of Grid Networking High performance transport protocols, Qos Investigate original mechanisms
Planning
Grid’5000 foundations: Measurements and condition injection Quantitative metrics : Performance: Execution time, throughput, overhead, QoS (Batch, interactive, soft real time, real time). Scalability:Resource occupation (CPU, memory, disc, network), Applications algorithms, Number of users, Number of resources. Fault-tolerance:Tolerance to very frequent failures (volatility), tolerance to massive failures (a large fraction of the system disconnects), Fault tolerance consistency across the software stack. Experimental Condition injection : Background workloads: CPU, Memory, Disk, network, Traffic injection at the network edges. Stress: high number of clients, servers, tasks, data transfers, Perturbation: artificial faults (crash, intermittent failure, memory corruptions, Byzantine), rapid platform reduction/increase, slowdowns, etc.
Grid’5000 principle: A highly reconfigurable experimental platform
Experiment workflow
Grid’5000 map
Hardware Configuration
Grid’5000 network provided by RENATER
Grid’5000 as an Instrument A high security for Grid’5000 and the Internet, despite the deep reconfiguration feature Grid’5000 is confined: communications between sites are isolated from the Internet and Vice versa (level2 MPLS, Dedicated lambda). A software infrastructure allowing users to access Grid’5000 from any Grid’5000 site and have simple view of the system A user has a single account on Grid’5000, Grid’5000 is seen as a cluster of clusters, 9 (1 per site) unsynchronized home directories A reservation/scheduling tools allowing users to select nodes and schedule experiments a reservation engine + batch scheduler (1 per site) + OAR Grid (a co-reservation scheduling system) A user toolkit to reconfigure the nodes software image deployment and node reconfiguration tool
OS Reconfiguration techniques Reboot OR Virtual Machines
Community: Grid’5000 users
About 230+ Experiments
A series of Events
Grid@work (Octobre 10-14 2005)
Experiment: Geophysics: Seismic Ray Tracing in 3D mesh of the Earth
Jxta DHT scalability
Fully Distributed Batch Scheduler Motivation : evaluation of a fully distributed resource allocation service (batch scheduler) Vigne : Unstructured network, flooding (random walk optimized for scheduling). Experiment: a bag of 944 homogeneous tasks / 944 CPU Synthetic sequential code (monte carlo application). Measure of the mean execution time for a task (computation time depends on the resource) Measure the overhead compared with an ideal execution (central coordinator) Objective: 1 task per CPU. Tested configuration: Result :
Large Scale experiment of DIET: A GridRPC environment
Solving the Flow-Shop Scheduling Problem
TCP limits over 10Gb/s links Highlighting TCP stream interaction issues in very high bandwidth links (congestion colapse) and poor bandwidth fairness Grid’5000 10Gb/s connections evaluation Evaluation of TCP variants over Grid’5000 10Gb/s links (BIC TCP, H-TCP, weswood…)
Grid’5000 main achievements in 2006 A large scale and highly reconfigurable Grid experimental platform Used by Master student Ph. D., PostDoc and researchers (and results are presented in their reports, thesis, papers, etc.) Grid’5000 offers in 2006: 9 clusters distributed over 9 sites in France, about 10 Gigabit/s (directional) of bandwidth the capability for all users to reconfigure the platform [protocols/OS/Middleware/Runtime/Application] Grid’5000 results in 2006: Grid’5000 is opened to French Grid researchers since July 2005 Grid’5000 is opened to others communities in 2006 (CoreGRID) Grid’5000 winter school (Philippe d’Anfray, ~January 2007) Connection to other Grid experimental platforms Netherlands (from October 2006), Japan (under discussion) Sustainability ensured by INRIA after 2007
Concluding remarks GRID in its wider definition Computing, data and knowledge Grids, P2P Not only focusing on the use of Supercomputers… neither on Globus… An emphasis on middleware but also on applications/algorithms to make them Grid-aware The French ACI GRID lead to many European initiatives Several groups of the ACI GRID projects are involved in EU funded projects (almost absent in FP5, involved in 10 projects in FP6 and leader of 3 projects) The idea to set up a Network of Excellence in Grid Research came from the ACI GRID (M. Cosnard) On-going discussions to have a European dimension of Grid’5000 funded under the 7th Framework Programme Funding of Grid research yet available Through the “Agence National de la Recherche” To get more information about the ACI-GRID http://www-sop.inria.fr/aci/grid Thierry.Priol@inria.fr
Announcement Project consultation Meeting
Dostları ilə paylaş: