resource localization (hierarchical) scheduling, performance evaluation (both static and dynamic), data persistence, data redistribution between servers
Clients
C, C++, Scilab, Web browser
TLSE : Web expert site for sparse matrices based on grid infrastructure
Grid’5000 foundations: Measurements and condition injection
Quantitative metrics :
Performance: Execution time, throughput, overhead, QoS (Batch, interactive, soft real time, real time).
Scalability:Resource occupation (CPU, memory, disc, network), Applications algorithms, Number of users, Number of resources.
Fault-tolerance:Tolerance to very frequent failures (volatility), tolerance to massive failures (a large fraction of the system disconnects), Fault tolerance consistency across the software stack.
Experimental Condition injection :
Background workloads: CPU, Memory, Disk, network, Traffic injection at the network edges.
Stress: high number of clients, servers, tasks, data transfers,
Grid’5000 principle: A highly reconfigurable experimental platform
Experiment workflow
Grid’5000 map
Hardware Configuration
Grid’5000 network provided by RENATER
Grid’5000 as an Instrument
A high security for Grid’5000 and the Internet, despite the deep reconfiguration feature
Grid’5000 is confined: communications between sites are isolated from the Internet and Vice versa (level2 MPLS, Dedicated lambda).
A software infrastructure allowing users to access Grid’5000 from any Grid’5000 site and have simple view of the system
A user has a single account on Grid’5000, Grid’5000 is seen as a cluster of clusters, 9 (1 per site) unsynchronized home directories
A reservation/scheduling tools allowing users to select nodes and schedule experiments
a reservation engine + batch scheduler (1 per site) + OAR Grid (a co-reservation scheduling system)
A user toolkit to reconfigure the nodes software image deployment and node reconfiguration tool
OS Reconfiguration techniques Reboot OR Virtual Machines
Community: Grid’5000 users
About 230+ Experiments
About 200 Publications
A series of Events
Grid@work (Octobre 10-14 2005)
Experiment: Geophysics: Seismic Ray Tracing in 3D mesh of the Earth
Jxta DHT scalability
Fully Distributed Batch Scheduler
Motivation : evaluation of a fully distributed resource allocation service (batch scheduler)
Vigne : Unstructured network, flooding (random walk optimized for scheduling).
Experiment: a bag of 944 homogeneous tasks / 944 CPU
Synthetic sequential code (monte carlo application).
Measure of the mean execution time for a task (computation time depends on the resource)
Measure the overhead compared with an ideal execution (central coordinator)
Objective: 1 task per CPU.
Tested configuration:
Result :
Large Scale experiment of DIET: A GridRPC environment
Solving the Flow-Shop Scheduling Problem
TCP limits over 10Gb/s links
Highlighting TCP stream interaction issues in very high bandwidth links (congestion colapse) and poor bandwidth fairness
Grid’5000 10Gb/s connections evaluation
Evaluation of TCP variants over Grid’5000 10Gb/s links (BIC TCP, H-TCP, weswood…)
Grid’5000 main achievements in 2006
A large scale and highly reconfigurable Grid experimental platform
Used by Master student Ph. D., PostDoc and researchers (and results are presented in their reports, thesis, papers, etc.)
Grid’5000 offers in 2006:
9 clusters distributed over 9 sites in France,
about 10 Gigabit/s (directional) of bandwidth
the capability for all users to reconfigure the platform [protocols/OS/Middleware/Runtime/Application]
Grid’5000 results in 2006:
300+ users
~200 publications,
~230 planned experiments
Grid’5000 is opened to French Grid researchers since July 2005
Grid’5000 is opened to others communities in 2006 (CoreGRID)
Grid’5000 winter school (Philippe d’Anfray, ~January 2007)
Connection to other Grid experimental platforms
Netherlands (from October 2006), Japan (under discussion)
Sustainability ensured by INRIA after 2007
Concluding remarks
GRID in its wider definition
Computing, data and knowledge Grids, P2P
Not only focusing on the use of Supercomputers… neither on Globus…
An emphasis on middleware but also on applications/algorithms to make them Grid-aware
The French ACI GRID lead to many European initiatives
Several groups of the ACI GRID projects are involved in EU funded projects (almost absent in FP5, involved in 10 projects in FP6 and leader of 3 projects)
The idea to set up a Network of Excellence in Grid Research came from the ACI GRID (M. Cosnard)
On-going discussions to have a European dimension of Grid’5000 funded under the 7th Framework Programme