GridCoord DoW


Information Collection - Questionnaires



Yüklə 0,53 Mb.
səhifə20/22
tarix27.10.2017
ölçüsü0,53 Mb.
#15514
1   ...   14   15   16   17   18   19   20   21   22

Information Collection - Questionnaires

Method


We decided to publish an online questionnaire, to be filled in by producers of Grid middleware. This questionnaire was built to give a description of the middleware, with specific detail concerning very different aspects, such as application domain, release version, use of legacy code, supported operating systems... Questions were also designed to outline the specific Grid tools used, like other existing Grid middleware.

A paper version of the questionnaire is attached to the end of the report, and its online version can be found online (www-sop.inria.fr/oasis/GridCoord/middleware.php), or from the GridCoord (www.gridcoord.org) website, which acts as a central platform, and gives access to the three online questionnaires funded by the European project.


Diffusion


Once designed and published, the questionnaire had to be advertised to as many actors as possible. For such surveys, a well-known fact is the difficulty of a broad enough diffusion. We have thought about this issue, and have devised the following means to attract people's attention:

  • Direct GridCoord contacts: during the different seminars or conferences to which the author attended, the questionnaire was advertised.

  • Web search: during the month of October, we searched the web for middleware for the Grid. Using the Google search engine, and key words like “grid”, “software”, “middleware”, “tools”, a list of people/institutions/web sites was compiled. A standard email was written and sent to all these entities separately, encouraging them to fill in the questionnaire for themselves, and forwarding the email to any other known actors.

Current status of the survey


The database contains 38 entries. These were entered by people directly linked to their own middleware. The study was meant to be European based, but we still recorded 4 entries from Globus, as well as BOINC and NaradaBrokering, which come from the United States, and also Gfarm and ninf-G from Japanese organizations. These 8 entries were not taken into account for the different figures which will follow. We keep them though, as they will prove very useful for the gap analysis which is scheduled after this deliverable.

Preliminary analysis


The database has been read carefully, and we tried to make clever distinctions between the different entries to classify them. Easy differences are based on the release and start date of the middleware, or on the platform supported, or the country from which they originated. A better distinction is based on the functionalities these middleware provide.

Themes/Functionality


One way of making differences between the middleware is by their usage. Some middleware are very specific, and do only one task, like file management, while others, for which the term middleware is maybe ill-applied, handle all the Grid-related tasks, from deployment to scheduling to monitoring. So here is a list of the middleware that were submitted, sorted by their functionality.

  • General Grid Environments: Padico (Paco++), ProActive, UNICORE, P-GRADE, JGrid. These are “super-middleware”, which take care of practically all the Grid-related issues you might encounter when coding some application for a Grid.

  • Grid enablement: Pacx-mpi, geddm, caraml, paco++, and most database systems (cited later). These middleware do not allow all Grid possibilities, but transform a distributed code into code that can be deployed on a Grid.

  • Legacy code: Guigen, GEMLCA. These middleware take legacy code (code written long ago, that cannot be maintained, but still works) and wrap it to make it work on a Grid.

  • Components/GUI: Guigen, ICENI, COVISE, UNICORE, P-GRADE. This is maybe the next step towards easy programming: instead of writing lines of code, the user draws the structure of the program, by dragging modules together.

  • Database: GeneGrid, GEDDM, caraml (partly). Big databases may be geographically spread apart, and the treatment may need huge computing power. Grids help solve these problems.

  • Control/Supervision: mpich-v, marmot, sgas, InfoProviders [rather ressource discovery], VISIT [creation of bridge compute/visual]. The middlewares enable some control over what's happening on the Grid, and give the maintainers of the Grid a view of what exactly is running on the Grid.

  • File Systems: GridLab, ftm, GridCast [BBC stream availability], MAPFS. An important feature of Grids is being able to have files accessible from every node of the infrastructure. This needs advanced file systems which may for example fetch files on the fly, and keep coherence between distant versions of the same file.

  • Communications: VISIT. Some advanced communication abilities are required on a Grid, like linking together simulation and visualization, or to translate from one application to another.

  • Service infrastructure: DIET.

  • Job submission: GridWay. Job control is a key issue of an efficient Grid structure: one needs to be able to run concurrent applications, allocate free nodes, and stop jobs which are too greedy.

  • Business Grid: GRASP. This is maybe the future of Grid middleware, targeted at making money with the concept, and providing services or facilities for a financial contribution.

Peer-to-peer made some noise recently, with some on-the-fringe file-sharing applications. Within our database, P2PS is a middleware focused on P2P facilities to create applications on such an infrastructure. But the general Grid environments also offer this as a service/ability.

Dates


Saying something only based on the dates seems very difficult. At first glance, one can say that most projects started around the year 2000. Middleware development time before first release: shortest on the left, longest on the right

It can also be seen that some have very quick development (less than a year), and others are in the 2/3/4 years area (could they be the object of a thesis?). Only one middleware was not released when submitted, but that could be because those that are still under original development were not aware of the survey. But this should be taken with care, as project status should also be taken into account: has the project been dumped, is it actively still under development, does it provide support to its users...


Target (business, academic research, HPC Grid)


Another possibility of finding differences is by looking at the expected target of these middleware. Who is the intended audience, which kind of people will be interested by it? It was hoped that this would be a separating criteria, with the different entries targeting different activities. Well, the answers to the survey did not show this.

There was only one clear “business Grid”, GRASP. This one is definitely targeted at the business market. Then there were also two other entries only aimed at the industry (GEDDM, GridCast): they are middleware that help to use efficiently the Grid.

Three others (FTM, GridLab, VISIT) are only oriented towards “other scientists”, those doing research in fields other than Grid computing. These should be mature enough to be used by non-specialists, but are most probably still research projects, as the industry is not expected to be interested. COVISE and BOINC are for non-Grid specialist researchers, and for the industry. That should mean that these are mature to start being widespread, and it may well be, as BOINC hosted the internet-wide SETI@home, while COVISE has been helping the industry since 1997.

But most of the others hit the research project tickbox, meaning to say that it is still under development, or only meant to be used by people who experiment on Grids, and should only serve as a basis for computer science research.

Does this mean we're still in a development phase, or should we see the emerging 'industrial' projects as some bleeding-edge projects surfing on the first wave of a future Grid ubiquity?

Language/OS


Another important criterion is the programming language used to develop the middleware.

One can see that most use only one single language. Java leads ahead (10 entries), followed by C (7), then C++. Which clearly states Java and C like are predominant. There are also small projects written one only in Python and the other in Mozart (P2PS).

There are five middleware within the database which are written in two languages, and five more use three!

What can be said is that Java, C, C++ are the main programming languages used, and Perl is sometimes used in combination with one of the previously cited. This was to be expected, as they are the globally most used programming languages.


Open source


The license is also a critical issue when comes the time to choose a software tool. Within the database, we have tried to see which where the licenses being applied. A majority is open-source: 11 are GPL, 8 are BSD-like, 3 LGPL. But 10 were left unspecified. This may mean the question was forgotten, but presumably that the choices did not include enough latitude. Those that did not specify their license may have a restrictive license of their own, and couldn't fit it in.

Interoperability

Operating System restrictions


To have an ideal tool on a heterogeneous grid, it needs to be able to be run on any Operating System. This is supported by 7 of the surveyed middleware. There are 8 which support only Linux. The others offer various combinations, like 3 supporting Windows and *nix but not Mac, 4 open to Posix & *nix, 3 open to Posix and Mac, 2 restricted to Linux and Solaris, and one is limited to Windows and Linux.

This may be seen as having three categories: the ones that accept any Os, those that accept some variety, but not all, and the ones that are OS specific. But our attitude must stay careful, as some research projects are not meant to become pervasive, or at least in their current development, as they're only tools to prove concepts, and have no need of portability.


Interoperability with other middleware


This part concerns the capacity of a middleware to inter-operate with other existing middleware. The first part concerns the interface: is the middleware capable of communicating with another? The second concerns the reliance: does the middleware require the installation of another before being fully capable?

  • INTERFACE: a middleware is said to interoperate with another when it has built-in functions that allow the former to communicate with the latter, through it's API. Since Globus has been used for so long, it the one most often chosen to inter-operate with: 11 middleware inter-operate with Globus, when 8 inter-operate with MPI, and 5 with UNICORE. CORBA and SOAP are made accessible by two middleware each. Tomcat, NorduGrid, PBS, LFS, and Condor were also cited.

  • RELIANCE: a middleware is said to rely on another when the former needs the installation of the latter to run. Globus, here also, is the most often required middleware: 13 of all the middleware surveyed require it, when 8 require MPI. The others also cited were UNICORE (required by 2 middleware), Mozart/OZ, OGSI.net, Jini, CORBA, PM2, OGSA.DAI, and Tomcat.

Country


A study of the countries where the middleware are developed may give an idea of the active countries in the Grid domain.

Belgium, Italy, Sweden provided each one middleware. Spain provided 2. But France (7), UK (7), and Germany (10) answered very positively entering many different middleware. Can we conclude that the main driving forces in Europe are these 3 countries?

USA provided 3 (Globus, Narda, Boinc) and Japan 2 (Gfarm, Ninf-G). This will allow making a comparison with other continents later on.


Yüklə 0,53 Mb.

Dostları ilə paylaş:
1   ...   14   15   16   17   18   19   20   21   22




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©muhaz.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin