High energy & nuclear physics


High-throughput computing platform for mapping many tasks to idle computers



Yüklə 446 b.
səhifə8/9
tarix01.11.2017
ölçüsü446 b.
#24870
1   2   3   4   5   6   7   8   9

High-throughput computing platform for mapping many tasks to idle computers

  • High-throughput computing platform for mapping many tasks to idle computers

  • Since 1986!

  • Major components

    • A central manager manages pool(s) of [distributively owned or dedicated] computers - A CM = scheduler + coordinator
    • DAGman manages user task pools
    • Matchmaker schedules tasks to computers using classified ads
    • Checkpointing and process migration
    • No simple communications
  • Parameter studies, data analysis

  • Condor married Globus: Condor-G

  • Several hundreds of Condor pools in the world… or in your student room!



A DAG is defined by a .dag file, listing each of its nodes and their dependencies

  • A DAG is defined by a .dag file, listing each of its nodes and their dependencies

    • # diamond.dag
    • Job A a.sub
    • Job B b.sub
    • Job C c.sub
    • Job D d.sub
    • Parent A Child B C
    • Parent B C Child D
  • Each node will run the Condor job specified by its accompanying Condor submit file



A set of integrated executable management grid services

  • A set of integrated executable management grid services

  • Initially Expected services

    • resource management (GRAM-DUROC)
    • communication (NEXUS - MPICH-G2, globus_io)
    • information (MDS)
    • data management (replica catalog)
    • security (GSI)
    • monitoring (HBM)
    • remote data access (GASS - GridFTP - RIO)
    • executable management (GEM)
    • execution
    • commodity Grid Kits (Java, Python, Corba, Matlab…)










Latest stable versions ( 5.2.5: 2013 – 6.0: 11/2014)

  • Latest stable versions ( 5.2.5: 2013 – 6.0: 11/2014)

  • No more architecture figure…

  • Only 5 components related to 3 dimensions

    • GSI: security
    • MyProxy: credential repository/certificate authority
    • GSI-OpenSSH: GSI secure single sign-on remote shell
    • SimpleCA: certificate authority for testing purposes
    • GridFTP: file transfer
    • GRAM: job execution/resource management
    • Runtime libraries: C common libraries, XIO




Where to place instances (replicas) of a service?

  • Where to place instances (replicas) of a service?

  • What (instance of a) resource/service is to be used?

  • Condition: knowledge of the state of the grid: CPU load, network load…

  • Design of a grid service that processes high level information requests (i.e., requests that express (and are specific to) the user’s needs))











Just a new toy for scientists or a revolution?

  • Just a new toy for scientists or a revolution?

  • Huge investments

  • Classical issues but a functional, operational and applicative context very complex

  • Complexity from heterogeneity, wide distribution, security, dynamicity

  • Functional shift from computing to information

  • Data management in grids: not prehistory, but still middle-ages

  • Still much work to do!!!

  • A global framework for grid computing, pervasive computing and Web services?



Just a new toy for scientists or a revolution? Neither of them!

  • Just a new toy for scientists or a revolution? Neither of them!

  • Huge investments: too much?!

  • Classical issues but a functional, operational and applicative context very complex

  • Complexity from heterogeneity, wide distribution, security, dynamicity

  • Functional shift from computing to information

  • Data management in grids: not middle-ages, but not 21st century => services

  • Supercomputing is still alive

  • A global framework for grid computing, pervasive computing and Web services… and SOA!


  • Yüklə 446 b.

    Dostları ilə paylaş:
1   2   3   4   5   6   7   8   9




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©muhaz.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin