|
|
səhifə | 5/9 | tarix | 01.11.2017 | ölçüsü | 446 b. | | #24870 |
| Characteristics: - dynamicity
- complex relationships
- frequent updates
- complex queries
A first approach : (distributed) directory (LDAP) A first approach : (distributed) directory (LDAP) - easy to use
- tree structure
- distribution
- static
- mostly read ; not efficient updating
- hierarchical
- poor procedural language
A second approach: (relational) database
Message passing (sockets, PVM, MPI) Message passing (sockets, PVM, MPI) Distributed Shared Memory Data Parallelism (HPF, HPC++) Task Parallelism (Condor) Client/server – RPC Agents Integration system (Corba, DCOM, RMI)
Parallelize the program with the right job structure, communication patterns/procedures, algorithms Parallelize the program with the right job structure, communication patterns/procedures, algorithms Discover the available resources Select the suitable resources Allocate or reserve these resources Migrate the data Monitor the execution ; checkpoints ? React to changes Collect results
It has long been forgotten!!! It has long been forgotten!!! Though it is a key issue! Issues - indexing
- retrieval
- replication
- caching
- traceability
- (auditing)
And security!!!
Grids lack most of the tools mandatory to share (index, search, access), analyze, secure, monitor semantic data (information) Grids lack most of the tools mandatory to share (index, search, access), analyze, secure, monitor semantic data (information) Several reasons Why is it so difficult? - Sensitivity but openness
- Multiple administrative domains, multiple actors, heterogeneousness but a single global architecture/view/system
- Dynamicity and unpredictability but robustness
- Wideness but high performance
Maintain a mapping between logical names for files and collections and one or more physical locations Maintain a mapping between logical names for files and collections and one or more physical locations Decide what, where and when a piece of data must be replicated Important for many applications - Multiple Petabytes of data per year
- Copy of everything at CERN (Tier 0)
- Subsets at national centers (Tier 1)
- Smaller regional centers (Tier 2)
- Individual researchers have copies of pieces of data
Much more complex with sensitive and complex data like medical data !!!
Security, security, security (incl. privacy, monitoring, traceability…)) at a semantic level Security, security, security (incl. privacy, monitoring, traceability…)) at a semantic level Access protocols (incl. replication, caching, migration…) Indexing tools Brokering of data (incl. accounting) (Content-based) Query optimization and execution Mediation of data Knowledge discovery and data mining
Ciel, where are the data ? Ciel, where are the data ? Use case: Italian student – heart accident in Klagenfurt (ALIAS-NATHCARE EU AS projects) Data inside the grid # data at the side of the grid ! Basic idea - Use of metadata/indexes. Pb: indexes are (sensitive) information
Dostları ilə paylaş:
|
|
|