undefined

Buffl

by jo G.

Distributed Computing

My definition

Distributed Computing is about systems (hardware and software) that work together at a geographical distance acting as a problem solving environment for the benefit of their users

Why distributed computing?

Application may require distributed computing resources, e.g. data produced in one location, but needed in another/several location/s, different types of resources are not available in a single place
Using distributed computing is beneficial for practical reasons, e.g. more cost-efficient, higher reliability/availability (e.g. no single point of failure), easier to expand, …

(Typical) Architectures

Client-Server: clients contact server for work and push results back
Peer-to-Peer: no special machines, all responsibilities are distributed, peers can be both servers and clients at the same time

Distributed Computing – Examples

Telecommunication networks (telephone, mobile phone)

Network applications (WWW, massively multiplayer online games, VR communities, networked file systems, banking systems, airline reservation systems, …)
Big Data
Cloud Computing

➔ in this lecture we do not look at telecommunication networks or network applications (covered in other Telematics-related lectures); cluster computing is address in my other lecture on parallel computing

Meta-Computing

Als Metacomputer bezeichnet man die logische Integration eigenständiger und über eine Hochgeschwindigkeit-WAN- Verbindung gekoppelter, ggf. heterogener Parallelrechner zu einem virtuellen System
Der Programmlauf kann hierbei verteilt stattfinden, wobei die Zusammensetzung der ausführenden Instanzen anwendungsbezogen selektiert wird

Metacomputing bedeutet

eine einzelne Anwendung über verschiedene Parallelrechner so zu verteilen,
dass eine heterogene Ansammlung von Parallelrechnern aufeinander abgestimmt die Gesamtaufgabe bearbeiten

Motivation für Zusammenschaltung

Zusammenschaltung (Aggregation) von Hauptspeicherkapazität und Rechenleistung mehrerer Supercomputer um

Probleme zu lösen, die nicht auf einem Supercomputer gelöst werden können (sog. grand challenge problems)
Probleme schneller zu lösen, als dies auf einem einzelnen Supercomputer möglich ist

Gekoppelte Simulationen

Jede Simulation wird auf der optimalen Plattform gerechnet

Beispiel 1: Wiedereintritt eines Raumfahrzeuges Verbindung von Aerodynamik- und Strukturmechanik-Simulationen
Beispiel 2: Verteilung von Schadstoffen im Grundwasser Hydrodynamik: Geschwindigkeitsfeld des Grundwassers
Transport von Chemikalien in einem gegebenen Geschwindigkeitsfeld
Erhöhte Nutzbarkeit und Auslastung von Systemen
Zugang zu anderen Architekturen & Verschiebung von Job-Workload

Programmierung von Metacomputern

Programmierung über gemeinsamen Speicher ist sehr schwer

Nachrichtenkopplung (MPI) ist Methode der Wahl
Genereller einsetzbar, da auch auf SMP verwendbar
Populär auf großen MPP-Systemen wie auch Clustern

Rahmenbedingungen: Aufwand, um eine Anwendungen für

Metacomputer fit zu machen, sollte so gering wie möglich sein

Benötigt eine spezielle MPI-Umgebung für Metacomputer
Sollte auf MPI-Standard aufsetzen

Beispiele (alle älter)

MetaMPICH
Keine Notwendigkeit des „code rewriting“
Basiert auf MPI-Standard und MPICH-Implementierung
Flexible, adaptierbare Verbindungstopologie zur
Ausnutzung der vorhanden Infrastruktur

The Grid Concept

New class of infrastructure based on the internet
Offers a means of harnessing distributed resources
Supposed to provide scalable, secure, and high-performance
mechanisms for ...

... discovering resources

... negotiating the access to resources

... using remote resources

Enables scientific collaborations to share resources on unprecedented scale
Enables distributed groups (virtual organizations) to work together in ways that were previously impossible

A Grid Checklist

Coordinates resources that are not subject to centralized control
User’s desktop vs. central computing, different administrative domains of the same company, different companies etc.
Addresses security, policy, payment, membership etc.

…using standard, open, general-purpose protocols and interfaces

Authentication, authorization, resource discovery and access

… to deliver non-trivial qualities of service

Response time, throughput, availability, security, co-allocation

Utility of the combined system must be significant (greater than sum of its parts)

The Worldwide LHC Computing Grid (WLCG)

Infrastructure for storing and analyzing the data from the LHC Infrastructure can also be used by other data intensive projects

Collaboration of more than 170 institutes in 34 countries worldwide
Multi-tiered, hierarchical model

Join Course

Preview

Author

jo G.

Information

Last changed
2 years ago

Report course

Kapitel 6

Author

jo G.

Information