What are some Basic MPI Communicaton?
Point-to-Point Communication
MPI_Send() - send data from one process to another
MPI_Recv() - receives data from another process
Collective Communication
MPI_Bcast() - broadcasts data from one process to all
MPI_Reduce() - combines results from multiple processes
MPI_Gather() - collects data from multiple processes to one
Compare MPI, Open MP and CUDA
What are some ILP techniques?
Pipelining: overlap individual parts
Superscalar execution: multiple things at same time
Out of Order execution (OOO): re-order operations
What is Cache?
fast (expensive) memory which keeps copy of data in main memory (hidden from software)
cache-hit: in-cache memory access (cheap, fast)
cache-miss: non-cached memory access (expensive, slow)
What is Cache Associativity?
how cache memory structured to map main memory to cache locations
Direct-mapped: only 1 address line
Fully-associative: line can be stored anywhhere in cache
n-way set associative: determine set for line
What are sources of Cache Misses?
Compulsory: 1st access to a block
Capacity: cant contain all blocks -> increase size
Conflict: collision
Coherence: invalidation
What is the difference between Weak and Strong Scaling=
increasing amount of parallel resources can be used
Strong Scaling: (harder to achieve)
constant problem size
hope for linear devrease in execution time with increasing nubmer of cores
Weak Scaling:
increase problem size in accordance to number of cores
hope for constant execution time
Hardware View. What are the 2 big classes?
Distributed Memory
= main memory (RAM) is physically distributed (CPU can only access portion of it)
Shared Memory
= main memory (RAM) is physically shared (CPU can directly read/write all of it)
UMA - uniform memory acces (all memory accessible with same speed
NUMA - non-unform meemory access (some parts of memory accessible faster)
What are the 2 big classes of Software View?
Threading
= program consists of several threads
prrivate data
shared data (read/written by all threads)
communication is implicity
Message Passing
= program consists of several processes
private data
no shared data
communication is explicity (sending messages)
Where in Hardware and Software View are e.g OpenMP, MPI, …
Open MP, Pthreads, Java / C++ Threads
Cluster OpenMP, PGAS
CSP Go, Erlang, MPI
MPI (Message Passing Interface)
What is False Sharing?
Give 2 software solutions for False Sharing.
occurs in multi-threaded programs
multiple threads modify different variables on same cache line
—> cause unnecessary cache invalidations
Solutions:
(Cache-line) Padding
make extra space btw variables -> placed in separate lines
Align Data Structures
-> cache line boundaries
What is True Sharing?
multiple threads share and modify same variable
—> leading to cache synchronization overhead
Solution: Atomic operations, distributed counters, thread-local storage
Zuletzt geändertvor 2 Monaten