CUDA, PGAS, ..

von vi V.

CUDA?

Compute Unified Device Architecture

what are some operations?

__global__ = executed on GPU, invoked from host (CPU) cache, cannot be called from device (GPU)

__device__ = executed on GPU, called from other GPU function, cannot be called from host (CPU)

__host__ = only executed by CPU, called from host

What is the CUDA execution hierachy?

Stream: list of Grids that execute in-order

Grid: consists of 2^32 Thread Blocks

Thread Block: consists of 1024 CUDA threads

CUDA Thread: scalar execution context (individual worker)

CUDA Thread vs CPU Thread

CUDA not really a thread: single iteration in iteration space (grid) pf vectorizable loop

What is PGAS, name some languages and librarys.

Partitioned Global Address Space

-> shared data divided in local and remote parts

languages: Chapel, CoArray, …

librarys: Global Arrays (GA), GASPI, MPI3.0 RMA, …

What is UPC and basic forms of barriers?

extension to C, implementing PGAS model

basic form of barriers:

UPC Pointer?

What is DASH and some implementations?

C++ template library

implementations:

Zuletzt geändert
vor 7 Monaten