undefined

Buffl

Reconfigurable Computing

by Tabea R.

Erstelle LUT:

z.B.: u = ab - c (NOT C)

bedeutet u = 1 wenn a = 1, b = 1 und c = 0; ansonsten immer 0

zeichne:

Draw two signals in the figure of the switch box below. You can arbitrarily choose which port is input and output.

Identify Two Paths:

Choose any two paths that you want to configure for the signals.
Ensure these paths do not overlap or intersect at the same programmable connection points to avoid conflicts
Da wo Linie durch Dreieck geht, Rechteck am Dreieck wird 1 -> erlaube Buffer passthrough
Da wo Linie am Dreieck vorbei geht, Rechteck am Dreieck wird 0 -> erlaube Buffer kein passthrough
Linie geht einmal durch das Dreieck, einmal vorbei

Setze die Multiplexer
Nummeriere die möglichen Ausgangssignale von den Trapezen (0 - 2)
Wenn die vorher gezogene Linie durch das 0. Signal geht, fülle im Trapez-Rechteck 0 0 aus (Binär für 0)
Wenn die vorher gezogene Linie durch das 1. Signal geht, fülle im Trapez-Rechteck 0 1 aus (Binär für 1)
Wenn die vorher gezogene Linie durch das 2. Signal geht, fülle im Trapez-Rechteck 1 0 aus (Binär für 2)

ONLY

N = 2

S(2) = 1 / ((1 - 0.99) + (0.99/2))

= 1 / (0.01 + 0,495)

= 1 / 0,505

~ 1,98

ONLY maximum speedup

The maximum theoretical speedup is 100x, regardless of the number of processors N, because the sequential portion limits further improvement.

Formel

Berechnung:

start with x as input variable
create comparison and arithmetic operation nodes (<, >, *, + )
create MUX
1. control (comparison node) input should come from the sides
2. resulting values of arithmetic operations come from the top
3. no more than two inputs and one decider/control
connect MUX with each other
add output variable y

Generate partitions for the graph below by hierarchical clustering until you receive a bi-partitioning. The edge weights denote the closeness between the operations o1,...,5. Use arithmetic mean for combining closeness values.

such die “größte” Verbindung (hier 45 o1 zu o4) und “lösche” diese (45 wird fürs rechnen ignoriert)
berechne aus dem daraus entstehenden neuen Knoten (o1,o4) die Verbindungen (Entfernungen addieren und /2 )
wiederhole Schritt 1 und 2 mit den neuen Verbindungswerten (größte wird wieder entfernt, Node zum Knoten hinzufügefügt, berechnung der Verbinungen …)
1. o1,o4 -> o5 = 10 aus Schritt 1
2. o1,o3,o4 -> o5 = 10 + 0 da o1,o4 -> o5= 10 und o3 -> o5 = 0
Wiederhole so lange bis nur noch zwei Nodes übrig sind

Determine the ASPA schedule (without any hardware constraints)

Determine the ALAP schedule without any hardware constraints

Determine the URGENCY by applying list scheduling using the urgency as priority function. Assume that one multiplier and one adder units are available having computation delays of 2 and 1 cycles, respectively

Determine the schedule by applying list scheduling using the urgency as priority function. Assume that one multiplier and one adder units are available having computation delays of 2 and 1 cycles, respectively

What does the MathProg Code for “Only Once” Constraint look like?

times ergibt sich aus D - 1 (weil Computer mit 0 startet)

nodes sind die Anzahl der Knoten

What does the MathProg Code for “Precedence Constraint” look like?

Give a full example for optimal scheduling using ILP in MathProg

Provide the optimal solution and sketch for scheduling using ILP

wobei x[OPERATION/NODE, TIME]

What is D_min?

Dmin = 3

Provide the ILP model with resource constraints a_mult=1 und a_add=1

Provide the optimal solution and sketch for scheduling using ILP, given constraints a_mult=1 und a_add=1

What is D_min, given constraints a_mult=1 und a_add=1?

Dmin = 3

Provide the ILP model using modulo scheduling and a super sink node

Provide a sketch using modulo scheduling

Optimal solution: x[1,0]=1, x[2,1]=1, x[3,4]=1, x[4,3]=1, x[5,1]=1, x[6,2]=1, x[7,5]=1, x[8,4]=1, x[9,5]=1, x[10,3]=1, x[11,0]=1, x[12,1]=1 y[1]=y[2]=y[3]=y[4]=y[5]=y[6]=y[7]=y[8]=y[9]=y[10]=0, y[11]=y[12]=1

What is the minimum II?

II_min = 6

What is the minimum D?

D_min = 7

What is a super sink node, why do we need it and how does it work?

What:

conceptual or artificial node introduced to simplify the problem by consolidating the flow into a single destination

Why:

network flow problems: there may be multiple sink nodes (nodes where flow can terminate). Introducing a super sink node allows all flow to be redirected to a single destination, simplifying the flow constraints.
simplify: can make it easier to define constraints and solve the problem

Given the DFG

and

Determine t_{p,max} and total resources in BLE for a fully parallel combinatorial implementation (no pipeline stages).

Lösung:

t_{p,max} berechnet sich aus den ns mit dem längsten Pfad
Total Resources berechnet sich aus 3x Adder-Node à 16 BLE und 3x Mult-Node à 256 BLE

Given the DFG

and

we want a high performance realization. Pipeline the circuit, such that a clock frequency of at least 100 MHz is possible. Use a minimum of additional resources for pipelining to just reach this goal. Determine the delay tp,max and total resources in BLE.

Note: The first pipeline stage is located at the output, all others are perfectly distributed in the combinatorial path

Calculate (f=100MHz): T = 1000/f = 1000/100 = 10ns
From the requirements, t_MUL is > 10ns as it is 15ns; therefore we have to add partition for the multiplier. The smallest amount of partitions/resources so we get A_MUL is < 10ns -> 15ns/P_MUL = 15ns/2 = 7.5ns
This means we have to add 2 partitions (and registers) to every multiplier; 2* number_of_multiplier = 2*3 = 6
We also have to add three registers for the addition
We update the BLE computation using P_MUL and P_ADD

The new t_p,max is 7.5ns
T_min = 7.5ns + 1ns = 8.5ns
f_max = 1/8.5ns = 1127.6 MHz

???????????????

Aufgabe 8 1.d & 2

Join Course

Preview

Author

Tabea R.

Information

Last changed
5 months ago

Report course

Exercises

Author

Tabea R.

Information