3 types of resource limitations (Muzner 2014)
Visualization designers must take into account three very different kinds of resource limitations: those of computers, of humans and of displays
Computational limits: Processing time / system memory
Display limits: Number of pixels, Information density (ratio of used space vs unused whitespace)
Human limits: Perception, attention and memory (e.g. change blindness)
Explain the visualization pipeline. What are the four stages?
Visualization Pipeline
A general model of visualization process with 4 stages where the user can interact with the process at each stage
4 stages: data acquisition -> filtering / enhancement -> visualization mapping -> rendering
Explain the data acquisition stage. What are three general cases?
3 general cases: simulation, data bases, sensors
with this 3 cases you get raw data as a result which then will be visualized
Examples: global climate simulation, images of particles in fluid, weather radar data,…
Explain the filtering / enhancement stage. Give at least two examples.
process to obtain useful data (e.g. 3D volume) / derived data computed from the raw data
Examples:
data format conversion
co-registration of data sets
resampling to grid
interpolation / approximation of missing values
data reduction
clipping / cleaning / denoising
etc.
Explain the visualization mapping stage. Give at least two examples.
Derived data that was computed during the filtering / enhancement step now is mapped to some rederable representation
scalar field mapped to (->) isosurface
2D field -> height field
Decide
which parts of the data are shown?
How to represent them?
Graph.primitives: points, lines, surface, volumes
Visual channels: color, texture, transparency
Explain the rendering stage. Give at least two examples.
from rederable representation to display images / video
Generate 2D image / video
viewpoint specification
visibility calculation
lighting / shading, composition
compute values at sampling points
In which stage of the visualization pipeline happens resampling to a regular grid?
Filtering / enhancement step
In which stage of the visualization pipeline are the viewpoint and lighting parameters specified?
Rendering stage
In which stage of the visualization pipeline happen lighting and shading?
In which stage of the visualization pipeline are colors assigned to every voxel?
Visualization mapping stage, for instance using a transfer function to assign color and opacity to each voxel.
In which stage of the visualization pipeline happen smoothing and noise suppression?
Filtering and enhancement stage (?)
Discuss independent vs. dependent variables in data. Give at least two examples each.
Data Representation
independent variables refer to the dimensionality of the domain of the problem
e.g. 2D or 3D Space, time
Do not depend on anything else
Dependent variables refer to the type and dimension of the data to be visualized
e.g. temperature, density values, velocity vectors
Depend on the location where we measured those, e.g. measured in respect to the independent variables
What are the independent and dependet variables in a 3D spatial curve 𝟇 : ℝ -> ℝ^3
R would be independent variable and R3 woul be the dependent variable
Basically a function of a parameter and the output would always be a 3D location for instance
What are independent and dependent variables in a 3D vector field?
We have 3D space = independent variables
3D vectors given at every location so the vectors itself would be the dependent variables
What type of attribute are the following: Categorical, ordinal, or quantitative?
a) Type of cheese (e.g. Swiss, Brie)
b) Tire pressure (e.g. 2.3 bar, 2.5 bar)
c) first name (e.g. Alice, Bob)
d) Unemployment rate (e.g. 6%, 10%)
e) T-shirt sizes (e.g. medium, large)
TODO
Draw an illustration of a Cartesian grid. Describe how such a grid is different from a regular grid. Which information needs to be specified explicitly for such a grid?
Cartesian grid = equidistant grid
Spacing is constant in x and y (or z in 3D) dimension
Difference to uniform / regular grid: same spacing within one direction, but spacing in x and y direction are different
Information that has to be stored: Index, coordinates
Positions of the individual cells can be computed from the indices - don’t have to be stored
Neighborship information is implicit
What is a curvilinear grid? How is it characterized? How is it different from an unstructured grid? Which information needs to be specified explicitly for such a grid?
non-orthogonal grid
Grid points / position of individual vertices specified explicitly
still have regular grid cells so neighborhood information is still implicit
Unstructured grid: both location of the individual vertices and neighborhood information has to be specified explicitly
cells are tetrahedra or hexahedra
How to proof for a triangulation that it is a Delaunay triangulation?
Draw a circle aroung each of the triangles and whenever we draw such a circle we have to make sure that no vertex from another triangle is inside that circle. If this would happen then the Delaunay property is violated and its locally non-Delaunay and we would have to perform e.g. an edge-flip operation to get it Delaunay.
Visualization (Definition)
The use of computer-supported, interactive, visual representations of data to amplify cognition
Medical Visualization
Preoperative planning of tumor resection, Virtual fetoscopy (4D Ultrasound), MRI scans,…
Why visualization?
lets you see things that would rather go unnoticed (data trends, outliers, dependencies, etc.)
gives answers faster
lets you interact with your data, study causes and effects, etc
helps to deal with increasing size and diversity of data
produces pretty, informative and interactive pictures
Lie factor
Size of effect shown in graphic / Size of effect in data
What is Visualization good for?
Visual exploration (Nothing is known about the data)
find the unknown / unexpected
generate new hypothesis
Visual analysis (confirmative vis.) (There are hypotheses)
confirm or reject hypotheses
information drill-down
Presentation („Everything“ is known)
effective / efficient communication of results
The visible human project
bodies were frozen in a special material to preserve tissues and organs
Sections were „shaved“ off the frozen block micro-thin layers to expose underlying tissues
a picture is taken
A „stack“ of 2D images is obtained
Passive Visualization Scenario
complexity / technical demands: low
benefits, possibilities: Low
Interactive Visualization Scenario
complexity / technical demands: middle
benefits, possibilities: middle
Computational / Visual Steering Scenario
complexity / technical demands: high
benefits, possibilities: high
Characteristics of data values
Attribute types (quantitative vs. qualitative)
Domain (continuous vs. discrete)
Value range (includes precision of values)
Data type (categorical, scalar, vector, tensor data)
Dimension (number of components)
Error and uncertainty
(physical) interpretation
Attribute types
Quantitative: numerical, measurable
e.g. length, mass, temperature
metric scale - allows measure of distance
Continuous (real) or discrete (distinct & separate values)
Qualitative: categorical data, not measurable
No metric scale; cannot be measured
Requires a subjective decision in order to be categorized
Discrete
Nomial
No natural ordering or indication of values, only equivalence and membership (=, ≠)
e.g. eye color (blue, green, brown)
Ordinal
Logical order relation (<, >) but no relative size or degree of difference
e.g. judgement of size (small, medium, large), Attitudes (strongly disagree, disagree, neutral, agree, strongly agree)
Categorical data: Values from a fixed number of categories
Scalar data: Given by a function
Vector data: Represent direction and magnitude and given by an n-tupel; 2D vector field where every sample represents a 2D vector
Tensor data: A multi-dimensional matrix
Scientific Visualization
Deals with the reconstruction of a continuous real object from a given discrete representation
Data that has some physical or geometric correspondence
Information Visualization / Visual Analytics
Deals with data that is discrete and more abstract
Does not have a physical or geometric correspondence
Symbolic, tabular, networked, graphs, textual information
Scattered Data
Grid-free data
data points given without neighborhood relationship
influence on neighborhood defined by spatial proximity
Scattered data interpolation
Isocontours
Curves on which all points have a certain (constant) value
Are hyperbolas, not lines
Radial basis function
Construct a continuous function f from a given set of points and values which approximates the given values
Independet of dimension of parameter domain (1D / 2D / 3D)
Function represented as weighted sum of N radial functions
Nearby points have higher influence than far-away points
Each radial function is centered around a data point. Values decrease quickly the further away from this point / the functions center
Interpolation: we want a curve which is going through the initial data points and changes smoothly in between them
Drawbacks of radial basis functions
Every sample point has influence on whole domain
Adding a new sample requires re-solving the equation system
Computationally expensive (solving a system of linear equations)
What can we do?
Find a different radial function
Give up finding a smooth reconstruction and try finding a piecewise (local) reconstruction function
What is a good triangulation?
A measure for the quality of a triangulation is the aspect ratio of the so-defined triangles
Avoid long, thin triangles
Make triangles as „round“ as possible
Maximize the minimum angle in the triangulation
Maximize radius_of_in-circle / radius_of_circumcircle ratio
A Delaunay triangulation is an optimal triangulation
Delaunay triangulation
The circumcircle of any triangle does not contain another point of the set
Maximizes the minimum angle in the triangulation
Such a triangulation is unique (independet of the order of samples) for all but trivial cases
Building a Delanuay triangulation from initial, non-optimal triangulation: successively improving the initial triangulation via local operations
Edge-flip operation:
An edge is local Delaunay if there exists an empty circumcircle / The circumcircle of any triangle does not contain another point if the set
If an edge shared by two triangles is illegal, a flip operation generates a new edge that is legal
If a triangulation is locally Delaunay everywhere -> globally Delaunay
Voronoi Diagram
Problem: Looking for nearest neighbor
Partitions domain into Voronoi regions: Each Voronoi region contains one initial sample - the Voronoi samples
Points in Voronoi region are closer to respective sample than to any other sample (blue circle)
Centers of circumcircles of Delauney triangulation (borwn circle)
The geometric dual (topologically equal) of Delaunay triangulation (blue = Delaunay, yellow = Voronoi)
Points in a Voronoi region are closer to the respective sample than to any other sample
Isolines (2D)
An isoline (iso-contour) consists of all points at which the data has a specific value c: {(x,y) | f(x,y) = c} (Given a 2D scalar function and a scalar iso-value c)
Can be seen as a special kind of data condensation
Isolines are always closed curves (except when they exit the domain)
Isolines never (self-) intersect, thus they are nested
Isolines are always orthogonal to the scalar field’s gradient
The true isolines within a cell are hyperbolas
Marching-Cubes Algorithm
Approximates the surface by a triangle mesh
Surface vertices are found by linear interpolation along cell edges
Efficient triangulation by means of lookup tables
the standard geometry-based surface extraction algorithm for 3D scalar field
Computes isosurface for specific iso-value
Cell consists of 8 vertices
indices: (i+[0,1], j+[0,1], k+[0,1])
Consider a cell (defined by 8 vertices with associated data values) independently
Classify each vertex as inside or outside (outside the surface: value < iso-value, inside the surface: value => iso-value)
Use the binary labeling of each cell to compute an index: outside = 0, inside = 1
Get per-cell triangulation from index: look up the triangulation for every index from a pre-computed table
Interpolate the edge location : for each triangle edge, find the vertex location along the edge using linear interpolation of the vertex values
Compute gradients : Calculate normals at each cube vertex (via finite differences), and interpolate along the edges
Consider ambiguous cases: use asymptotic decider as in 2D for this
Go to next cell
Summary:
256 Cases, Reduce to 15 Cases by symmetry
Causes holes if arbitrary choices are made
Up to 5 triangles per cube
Not that triangulation is only approximation of true isosurfaces produced by trilinear interpolation
Voxel
Data values are initially given at vertices of a 3D grid = voxels (volume elements)
Voxel = point sample in 3D
Phong’s illumination model (+ 3 components)
Considers ambient light and point lights as well as the material color and reflection properties
Ambient light: background light, constant everywhere
Diffuse reflector: reflects equally into all directions
Specular reflector: reflects mostly into the mirror direction
Lighting
Necessary to emphasite iso-surface shape: simulate reflection of light and effect on color
Phong’s illumination model
How can a perfect mirror be simulated via the Phong illumination model?
Ambient light
(Formel!)
C = ka Ca Od
Background light
constant everywhere
ka = ambient reflection coefficient aus [0,1]
Ca = Color of the ambient light
Od = object color
Diffuse reflection
Scatters light equally in all direction
C = kd Cp Od cos θ bzw. C = kd Cp Od (l * n)
kd = diffuse reflection coefficient aus [0,1]
if kd = 0 -> black, kd = 1 -> am hellsten
Cp = color of the point light
cos θ = angle between light vector l and normal n
if l = n meaning light is precisely above the point and θ = 0 the cos 0° = 1 -> 100% intensity
if light shines onto point in a 45° angle -> cos 45° ≈ 0.7 -> 70% intensity
if 90° angle = 0% intensity
highest diffuse reflection when normal vector = light vector, point precisely below light source
Specular reflection
Highlight = reflection of light source
Glossy surfaces
Reflects mostly into the mirror direction
view dependent!
C = ks Cp Od cos^n 𝞅 bzw. C = ks Cp Od (r * v)^n
ks = specular reflection coefficient aus [0,1]
cos 𝞅 = angle between the reflected light ray r and the vector to the viewer v
^n = shininess factor (controls extend of highlight)
the larger n the smaller the highlight / intensity
highest specular reflection when reflected vector = view vector
when the expont in the Specular reflection formula (shininess factor) gets close to infinity we have almost a perfect mirror
Calculate vector to the viewer (Phong illumination model)
Camera position vector minus surface point vector
Calculate light vector (Phong illumination model)
Vector of position of a point light source minus vector of surface point
Where does volumetric data come from?
medical scanners e.g. CT scan
automotive engineering
Volume rendering techniques
Techniques for 2D scalar fields (e.g. slicing)
Indirect volume rendering techniques (e.g. surface fitting)
direct volume rendering techniques
Indirect volume rendering techniques
e.g. surface fitting
Convert / reduce volume data / raw data to intermediate representation (surface representation) first, which can then be rendered with traditional techniques
MC-algorithm (Marching-cubes algorithm)
Direct volume rendering (DVR) techniques
consider data as a semi-transparent gel with physical properties and directly get a 3D representation of it
e.g. Ray-casting
Volume material attenuates reflected or emitted light
Get a 3D representation of the volume data taking into account emission and absorption (without making an intermediate representation first)
considers the physics of light transport in a dense medium
Optical properties are mapped to each voxel (emission = color, absorption = opacity)
The light reaching the viewer is simulated by ray casting
Slicing
Display volume data, mapped to colors, on a slice plane
Iso-surfacing
Generate opaque / semi-opaque surfaces (e.g. via the MC-Algorithm)
Transfer function
performs mapping of data values to visual properties like color and opacity
associate distinct materials (value ranges) to disting properties (color & opacity: as (RGBa))
assign a different color to each scalar value (for scalar data)
Opacity
Opacity alpha = 1 -> completely opaque
Opacity alpha = 0 -> completely transperent
Volume rendering integral
Ray-casting (Direct volume rendering)
Numerical approximation of the volume rendering integral
A ray is cast into volume for each output pixel -> volume is resampled at equidistant intervals along the ray (integral as a sum over samples) -> Sample values are tri-linearly interpolated -> apply transfer function
Or Volumetric ray integration using front-to-back strategy (alpha-compositing), first-hit, average or maximum
Ray-casting method (Direct volume rendering)
Defines a virtual image plane where viewer is looking through
Cast a ray through every pixel on the screen
for each pixel on the image plane
compute entry- and exit-point in volume
while current position inside volume
read density at current position
apply transfer function: scalar value -> color + alpha-value
compute shading
apply compositing
compute new position along ray
end while
set pixel color in image plane
end for
Volumetric compositing (Ray-casting)
accumulation of color and opacity along rays
Variations of compositing schemes
alpha-compositing
Surface rendering / first hit
Average
Maximum
stop ray traversal if an iso-surface is hit (larger than a certain threshold), and shade the surface points
produces same result as marching cubes, but with higher accuracy
Average compositing scheme
simply accumulate colors but does not account for opacity
values along the viewing ray are averaged
produces an x-ray image
Maximum intensity projection (MIP) scheme
only takes the maximum color along the ray and displays it
doesn’t account for opacity
Often used for magnetic resonance angiograms
good to extract vessel structures
Problems when doing direct rendering: Sampling artifacts
Too few samples along the ray
Interrupted artifacts -> no smooth surface of the visualization
Solution: Increase sampling rate to Nyquist frequency -> at least 2 samples per voxel
remove artifacts by stochastic jittering of ray-start position
Direct volume rendering vs. Surface rendering
Direct volume rendering
Direct representation
Conveys volume impression
Often realized in software (slow?!)
Transfer function specification
Surface rendering
Indirect representation
conveys surface impression
Hardware supported rendering (fast?!)
Iso-value-definition
How do you get values along the viewing ray (from volume data)?
Data in volumetric datasets are usually given in some kind of grid
ray casting: interpolation scheme e.g. trilinear interpolation or some other higher-order interpolation scheme
Flow visualisation
Visualize stuff that is moving, e.g. wind tunnel / wind fields, weather / climate simulations, aerospace / car / ship design
Flow visualization - data sources
Flow simulation:
Design of ships, cars, airplanes
Weather simulations (e.g. atmospheric flow)
Medical blood flows
Measurements
wind tunnel
schlieren imaging
Modeling
Differential equations systems (dynamical systems)
Main application of flow visualization
Motion of fluids (gases, liquids)
Geometric boundary conditions
Velocity / flow field v(x,t)
Conservation of mass, energy, and momentum
Navier-Stokes equations
Computational fluid dynamics (CFD)
Flow visualization - classification
Dimension (2D or 3D)
Time-dependency: steady vs. time-varying flows
Direct vs. indirect flow visualization
Steady (time-independent) vs. time-varying (unsteady) flow
Steady (time-independent) flow:
flow static over time
e.g. laminar flows
simpler interrelationships
time-dependent (unsteady) flow
flow changes over time
e.g. turbulent flows
more complex interrelationships
Flow visualization - Approaches
Direct flow visualization (arrows, color coding,…)
Geometric flow visualization (stream lines/surfaces,…)
Sparse (feature-based) visualization
Dense (texture-based) visualization
1. Direct flow visualization
e.g. color coding, arrow plots, glyphs
Gives overview on current flow state
Visualization of vectors
Glyphs
Visualize local features of the vector field
Map vector or curl to arrow glyphs
Can visualize more features of vector field, e.g.using Velocity, Curvature, Rotation, Convergence,…
Flow visualisation with arrows
Vector per grid point pointing into the flow direction
use arrow length and / or color to highlight special regions
Arrows in 3D: Advantages - Disadvantages
Advantages
Simple
3D effects
Disadvantages
Ambiguity
Difficult spatial perception (1D-objects in 3D)
Inherent occlusion effects
Poor results if magnitude of velocity varies significantly and changes rapidly
-> Use 3D arrows of constant length and color code magnitude
2. Geometric flow visualization
Use intermediate representation (vector-field integration over time)
Visualization of temporal evolution (also consider data over time)
Stream lines, path lines, streak lines
Basic idea: trace particles along characteristic trajectories and map trajectories to particles, lines, balls, bands
Types of characteristic lines (Geometric flow visualization)
Stream lines: trajectories of massless particles in a “frozen” (steady) vector field
trajectories of massless particles at one time step
doesnt show its movement over time but within a frozen flow / vector field
Path lines: trajectories of massless particles in (unsteady / time-varying) flow
follow one particle through time and space
Streak lines: trace of dye that is continuously released into (unsteady / time-varying) flow at a fixed position
connect all particles that started at the same seed point
a new particle is continuously injected at the same seed point
all existing particles are advected and connected (from youngest to oldest)
-> Comparison of path / streak / stream lines: Identical for steady flows
Stream ribbons (flow oriented)
we liked to see places where the flow twists (vortices)
Trace two close-by particles (keep distance constant)
Or rotate band according to curl
Streak surface
simultaneously release aprticles along a seeding structure (line) and connect all them to form a surface
e.g. particle-based, triangulated, or semi-transparent streak surface
Characteristic lines are tangential to the flow. What does that mean
means that the line tangent (1st derivate) is aligned to the vector field / is in vector field direction
Particle Tracing on Grids - most simple case: Cartesian grid
(Basic algorithm)
Select start point (seed point)
Find cell that contains start point
While (particle in domain) do
interpolate vector field at current position
integrate to new position
find new cell
draw line segment between latest particle positions
EndWhile
Stream line placement in 2D
irregular results when using regular grid
Evenly-spaced streamlines:
Idea: streamlines should not get too close to each other
Choose seed point with distance d_sep from existing stream line
forward- and backward-integration until distance d_test
Stop stream line integration
when distance to neighboring stream line <= d_test
when stream line leaves flow domain
when stream line runs into fixed point (v(x*)=0)
When stream line gets too close to itself
after a certain number of maximal steps
the smaller d_sep the nearer together the streamlines are
the smaller d_test in relation to d_sep the nearer together the streamlines are
Streamline placement on surfaces
Image-space technique
Vector field is first projected to 2D image
2D image is scanned at intervals d_sep
Seedpoint is found and stream lines are traced in that region
Scanning continues until seedpoint in new region is found
Seeding and integration happen in image space
Discontinuity detection
Stop stream line integration when z-depth drops to zero (edge of model) or when z-depth changes too abruptly (edge of overlapping regions)
What challenges does arrow-based direct flow visualization have?
due to the perspective it is unclear in which direction the vector is pointing so there is some ambiguity assigned to having arrows in 3D for instance, or there can be occlusion effects
Give two examples for geometric (integration-based) flow visualization. How do these techniques relate to direct flow visualization?
Streamlines
Streaklines
Pathlines
-> When using direct flow visualization we directly show the flow.
-> With geometric (integration-based) visualization we consider movement of particles along trajectories in the flow.
True or False: The Jacobian matrix at a point in a constant 3D vector field has non-zero elements on the main diagonal.
False
True or False: If the Jacobian matrix at every point in a 3D vector field is the identity matrix, then the vector field is divergence free.
True or False: The divergence at every point in a 3D vector field is a scalar value.
True
True or False: Streamlines in a steady 3D vector field never cross.
True or False: Path lines in a time-varying 2D vector field never cross.
3. Sparse (feature-based) visualization
Global computation of flow features
Vortices, shockwaves, vector field topology
Vortices
one of the most prominent features
Important in many applications (turbulent flows)
No formal, well accepted definition yet (“something swirling”)
Shock waves
characterized by sharp discontinuities in flow attributes (pressure, velocity magnitude,…)
Vector field topology (2D)
Idea: do not draw “all” stream lines, but only the “important” ones
show only tological skeleton
Connection of critical points
Characterization of global flow structures
Critical points: singularities in vector field such that v(x*) = 0 (source, saddle, sink)
Points where magnitude of vector goes to zero and direction of vector is undefined
Stream lines reduced to single point
Type of critical point determines flow pattern around it
Vector field topology (3D)
Critical points in 3D
More complicated
Line and surface separatices exist
Saddle connectors in 3D (Vector field topology (3D))
The intersection of the separation surfaces of the two saddles is the saddle connector
Dense (Texture-based) flow visualization
Global method to visualize vector fields
Dense sampling
better coverage of information
Critical point detection and classification
(Partially) solved problem of seeding
Flexibility in visual representation
Good controllability of visual style
From line-like (crisp) to fuzzy
Line Integral Convolution (LIC)
Global visualization technique (not only one particle path)
Dense representation of flow fields
Convolution along characteristic lines -> correlation along these lines
for 2D and (3D flows)
Start with a random texture (white noise)
Smear out the texture along trajectories of vector field
Results in low correlation between neighboring lines but jigh correlation among them (in flow direction)
Algorithm for 2D LIC
look at stream line that passes through a pixel
Smear out -convolve- noise texture in direction of vector field (along stream line)
LIC is a convolution of
a noise texture T(x,y)
and a smoothing filter
Algorithm:
Stream line containing the point
randomly generated noise texture
compute intensity using an integral
smoothing filter kernel, normalized and usually symmetric
Influence of filter length: The bigger the L the finer the lines
Filtering by convolution
Sliding a function g(x) along a function f(x)
Function f is averaged with a weight function g
-> Horizontal / Vertical Gaussian blur
Oriented LIC (OLIC)
Visualizes orientation (in addition to direction)
uses a sparse texture; i.e. smearing of individual drops
Asymmetric convolution kernel
3D LIC
only good if non-relevant data is discarded
True or False: LIC is a local method for visualizing a vector field
False, its a global method
True or False: The larger the extent of the convolution kernel used in LIC, the lower is the correlation between adjacent intensity values along a streamline
True or False: LIC images show high correlation between the intensity values at adjacent streamlines
True or False: LIC is restricted to 2D vector fields
True or False: The convolution kernel used in LIC must be symmetric
Mapping techniques - Graphical primitives
represent data items or links
points, lines, areas, surfaces
representation of links between data items: Connection, Containment
Mapping techniques - Visual channels
control appearance of graphical primitives based on data attributes
Position (horizontal, vertical, both), Color, Slope, Size (length (1D), 2D area, 3D volume), Shape
Effectiveness principle
some visual channels are better than others
encode most important data attributes with most effective / accurate channels
Expressive mapping
match type of visual channel to data type
Properties of visual channels
Pop-out (emphasize important information)
Discriminability (how many usable steps?)
Separability (judge each channel independently)
Relative vs. absolute judgement
-> Perceived color is highly context depenent
-> percetion is relative
-> use popout to emphasize data
-> choose carefully with the mapping and color
Pop-out
Preattentive processing: automatic and parallel detection of basic features in visual information (200-250 msec)
Speed independent of distractor count
Works on many individual channels
Discriminability
How many usable steps?
Must be sufficient for number of discriminable bins
we can only distinguish a limited number of colors / brightness level
Separable vs. integral visual channels
Relative vs. absolute judgements
Perception highly context-dependent
Perceptual system mostly operates with relative judgements, not absolute ones
Weber’s Law: just-noticeable difference is a fixed percentage of the magnitude of the stimuli (e.g. bar length) -> if i have a visual stimuli, to see the difference between the two, if the difference is smaller than a certain percentage of the total, you cannot see the difference anymore
Diagram techniques
Categorical + quantitative data: Bar / pie chart, stacked bars
Time-dependent data: Line graph, ThemeRiver, Horizon graph
Single and multiple variables: Histogram, scatterplot, parallel coordinates, Glyphs, color mapping
Quantitative data
numerical, measurable
objective data produced through a systematic process, not subject to interpretation (e.g. lenght, mass, temperature)
Metric scale: allows measure of distance
Qualitative data
categorical, not measurable
no metric scale, cannot be measured
Bar chart
Attribute 1: categorical -> horizontal position
Attribute 2: quantitative (dependent) -> length / vertical position
Bars should always start at zero!
Bars support comparison
Pie chart
Attribute 1: categorical -> color
Attribute 2: quantitative (dependent) -> angle
angle / area judgement less accurate than bar length
often bar chart better choice
Stacked bar chart
Quantitative data wrt 2 categorical vars (horizontal & vertical)
Investigate part-to-whole relationship (100%)
Length and color hue
Parallel sets
Quantitative data wrt. multiple categorical attributes
Shows connections and proportions
Given a 2D scatterplot using color, point size, and position to encode data. With respect to separability of visual channels there is…
A) some interference between color and position
B) some interference between color and point size
C) some interference between point size and position
D) some interference between all visual channels
E) no interference betweem the different channels
B
Which ranking sorts the visual channels for encoding quantitative data according to accuracy, starting with the highest accuracy?
A) angle/tilt - position - luminance
B) position - luminance - angle/tilt
C) length - 2D area - curvature
D) position - 2D area - length
C
Which statement on pre-attentive processing or pop-out is true?
A) it works on many combinations of visual channels
B) Speed depends on the number of distractors
C) Automatic and parallel detection of basic visual features
Line Graph
quantitative data on common scale(s) wrt. time
Connection between points - trends, structures, groups
banking to 45 degrees
Perecptual principle: most accurate angle judgment around 45°
Pick aspect ration (height/width) accordingly
Lines imply trends
ThemeRiver
Thematic changes in documents
Occurrence per topic / category mapped to width of river band
less distorted around center
Rearranging bands
Horizon graph
Reduces vertical space without losing precision
Split vertically into layered bands
Collapse color bands to show calues in less vertical space
Optional mirroring of negative values
What is the pop-out effect / pre-attentive processing? How can it be used?
pre-attentive processing: automatic and parallel detection of basic features in visual information (200-250 msec)
independent of the number of distractors
usually only works on individual channels, when having a combination of channels it usually requires a serial search instead of pre-attentive processing
Sort the following visual channels according to how accurately humans can compare them starting with the highest accuracy: 2D area – length – curvature – angle/slope
quantitative data: length - angle / slope - 2D area - curvature
qualitative / categorical data: different ranking but didn’t talk about it in the lecture
What is the difference between separable and integral visual channels?
two visual channels that are separable (color and position): we can focus on one group (either one color or one position) -> separable visual channels
integral visual channels: not possible to see the different channels separately
Name an example for fully separable / integral visual channels.
fully separable visual channel: color + position, fully separable, 2 groups each
fully integral visual channel: red + green make different colors, major interference, 4 groups total: integral color
Which visual channel(s) can be used in a bar chart? For what types of data?
From a perceptual point of view, what works better: Bar charts or pie charts? Why?
Pie chart: usually estimate the angle or area
Bar chart: length and precision of the ending is read out
humans ar much better at comparing length and position instead of angle and ares -> Bar chart better
How do Parallel sets work? What kind of data can be shown?
quantitative data with respect to multiple categorical attributes
shows connections and proportions
How does the ThemeRiver work? Which visual channels are used for which type(s) of data?
usually have categorical data with a certain frequency (how often it appears)
How do Horizon Graphs work? How can you read out values at a position?
split an original graph into some layers and use color coding to distinguish different layers, those layers can be collapsed on top of each other
Histogram
Binning: group values into equally spaced intervals (bins)
Bin width affects representation
Box plot variations
shows summary statistics of a distribution (1 variable)
Probability density function (PDF)
q1: lower quartile: 25% of data below
q2: median
q3: upper quartile: 25% of data above, 75% of data below
Interquartile range: between q1 and q3
Variations:
Tukey’s box plot
Tufte’s quartile plot
median = dot
easier to follow trend of median, more compact
Scatterplots
Show correlations between 2 dependent variables
Typically quantitative (measurable) data attributes
find trends, outliers, distributions, correlations, clusters,…
encode additional attributes by size, color, shape,…
Scatterplot matrix
show (all possible) combinations of attributes in a scatterplot matrix
Each row / column is one attribute
overview of correlation and patterns between data attributes
-> Brushing: mark data subset
-> Linking: highlight brushed data in linked views
-> Move / alter / extend brush
Parallel coordinates
Represent multiple data variables
each variable is represented by a vertical axis, which are organized as evenly spaced parallel lines
Data on each axis is normalized to min / max
One data sample is represented by a connected set of points, one on each axis
recognize patterns between adjacent axes
steep learning curve for novices
Axis ordering is major challenge: order by quality metrics
Line point duality
Points in scatterplot map to lines in parallel coordinates
points in parallel coordinates map to lines in scatterplot
Radar chart (star plot, spider chart)
Radial axis arrangement
Items are polylines
axes in center very crowded: too much information in the center
Function plot for 1D scalar field
showing single variable
1D curve
Mapping of a discrete set of points to a set of lines by connecting adjacent points
Height field for a 2D scalar field
function plot for 2 independent variables x and y and 1 dependent variable
2D surface, f(x,y) can be interpreted as height value at (x,y)
Mapping of a discrete set of points to a set of faces by connecting adjacent points
small independent visual objects that depict attributes of a data record
Discretely placed in a display space
Data attributes are represented by different visual channels (e.g. shape, color, size, orientation)
Visual channels should be easy to distinguish and combine
Mainly used for multivariate data
Star glyphs
A star is composed of equally spaced spikes, originating from the center
Length of spikes represents value of respective attribute
end of rays connected by line
Stick figures
2D figures with limbs
Data encoded by
length
line thickness
angle between lines
recognize texture patterns: see changes looking at texture and neighborship, not individual glyphs but patterns that they form
Chernoff faces
Data attributes represented by features of a face (eye position, nose length, mouth form,…)
Problem:
Faces are preceived holistically: we interpret mood of glyphs but its actually data visualization
Efficiency?
Color
light is electro magnetic radiation
different wavelengths are perceived as different colors
Human eye can only see light between 380nm and 780nm (visible spectrum)
Visual effect of chromatic light (light spectrum) can be characterized by 3 channels
Hue: dominant wavelength
Saturation: pureness, amount of white light
Luminance / Brightness: intensity of light
Color mapping
Emphasize a specific target in a crowded display (pop-out)
Group, categorize, and chunk information
Possible problems:
Dependent on viewing and stimulus considtions
distract the user when inadequately used
Ineffective for color deficient individuals
Results in information overload
Color maps can be
categorical vs. ordered
sequential vs. diverging
discrete vs. continuous
Color mapping - Perceptual linear
equal steps in color map (i.e. magnitude of data) should be perceived equally
equal steps in color map don’t mean equal steps in reality
Color mapping - Perceptual ordering
ordering of data should be represented by ordering of colors
rainbow colormap is perceptually unordered
Things to know when mapping data to colors
Perceived color is highly context dependent
Size matters
vary luminance too
make sure contrast is high
Colors are more useful for qualitative statements
do not use color if it is necessary to read our precise values
use “good” color maps
Which visual channels can be used in a scatterplot besides position?
size
color
shape
…
How does a scatterplot matrix work? How can you see correlations?
used to show more than two variables: but the combination of a lot of variables
positive correlation: von links unten nach rechts oben
negative correlation: von links oben nach rechts unten
How does linking and brushing work?
brushing: mark (interesting) data subset
Linking: highlight brushed data in linked views
number of data point is selected in one plot of the scatter plot matrix and the corresponding data points are also highlighted in the other scatter plots in a scatter plot matrix
What does it mean when the lines between two axes in a parallel coordinates visualization meet in a point?
if the lines of a parallel coordinates plot meet in one point it means that we have a negative correlation between the two axes / attributes
What are glyphs? For which type of data are they typically used?
small independent visual objects that show attribtues of a data record
can be placed individually somewhere in the visualization in display space
different data attributes can be encoded by different visual channels of a glyph (e.g. shape, color, size, orientation)
visual channels should be easy to distinguish and combine
mainly used for multivariate data to show different data attributes of a data record
How do star glyphs / stick figures work? How do they encode the data?
star glyphs:
show multiple attributes of a data record
composed of equally spaced spikes that start in the center, and the length of wach of the spikes represent a value of a data attribute, the ends of the spikes are connected by lines
stick figures:
data attributes can be mapped to the angle between the different limbs of the stick figure and the length / thickness of the individual limbs can be used to encode data
usually not perceived individually but in combination
recognize texture patterns
What are the advantages / disadvantages of a rainbow color map?
Disadvantages:
since it uses color hue to encode the information, some of the details may be lost
perceptually non linear, means that the same step in the data is not represented in the same perceived difference
perceptually unordered: no naturally ordering of the colors
Advantages:
many colors
good for categorical data, but not for quantitative data because of non-linear, perceptually unordered,…
What does it mean, when a visual channel (e.g. color) is perceptually linear / ordered?
perceptual linear: equal steps in color map (i.e. magnitude of data) should be perceived equally
perceptual ordering: ordering of data should be represented by ordering of colors
What are the characteristics of a sequential / diverging color map?
Sequential:
are suited to ordered data that progress from low to high / min to max
Diverging:
has neutral center
put equal emphasis on mid-range critical values and extremes at both ends of the data range.
oben: sequential, unten: diverging
Visualizatin is good for…
Visual exploration
find unknown / unexpected
generate new hypotheses
Visual analysis (confirmative vis.)
verify or reject hypotheses
Presentation
show / communicate results
Visual Analytics / Analysis
Visual Analysis of Scientific Data
Combines computational & interactive visual methods
multiple linked views
Interpret large & complex data
Drill-down into information
Find relations (“read between the lines”)
Detect features / patterns that are difficult to describe
Integrate expert knowledge
Multi-faceted Scoentific Data
Spatiotemporal data
Multi-variate / multi-field data (multiple data attributes, e.g. temperature or pressure)
Multi-modal data (CT, MRI, large-scale measurements, simulations, etc.)
Multi-run / ensemble simulations (repeated with varied parameter settings)
Multi-modal scenarios (e.g. coupled climate model)
Cartography, geovis, etc.
Linear vs cyclic time
automatic animations
Flow visualization
Visualize summary statistics
Multi-variate / Multi-field data
-> comes from 1 simulation / measurement device
-> multiple data attributes, e.g. temperature or pressure
Attribute views (scatterplots, parallel coordinates, etc.)
Find patterns such as correlations or outliers
lack spatial relationships of data
which of the many data variables to show?
Volume rendering
Difficult to see multi-variate patterns
Layering and glyphs
Feature-based visualisation (brushing, segmentation,…)
Clustering, dimensionality reduction, etc.
Multi-modal data
-> comes from different data sources / different modalities
-> CT, MRI, large-scale measurements, simulations, etc.
Various types of grids with different resolution
Coregistration and normalization
Multi-volume rendering
Visual data fusion
Comparative visualization
How are visualization, interaction & computer analysis combined?
Comparative visualization taxonomy
Side-by-side comparison (juxtaposition)
Overlay in same coordinate system (superposition)
Explicit encoding of differences / correlations
Navigation
Change item visibility
Change which items are visible
Camera metaphor
Zoom, pan, rotate (3D)
Automated viewpoint selection
-> Guided navigation between characteristic views
-> Based on information-theoretic measures
Compute rating for each view v_i and obejct o_j
Optimal viewpoint estimation based on obejct visibility, location in image, and distance to viewer
Animated transition
View 1 optimal for o1 (emphasize o1)
Overview optimal for o1 and o2
View 2 optimal for o2 (emphasize o2)
Ranking / quality metrics
Automatically order views / axes by quality metrics
Enhance clustering, correlations, outliers, image quality, etc.
Overview + detail visualization
Spatially separate overview / detail (e.g. juxtaposed views)
User has to switch attention between representations
Focus + context (F+C) visualization
Seamlessly integrates focus / context in single visualization
Originally spaced distortion used
More space for focus
Keep context, without cropping away data outside of zoom area
Generalized F+C visualization
emphasize data in focus
keep context for orientation / navigation
focus specification, e.g. by pointing, brushing or querying
Line brush
Select function graphs that intersect with user-specific line
Similarity-based brushing
Select function graphs by similarity to user-sketched pattern
Similarity evaluated based on gradients (1st derivative)
Machine Learning Approaches
Supervised learning: leatning with a labeled training set
Unsupervised learning: discovering patterns in unlabeled data
Reinforcement learning: Learning based on feedback or reward
Semi-Automatic Labeling Tool (SALT)
Labeling of large time series data by domain experts
Integrates supervised & unsupervised segmentation methods
User can iteratively improve labeling
Algorithmic extraction of values & patterns
Dimensionality reduction
Aggregation, summary statistics
Clustering, classification, outliers, etc.
Clustering
Given some data points, we’d like to understand their structure
Given a set of data points with some notion of distance between points, group them into clusters such that
members of a cluster are close / similar to each other
members of different clusters are dissimilar
Usually
points are in high-dimensional space
similarity defined by distance measure (e.g. EUklidean)
Clustering is a hard problem: many applications involve 10 or 10000 dimensions, and distances in high-dimensional spaces are at similar distance
Cluster Calender View
Time series clustered by similarity (K-means)
Cluster affiliation of daily pattern shown in calendar
Density-based Clustering (DBSCAN)
Identify dense regions in data
Clusters can be arbitrarily shaped
Difficult to find good parameter settings
2 parameters: Radius epsilon and Number of Minimum Points MinPts
core point: epsilon-neighborhood contains at least minimum number (MinPts) of points
border point: in the epsilon-neighborhood of core point
noise: neither a core object nor a border object
Derive low-dimensional target space from high-dimensional measured space
use when you can’t directly measure what you care about
True dimensionality of dataset assumed to be smaller than dimensionality of measurements
Latent factors, hidden variables
Principal Component Analysis (PCA)
Find directions of largest variance
neglect directions of small variance (not descriptive)
Coordinate system transformation (rigid rotation)
Result:
new axes (eigenvectors) & explained variances (eigenvalues)
New axes usually don’t mean anything physical
What is the main goal of visual exploration?
start with unknown dataset -> don’t know which features are in the data set, we want to explore it to find some hidden / unknown information
Purpose: to generate new hypotheses -> which we can verify / reject later in visual analysis step
Which three major areas / concepts are combined in visual analysis / analystics?
Visualization (e.g. Information / Scientific Vis., Computer Graphics)
Interaction techniques (e.g. Human Computer interaction)
Data Analysis (e.g. machine learning, data mining)
-> Visual Analytics aims to combine those areas in different approaches
Give examples how multivariate data can be encoded in a spatial context?
2 ways to show multivariate data:
using attrubute views (scatterplots, parallel coordinates,…) that mainly focus on the attributes -> problem: don’t see spatial context in this examples
using volume rendering: show data frames in a spatial context using e.g. glyphs or layering
What are challenges when fusing multi-modal data stemming from different data sources?
here we usually have data coming from different data sources (e.g. tumor from MR scan, vessels from MRA scal, skull from CT scan…)
means data can be on various types of grids with different resolutions
challenge:
what should be hidden and which data should be visible and shown to the user
comparison of multiple modalities -> need different techniques for that
Visual data fusion intermixes data in a single visualization using a common frame of reference. Give at least two general approaches.
Layering techniques (e.g. glyphs, color, transparency)
Multi-volume rendering (coregistration, segmentation)
Helix glyphs
What are three general approaches for comparative visualization (according to the taxonomy of Gleicher et al. 2011)?
side-by-side comparison (juxtaposition)
What is focus+context visualization? Explain the general approach. How is it different from an overview+detail visualization?
idea: to combine both the important area / focus and the context information in one single visualization
e.g. highlight with color, opacity / transparency, blurring, enlargement of focus
in overview + detail visualization the overview and the detail are shown next to each other (spatially separated)
issue: user has to switch attention between the representations
Give at least three examples of visual channels (graphical resources) that can be used for focus+context discrimination.
style
frequency / blurring
opacity / transparency
fisheye views
Give two examples for focus+context visualization techniques which use spatial distortion.
What is the main idea in clustering? Is clustering a supervised or unsupervised method?
unsupervised method -> tries to find structures in the dataset by itself
given a dataset with data points and some idea between the distance between the data points, group them into clusters -> data points that are similar should be grouped together in a cluster, and not similar /dissimilar data points should be in different clusters
What is the main idea in dimensionality reduction? Name one example method? How does it work?
idea: if we have given some high dimensional data, then often it is possible to derive some low-dimensional target space where its easier to find the interesting features
true dimensionality of dataset assumed to be smaller than dimensionality of measurements
example method: principal component analysis
Principal component analysis transforms data from a cartesian coordinate system into another coordinate system. Why is it then still considered a dimensionality reduction method?
transformation of the coordinates system
since the new axis is more aligned with the variance it is easier to distinguish / find groups and thats why we can dismiss other principal components
Cartesian / equidistant grid
Samples at equidistant intervals along Cartesian coordinate axes
Neighboring samples are connected via edges
Cells formed by 4 (2D) or 8 (3D) samples
Cells and samples (grid vertices) are numbered sequentially with respect to increasing coordinates
It is a structured grid:
Neighboring information (topology) is given implicitly
Neighbors obtained by incrementing / decrementing indices
Structured Grids:
Uniform / Regular grid: orthogonal and quidistant grid
Rectilinear grid: varying sample-distances
Curvilinear grid: non-orthogonal grid, Grid-points specified explicitly, implicit neighborhood relationship
Unstructured Grids: grid points and neighborhood specified explicitly; Cells: tetrahedra, hexahedra
Scattered data interpolationIs
Zuletzt geändertvor 2 Jahren