Buffl

Multimodal Interaction

von Lynn H.

Explain the difference between Icon, Index and Symbol

Icon = Sign that relates to an object through its similatity

Index = ‘‘ throug direct causal relationship (e.g. smoke relates to fire)

Symbol = ‘‘ through an abritory or conventional relationship (e.g. letter to a sound)

Explain the simiotic relationship of signs

Carrier : physical Sign displaying the symbol

Meaning : What is the message?

Object : what is depicted?

Explain the difference between earcon and auditory icon

earcon: abstract, synthetic tone -> create soundmessages -> UI, learning

auditory icon: everiday sound, e.g. recorded in nature -> anology with events -> functional feedback, get attention

What are the pros and cons of earcons?

Earcons

Pro: Can be used to design and control the desired effect

Con: No association with everyday events/ needs to be learned

What are the two levels of typography?

Microtypography

Small scale text composition (Serifs, Ligation, Font, hight)

Macrotypography

composition of text on paper/screen

Proportional font vs. Non-proportional font

proportional: every character has an individual width

non-prop.: each character has the same width (e.g. curier)

Why shouldn't we utilize styles like this too much?

this is inhomogenic and distroys the attention of the reader

What are Kerning and Line Spacing?

Kerning is the adjustment of space between individual letter pairs to create visually balanced and pleasing text.

Line spacing is the vertical space between lines of text

What are the steps to Symbolic pre-processing?

Describe the steps of a full Text-to-Speech System

(Explain all the steps of TTS on the example of an voice assistant reading out an email)

Graphical source-filter

Complete the labels of the Source Filter Model

How does PSOLA work?

Segmentation (Analysis):
- The speech signal is segmented into short overlapping frames around pitch marks (periodic instants like glottal pulses in voiced sounds).
- These frames are usually centered around pitch periods to maintain the natural periodicity of the speech.
Modification:
- To change pitch, the algorithm shifts the spacing between segments.
- To change duration, it either repeats or skips certain segments.
Reconstruction (Synthesis):
- The modified frames are overlapped and added together, aligning them at pitch-synchronous points to produce a smooth signal.

Due to manipulation the result can sound a bit unnatural -> longer and more units used

Describe these 4 signal generation methods:

Parametric synthesis
Concatenative synthesis
Unit-Selection synthesis
Statistical synthesis

Parametric Synthesis: Generates speech by manipulating parameters (like frequency, pitch, and formants) using a mathematical model, often resulting in robotic or unnatural-sounding speech.
Concatenative Synthesis: Constructs speech by piecing together small recorded segments of human speech (units), aiming for more natural-sounding output.
Unit-Selection Synthesis: A type of concatenative synthesis that selects the best-matching units from a large speech database based on target and concatenation costs for high naturalness.
Statistical Synthesis (e.g., HMM-based): Uses statistical models trained on speech data to generate speech waveforms, balancing flexibility and intelligibility, though often with smoother but less expressive output.

Beitreten

Vorschau

Author

Lynn H.

Informationen

Zuletzt geändert
vor 7 Monaten

Kurs melden

Chapter 9: Multimodal Output Systems

Earcons

Graphical source-filter

Author

Lynn H.

Informationen