Quality: defintion
The result of a process of perception and assessment in which the assessor compares the perceived characteristics of a unit with individual expectations, appropriate requirements or social demands (Jekosch)
Requires a process of perception and assessment (Beurteilung).
Can only be measured with perceiving and judging subjects (e.g. testsubjects)
Occurs in a specific context à“Quality event“
(therefore) Depends on the perception and assessment situation.
Perceived Quality
Totality of features of an entity. Signal for the identity oft he entity visible tot he perceiver. (Jekosch)
Desired Condition:
Totality of features of individual expectations and/or relevant demands and/or requirements. (Jekosch)
Feature:
Recognizable and nameable characteristic of an entity.
Quality Feature:
A recognized characteristic of a unit that can be named and is relevant to the quality of the unit. (Jekosch)
Quality Element:
Contribution to the quality of an intangible or tangible product and/or activity or process in at least one phase oft he quality circle. (Jekosch)
Quality of Service:
(view of system developer) the collective effect of service performance which determines the degree of satisfaction of the user of the service. Contains Service support, service operability, serveability, service security
Performance (view of the system developer):
The ability of a unit to provide the function it has been designed for
Quality of Experience: (View of the user)
The overall acceptability of an application or service, as perceived subjectively by the end user. Includes the complete end-to-end system effects. May be indluenced by user expectations and context.
Usability:
The extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.
Effectiveness:
The accuracy and completeness with which specified users can achieve specified goals in specified environments.
Efficiency
The resources expended in relation to the accuracy and completeness of the goals achieved.
Satisfaction
The comfort and acceptability of the system to ist users and other people affected by ist use
taxonomy of quality aspects
Psychophysics
investigates the relationship between physical quantities and perceptions
Psychometrics:
quanititative description of perceptual variables à measurement with human test subjects (a psychophysical measurement)
Physical event: characteristics
spatially, temporally & property determined
Perceptual event: 2 examples
auditory event, visual event
Schaubild passive Messung erklären
- Ein physikalisches Ereignis wird von einer VP wahrgenommen und beschrieben.
- Im inneren der VP findet ein Wahrnehmungsereignis statt, es verbindet das Physikalische Ereignis mit der Beschreibung (zwischengeschaltet)
- Die boxen deuten an, dass umsetzungsprozesse stattfinden zwischen physik. Ereign., wahrnehmungsereig. Und der Beschreibung (nicht identisch in qualität und quantität)
Zusammenhang der bei der psychophysikalischen messung auftretenden Ereignisse und Skalen
- Die Buchstaben S, H und B stehen für Physikalisch, Wahrnehmung und Beschreibung
- Links: Die Grundmengen der Ereignisse
- Rechts: Die Skalen der Ereignisse
- Uns interessiert meistens der zusammenhang zwischen physik und Wahrnehmung (S und H) also h = f(s)
- B0 = H
- Wir messen das physikalische Ereignis und die Beschreibung also einmal s = f(s0) und einmal b0 = f(s0), wobei die VP den Wahrnehmungszusammenhang h0 = f(s0) beinhaltet
- Die VP ist also gleichzeitig wahrnehmendes und beurteilendes Messorgan
Welche drei Arten von Messfehlern können bei der psychophysikalischen Messung auftreten?
- Die Messungenauigkeit des physikalischen Messgeräts
- Die Messungenauigkeit des psychophysikalischen Messorgans
- Schwankungen im wahrnehmenden Messorgan
Erweitertes VP-Schema: erklären
- Bis wahrnehmungsereignis gleich wie das andere Schema
- Dann aufspaltung und durch vergleich zur Referenz (desired composition) entsteht ein Qualitätsereignis
- Mögliche Beschreibung von Qualität oder Qualitätsmerkmal
Was ist die “Referenz”?
- Includes als aspects of „individual expectations, appropriate requirements or social demands“
- Wichtig für VP auswahl
- Es gibt drei ebenen: physisch (volume level), psychophysisch (roughness), semantisch (Flugzeuglärm)
- For determining system quality: psychological, semantic and functional references
Measurement:
All activities in the entire measurment chain to determine the value of a measurand.
Measured variable:
characteristic oft he measured object, which is described numerically in the course of the measurement
Scaling:
the entirety of all activities that relate specifically to the process of assigning a value of a measured variable, the carrier of which ist he measured object, to a corresponding scale value (measured value) according to predefined rules
Validity:
suitability of a measurement method with regard to its objective
Reliability: (+ 3 types)
reliability of a measurement
Parallel test reliability: correlation oft wo comparable measurments
Retest reliability: correlation when the measurement is repeated
Internal consistency: correlation of partial results of the same measurement
Objectivity:
Degree of interpersonal agreement of measurements (dependence on the experimenter)
Other criteria of good science
Economy, standardization, usefulness, comparability
Conclusion by analogy: what is it and for what step is it important?
The physiological-psychological parallelism of the process that I observe in myself justify me in concluding that a fellow human being whose physiological and psychological conditions are analogous to mine also experiences something analogous to me in the same physiological events.
—> Representative selection of test subjects
Types of measurements (name 5)
- Observation procedures: e.g. investigation oft he reaction of VP to auditory events or visual events
- Assessment procedure: e.g. description oft he quality of speech samples by VP on predefined scales
- Instrumental methods:e.g. time or length measurement
- Calculation method: calculation oft he weighted sound pressure level in db(A)
- Statistical estimation method: consideration of statistical variables
Usertypes can be classified according to:
- Perceptual properties
- Behavioral characteristics
- Experience
- Motivation (pro/private, sporadic/regular)
- Individual preference, skills and knowledge
User expertise cube by nielsen: What are the dimensions?
two curves about willingness to accept innovations: describe how they look
What are other ways to describe Usertypes?
sinus mileus
Hermann User Types (ICT-Enthusiast, ICT-indifferent, etc…)
4 ways to classifiy psychometric methods
- Scale level
- Presentation method (production method, constant method)
- Modality oft he experiment (hearing test, conversation test etc.)
- Indirectness of measurement (indirect: thresholds, direct: mapping between physical and perceptual event scales)
Classical Psychophysics: what are the 3 main logics
- Perceptability thresholds: nominal level
o 50% say feature is there
o 50% say its not there
- Difference Thresholds: nominal or ordinal
o 50% say equal
o 50% say unequal
- Points of equal perception: ordinal
o 50% say larger
o 50% say smaller
8 steps of scientific procedure
1. Determining the objective oft he measurement
2. Specification of the measurement object
3. Definition of measured variables (black box vs. Glass box)
4. Determination of measurement environment (lab, field)
5. Specification of measurement method
6. Specification of experiment details (selection of stimuli, subjects)
7. Planning the test procedure
8. Data analysis
Difference between “between subject” and “within” design
- Between subjects: each subject tests only one system (-variant)
- Within subjects: each subjects tests all system (full factorial, or partial factorial design)
What graeco-latin square, when use it?
- Full factorial design
- Bei kleiner anzahl von Stimuli und einflussfaktoren
- Alle einflussfaktoren und positionen im test werdenen miteinander kombiniert
What is the basic rule of scaling (skalierung)?
There is no one „ideal“ scale for all applications
- Select a suitable scale dependin on the measuring task!
what are the types of scales and their characteristics?
- Nominal scales: scale elements represent identities; no relations defined between the identities
- Ordinal scales: scale elements have identity and rank order; however, ranks are not necessarily equidistant
- Interval scales: scale elements have identity, rank order and additivity, but no absolute zero
- Ratio scales: scale elements have identity, rank order, additivity and absolute zero; this allows ratios to be scaled
What is Magnitude estimation (ME) how does it work? other methods? Disadvantages?
- A way of Ratio-Scaling
- Goal is determining the „absolute size“ of a perception
- Task: Assign numbers to perceptions so that relationships are maintained
o Reference stimulus: loudness 10
o If the next is twice as loud: 20
o Half as loud: 5
- Other methods: sum of constant ratios, determination of a line length, magnitude production
- Disatvantages: numbers only have relative significance, no anchoring in „world knowledge“
What is category scaling?
- Assignment of stimuli to categories
- Possibilities
o Absolute category rating (ACR)
o Degradation category rating (DCR)/ Comparison category Rating (CCR)
§ Assignment oft he difference between two stimuli to categories
What is a popular scale for overall quality?
- 5-level category scale for overall quality
- Mean value of judgements
- Mean opinion score (MOS)
- MOS-Scale
What are problems with category scaling?
- Interpretation of attributes can vary
- Context effects: influence of previous stimuli
- Intervals are not equidistant (only ordinal scale)
- Tendency away from extremes / or saturation if extreme was used prematurely
What is a possible improvement to the problems with category scaling?
- Veranschaulicht äquidistanz zwischen kategorien
- Kontinuierlich
- Dünngezeichnet für noch extremere werte (gegenwirkung sättigungseffekt)
what are CR-Scales
Category ratio scaling
a combination of ratio aspects and absolute scaling
Example: Borg CR10 scale
wie funktionieren Likert-Skalen?
- Zustimmung oder ablehnung einer behauptung
- Erfragung der Einstellung
- Kontinuierliche oder einfache Kategorien skala
what/ Why multidimensional Analysis?
Two different types
- Identification of perceptually relevant dimensions
- Consists of a data collection procedure and a data analysis procedure
- Two possibilities:
o Similarity assessment and multidimensional scaling
o Semantic differential and principal component analysis
Ähnlichkeitsbewertung und multidimensionale Skalierung:
- Paare von Stimuli hinsichtlich ihres Perzeptiven Abstands bewerten (skala ähnlich/ unähnlich)
- Transformation der Unähnlichkeit in euklidische distanzen in einem Beschreibungsraum (multidimensionale Skalierung)
- Ergebnis ist eine Anordnung der Stimuli im Raum, zusammen mit ihren Faktorladungen
- Dimensionen danach interpretieren
- Vorteil: keine Merkmale werden vorgegeben
Semantisches Differential und Hauptkomponentenanalyse:
- Vordefinierte Anzahl von Skalen (am besten aus Vorversuch)
- Jeder stimulus auf allen skalen bewerten: Polaritätsprofil erstellen
- Hauptkomponenten analyse, von skalen aufgespannter Raum reduziert auf geringe anzahl an Dimensionen
- Besser interpretierbar, da mit beschreibenden attributen versehen
- Nachteil: nur qualitätsmerkmale erfassbar, die vorab auf skalen definiert sind
Was ist präferenz mapping?
+ zwei modelle:
- Dimensionen der vorherigen Verfahren bsp. Mit MOS-Skala bewerten
- Beschreibt die bedeutung für Qualität
- Vektormodell:
o Monotoner Zusammenhang zwischen Dimensionswert und Qualität (z.b. Rauschen in Sprachsignalen)
- Idealpunktmodell:
o An einem Punkt ideal, größer oder kleiner führt zu qualitätsbeeinträchtigung (z.b. lautstärke)
6 Schritte der Auswertung skalierter Messergebnisse:
1. Recoding into scale values (numbers)
2. Data entry (e.g. in SPSS)
3. Descriptive Analysis
4. Mean value comparisons
a. T-test
b. One-fact. ANOVA
c. Multi-fact. ANOVA
d. MANOVA
5. Correlation analysis
6. Regression analysis
Wann beginnt und endet Usability Engineering?
- Beginn: Vor dem Design
- Ende: Nach Installation/ erst mit Benutzung
Wie sieht der Usability Kreislauf aus?
What 4 things does the analysis include
- User with his characteristics and requirements
- Task
- System as well as comparable and tradintional systems
- Environment
important aspects of a user?
- Experience, level of education, age, influence design decisions
- User is a „dynamic system“ evolving from novice to expert over time
- Users often use systems in unintended ways
What is meant by context?
- Physical context
- Social context (for example office)
why care about comparable systems?
- Conventional solutions
o May influence mental model
o Show strategies and shortcuts
o Show things that the new system should support
- Influence the reference against which the system is later measured
Target values, how to determine
- Determination oft he values:
o Usability expers (compromises necessary)
o Users (often incorrect assessments)
interpretation tortendiagramm:
- Größe des Stücks ist die wichtigkeit des aspekts
- Die Kreislinie gibt den Zielwert an
Zwei arten von design
zwei wichtige sachen die man beachten sollte
- Parallel Design
- Iterative Design
- Consistency within system (with other systems) through standardized interfaces and project standards (but usability is more important)
- Participatory design
3 arten von prototyping
- Vertikales Prototyping: nur ein teil der systemmerkmale in aller tiefe
- Horizontales prototyping: gesamte breite der Systemmerkmale nur oberflächlich
- Szenario-basiertes: ausgewählte merkmale oberflächlich
Was ist wizard of oz simulation?
- Wenn nicht alle notwendigen Module verfügbar
- Ein Mensch simuliert eine Funktion bzw. Komponente des Systems(der wizard)
was ist cognitive walkthrough?
- Specification of a user, a task and a context of use
- Defintion of ideal solution path
- Usability expert goes through the interaction step by step
- Questionaire
what is pluralistic usability walkthrough?
- Use of a group of users, developers and usability experts
- Users bring domain knowledge
- Developers bring knowledge of basic restrictions
what is heuristic evaluation?
- Gruppe von Experten sucht probleme und teilt sie den heuristiken zu
- Für unterschiedliche anwendungen unterschiedliche heuristiken
- Allgemeine heuristiken
- Kategorie-spezifische Heuristiken
- Product-specific heuristics
- „discount usability engineering method“
What is the most important and reliable method of empirical testing?
test with real users!
Advantages(4) and disadvantages(2) of tests with real users
Advantages:
- Real user behavior usually unpredictable even by usability experts
- Shows „real“ problems (not assumed)
- Quantification of problems possible
- Influence on satisfaction detectable
Disadvantages:
- High effort (costs, time)
- Availability of users
What are different requirements in empirical testing for research and practice? (hint: think about the onjective of each)
- Research:
o Verification of hypothesis
o Statistical validation à high number of test subjects
- Practice:
o Detection of design problems
o No statistical significance necessary (3-6 test subjects)
Zwei arten von Analyse je nach zielsetzung:
- Summative Analyse:
o Overall quality of system
o Recording of various metrics
o Usually at the end of the design phase
- Formative analyse:
o Identification of design errors
o Submission of proposed solutions
o Mostly during the design phase
2 Andere Methoden mit echten Nutzern
- Thinking aloud (experimenter in the room)
- Focus groups (Discussion concerning system design)
Was sind 5 Methoden für “Feedback from the field”
- Standard market research: regular surveys, telephone interviews
- Special studies: with specific objectives
- Analysis of log data: recordings of real interactions
- Analysis of secondary data: hotlines, complaints, change requests
- Analysis of monetary data
Quality characteristics from multidimensional analysis
older results (5)
- Comprehensibility or clarity
- Naturalness or fidelity or speaker recognizability
- Loudness
- Timbre
- Differentiation between background noise/voice signal interference
recent results (4)
- Sound coloration, frequency content or directness
- Continuity
- Intoxication
Further Quality features
- Ease of communication and conversation effectiveness
o Hearing strain, e. g. if the signal is too quiet, or background noise
o Speech effort (listenig, echoes)
- Impairment of conversation effectiveness due to delays, depends on
o Interlocutor
o Conversational situation
o Motivation of the conversation
what is advantage of access?
- Examples:
o Mobile telephony
o Cordless telephony
o Calls to hard-to-reach regions
- Quantification (rule of thumb): Advantage Factor A
o Advantage corresponds to approximately half of the typical malfunction oft he corresponding system
o Calculation examples
§ Mobile radio encoder with le = 20 à A = 10
§ Satellite connection with delay impairment Id = 40 à A = 20
what are different types of psychophysical measurements?
(meaning 4 contrasts between methods)
- Conversation attempt vs. Listening attempt
- Laboratory test vs. Field tets
- Overall quality vs. Indiviudal quality features
- Absolute vs. Relative assessment (ACR, Pair comparison)
comprehensibility:
- ability of the speech signal to convey content
intelligibility:
describes how well the content can be identified
communicability:
- describes how well an utterance serves communication
Comprehension:
result oft he communication process
What are differences of measuring audio quality compared to speech quality?
- Different quality level
- Other listening situation
- Use of other higher resolution scales
- Direct pair comparison instead of absolute assessment
What is MUSHRA
- Multiple stimulus test with hidden reference and anchor
- Continuous slider
what is another way of measuring audio quality in a listening test?
double-blind triple-stimulus with hidden reference
A is always the reference
task is to identify the hidden reference
Dis- Advantages of conversion tests vs listening tests
- Advantages: many speakers, natural situation, focus on content, detection of „conversational disturbances“
- Disadvantage: complex, less analytical
4 scenarios of conversation tests
- Postcard test: discussing pictures over the phone
- Kandindski test: search game in which numbers are tob e found in a picture
- Short conversation test: role play, leads to shorter yet highly structured dialogs
- Interactive tests: test subjects have to match numbers or addresses as quickly as possible
Zuletzt geändertvor 6 Monaten