undefined

by Carmen F.

System Definiton

one or more components of differing nature

The individual components interact with one another via internal interfaces

fulfils a defined purpose by providing or executing functions

A system progresses through a life-cycle, from the development to the realisation, commissioning, operation until its disposal

In order to define a system, it must be delineated

system boundary separates the system and its components from the system environment

components interact via external interfaces with the system environment

The system environment is not part of the system

The components of a system can be systems themselves

They are then referred to as subsystems

The complete system is then also termed a System-of-Systems (SoS)

The separation can occur on the same level and/or in a hierarchical fashion

System Grenzen

System Structures

Complexity of Systems

no objective definition for complexity

some properties correlate with higher complexity

comparison of transistor count and size of instruction sets of various processors

Combinatorial explosion of the state space of comparatively small state machines executed in parallel

Failure

Any deviation of a function’s behaviour from its (intended) specification is a failure of the function or service

Failures are caused by errors in the components of a system

Error

An error is an internal system state that deviates from the expected state necessary to perform a function/service and always occurs at run time of the system

The occurrence of an error can lead to a propagation of further errors inside the system

When a propagating error reaches the system boundary it causes a failure

The reason for the occurrence of an error is a fault

Fault Classes

Development faults
Physical faults
Interaction faults
natural faults
malicious faults
operational faults

Fault

An active fault is a fault that causes an error during run time of the system

Otherwise the fault is still present in the system but is in a non-activated or dormant state (so-called dormant fault)

An external fault acts on components of a system from outside the system boundary

The external fault either by itself or by activating an existing internal fault, causes an error in the system

The internal fault is also termed a vulnerability in this context

Development faults

occur during the development of a system

Physical faults

only affect physical components

Interaction faults

include all external faults

Natural faults

arising from natural phenomena

Malicious faults

intentionally brought into a system

affect a system from outside in order to inhibit its operation

make it possible to gain control over it

Failure Modes

type of deviation from a specified service when a failure occurs

content failure

timing failure

silent failure

(in)consitsnt failures

magnitude of failure

Operational faults

faulty interactions of a user with the system

during run time

content failure

Deviation of the content of information

timing failure

Deviation of the timing

silent failure

complete absence of delivered information due to termination of the service

(in)consistent failure

Whether the failure is experienced by all users of the service in the same way (consistent failure) or in differing ways (inconsistent failures)

Propagation of Failure

system consisting of more than one component

When a component experiences a failure due to a fault, this can cause further failures in the dependant components

worst case, a chain reaction occurs that propagates to the system boundary, causing a system failure

Random failure

caused by a random fault

The probability of random faults and their associated random failures can be quantified within certain limits

A random fault is created and activated during run time of a system with a certain probability

Systematic failures

failures with deterministic causes

Systematic failures are caused by systematic faults

can be (theoretically) eliminated from a system altogether

probability for existence and activation of a systematic fault can not be quantified

Software hat nur systematische fehler

Definition of Dependability

Dependability of a system is the ability to avoid service failures that are more frequent and more severe than is acceptable.
Dependability is a collective term used to describe the availability performance and its influencing factors: reliability performance, maintainability performance and maintenance support performance.
Dependability is a property of a system that defines its resilience against (service) failures
The criteria for acceptability are formalised by dependability requirements for the system

Does a dependable System have failures?

yes

Dependability allows for the occurrence of failures, as long as they occur rarely enough and with only minor consequences

Attirubutes of Dependability

Reliability
Availability
Maintainability
Safty
Integrity

Reliability

the probability that a system operates until a certain time point without experiencing a failure (i. e. the “survival probability”

Availability

the probability that a system can provide its service(s) at a certain time point (i. e. the probability of being “up and running”)

Maintainability

the probability that a renewal (i. e. a repair or replacement) is finished until a certain time point

Safety

but only in regard to failures!

Integrity

protection from undetected alterations of information or structure

Fault Prevention/Avoidance

The introduction of faults into the system should be prevented

Achieved via constraints enforced on the development activities

Fault Removal

Existing faults should be detected and removed from the system

Achieved by performing verification activities on implementation artefacts

Fault Tolerance

The effects of residual faults during operation of the system should be controlled to prevent failures

methodology employed in the design and operation of a system in order to increase a system’s dependability when residual faults are present

Fault Forecasting

The effects of residual faults which are not tolerated should be estimated (and the consequences accepted)

Achieved by empirical observation of components and stochastic modelling of system structures or qualitative analysis of failure consequences

Residual faults

faults that were neither prevented nor removed during development of the system – their presence must be assumed for every nontrivial system!

Fault Tolerance Phases

Error Detection … Detecting the errors caused by residual faults via acceptance checks on the output/state of modules
Damage Assessment/Confinement … Assessing/limiting the extent of the corruption of the system state due to the detected errors
Error Recovery … Correcting the corrupted system state to arrive at a correct system state
Fault Treatment … Locating and deactivating the responsible fault(s) to prevent immediate recurrence of errors

Redundancy domians

Physical (or HW) redundancy, by adding physical components

Information redundancy, by adding information

Temporal redundancy, by repeating operations

Software redundancy, by adding SW components

Full Fault Tolerance

a service is provided according to its specification without any impairments at all

Partial Fault Tolerance

a service is provided in a degraded mode only, when a failure would otherwise occur

also known as Graceful Degradation

Fault/Error Injection

Faults/errors are deliberately introduced into the system

Error injection is used when injection of an actual fault is too expensive

Physical Fault Tolerance

most basic FT strategy and applicable to all systems

cost- and space-intensive

based on adding redundant physical components to a system

Homogeneous redundancy

components are replicated and differ only unintentionally

Inhomogeneous redundancy

components differ intentionally in some aspects

Standby System

Software Fault Tolerance

Homogeneous redundancy of SW is meaningless

SW only contains systematic faults

Trust in reused components is often unjustified (“Software of Unknown Pedigree” – SOUP)

N-Version Programming

Recovery Block

Signature-based Control Flow Monitor

Distributed System

system consisting of digital computers,called nodes, connected by a network that interact with one anotherto provide a set of services to its users

distinction from conventional system

Remoteness
Concurrency
Lack of global state
Partial failures
Asynchrony
Heterogeneity
Autonomy
Evolution
Mobility

Requirements for a Distributed System

Openness

Scalability

Security

Distribution Transparency

Openness

Interfaces for invoking services and communication between nodes are standardised and these standards are made available to the public

defined via Interface Description Languages (IDL)

requires portability, so that a service implementation can continue running even when the underlying components change

Scalability

system can also handle increased workload

needs:

replication
hierachical structures for localisation and managing
Decentralisation of services
Replacing synchronous with asynchronous communication
Replacing discovery services based on broadcast communication with actual location services based on point-to-point communication that work over wide-area networks

Security

enforcing policies over extensions and modifications
ensuring authentication and authorisation for different groups, also mobile
maintaining security for mobile users during location changes
ensuring CIA triade when using mobile code
ensuring availabity in the case of DoS attacks

Distribution Transparency

hiding the physical or logical distribution of nodes and resources from the user so that the DS appears to them as a single, monolithic system

Access transparency

hides differences in the invocation of services of the DS and access to its resources

Location transparency

hides the actual physical location of a service or resource

Relocation/migration transparency

hides the effects of relocating a resource between components of the DS while they are accessed (relocation) or otherwise (migration)

Replication transparency

hides the replication of resources inside the DS for increasing performance and resilience against component failures

Failure transparency

hides the effects of failures and recoveries of components

of the DS

Persistence transparency

hides the persistence properties of a resource

Transaction transparency

hides interaction between components for achieving consistency when resources are modified by an invoked service

Concurrency transparency

hides the effects of simultaneous, competitive (i. e non-cooperating) access to shared resources by multiple users/services

Entities in a Distributed System

node

process

threads

object

node

single computer that executes a program

process

instantiation of a program at run time executed by an operating system (OS) running on a node

threads

further subdivision of a process

share the virtual memory space of their parent process

object

virtual entity providing a set of methods that can be invoked by other objects and is realised by the processes/threads of a suitable runtime environment

Communication Paradigms

Direct communication

Indirect communication

Direct communication

the sender must know the receiver’s identity (and vice versa) and both must be active at the same time

RPC

Remote Procedure Call

A process invokes a procedure in a remote process

RMI

Remote Method Invocation

An object invokes a method of a remote object

Indirect communication

sender and receiver need not know each other (decoupled in space) or need not be active at the same time (decoupled in time

Publish-Subcscribe

also called event-based, publishers generate events which are delivered to subscribers by an intermediate via notifications

Message Queues

Producers store messages in persistent message queues, from which they can be extracted by consumers at a later time

Shared Data Space

entities read from or write to a shared storage independently from one another

client

is an entity that invokes a service provided by a server by sending a request to the server’s interface and waiting for a reply (or response) from the server

server

A server is an entity that provides one or more services by waiting for a request from a client and, after processing it, answers with a reply (or response) back to the client

Tiered Architectur

Presentation

Application

Data

Presentation layer

performs visualisation and handles user interaction

Application layer

performs the core/business logic

Data layer

provides and manages data

Two-Tier architecture

layers are split up between a client and a server entity

Presentation layer and, optionally, part of the application layer on the client (so-called Thin Client), remaining layers on the server

Presentation and application and, optionally, part of the data layer on the client (so-called Fat Client), remaining parts of the data layer on the server

Three-Tier architecture

layers are split up between a client and two server entities,

usually in the following way:

Presentation layer on client

Application layer on an application server

Data layer on a database server

Peer-to-Peer

an alternative principle to client server architecture for providing decentralised services

Each entity provides the same services and implements the same interface the P2P models is therefore symmetric

Each entity participates in sharing the load – combined with an even distribution of peer entities over the network this avoids performance bottlenecks

Resources are distributed evenly (in the best case) over the entities, so localised failures have a smaller impact on the service

Middleware

facilitate achieving the requirements for a DS in a standardised way

Without middleware, each distributed application would need to achieve the requirements by itself, causing needless duplication of functionality and interoperability problems

Join Course

Preview

Author

Carmen F.

Information

Last changed
9 days ago

Report course

Basics

Author

Carmen F.

Information