What are AWS availability zones?
logical data center in a region
-> isolated from failures of other availabilty zones
-> infrastructural redundancy… as no common points of failures..
-> inexpensive, low-latency network connectivity to other avaibailbty zones in same AWS region
What are regions in AWS?
geographic cluster of availability zones
regions isolated from eahc other for fault tolerance and satbility
What do regions allow for?
users can control where resources are allocated
-> i.e. meet legal requirements such as in europe
-> have short latency access for customers…
Can users choose the zone in an AWS region?
yes
How is the cost structure in AWS communicatoin between zones and regions?
zones in same region -> free
region to region -> cost
What SLAs are there in AWS?
region level
guarantee 99.99% region availabilty
region is unavailable when all running instances or tasks, deployed in two or more AZ, concurrently have no external connectivity
instance-level SLAs
guarantee 99.5% reachability of an EC2 instance
instance not available when your single EC2 instance has no external connectivity
What does amazon elastic compute cloud provide?
vitrual machines running inside amazon cloud
ephermal storage tied to the virtual machine (node)
network accessible block storage that persists across time and can be mounted in the VM
virtual firewall to secure your network in the cloud
based on xen hypervisor
What type of hypervisor is xen ?
bare metal hypervisor
How does the xen hypervisor work?
One VM is called Domain 0 (DOM0) and runs the host OS.
It
starts first and runs the Xen management software,
manages other VMs,
has drivers for hardware
and provides virtual disks and network access to unprivileged VMs.
What is the main difference between xen and nitro hypervisor?
xen has Network I/O, CPU interrupts and storage IO in software
in nitro, tese are in hardware
hardware is faster
as Dom0 can be removed
no cores reserved for Dom0
What are Amazon Machine Images (AMI)?
also called VM template
copy of a server with OS and preinstalled software
predefined from amazon and third parties; user defined AMIs also possible
stored in Amazon S3
What is a potential problem with AMIs?
difficult to select
-> could contain trojans or backdoors
-> amazon provides reviews and ratings…
What is amazon Elastic Block Store (EBS)?
persistent block-level storage volumes
can be attached to EC2 instances.
independent of EC2 instances
can be detached and reattached to different instances.
It offers different volume types, each optimized for different performance characteristics and costs.
survive the termination of the EC2 instance.
EBS is suitable for applications that require persistent storage, databases, and boot volumes for EC2 instances.
What is the EC2 instance storage?
local, temporary storage that is directly attached to an EC2 instance.
It provides high-performance, low-latency storage specific to the instance it is attached to.
The storage is transient and data is lost if the instance is stopped or terminated.
deal for temporary data, caching, and scratch space where durability or persistence is not required.
What is amazon elastic file system (EFS)?
scalable, shared file storage service
can be accessed by multiple EC2 instances simultaneously.
offers file system interface compatible with standard OSs,
supporting both read and write operations.
automatically scales storage capacity as files are added,
suitable for workloads that require shared access and dynamic scaling.
It is highly available and durable, with data replicated across multiple Availability Zones.
EFS is well-suited for content management, web serving, data sharing, and containerized applications.
What is amazon simple storage service (S3)?
scalable object storage service designed for storing and retrieving large amounts of unstructured data.
It offers virtually unlimited storage capacity and is accessible over the internet via API calls.
S3 provides high durability, availability, and data redundancy, with data distributed across multiple facilities within an AWS Region.
S3 is commonly used for backup and restore, data archiving, static website hosting, content distribution, and big data analytics.
Storage overview
What storage levels are there?
processor registers
very fast, very expensive
small size, small capacity
power on, immediate term
processor cache
RAM
fast, affordable
medium size, medium capacity
power on, very short term
flash / USB memory
slower, cheap
small size, large capacity
power off, short term
hard drives
slow, very cheap
large size, very large capacity
power off, mid term
tape backup
very slow, affordable
power off, long term
What is local storage? What types are there usually?
Flash or disk drives attached to the computer
Disks are larger and cheaper but slower and more power hungry
What is the meaning of RAID?
redundant array of independent (inexpensive) disks
Explain RAID0, RAID1, RAID5
Raid0
split data among disks
improve performance (read/write simultaneous)
no redundancy
Raid1
duplicates among disks
no improved performance
redudancy
Raid5
srtip data among disks to improve performance
include parity information to revocer in case a drive fails
balance between performance, capacity and fault tolerance
How is a non-volatile flash memory cell built?
How are non-volatile flash cells written?
high positive voltage between source and control gate
injects electrons to floating gate
How are non-volatile flash cells read?
apply voltage to control gate
measure current between source and drain
presence / absence of charge in floating gate influences conductivity of channel
How can one reset data in non-volatile flash cells?
high negative voltage to control gate
-> block-wise, not cell wise…
What are SLC and MLC?
single level cells
store one bit
multi level cells
store multiple bits in single cell
sense strengh of current, not only presence…
-> more precise measurement required
-> amount of charge determiens states -> precice control of the charge deposit required
higher density, lower cost
larger bit-error ration
lower write speed, lower number of program-erase cycles, higher power consumption
What was an effect of server consolidations
pressure on storage
-> shift importance from €/GB to €/IOPS (in out operations per second)
thus nowadays storage solutions all flash or hybrid flash array
all flash:
flash or SSD
combination with RAID
higher IOPS, lower latency, higher bandwidth
more expensive
How can one provision storage?
direct attached storage
Storage devices are attached to the individual computer
Leads to over-provisioning
storage area network (SAN)
NETWORK providing access to block level data storage.
Typically specialized network separated from LAN.
No file abstraction, only block-level operations
Shared pool of spare resources
Network attached storage (NAS)
Storage devices connected to a file server.
Access available at a file-level for other computers.
What protocols are used in SAN?
Fibre Channels to access the SAN
iSCSI (Internet small computing systems interface)
ATA (advanced technology attachment) over Ethernet (AoE) (also cable standard but over ethernet…)
Storage device visible to client as disk
can be mounted after formatting with a file system…
=> AS OWN NETWORK -> NEED ACCESS TO HARDWARE OVER ETHERNET…
What is iSCSI?
SCSI -> standard for computer system interface (i.e. for peripherals such as drives)
-> i SCSI -> SCSI commands over TCP/IP -> thus able to run over ethernet as basis…
What are drawabacks of SANs?
shared network bandwidth
shared performance of storage devices
security concerns due to transfer of data through network…
What are advantages of SANs?
flexible distribution of devices between clients
-> no adaption of cabling
easy replacement of fauilty servers with new servers booting from the same unit
easier disaster protection
Differences SAN vs NAS?
SAN -> hardware access over some network connection (block level…)
NAS -> dedicated file server (with file system) attached to the network…
-> so not usually own dedicated network…
How does NAS work?
file server attached to the network
-> has file system and provides access to files for clients
access to files using network file sharing protocols, e.g. NFS
frequently using internally a RAID
What are some exemplary implementations of NAS?
computer based
embedded systems based
ASIC based
What does clustered NAS allow for?
provide ability to distribute files and meta-data across multiple NAS servers
=> i.e. cluster of NAS as single…
What types of storage virtualization does exist?
block virtualization
file virtualization
What do both block and file virtualization provide?
location transparency
=> ability to access data without knowiung the actual locaiton…
How does Block virtualization work?
some meta data determine mapping of virtual disk and block number (local in client system)
to physical disc and block number
IO redirection based on the metadata…
=> mapping local blocks (virtual) using some meta data to physical blocks in the e.g. SAN
What is the usage of storage virtualization?-
flexible mapping
only use parts of physical disk
thin provisioning
provision e.g. 2 GB, in reality -> reserve 1 and only 2 if actually needed…
disk expansion and shrinking
non-disruptive data migration (between disks)
-> migrate and simply switch mapping function…
improved utilization
What different implementations are possible for block virtualization?
host based
storatge device based
network based
How does host based block virtualization work?
host runs virtualization software
it maps logical to physical units
How does storage device based block virtualization work?
disk array, e.g. RAID provides virtualization level
nes disk arrays provide storage controllers that allow the attachment of other storage controllers
primary controller provides pooling and meta-data management
and might provide replicatoin and migration services
How does network based block virtualization work?
virtualizatoin device is in the LAN and connected to a SAN
-> provides virtualization services
most frequent implementaiton…
in-band vs out-band
What different types of file virtualtizatoin exist?
file system
e.g. NTFS, FAT32, UFS -> stores files on block based storage
clustered NAS
-> combines NAS from the same vendor
distributed file system
allow files located on multiple NAS to appear as if on single NAS (e.g. windows DFS, Linux DFS)
distributed file systems can combine NAS from different vendors
What is the amazon block storage?
a block storage volume
can be mounted
can be formatted appropriate
multiple can be combined into a virtual RAID
snapshots of block storage volume stored in S3 -> for backup or replicatoin
What is amazon instance storage? What technologies are used?
disks attached to physical host -> i.e. directly to machine thus (aside from RAM) lowest latency…
some uses NVMe or SATA-based SSD -> high random I/O performance
What happens to the instance storage if an instance is stopped or terminated?
any data on instance store volumes lost
What is the amazon elastic file system?
scalable file storage
can be created and mounted into instances
files can be shared among instances
file system has to be explicitly created and destroyed
What does amazon S3 stand for?
Simple Storage Services
What is amazon S3?
reliable and inexpensive data storage infrastructure
-> supports objects from 1 to 5 TB
two-level namespace
slow compared to local discs or EBS
high durability but low availability
-> most users use S3 for short-term or long-term backup
How does the S3 two-level namespace work?
buckets:
flat colleciton of buckets,, namespace shared across all amazon customers
objects:
file in the buckets
How can one access S3 ?
in EC3
from the web
What are amazon EC2 instances?
instance -> running VM which is based on an AMI (amazon machine image)
instance type -> VM with different compute and memory capabilities
How does stoarge in EC2 work?
storage:
boot device volume
elastic block storage
instance storage
instance store volumes: local discs of the server
!both lost when instnace is terminated
for persistency: use EFS, EBS (-> elasitc block storage, elastice file storage)
What does amazon ec2 stand for?
elastic compute cloud
how are IP addresses handled in amazon ec2?
elastic IP address
static ip address required if:
use instance that must always be accessible by the same IP address
pay for address indepenedent of usage
Can one have arbitrary numbers of EC2 ?
account has limit for number of VMs of a certain type…
What are the states in an AWS instance lifecycle?
pending
running
stopping
stopped
shutting down
terminated
Which instance states are billed, which not?
billed:
stopping (if perparing to hibernate)
not billed:
stopping (if preparing to stop)
shitting-down
What is the instance state pending for?
instance preparing to enter running
-> pending when launched for the first time or when restarting after being stopped
What is the instance state running for?
instance running and ready for use
What is the instance state stopping for?
instnace preparing to be stopped or stop-hibernated
What is the instance state stopped for?
instance shut down and cannot be used
-> can be restarted at any time
What is the instance state shutting down for?
instance preparing to be terminated
What is the instance state terminated for?
instance has been permanently deleted and cannot be restarted
How do boot times differn from EBS-backed and instance store-backed EC2 instances?
EBS:
usually < 1 min
ISB:
usually < 5 min
How do size limits for root device differ from EBS-backed and instance store-backed EC2 instances?
16 TiB
10 GiB
How do root device volumes differ from EBS-backed and instance store-backed EC2 instances?
amazon EBS volume
IBS:
instance store volume
How dodoes data persistence device differ from EBS-backed and instance store-backed EC2 instances?
by default root volume deleted upon termination; on any other EBS volume persists after instance termination by default
data on any instance store only persists during life of the instance
How do modifications differ from EBS-backed and instance store-backed EC2 instances?
instance type, kernel, RAM, upse data can be changed while instance is stopped
instance attributes fixed for lifetime of instance
How do charges differ from EBS-backed and instance store-backed EC2 instances?
charged for:
instance usage
EBS volume usage
storing AMI as amazon EBS snapshot
storing AMI in amazon S3
How AMI creation/bundling differ from EBS-backed and instance store-backed EC2 instances?
using single command/call
requires installation and use of AMI tools
How does the stopped state differ from EBS-backed and instance store-backed EC2 instances?
can be placed in stopped state where instance is not running, but root volume persisted in amazon EBS
cannot be in stopped state!
-> instances running or terminated
How does the host computer in EC2 instances behave in different instance states?
reboot:
instance stays on same host computer
stop/start (EBS-baced only; no ISB)
in most cases -> move to new host cmputer; may stay at same if there are no problems with host
hibernate (EBS-only)
same as stop/start
terminate
none (as instance will be no more…)
How do the private and public IPv4 addresses in EC2 instances behave in different instance states?
addresses stay the same
instance keep private IPv4
get new public IPv4 (unless has elastic IP which doesnt change during stop/start
How do the elastic IPv4 addresses in EC2 instances behave in different instance states?
elastic IP remains associated with instance
isame as reboot
How do the IPv6 addresses in EC2 instances behave in different instance states?
address stays the same
instance keeps address
How does the instance store volume in EC2 instances behave in different instance states?
data preserved
data erased
How does the root device volume in EC2 instances behave in different instance states?
The volume is preserved
volume is preserved
volume is deleted by default
How does the RAM in EC2 instances behave in different instance states?
RAM erased
RAM saved to file on the root volume
How does biling in EC2 instances behave in different instance states?
instance billing hour does not change
stop incurring charges as soon as it hits stopping
each time instance transits from stopeed to running -> start new instance billing period (billing minimum of one minute every time you restart the instance)
incur charges while instance in stopping state
stop incurring charges as soon as it state changes to shutting down
What different instance placement groups exist?
cluster placement gropu
partition placement group
spread placement group
What are instance placement groups?
determine how instances are placed on the untderlying hardware
=> i.e. makes sense if you have several intercorrelated instances (that e.g. communicate with each other…)
What is the cluster placement group?
logical grouping of instances
-> packed closely in availablity zone to increase network performance
What is the partition placement group?
spread instances across partitions such that
different partitions do not share the underlying hardware
each partition gets its own rack
partitions can be placed in different availability zones
reduce likelyhood of correlated hardware failures
-> and imporve performance in a partition
What is the spread placement group?
spreads instances across distinct underlying hardware
reduce correlated hardware failures
How does amazon manage network security?
accounts have their own virtual private cloud
-> resources launced in this VPC
=> resembles your network in your own data center…
What can one configure in the VPC?
Ip address range
subnets
route tables
network gateways
security settings
connect instances to the internet
conect your VPC to your data center
How are VPC created?
amazon creates default VPC (for your account)
additional VPCs can be created by the user
How can one access EC2?
primary means through web services API
interactive tools on top of API
AWS concole
amazon command line tools
third-party infrastructure tools
-> management of while infrastructure with multiple servers, accounts, reports, etc.
How does one access EC2 servers? (means of authentication)
private/public key pair
What is amazon cloud formation?
Allows to model your infrastructure
inftastructure as code
specify all resources in textual way as json template
allows to standardize components accross your instituztion
automatic deployment of all resources, controlled and predictable
use code editor and versioning tools
What different pricing models exiust?
on-demand pricing
-> pay by second for instance you launch
reserved instances pricing
make commitment to consistant configuration, including type and region
-> long term (1 or 3 years)
pricing spot-market
request unused (avilablle) instances with lower current price
-> provdie max willing price
What is the pricing for data transfer?
internet:
in: free
out: <$0.09 per GB
inside availability zone (private IP address)
none
regional transfer (private IP address)
between different availability zones in same region
$0.01 per GB (in/out)
public and elastic IP address inside EC2
What different pricing types exist for block storage and elastic iP addresses?
block storages
snapshot to S3
elastic iP address
Zuletzt geändertvor 2 Jahren