Genome
complete set of genetic information in an organism
Genomics = study and analysis of genomes
Genotype
genome of the individual
all genetic info inside a cell
inherited from parents
Phenotype
expression of genetype that is visible to people: eyecolor, size, skin, hair
determinded by: genotype & environment
= interaction of several alleles in different genes = found on different chromosomes
Allele
different version of the same gene located at specific position on chromosome
one allele from your mother and one from your father
dominant: allele is expressed
recessive: allele is masked
Diploid organisms: 2 alleles (homo- or heterozygous)
Codominant alleles
Two different alleles are expressed equally — neither one is dominant over the other.
both alleles appear fully in the phenotype
zB: AB blood type inheritance
A and B alleles are codominant.
If someone inherits A from one parent and B from the other → their blood type is AB.
Both A and B traits are fully expressed — not mixed or hidden.
Rhesus factor
type of protein on outside of RBS
Polygenic traits
determined by more than 1 gene
zB: skin color, height, weight
DNA structure
Phosphate
Sugar
Bases
Pyrimidines: Cytosin, Uracil, Thymine
Purines: Adenin, Guanin
Pairing: T-A, G-C (G-U)
Forward: 5-3 (Watson)
Reverse: 3-5 (Crick)
DNA folding
Chromatin: double stranded helical strct.
DNA complexes with Histones = Nucleosome formation (consists of 8 Histone proteins (DNA wrapped around 1,65 times))
Chromatosome: Nucleosome + H1 Histone, folds up to produce 30nm fiber
Formation of loops (diameter 300nm)
Fibers (compressed and folded into 250nm wide Fiber)
tight coiling = Chromatid of chromosome
Histones
type of protein found in the cell nucleus
organizing and packaging DNA
gene regulation (influence whether genes are turned on or off) = methylation
present different amino acids:
Lysine
Arginine
Phosphate groups = negatively charged
AA = positively charged
-> condense wrapping of DNA around histones
Methylation
introduction of Methyl functional group to Arg / Lys of Histone tail
catalyzed by: Histone methyl transferase enzyme
Methylated: 2x Arg, 3x Lys
Cell division
Mitosis:
Prophase
Metaphase
Anaphase
Telophase
where cell duplicates ALL contents (incl. chromosome)
Result: 2 identical daughter cells
Cell cycle
G1: cell growth
S: DNA Replication
G2
Mitosis: cell division
Cytokinesis
Interphase: G1, S, G2
Modification of crops
Tradt. breeding
Mutagenesis
RNA interference
Transgenics
Gene editing
GFP (fluorescent protein)
green fluorescent protein
protein in jellyfish
exhibits green fluorescence when exposed to light
has 238 AA
Marker Protein to study developmental processes and metastasis in cancer
Monogenic hereditary diseases
5000 - 8000
defined as herited conditions arising from mutations on single gene
affect 6%
zB:
diastorphic dyplasia
nonsyndromic deafness
Sequencing Applications
Human genome sequencing
Disease diagnostics
Personalized medicine
Forensic genetics
Human Genome Sequencing: Generating a reference genome sequence
= Determining the complete human genome for the first time
create a "reference blueprint" of the human genome that future studies can rely on.
built using DNA from multiple individuals to provide a complete and general representation of the human genome.
Extremely complex and expensive (e.g., the Human Genome Project took over 10 years and billions of dollars).
Serves as a comparison standard for sequencing individual genomes.
The result is the "Human Reference Genome", which researchers worldwide use as a foundation.
Human Genome Sequencing: Generating a persons genome sequence
Determining the individual genome of a specific person
for medical purposes or ancestry analysis, identify differences (compare to human reference System)
Quick and affordable = NGS
NGS (next generation sequencing) - Applications
Whole genome sequencing WGS
Exome sequencing Exome-Seq
RNA sequencing RNA-Seq
Methylation sequencing Methyl-Seq
Point Mutations
silent
missense
nonsense
Gene fusion formation
1) Fusion by structural rearrangement
translocations
inversions
deletions
insertions
2) Fusion by transcription or splicing
transcription read through
mRNA trans-splicing
cis splicing
Library Preparation
Fragmentierung der DNA
Mechanisch oder enzymatisch → DNA wird in kleine Stücke zerteilt
Adapter hinzufügen
Spezielle Sequenzen werden angehängt, damit der Sequencer die Fragmente erkennt
Library Amplification
Verstärkung der DNA, damit das Signal beim Sequenzieren stark genug ist Methoden:
PCR (klassisch)
Emulsions-PCR (z. B. für Ion Torrent)
Bridge-PCR (Illumina)
Lineare Amplifikation (PacBio)
Größenselektion
Nur Fragmente einer bestimmten Länge werden weiterverwendet
Library Quality Control
z. B. mit TapeStation → prüft Fragmentgröße und Qualität
Types of libraries (sequencing)
TruSeq PCR free library preparation kit
TruSeq Nano DNA library prep kit
TruSeq stranded total RNA kit
TruSeq DNA methylation kit
Nextera DNA library prep kit
Nextera rapid capture exome kit
Genetic Variation
Human genomes are 99.5% identical
Genetic variation arises from:
SNPs (Single Nucleotide Polymorphisms)
Indels (Insertions/Deletions)
Minisatellites & Microsatellites (also known as STRs)
Especially useful in forensic science
STRs are used to identify individuals
Forensic Genetics
STR profiling using Sanger sequencing
Comparison with national DNA databases (e.g. UK, Spain, Germany)
EDNAP = European DNA Profiling Group
Cancer Genomics – TCGA (The Cancer Genome Atlas)
large international project started in 2006
Sequenced over 20,000 samples across 33 cancer types
Aim: Understand the molecular basis of cancer and discover new biomarkers
Tools:
TCIA, SurvNet, TANRIC, FASMIC
Tumor subtyping
Personalized treatment
Risk assessment
Single-Cell Genomics
Goal: Analyze cell-to-cell variation (e.g. within tumors)
Tumor heterogeneity
Tumor evolution (e.g. during metastasis)
FACS, DEPArray, CellRaft
MDA (Phi29 polymerase, high fidelity)
DOP-PCR, MALBAC → alternative methods
Challenges: ADO (Allelic Dropout), error rates, non-uniform coverage
DNA Origami
Technique for folding DNA into nanoscale structures
Uses: long single-stranded DNA + short "staple strands"
Nano-containers
Drug delivery
Molecular sensors
DNA sequencing workflow
sample collection: from blood, saliva, tissue; goal: collect cells that contain DNA
DNA extraction: isolate pure DNA from cell + remove proteins, other cellular material
DNA fragmentation: Genome is too big = break DNA into smaller pieces (200-600 BP)
Library perparation: add adapters to DNA fragments for recognition by sequencer (short known sequences)
Sequencing: run fragments through sequencing machine (Illumination, Nanopore, Biopac) - each base sequence is read = output: 1000s of short sequence reads
Data processing: quality controll + sequence allignment
Data analysis: identify invariants + analyze genes, mutations, functl. elements
DNA data storage
DNA = natural data storage:
long lifespan
ultra compact
energy efficient
DNA encoding: converting binary code (0,1) into bases (A,T,C,G)
key components:
data
encoding system
writer (synthesizer)
reader (sequencer)
Sequencing applications
human genome sequencing
disease diagnosis
personalized medicine
forensic identification
Human Genome Project
1990 - 2003
first full sequence of human DNA (3.1 billion BP)
clone based & shotgun sequencing methods
discovery of around 30.000 human genes
Importance in Primer Design
Melting T°C:
T°C at which primer has dissociated from DNA template by half
55 - 65°C
Length & specifity:
long enough to anneal target region
not too long to form second structures
optimal: 18-24 nucleotides
avoid regions with: high sequence similarity / repititive DNA
Secondary structures:
hairpin loops & self complementary sequences = can interfere with Primer Binding & amplification
GC content:
GC content must be balanced
GC: 40-60%
target DNA content: 30-70%
GC clamp: GC within last 5 bases of 3’ end = helps to promote specific binding
always consider: mutations & SNPs (=single nucleotide polymorphisms)
Sanger sequencing (first generation)
very accurate
slow + expensive
for shorter sequences (1000 bp)
dideoxy technique
components:
DNA template
Primer
Polymerase
dNTPs (DNA building blocks)
ddNTPs (stop bases)
Sanger sequencing process
Copy of DNA:
like PCR but add: special stop bases (ddNTPs)
mostly added: dNTPs = DNA grows
sometimes: ddNTPs = DNA immediatly stops growing
add them for all bases (each has different color): dATP, dTTP, dCTP, dGTP
Many fragments are formed:
we get 1000s of fragments of different length = each fragment ends with specific base (where stop base was added)
Sorting by length:
using gel-electrophoresis: shortest fragments move fastest!
-> what color appears at each position (length) = tells us which base was at that position in original sequence
knowing order in gene / genome for:
what gene it is
what protein it codes for
wheter theres a mutation
how to diagnose genetic diseases
how to design personalized treatments
Illumina (NGS)
sequencing by synthesis (most common method)
DNA fixed to flow cell surface
Fluorescent nucleotides incorporated + detected by camera
13 billion reads per run
Platforms: MiSeq, HiSeq, NovaSeq
Oxford Nanopore (NGS)
Nanopore sequencing
produces long reads
DNA pulled through protein nanopores
changes in electrical current identifies bases
Platform: MinION (portable device)
Ion Torrent (NGS)
Proton Detection Sequencing
doesnt need light!
detects pH changes from released H+ ions
As DNA bases are added = pH increases = sensor measures voltage
Platform: Ion Proton, PGM
SOLID (NGS)
sequencing by ligation
DNA read through fluorescently labelled ligation reaction
complex data analysis & outdated
Pyro-sequencing-454 (NGS)
emits light signal when nucleotide is incorporated
Platform: Roche 454
no longer in use
Sanger vs. NGS
Genetic engineering
to modify / enhance the characteristics of indivials or organisms
can change:
one base pair
delete whole regions of DNA
introduce additional copies of a gene
combine genetic material from 2 different organisms
GE - Vectors
delivery tool used to transport a desired gene into a target cell
Viral vectors
Use modified or weakened viruses.
Viruses are naturally efficient gene carriers because they infect cells.
zB: Adenoviruses, Lentiviruses (HIV-based), AAV (Adeno-associated virus)
Advantages:
Very efficient at delivering genes into human cells.
Can lead to long-term gene expression.
Disadvantages:
Can trigger immune responses.
Often complex and expensive to produce.
Non viral vectors
use plasmids, liposomes, nanoparticles, or electroporation (electric pulses)
No virus is involved
Safer (less risk of immune reaction)
Simpler and cheaper to produce
less efficient at gene delivery
Often short-term expression
GE - Parts:
Promoter: initiates transcription of gene within a vector
Origin of replication: needed to start replication process in host cell
Cloning site: faciliates targeting of expressed protein to precise specific location (zB: periplasmic space of bacteria)
AB resistance: for selection of vectors with genome of interest
Epitope: antibody recognition of cell (some vector carry specific epitope sequences)
Reporter genes: enable identification of plasmids that carry inserted DNA sequence
Protein purification tags: faciliate purification process of expressed protein
Gene
segment of DNA that contains the instructions to make a specific protein (sometimes RNA)
Inheritance Types
Zuletzt geändertvor 3 Stunden