What is the C-value Paradox?
C-value (=size of genome) depends on organism
—> constant within a species
—> varies widely among species
Paradox: no strong correlation between complexity and genome size
Are in general bacterial or eukaryotic genomes smaller?
bacterial smaller than eukaryotes
bacteria: 580 Kb to 13 Mb (20-30 fold size variation in prokaryotes)
eukaryotes: 8.8 Mb to 700 Gb (80.000 fold size variation)
Is there a correlation between genome size and gene number?
In bacteria —> Yes!
In eukaryotes —> No
(some correlation e.g yeast<Drosophila<human, but range of variation <50 fold)
give the ‘genome size - gene number’ ratio of
Yeast
Human
and the
Yeast-Human C ratio
Yeast-Human G ratio
15 Mb
6000 genes
3.000 Mb (3 Gb)
25.000 genes
for DNA C ratio (human to yeast) —> 3.000/15 = 200
for genes G ratio (human to yeast) —> 25.000/6.000 = 4.2
—> most of C-value variation is due to amount of non-coding DNA
What is the Heterochromatin?
large regions of genome with no/very few genes
—> difficult to clone and usually not sequenced in genome projects
how many of the human genome is protein-encoding?
2%
What is satellite DNA?
Where are they found=?
distinct bands of DNA —> heavier / lighter than majority of genomic DNA.
Repeated (1.000 times a row) short sequences (2-2.000 bp) that have:
high GC (heavier) content
high AT (lighter) content
found in heterochromatic regions and around centromers
What are Minisatellites and Microsatellites?
Where are they found?
Minisatellites: sequences of 9-100bp repeated 10-100 times
—> in subtelomeric regions
Microsatellites: very short sequences of 1.5 bp repeated 10-100 times
SRS = short repetitive sequences
STR = short tandem repeats
SSR = simple sequence repeats
—> dispersed throughout chromosomes (in and around genes)
What is the most common satellite DNA in human?
dinucleotide repated CA (50.000 copies)
Where are microsatellites useful?
DNA fingerprinting
What are examples for tri-nucletides repeats?
Fragile X-syndrome (CCG)
Huntington’s disease (CAG)
Schizophrenia (CAG)
Myotonic Dystrophy (CTG)
many other neuromuscular disorders
What are Transposable Elements (TE) and how much of the human genome consists of it?
pieces of DNA that can move within the genome and increase in number (jumping genes)
50% of human genome
What are the two major types of Transposable elements? Which meachnisms are involved in them?
Transposons, Retrotransposons
classified by their mechanism of transposition
Conservative transposition -> Transposon (involve only DNA)
Replicative transposition -> Transposon (involve only DNA)
Retrotransposition -> Retrotransposon (requires RNA intermediate)
Which of the transposons is the most abundant in a genome?
Retrotransposons
What are the mechanisms of Transposons and Retrotransposons?
Is the copy number increasing or decreasing?
Conservative transposition
TE move from one place to another
recombination between elements at different chromosomal locations (‘cut-and-paste)
—> no increase
Replicative transposition
original element remains at donor site, while new copy inserted in new site (‘copy-and-paste’)
—> copy number increases
Retrotransposition
TE transcribed to RNA —> reverse transcribed to cDNA —> inserts in new chromosomal loc (‘copy-and-paste’)
How long are Transposons?
2.500 - 7.000 bp long
What are autonomous and non-autonomous transposons?
autonomous
terminal repeats at ends
encode single gene (transposase)
can move by themselves
non-autonomous
no transposase gene
cannot move by themselves
can move if another element in genome producing transposase
What is the ‘Helper element’?
In which organism useful?
Give an example.
no inserted repeats
has transposase gene
cannot move, but can cause non-autonomous elements to move
useful for experiments in organisms like Drosophila, e.g.:
transposase gene of a TE replaced with any gene
—> helper-element make transposase
—> insert gene into Drosophila genome
—> helper element removed
—> new gene becomes stable part of genome
What are Retrotransposons?
Active - have intact promoter
transcribed
can retrotranspose
What is ‘DOA’?
‘Dead’ or ‘Dead On Arrival (DOA)’ - retroelements
often shortened at 5’, when inserted in DNA
—> remove promoter
—> can not longer be transcribed / retrotransposed
—> under no selective constraint and accumulate mutations at random —> ‘junk-DNA’
What are pseudogenes?
previously functional genes, lost function due to mutation:
stop codon into ORF
insertion / deletion that disrupts reading frame
rare cases: lost function, due parasitic / symbiotic relationship with their host —> genes not needed —> lost due mutation (M. tuberculosis)
What are unprocessed pseudogenes?
often arise through tandem duplication
—> entire section of DNA duplicated during replication
—> producing 2 copies of gene
if only 1 copy required: other copy accumulate mutations —> non-functional psuedogene
What are processed pseudogenes?
What are the key features?
(or retrotransposed genes)
mRNA of nuclear gene is reverse transcibed into cDNA —> re-inserts into genome
uses reverse transcriptase + integrase enzymes encoded by retroelement
Key features:
no introns present in ‘parental’ gene
if recent, evtl poly-A sequence at 3’ end
usually lacks promotor sequences (—> Dead-on-Arrival)
Why do some genes appear to retrotranspose more than others?
expression level:
highly-expressed genes have more mRNA —> greater chance of being reverse transcribed
gene size:
short mRNAs may retrotranspose better than long
sequence-specific:
primary sequence of some genes may be better for retrotransposition
What affects genome size?
Why is there such great variation in genome size?
mutational and selective forces affect genome size
major classes of explanation:
adaptive
non-coding DNA is functionally important to organism
junk DNA
most of non-coding DNA serves no purpose
may event be parasitic/’selfish-DNA’
Can C-value variation be explained by differences in mutation rate?
Describe with Laupala sp. and Drosophila.
approach:
Laupala sp. (Hawaiian crickets) have genome 11 times larger than Drosophila (genome small, almost no psuedogenes)
Rates of DNA deletion estimated in both by comparing sequences of DOA transpoasble elements (similar to pseudogenes)
result:
spontaneous DNA loss faster in Drosophila
—> may explain why Dros. genome small and almost no pseudogenes (lost rapidly by deletion mutations)
C-value also with larger genomes? (20 Gb)?
Grasshoppers (Podisma)
over 10X greater than Laupala
over 100X greater than Drosophila
many pseudogenes in nuclear DNA derived from mitochondrial genes (NUMTs)
—> Podisma very low rate of DNA loss
—> inverse correlation between genome size and rate of spontaneous DNA deletion holds for 3 insect groups with greatly diff genome sizees
What are NUMTs?
Why are NUMTs non-functional?
Nuclear copies of mitochondrial genes = ‘new mites’
non-functional, bc:
genetic code different between mitochondria and nucleus
often lack promoter
dont have signal seq to target to mitochondria
What are the differences of
Deletion rate
Deletion size
Genome size
of Drosophila, Laupala, and Podisma?
Deletion rate: Dros > Lau > Pod
Deletion size: Dros > Lau > Pod
Genome size: Pod > Lau > Dros
Last changed5 months ago