List at least three gene transfer methods that can be used to introduce DNA into plants.
Biological e.g. Agrobacterium mediated transformation
Mechanical e.g. Particle bombardment
Electroporation e. g. microinjection, Silica/carbon fibers, Laser mediated
Chemical e.g. Polyethylene glyco
Briefly describe three genetically modified plants that are currently in the field/on the market (outside Europe). Which traits do they contain?
Recent Examples of GM Biotech crops approved or commercialized in Brazil, Argentina, Canada, the United States:
Arctic® Apples that do not brown when sliced
Innate™ potatoes, with lower levels of acrylamide, a potential carcinogen, resistance to bruising, and late blight resistance
HB4 Transgenic wheat approved for cultivation:
Transcription factor HB4 from sunflower transferred into wheat leads to drought tolerance
What is Bt-toxin? Which organism is the original source of Bt-genes? Briefly explain its mechanism of toxicity for insects.
Bacillus thuringinsis
Mechanism:
toxin protein ingested by insect when insect feeds on plant
toxin activated in insect gut -> alkaline conditions and specific proteases needed
inserts into gut epithelial cell membrane (specific receptors) and forms ion channel
What does „codon-modification/optimization“ of a gene sequence mean? Why is it done?
Different codons are preferably used in different organisms. Before transfer of a sequence into another oranisms this has to be adapted to increase the expression rate in the organism.
Briefly describe three general strategies to achieve herbicide resistance in plants.
Introduction of the resistance from an closely related species via wide cross or if present in the species by crossing
Mutation breeding (with e.g. radiation or chemical mutagenisis) and hoping for a resistance occure by mutation
Introduction via gene transfer from a plant within or outside of the species
List five typical elements of a CRISPR/Cas-based genome editing vector to be transferred into a plant cell.
“guide” RNA,
Editing nuclease: sequence optimized;
Selectable marker gene,
Promoter,
Nuclear targeting peptide,
terminator
Can herbizide resistance also be achieved by conventional mutation breeding? How? Does GMO-regulation apply in such case?
Would be possible via creating mutations (random mutagenesis)
It is not GMO as it is defined in the regulations
Describe the mechanism of CRISPR/Cas technology. What is it used for?
The CRISPR/Cas9-system is a highly precise defence mechanism of bacteria against viruses. Thereby the bacteria stores a segement of an attaching but defeated virus. If it is again faced with the virus the sequence is recognised and cut.
We can use and modify the system by intruducing a target sequence into the system. If this sequence is recognised by the CRISPR/Cas9 system it is cut. This can be used to create knockouts or go further and introduce new segments.
Recombinant endonucleases (such as Zn-finger nucleases) can be used to modify plant genes (genome editing). Please explain briefly how this works.
They can be created so they are able to identify a sequence and cut at a specific site. This can be used to create knockouts or introduce a new segment at that loci.
Explain how a PCR works.
Denaturation
Annealing
Extension/Elongation
Briefly describe the use of particle bombardment for gene transfer into plants.
During particle bombardment sequences are atteched to particles (as gold). Afterwards they are shot with high pressure on cells. Due to the high speed they are able to enter cells.
How does oligonucleotide-directed mutagenesis work? Describe a commercial product generated with this technology.
Oligonucleotide-directed mutagenesis (ODM) is a method by which a single mutation can be induced at a specific site in the genome. For this purpose, oligonucleotides - short pieces of DNA whose sequence matches the target sequence - are synthetically produced and introduced into the cell. As a result of a deliberately inserted faulty nucleotide, individual base pairs are exchanged or altered during the subsequent repair of the break. Among the random DNA changes at the break site, the one that corresponds to the respective breeding target must now be found out.
Canola (CIBUS):
Under the Falco™ brand, our first commercial product is a non-transgenic sulfonylurea herbicide tolerant canola (Raps) providing an alternative weed control option for canola farmers globally, including lower herbicide costs, non-GM premiums and a synergistic option for soybean rotations.
Explain the terms Cisgene and Transgene.
Cisgenics: gene from sexually compatible plant
Intragenics: Newly arranged sequence from compatible plant
Transgenics: original gene from sexual not compatible variety
Your aim is to abolish a specific enzymatic activity in a given plant. You have got the sequence of the corresponding gene encoding the enzyme. Develop and describe a strategy to achieve this goal.
CRISPER-Cas9 System (easy and cheap) design sgRNA, embed in Cas9 protein (cutting molecule), the complex binds to a PAM region and unwinds the double helix. When the correct sequence is found, Cas9 introduces a double strand break. Repairing the cut though is error-proned which often leads to mutations and therefore making CRISPR a great tool for knocking out genes
What is the role of the guide RNA in CRISPR/Cas-based gene editing?
A guide RNA (gRNA) is a piece of RNA that functions as a guide for RNA- or DNA-targeting enzymes, with which it forms complexes. Very often these enzymes will delete, insert or otherwise alter the targeted RNA or DNA. They occur naturally, serving important functions, but can also be designed to be used for targeted editing, such as with CRISPR-Cas9
Explain briefly how RNA-induced silencing works. How can this be used to generate plants with increased virus resistance?
dsRNA is produced, in the cells transformed into siRNAs (small interfering RNA) are incorporated into a multicomponent nuclease complex. Afterwards it is cutting siRNA/mRNA with the complementary sequence. This enables plants to be resistant to viruses.
Describe briefly the use of antisense (RNAi) technology for generating amylose-free potatoes.
insertion of a coding sequence which produces dsRNA (hairpin construct) complementary to the sequence for the GBSSI enzyme.
Production of siRNA and incorporation into multicomponent nuclease complex.
Complex degrades mRNA encoding for the GBSSI enzyme.
Describe briefly the use of CRISPR/Cas-based technology for generating amylose-free potatoes.
no transfer of the transgene locus instead transfer of the protein or construct (avoid selection cycle, maintain variety quality, since it is a vegetatively propagated variety)
contains sgRNA for the GBSSI enzyme (enzyme involved in starch synthesis) locus -> knockout of all copies of GBSSI
-> no elongation of amylose, almost exclusively amylopectin is created
Select two strategies for generating amylose-free potatoes. Describe them briefly, compare them, and point out advantages and disadvantages.
Random mutagenesis (e.g. EMS) and afterwards TILLING
many other mutations as well -> decrease of quality
CRISPR/Cas9: targeted induced mutation that is knocking out the enyzme for the elongation of amylose
easier, fast, precise, but not allowed
The procedure for CRISPR/Cas-based gene editing can be done by transferring a DNA-construct encoding the Cas enzyme and the gRNA into the plant cell. Is the integration of the DNA strictly necessary? If not, describe the two options (with/without integration) and their pros and cons. Can CRISPR/Cas-based gene editing also work without DNA-transfer? How is this done?
transfer of the transgene locus encoding for the CRISPR/Cas9 construct (easier selection and higer transformation rate, but generative propagation needed)
or only transfer of the complex or vector (better if vegetative propagation and maintainance of the variety, hard selection)
Plant transformation constructs often contain a selectable marker. Why? Give an example for a selectable marker.
Makes it easier to select the plant with the transformation
Easier to distinct between plants with and without the desired gene
a gene encoding an enzyme is added along with your gene
Common selectable marker genes:
Antibiotic resistance: NPTII - detoxifies kanamycin (antibiotic),
Herbicide resistance: bar - detoxifies glufosinate (herbicide),
positive selection marker: PMI Mannose-6-phosphat-isomerase: changes mannose to useable carbohydrate
What is the aim of ""reverse breeding""? Which function needs to be suppressed to achieve this goal?
directly produces parental lines for any heterozygous plant
generates perfectly complementing homozygous parental lines
Surpression of Meiotic Recombination
Explain RNAi-based strategies to protect plants from insect pests.
sRNA is either applied or produced by the plant after ingestion by the insect (and transformation into siRNA) is cutting siRNA/mRNA with the complementary sequence.
How can CRISPR/Cas-based gene editing be used to introduce precise point mutations?
Use of sgRNA that enables cutting at a designated spot
Mechanical e.g. particle bombardment
Electroporation e.g. Microinjection, Silica/carbon fibers, Laser mediated
Chemical e.g. polyethylene glyco
Give your personally preferred definition of Plant Breeding and explain why you like it.
Plant breeding is the genetic improvement of plants for human benefit.
Short, easy and true.
In which typical phases can the activity of Plant Breeding be structured? For which of these can Biotechnology help breeders: give three examples that you appreciate most.
The activity of plant breeding can be partitioned in three phases: (Modern biotechnology helps plant breeders)
1) Planning: the design/imagine a breeding target (the ‘ideal’ cultivar)
2) Generate (new) variants: e.g., crossing, mutagenesis, transgenics
Careful planning of new crosses, mutagenesis, transgenics
3) Find and maintain the ‘best’ new variants
selection (markers), testing of new candidates, check for purity
What is tissue culture in a plant breeding context? Name at least four typical applications of tissue culture in a plant breeding context.
collection of techniques used to maintain or grow plant cells, tissues or organs under sterile conditions on a nutrient culture medium of a known composition
In-vitro-propagation (production of clones)
virus elimination
embryo rescue in wide crosses
tissue culture for doubled haploids (embryogenesis)
protoplast fusion (= somatic hybridisation allows fusion of different plant species)
single cell regeneration (e.g. after transformation)
What are Doubled Haploids, how can these be produced? Which is the typical genetic constitution of a doubled haploid plant? Advantages and disadvantages of this method in breeding.
Is a plant that has two identical chromosome sets
Homozygosity, diploid
can be produced via:
regeneration from haploid cells (gametes e.g., pollen or ovary), Chromosome elimination, Colchicine
Faster homozygosity in comparison to inbreeding
Labour intensive, working with toxic substances
From which cells can haploid plants be derived, and how? Will a haploid plant typically be fertile and form seeds?
Production from gametes (pollen or ovaries)
(Colchicine treatment or spontaneous diploidization)
Haploids can not sexually reproduce since during the meiosis the chromosome set can not be further split up
What is the genetic status of a doubled haploid plant in terms of homozygosity or heterozygosity? Is there an alternative procedure to achieve the same zygosity-status as with doubled haploids? If you know one name and describe it.
Homozygosity -> because this is needed for breeding
Inbreeding -> highest inbreeding -> selfing
Selfing takes place over several generations which is very time consuming
Explain and describe the term gene. Mention a few examples.
A gene is a molecular unit of heredity of a living organism
Gene for pea shape, flower colour
Explain and describe the term allele. Mention a few examples.
An allele is one of a number of alternative forms of the same gene or same genetic locus (generally a group of genes).
Allele for round or wrinkled pea shape
What are morphological markers, name several examples. What is their advantage? What their disadvantage?
You can see them while the plant is growing or on the plant in general
->easy to notice
Some traits develop late in life, some are not easy detectable, there are just a few markers (a few dozen)
E.g., Spelt colour, flower colour, hypocotyl colour, hairy/naked pods etc.
White flowers and missing black spots on stipules are perfect markers for tannin free seeds
What are biochemical markers, name a few examples.
Biochemical markers are protein markers
They visualize polymorphisms in proteins, such as storage proteins or enzymatic proteins.
E.g., Storage proteins (as glutenins and gliadins), Isozyme-Markers (Endopeptidase)
What are the two principle types of DNA polymorphisms?
Point mutations (single nucleotide polymorphisms, SNP)
Insertion/Deletion (INDEL): a more or less large piece of DNA is missing or extra present
Mention at least four important types of DNA markers based on the methodology used for detection/analysis.
Use of restriction enzymes for generating DNA fragments of different size, their detection is based on DNA/DNA Hybridization -> RFLP
Markers based on PCR: microsatellites (SSR), CAPS, (RAPD, AFLP, S-SAP, ISSR, SRAP, STS, SCAR…)
Markers detection on high throughput DNA-chips, modern DNA sequencing methods or different variants of PCR:
SNP on chips, genotyping-by-sequencing
(DArT and numerous more)
What are restriction-endonucleases? What is their typical feature? For which molecular markers do we need restriction-endonucleases?
Restriction-endonucleases are enzymes that cut DNA sequences at recognition sites
We need them for RFLP markers
Use of restriction enzymes for generating DNA fragments of different size, their detection is based on DNA/DNA Hybridization
What is the principle behind gel-electrophoresis of DNA? How does it work? Which equipment do you need for gel-electrophoresis?
Gel electrophoresis is a method for separation and analysis of biomacromolecules (DNA, RNA, proteins, etc.) and their fragments, based on their size and charge.
Nucleic acid molecules are separated by applying an electric field to move the negatively charged molecules through a matrix of agarose or other substances.
Gel, charged molecules ( e.g. DNA fragments), Buffer solution, electric field, colour for visualisation
What is a dominant marker and what is a co-dominant marker? Name a few examples in each category.
Co-dominant markers
Are markers for which both alleles are expressed when co-occurring in an individual
Heterozygotes can be distinguished from homozygotes
SSR, STR and RFLP
Dominant markers
In dominant markers only one allele is visible and heterozygosity cannot be determined directly
ISSR, RAPD and AFLP
What is meant by PCR?
Polymerase Chain Reaction
a technique used to amplify a segment of DNA of interest
PCR enables you to produce millions of copies of a specific DNA sequence from an initially small sample
Which ingredients are required for a PCR reaction?
Standard ingredients in the mixture are:
sequence of interest,
specific primers,
heat-resistant polymerase (Taq-polymerase),
dNTPs,
Buffer,
PCR machine
Describe the principle and the steps of the PCR reaction.
1. at approx. 90°C: Denaturing: the double stranded DNA becomes single stranded
2. at approx. 60°C: Annealing: primers bind to complementary sequences
3. at approx. 74°C: Elongation: the Taq-polymerase elongates the primers
Repeat steps 1-3 many times (…. 20-35 times)
Think about PCR reaction: at which rate are DNA molecules amplified with resect to their molecule size?
Result after n cycles: Linear amplification of PCR product of not defined length: 2n
Exponential amplification of PCR product of defined length (defined by the primer ends): 2n - 2n – 2
Do SSR markers usually detect SNPs or INDELs? What is the diffeence between SNP and INDEL polymorphism?
Detection of SSR markers:
SSR markers are usually INDEL markers, detection needs therefore size determination
Basically there are TWO types of DNA polymorphisms that we assess using currently available DNA marker methods:
Insertion/Deletion (INDEL): a large piece of DNA is missing or extra present
What is an SSR marker?
SSR = Simple Sequence Repeats (=microsatellites)
-> SSRs are highly variable and evenly distributed throughout the genome.
This type of repeated DNA is common in eukaryotes
Which separation and detection methods can be used for analyzing SSR markers?
Gel based systems:
Agarose gel (low resolution),
high resolution agarose gel,
Native acrylamide gel,
sequencing gel (denaturing acrylamide gel)
Capillary electrophoresis:
Capillary electrophoresis systems (e.g., capillary sequencer, capillary fragment analyzer, …),
Chip-electrophoresis systems (miniaturized electrophoresis systems, …)
Are SSR markers usually dominant or co-dominant markers?
SSR are co-dominant markers
Are SSR markers usually bi-allelic or multi-allelic?
SSR loci are often multi-allelic
Why are SSR markers still quite popular in plant breeding application?
Allow typically (but not always) co-dominant scoring
For many crop plants, many SSR markers (primer pairs) are available in the public database, which can be used for research and breeding
Many markers are already mapped and publicly available
What is meant by the term SNP marker?
SNP-Markers detects Single Nucleotide Polymorphism,
Variations at specific nucleotide positions in the DNA of a species (base substitutions because of point mutations),
SNPs are the most abundant sequence-polymorphisms,
silent mutations because of the ‚degenerated’ code for amino acids and,
SNPs occur often outside the coding regions
Name a few methods for SNP marker detection with and/or without electrophoresis. Describe the principle of one method.
Methods based on hybridization (e.g. microarrays)
Enzyme-based methods (CAPS, RFLP, primer-extension…)
Detection post PCR (e.g.. Single-stranded Conformation Polymorphism, Minisequencing, Kaspar System,…)
What is the principle of the KASPar system for SNP detection?
Uses allele specific PCR primers for amplification of SNPs combined with fluorescence detection
What is meant with the term Genotyping by Sequencing (GBS)? Describe the principle of this method.
GBS is a method to discover SNPs and to perform a genotyping study.
For sequencing of large and complex genomes target enrichment or reduction of genome complexity must be performed to ensure sufficient overlap in sequence coverage for species with large genomes.
Name at least two methods that can be used for high throughput SNP marker genotyping. Describe advantages / disadvantages of at least two such methods.
SNP genotyping on microarrays
Disadvantages of SNP ARRAYS:
development costs, therefore available for ‚important‘ species,
only those SNP that are represented on the array can be measured (may have a bias)
Advantages:
easily comparable and highly reproducible, robust,
High number can be detected at relatively low costs
GBS – Genotyping by Sequencing:
open platform, not restricted to pre-selected SNPs,
can also be used for minor crops/less economically important species
Linkage mapping depends on which biological phenomenon that takes place during meiosis?
Recombination:
Interchromosomal and Intrachromosomal
Your task is to develop a linkage map. How would you perform this task? Design and describe step-by-step what you need to do and achieve in order to generate a new linkage map from the beginning to completion.
Have ideally two homozygous parent (lines) that are distinct from each other in two or multiple traits/loci -> creation of a hybrid/cross (F1) -> selfing (or backcrossing) (alternatively development of homozygous progeny) -> measuring of the traits in the F2/homozygous progeny -> comparison of the ratio parental phenotype/traits and the newly combined phenotype/traits
Best to have >100
You were given the task to identify markers and map quantitative trait loci (QTL) for starch content in sorghum. How would you perform this task? Design, describe, and possibly illustrate step-by-step what you need to do and achieve in order to map relevant QTL for these traits in this crop species from the beginning to completion.
Parents that differ in one trait, high and low starch content
Make a cross F1
Selfing or backcross with one parent -> segregating population
Recombinant inbreed line or DH -> we want immortal populations -> DH or RIL with a genotype with a lot of markers and phenotype for the trait (with very few percentage of heterozygosity)
Data analysis
QTL validation
Name and describe at least three possible population types useful for linkage mapping. Describe advantages/disadvantages of these population types.
F2: fast but not many recombinant events, low homozygosity
RIL: needs a long time but reproducible and more recombination events
DH lines: fast, labour intensive, 100% homozygosity, less recombinations than RIL
BC1F1: fast, only one recombination event
Your task is to generate a linkage map in an F2 population of 250 F2 plants. Will it make a difference if you use a marker system with allows DOMINANT marker scoring or which allows CO-DOMINANT marker scoring? Which is the difference between a so called dominant marker versus a co-dominant marker? Which marker system would you prefer for making a genetic map in an F2 population (dominant or co-dominant)? and justify your an answer.
Yes it makes a difference, it is not possible to distinguish between the two alleles
Co-dominant because it has a finer resolution (we want to distinguish between homozygotes and heterozygotes)
In a mapping project you have genotyped your mapping population of 250 DH lines with 40 SSR markers and 500 SNP markers (using a SNP chip method) and for two morphological markers with a monogenic segregation pattern: flower color, seed color. Your task is to calculate a genetic map in this population. --> Reply true or false:
a. I can integrate only the DNA markers (which are SSR and SNP markers) and calculate a map combining these two molecular marker types BUT I cannot map the morphological markers at the same time. TRUE or FALSE?
b. I can integrate all markers (=morphological and SSR and SNP markers) in one combined data set and use this to calculate a genetic map combining all three marker types in one go. TRUE or FALSE?
c. I cannot combine these marker data. The only solution is that I have to calculate three separate genetic maps: one for the SSR markers, one with the SNP markers, and another one with the morphological markers. TRUE or FALSE?
Explain and justify your answer....
a. False
I can combine all markers; it does not matter which type of marker
There is no need to differ between DNA and morphological marker, for the mapping it does not matter, as long as we can differ between A1A1, A1A2, A2A2
b. True, for more see a.
c. False, for more see a.
What is an immortal mapping population, name a few examples, describe advantages / disadvantages of an immortal population.
Populations in which the individuals have a high degree of homozygosity are called ‚immortal populations ‘, each line can be multiplied easily and remains genetically constant
examples: DH populations, RIL populations
What is a temporary mapping population, name a few examples, advantages / disadvantages of this population type for mapping?
Mapping populations with a high degree of heterozygosity (example BC1F1 or F2) are ‚temporary populations ‘
You want to create a mapping population with 100 progeny from a cross of two homozygous parents. Let's say you could generate either 100 recombinant inbred lines (RIL) using single seed descent (SSD) or 100 doubled haploid lines (DH) descending from F1 plants. Does the population type (RILs vs. DH) have a impact on the expected resolution of the genetic map ?
(A) YES
(B) NO
Explain your answer....
Yes
DH have only one recombination event
RIL has double the amount of recombination events so the resolution is higher
What if I make a cross between two homozygous parents, then perform 2 generations of random mating before I start with selfing lines to generate a recombinant inbred line (RIL) population of 100 RILs? Will the resolution of a linkage map from such a population increase or decrease or remain largely the same, relative to a 'normal' RIL population without two generations of random mating? (mark correct answer(s))
a. increase
b. decarese
c. remain largely the same
Explain and jusitfy your answer
it will increase
-> more possibilities for crossover
What if I make a cross between two parents and then:
a. I perform 2 generations of random mating before I start with selfing the lines with single seed descent (SSD) to the F5
b. I start selfing immediately from the F1 and self by SSD to the F5. I then choose for genetic mapping 200 random F5 plants in population A) or B).
The genetic resolution of the obtained linkage maps from population A and B will be
I) exactly the same TRUE/FALSE
II) The resolution of the map from population A will be better TRUE/FALSE
III) The resolution of the map from population B will be better TRUE/FALSE
Explain and justify your answer
I) false
II) true
III) false
The more recombination events the higher the resolution
The resolution of any linkage map depends on the number of effective meiotic events that took place in the population
Are two markers possibly linked if the experimentally obtained recombination rate between them is
a. 0.501
b. 0.01
c. 0.0005
d. 0.49
e. 0.05
(Yes/No)
a. No
b. Yes
c. Yes
d. No
e. Yes
Two loci are 1 cM apart when the recombination rate between these two loci is … (complete the sentence)
…. 1% per meiosis event, which means ONE crossover per 100 independent meiosis events
recombination [in cM] = number of recombined gametes/numbers of all gametes * 100
Why do we use mapping functions to convert recombination rates in genetic distances in Centimorgan?
Recombination frequency (r) is not a good measure of genetic distance since the average recombination frequency cannot exceed 0.50 or 50%.
Meioses that contain multiple chrossovers are counted on average as containing only one crossover so r is biased downwards, especially when r > 0.20
A cross-over event usually reduces the chance for another crossover event nearby.
-> for these problems we need a mapping function
Describe and explain the difference between: recombination rate (r) and genetic distance in cM
Recombination rate is the number of recombined gametes / number of all gametes,
Genetic distance is a distance that is generated from the recombination frequency and can be used to generate a genetic map.
Is the recombination frequency typically uniform along a chromosome?
No it is not uniform
There are less recombinations in the centromeric region
You should map a gene of interest in a bi-parental cross. Your aim is to map this gene with a precision of about 0.1 cM. You decided to develop a doubled haploid (DH) population for mapping.
a. What is the minimum number of DH lines in your mapping population in order to achieve this goal?
b. Could you alternatively use a recombinant inbred line (RIL) population as well?
c. How many RILs would you need to obtain an expected average map resolution of 0.1 cM?
a. 1000
c. 500 RILs, double as informativ as DH
What is the basic principle behind a genetic similarity coefficient based on molecular markers?
Genotypes that often have marker alleles in common are closer related to each other than genotypes that possess often different marker alleles.
Mention / describe two methods for illustrating / visualizing genetic relatedness (similarity, diversity) based on molecular markers in plant populations.
Jaccarde distance: number of common bands in both genotypes divided by all bands,
shared band index: number of shared bands/ number of bands of genotype i and j,
For which reasons / questions could it be quite interesting to know about diversity / relatedness in a plant breeding context, name and describe minimum three examples.
Helpful for selecting the parents
To know where to search for possible advantageous alleles
relatedness is important for differentiation between cultivars of different breeders
detect duplicates in your popultaion
Describe briefly two multivariate statistical approaches useful to estimate and to visualize and illustrate genetic diversity in a plant population based on molecular markers.
Cluster-Analysis-Methods
Principle Component Analysis: Data reduction to visualize principal structure in the data
How can one map a gene responsible for a monogenic trait in a plant population, using genetic mapping. Describe and illustrate the principle and steps needed.
Have two distinct parents for the trait -> create an F1 and F2 -> do phenotyping and genotyping (high marker density every 5-10 cM one Marker) -> See which marker fits to the distribution of the trait
What is the principle difference between a qualitative trait compared to a quantitative trait? Give a few examples.
Qualitative: discrete variations (e.g. flower color in peas, hypocotyl colour in soy, qualitative resistances, …)
Quantitative: continuous variation (e.g. grain yield, plant height, protein contents, …)
What are the two principle reasons for quantitative variation of a character in a segregating population?
The environment is modulating the trait
Multiple genes contribute to the trait variation
You should find out among e.g. n putative F1 plants, which are real F1s and which are accidental selfings. Both: the egg-cell parent and the pollen parent appear phenotypically very similar and it is impossible to discriminate them based in their phenotypic appearance. Is it really necessary for the breeder to know which plants are selfings and which are true F1s?
a. YES it is or
b. NO it os not.
c. Explain your answer to the above question. Describe a strategy to solve this problem.
Yes, it does. Only crosses are going to lead to new combinations, which are relevant for selection
-> use molecular markers as quality control
What is the meaning of the term QTL? What is the basic definition of a QTL?
quantitative trait locus/loci a locus in the genome, that influences a quantitative trait
or a region on a chromosome, which is significantly associated with variation of a quantitative trait
Which questions can we answer when we apply QTL analysis?
Where are genes located that influence a trait,
how much of the variation can be explained by the locus,
How many (significant) QTL govern a certain trait in a certain situation?
Where in the genome are they located?
Which markers are closest to the QTL and how close are they?
What is the contribution of the individual QTL to the overall trait variation?
What is the gene action of the QTLs (additive, dominant, epistatic)?
Illustrate and explain the principle steps needed for a QTL analysis.
Parent 1 x Parent 2 (are differing from each other in the trait of interest) -> creation of a segregating population -> genotyping (with a lot of markers approximately every 5-10 cM) and phenotyping -> Data analysis -> QTL validation
What is a temporary mapping population, what is an immortal mapping population? Give a few examples.
Populations in which the individuals have a high degree of homozygosity are called ‚immortal populations‘, each line can be multiplied easily and remains genetically constant, examples: DH populations, RIL populations
Mapping populations with a high degree of heterozygosity (example BC1F1 or F2) are ‚temporary populations‘
Your task is mapping QTL for grain yield and seed color in soybean. You already made a cross between two homozygous parents, one with black seeds and low yield and one with yellow seeds and high yield, in the past. You can choose between two population types for mapping these characters:
a. an F2 population with 300 F2 plants and
b. a recombinant inbred line population (RIL) with 300 F6-derived RIL lines.
Will it make a difference which population you choose for which trait?
c. Which one would you prefer for mapping QTL for grain yield ?
d. Which one would you choose for mapping the trait seed color?
e. Explain why
Yes, A for seed color and B for yield.
Because B is a nearly immortal population yield is a trait influenced by a lot of loci so it makes sense to have a population with which replications are performable.
Also, it is possible to do finer mapping with population B because of more recombination events
What is the general principle behind statistical methods used for QTL analysis? Name and explain at least two typical statistical methods for QTL estimation (QTL mapping).
We want to find out which QTL markers are associated with our trait
Single marker-trait analysis -> looks at each individual marker and calculates the chance that it is associated with the trait
Simple interval mapping -> calculates the possible position in the genome and returns the likelihood that it is in a given interval
Let's say you have mapped a gene for an important disease resistance trait in a plant population using genetic mapping. Your mapping was successful and you found markers that identity the chromosome and flank this gene at a distance of 2cM left and 3 cM right of this gene. Do you already know with this information the sequence and function of the disease resistance gene under investigation?
a. If not, what could you do in order to find the gene that causes this resistance trait?
No, I don’t know the gene and its function we only have an interval and the sequence there -> synteny analysis or knockout experiments with the genes in the interval
Let's say you have mapped a gene for an important disease resistance trait in a plant population using genetic mapping. Your mapping was successful and you found markers that identity the chromosome and flank this gene at a distance of 1 cM left and 2 cM right of this gene. Which statement is ture?
a. This is nice but only theoretical information but it is unfortunately useless for breeding, because I still do not know the exact sequence of the resistance gene.
TRUE/FALSE
b. This is really good, I can use this information in breeding, TRUE/FALSE
Explain and justif your answer of your answer
b. True
Even though you don’t know the specific gene relevant for the resistance you can use the information for MAS e.g. in a marker-assisted backcross or simply in a cross to see if the region and thus the resistance is in the genome.
What is the principle of advanced-back cross QTL mapping? When is this approach applied?
Cross and multiple rounds of backcrossing.
Creation of sister lines with small segments of donor and QTL mapping.
Useful for transferring QTL alleles from a non-adapted line or wild relative into an adapted cultivar and mapping of QTL at the same time.
What is meant with the term ‘introgression library’?
Introgression library is a series of introgression lines which are almost identical and have small segements from the donor lines.
What is the principle of ‘association mapping’?
Association mapping or linkage disequilibrium (LD) mapping is an alternative approach for mapping quantitative traits in populations.
LD mapping is based on genetic variation in ‚natural‘ populations for high resolution mapping of desired genes.
What is meant with the term ‘Linkage Disequilibrium’ (LD)? What is LD mapping?
The non-random association between two markers or two genes (QTL) or between a marker and a gene (QTL)
An alternative approach for mapping quantitative traits in populations
What is meant by the term MAS?
Marker assisted selection
MAS is a variant of indirect selection
It combines phenotypic and marker selection (in contrast to marker based selection)
Use of markers that are closely linked to the trait and selecting of those markers instead of the trait
What is a MAGIC population in the context of QTL mapping? Why is this population type attractive for mapping? What is difficult about it?
Multi-Parent Advanced Generation Intercrosses
Advantage: population with LARGE genetic diversity:
* more parents leads to more diversity
* potentially segregating for multiple alleles at each locus
Bottlenecks:
* needs longer to develop,
* must have many lines (~100 per founder)
Give a few examples when Marker Assisted Selection would be of advantage compared to traditional phenotypic selection.
Phenotypic selection is labour intensiv, expensive or time consuming
Phenotypic selection is only possible in advanced generations
needs much input (many environments, expensive testing methods, …)
The breeder wants to combine alleles which cannot be distinguished on the phenotype
In a back-crossing program when a breeder wants to transfer only a few ‘positive’ alleles from a donor line into a well adapted productive line. Especially in cases where the donor line is a wild relative or a landrace with poor agronomic performance
For molecular marker assisted breeding we must know the sequence of all genes of the plant species we want to breed. Is this statement (A) True or (B) False? Explain and justify your answer
(B) False
we just need markers and need to know if they are linked to our trait of interest and how much the loci influences the trait.
In case of GS we need many markers, a phenotyped and genotyped training group to vreat a prediction model for the performance of a breeding population.
What is meant by the term Genomic Selection (GS)?
form of marker-assisted selection which uses a training population that is phenotyped and genotyped and applies a model using all markers applied to the genotype.
This model is afterwards applied to a breeding population, where the performance is predicted.
Early selection possible, reduces the high costs for phenotyping.
Which advantages are expected when using GS in plant breeding?
Lower costs compared to phenotypic tests
Genomic predictions of breeding values can be obtained earlier in the breeding program than phenotypic predictions.
GS therefore generally should increase the selection gain per unit time.
a. For which trait architecture would you recommend marker assisted selection?
b. For which trait architecture would you recommend genomic selection?
c. Which are the typcial feature for these two approaches?
a. classical MAS: for qualitative traits or traits influenced only by a few loci, possible to create QTL ideotype
b. GS: highly quantitative trait influenced by a lot of minor loci and a trait that is hard to screen for large breeding populations (costs for phenotypisation high and selection on that way slow)
c.
MAS: having closely linked markers to a few loci of interest that are screened for
GS: using many marker to predict the performance of an individual based on a training population
Illustrate and describe the components needed for using Genomic Seletion.
training population that is genotyped and phenotyped,
good prediction model,
breeding population that is only genotyped
Let us say you have found markers flanking a QTL which you want to select in MAS. Will the reliability of your MAS be the same or different when you use both markers simultaneously in MAS compared to using only one marker in MAS? Explain and justify your answer.
higher reliability if you use 2 markers simultaneously,
using one marker a crossover could happen destroying the function, but two crossovers are rare -> increases the selection precision
Describe a situation when F2-enrichment could be a useful strategy in MAS. Explain and justify your answer.
Increase the frequency of desired alleles by removing unwanted individuals (= those which possess the un-desired alleles for at least one QTL homozygous)
If multiple Genes (QTL) are wanted in a genotype F2 enrichment is decreasing the number of lines needed
What is the 'large p small n problem' that is associated with Genomic Selection? With which data analysis strategy has this problem been overcome in Genomic Selection?
we have more markers to estimate than phenotyped individuals
solved by treating markers as random effects -> using estimation methods such as ridge regression BLUP (best linear unbiased prediction, RR-BLUP) or G-BLUP and several further approaches
You have done a SSR marker analysis among 'putative' F1 plants from the cross female x male. Lane one on the gel is the allele from the female lane 2 from the male and all the others are the putative F1s.
a. Which are true F1s and which are not true F1s?
b. Plants that are not true F1s , what are these?
True F1s are 9, 11, 13 and 15, since they have the allele from both parents all others are selfings from the mother plant
Explain the virus defense of plants.
There are multiple mechanisms in plants against viruses the main mechanism is gene silencing -> whereby foreign genetic material is cut up
What is a marker?
A genetic marker is a gene or DNA sequence with a known location on a chromosome that can be used to identify individuals or species.
What is de novo genome sequencing and how does it differ from genome resequencing?
De novo sequencing is the sequencing of a genome, where no reference sequence is available.
In contrast for genome resequencing a reference genome exists, to which the sequences can and are alligned.
What is Pacific Biosciences (PacBio) sequencing? Explain the advantages of this technology for plant genome sequencing.
The PacBio sequencing is single molecule real time sequencing (SMRT) and creates long reads. It records the DNA synthesis in real time.
Advantages: Long reads
Disadvantages: high error rate (but unbiased errors)
If an assembly is reported with an N50 size of 1 Megabase, what does it mean?
50% of the assembly is within contigs or scaffolds of 1 mb or larger
What is a consensus error?
Likelihood of wrong base in final assembly
You have downloaded a genome assembly from a public website. As a first step, you run a script on the assembly to count the number of bases. 5% of the bases are reported as unspecified bases "N". Explain the reason behind this finding.
Missing parts between contigs will be annotated with N
Proportion of base N in the assembly- 5-10% N is normal
Multiple reasons possible:
Biology -> natural variation
technology -> e.g. sequencing errors, biased sequencing errors, errors within reference sequence
How can you evaluate the completeness of a genome assembly?
Completeness evaluated based on genes e.g. Benchmarking universal single-copy othologes (BUSCO) or others
Assemblies typically consist of contigs and scaffolds. Explain the difference.
The key difference between contig and scaffold is that a contig does not have gaps while a scaffold consists of contigs and gaps.
You are planning to study the genome of a not yet sequenced crop plant. Your first task is to select an accession for sequencing. Explain what kind of plant material / genotype will be best suited for the work.
Variety that is well known and widely used
Additionally material of young plants that have a lot of DNA and not jet accumulated many mutations
homozygous genotypes -> Inbred lines (e.g. from selfing or backcross)
Heterozygosity causes assembly fragmentation
DH genotypes (from plant biotechnology)
How can you identify contamination within an assembly? Name a few reasons why contaminations can occur.
Parasite, symbionts, accidental contamination
Organisms colonized by microbes
Sequence database curators perform contamination screens
What is genome annotation?
Attaching information to genome
e.g. genes, transposable elements, variants
How can you discover genome variants and how can you confirm them?
Sequencing of a population
decide one genome as "reference"
Search for variants, differ from your reference
Variants should be called on both strands
Variants called from just one strand may be artefacts due to context-dependent biases
Explain if, despite high base quality, a base call may still be wrong.
Explain the differences between a hierarchical shotgun strategy and a whole-genome shotgun strategy.
hierarchical sequencing: in hierarchical shotgun sequencing, the genome is broken into larger fragments prior to sequencing
in whole genome shotgun sequencing, the entire genome is broken into small fragments for sequencing
In contrast in whole genome sequencing there is no BAC library created
You receive a file containing the scaffolds of an assembled plant genome. The assembly statistics indicates an N50-length of 1000 base pairs for the scaffolds. Explain what these assembly metrics imply. Do you think this is a high-quality genome assembly?
N50 of 1000bp reports that 50% of the assembly is in units of 1000bp or bigger
There will always be contigs that are too short to be of any use to anyone
If the number is very small it might not be a good assembly
What are advantages and disadvantages of an introgression library?
Advantage:
Such near isogenic sister lines allow a very precise estimation of the effects of genes on the donor segments on the traits of interest.
Disadvantage:
Generation of a good introgression library is time and resource demanding.
One needs to backcross several times and has to analyze the progeny with appropriate markers after each backcross
Mention at least two questions we cannot answer with QTL ananlysis.
Cannot answer:
which gene is exactly influencing the trait,
are the loci that were found the only loci influencing the trait,
We do not know if we found ALL QTL for this trait
What is the principle advantage of association mapping compared to mapping in bi-parental crossing populations?
uses existing populations due to many recombination events during their (breeding) history.
After many recombination events only very tightly linked markers should still be associated (be in LD) with the allele of interest
in contrast classical bi-parantal crossing populations and classical QTL mapping has relatively few recombination events
Zuletzt geändertvor 2 Jahren