Why do we need theoretical modeling of protein structures?
experimental protein structure determination is expensive and time-consuming and sometimes fails
for many applications, we do not need to know the exact positions of all atoms, just of residues, and theoretical calculations can provide that
(not sufficient for drug design, though)
Which two concepts are important in structure prediction?
Darwin: Evolutional Biology
Boltzmann: Statistical Thermodynamics
Explain the concept of Evolutional Biology by Charles Darwin in the context of theoretical structure prediction of proteins.
Existent proteins are derived from a common ancestor by mutation (duplication, insertions, deletions, …)
Evolutionary related (homologous) proteins are similar in structure.
There is a model to describe the evolution of proteins based on sequence alignment.
sequence alignment: quantify how similar two sequences are, tells you distance in evolution
When a structure of a protein that is homologous to our target protein exists, we can model the evolutionary changes that happened since they diverged from the common ancestor.
evolutionary related / homologous proteins keep a similar structure even they have differences in their sequences.
How well does the Darwin approach work to predict protein structures? What does the quality depend on?
sometimes it works well
requirements
good homologous protein template available
we need to find said protein
How does sequence alignment work?
find the weight of all transformations (insertions, deletions, mutations) that transforms sequence a in sequence b
weights are determined by scoring matrix, e.g. BLOSUM
align the amino acids so that the weight of transformations is maximized
What is homology? What does “45 % homology” mean?
“Descent from a common ancestor”
we assume homology when biological objects have similar properties
Homology is not the same as sequence similarity! But it can be assumed when sequences/structures are similar.
“45 % homology”: 45 % of the two proteins have derived from a common ancestor, 55 % from different ancestors. It does NOT mean that they necessarily share 45 % of their sequence.
homology: qualitative
sequence similarity: quantitative
Explain the concept of Statistical Thermodynamics by Boltzmann in the context of theoretical structure prediction of proteins.
Assumption: The native protein structure/conformation is stable and located in the global free energy minimum.
To find a native fold, the peptide chain is sampled into the native structure as well as non-native decoys.
anyway, not all possible conformations are tried (this would take way too long)
some sequences just “fall” into certain folds (e.g. α-helices) – there are libraries for that
A potential-energy function is used to identify the native structure (the one with the lowest energy)
find the funnel
Why is the Boltzmann approach alone not sufficient to obtain good structures? What is the alternative?
We do not have the computational power to calculate all possible structures.
We do not have sufficiently accurate potential-energy functions to identify the correct native fold.
ROSETTA combines the Boltzmann and the Darwin approach.
Explain the typical procedure of structure prediction.
You have the sequence of your protein of interest.
You search the PDB for a homologous sequence.
You align the two sequences.
important! If there are mistakes here, you get the wrong structure! Always check if the alignment makes sense.
You create a model using the template.
Trouble shooting in homology modeling.
Check the initial alignment of sequences! If this is wrong, you cannot get a correct model.
You need to validate your structure! Computational methods usually give you several alternatives, some of which will be wrong
globally (wrong fold/wrong orientation of domains)
locally (bad loops, alignment shifts)
Therefore, you should filter out wrong models and check for wrong regions in partially correct models.
Your structures can be checked by comparison with the experimental data you have
residue-residue distance (disulfide bridges, chemical cross-linking, etc.)
solvent accessibility
secondary structures (NMR, CD)
shape and size (cryo-EM, SAXS,…)
Also check for bond lengths, angles, stereochemistry, chemical environment of residues, etc.
Which applications does homology remodeling have?
protein engineering
antibody design
implementation of new functions into known enzymes
de-novo enzyme design
stabilization of protein structures (e.g. thermostability for industry)
Which strategies can be used to increase the stability of protein structures?
reduce entropy difference between folded and unfolded proteins
stabilize α-helix dipoles
introduce negatively charged amino acid at the N-terminus (which has positive charge)
increase hydrophobic interactions in the core
reduce hydrophobic surface accessible to water
introduction of hydrogen bonds, disulfide bridges, salt bridges, etc.
Explain the cycle of de-novo enzyme design.
Choose a catalytic mechanism.
Create active site (quantum mechanical calculations to find optimal transition state stabilization).
RosettaMatch: search PDB for backbone structures that could support the active site.
Geometry optimization
Redesign of residues surrounding the transition state.
Missing:
- Dominant approaches of homology modeling.
- The origin of different protein folds
- Tim barrel – one structure, many functions.
Last changed5 months ago