How are interactions of different conditions modeled in linear regression ?
The values of the 2 conditions are multiplied
Experiment with binary variables x1
betas are the weights
beta0 is offset
Whats a design matrix and what distinguished an incomplete from a complete design?
A matrix of all conditions applied during an experiment
This is a complete design since every possibiliy was tested.
Explain the default setup for linear regression for gene expression.
generic scenario
p = number of conditions or measurements
n = number of samples
Ideally we only get the most important ps
What the general equation that shall be solved for regression.
Least squares loss function
beta-hat = estimate of parameter
=> followed by hypothesis testing for each beta
Name some other loss functions
Least absolute deviations (more robust against out liers)
least trimmed squares
M-estimator
Whats the Huber loss ?
behaves like square for small residuals
behavs like linear for large residuals
Lot more robust against out liers
What happens to the linear regression setup with high-dimensional data ?
High dimensional linear regression
Idea:
select a few coefficients of beta, that are highly predictive
Whats the main idea behind LASSO
extending least squares by penalty function
beta has to be sparse ( L1 regularization)
lambda is regularization parameter ( trade off between sparsity and perdiction performance)
Name 3 popular model selection methods
k-fold cross validation
train on every test fold with multiple lambda
test lambda values against holdout set
information criteria (BIC,AIC)
Stability selection (based on subsampling, bootstrapping)
Last changed5 months ago