undefined

Buffl

Applied Empirical Methods

N�

by Nele �.

Seperate data into two datasets e.g. dummy variables with 0 and 1 as values

> ifemale <- which(data$group==0)

> data_female <- data[ifemale,]

> imale <- which(data$group==1)

> data_male <- data[imale,]

Check residuals vs. leverage plot

rule of thumb (Cook’s distance)
- preparation (code)
- identification of leveraged observations

Rule of thumb: observation has high influence if Cook’s distance exceeds 4/(n - p - 1)
- n <- dim(data)[1]
- p <- dim(data)[2]-1
- plot(linreg_clean) -> eventhough points are most leveraged, it does not mean they are outliers

Anova

Wofür?
Formel
Interpretation

Multicollinearity
anova(reg)
Depending on siginficance level
- Variable is significant
- Variable is not significant

Jarque Bera Test

Wofür?
Formel
Interpretation

Normality
jarque.bera.test(summary(reg)$residuals)
Hypothesis test
- H0: Errors follow a normal distribution
- H1: Errors dont follow a normal distribution

F-Test

Wofür?
Formel
Interpretation

Multicollinearity
var.test(x,y,data)
Hypothesis test
- H0: population variances are equal
- H1: population variances are not equal
- Interpretation

Durbin Watson Test

Wofür?
Formel
Interpretation

Autocorrelation
durbinWatsonTest(reg)
Hypothesis tests
- H0: residuals are not autocorrelated
- H1: residuals are positively autocorrelated

Breusch-Pagan Test

Wofür?
Formel
Interpretation

Heteroscedasticity

bptest(reg)
Hypothesis test
- H0: data is not heterocedastic
- H1: data is heteroscedastic

Boxplot

boxplot(data)

Correlation Formulas

Correlation Matrix
Correlation Plots

cor(data)

pairs(data) <- pairwise correlation plots

VIF

Wofür?
Formel
Interpretation

Multicollinearity
vif(reg)
value that exceeds 5 or 10 indicates a problematic amount of collinearity

VIF barplot incl. line

barplot(vif(mreg), horiz = TRUE)
abline(v = 5, lwd = 3, lty = 2)
- v: add vertical line at x = 5
- lwd: define linewidth
- lty: define line type

clean data from outliers

Option: Boxplot
- hout <- boxplot(data)$out
- iout <- which(data$variable==hout)
- data_clean <- data[-iout,]
Option: Scatterplot - Maximum point
- iout <- which.max(data$variable)
- dataclean <- data[-iout,]

stepAIC

Wofür?
Formel (incl. arguments)
Interpretation

Variable selection
stepAIC(reg, direction="backward")
- backward: removes predictors sequentially from the given model with decreasing complexity
- forward: adds predictors sequentially to the given mode with increasing complexity
- both: a forward-backward search that, at each step, decides whether to include or exclude a predictor
last displayed model shows all significant variables and removed variables, that do not improve the validity of the dataset

Steps for Transformation

Transform data & plot in transformed space
> logreg=lm(log(sales)∼log(price),data)
> plot(log(data$price),log(data$sales), pch = 16, col = "blue") > abline(linreg,col="red")
Plot in original space
> plot(data$price,data$sales, pch = 16, col = "blue")
> x=seq(from=min(data$price),to=max(data$price),by=0.01)
> y=predict(logreg,list(price=x))
> matlines(x,y,col="red")
Plot the residuals
Analyize R²

Color Coding

plot(x,y,col=data$x2+3)

> data$x2: data which should be seperated by color

Confidence interval with predicted variables

How would the arguments change if we would predict the confidence interval?

> predict(lm,newdata <- data.frame(income=50, competitor=3,

mallvisitors=2000),interval=’confidence’)

> predict(lm,newdata <- data.frame(income=50, competitor=3,

mallvisitors=2000),interval=’prediction’)

CI plot

formula
what does it mean if observations are outside the prediction interval?

ci.plot(lm,conf.level=0.90)
Prediction is too optimistic

Name the Seven Assumptions

Linearity
Strict Exogenitiy
No Correlation of X and the Errors
No Correlation of the Errors
Heteroscedasticity
Normality
No Perfect Multicollinearity

Multicollinearity - Tests

VIF
F-Test
Anova

Non-Normality of the Error Terms - Tests

analyze the QQ-Plot
Jarque-Bera Test

Heteroscedasticity - Tests

Breusch-Pagan Test

Auto Correlation - Tests

Durbin Watson Test

Join Course

Preview

Author

Nele �.

Information

Last changed
2 years ago

Report course