Which of the following is NOT an arithmetic operator in R?
A) +
B) -
C) **
D) %/%
C
How do you install a package in R?
install.packages("package_name")
Which of the following modes does NOT exist in R?
A) logical
B) integer
C) array
D) complex
What function is used to create a vector?
A) vector()
B) c()
C) create_vector()
D) list()
How can you access the second element of a vector named v?
v
A) v[2]
B) v(2)
C) v{2}
D) v<2>
What does sample(v, size=3, replace=TRUE) do?
sample(v, size=3, replace=TRUE)
A) Selects three unique elements from v
B) Selects three elements from v with replacement
C) Replaces three elements in v
D) Sorts v in ascending order
What function is used to count occurrences of a categorical variable?
A) summary()
B) count()
C) table()
D) unique()
Which function generates a bar plot for a categorical variable?
A) hist()
B) plot()
C) barplot()
D) scatter()
What function is used to create a histogram in R?
B) barplot()
C) scatterplot()
D) pie()
What parameter in boxplot() determines the color of the boxes?
boxplot()
A) color
B) fill
C) col
D) boxcolor
Which function is used to apply a transformation to every element of a vector?
A) lapply()
B) apply()
C) transform()
D) mutate()
What does scale() do in R?
scale()
A) Standardizes data to have mean 0 and standard deviation 1
B) Converts data into categorical form
C) Normalizes data between 0 and 1
D) Computes the logarithm of the data
What function calculates Pearson’s correlation in R?
A) pearson()
B) cor()
C) lm()
D) cov()
In linear regression using lm(y ~ x), what does y represent?
lm(y ~ x)
y
A) The dependent variable
B) The independent variable
C) The correlation coefficient
D) The residual error
Which function is used to perform a t-test in R?
A) t.test()
B) test.t()
C) hypothesis()
D) stat.test()
What does a p-value less than 0.05 indicate?
A) Strong evidence against the null hypothesis
B) The null hypothesis is true
C) The data is normally distributed
D) No conclusion can be made
Which function is used to perform an ANOVA test in R?
A) chisq.test()
B) aov()
C) anova_test()
D) lm()
The Chi-square test is used to analyze which type of data?
A) Continuous
B) Categorical
C) Binary
D) Ordinal
Which function in R is commonly used for PCA?
A) pca.test()
B) prcomp()
C) factor.pca()
D) pc.analysis()
What does PCA aim to do?
A) Standardize the data
B) Reduce dimensionality
C) Perform hypothesis testing
D) Normalize categorical variables
Which function is used for hierarchical clustering in R?
A) cluster.hclust()
B) hclust()
C) hier.cluster()
D) dendro.cluster()
K-means clustering requires specifying which parameter?
A) The number of variables
B) The number of clusters (k)
C) The correlation matrix
D) The Euclidean distance
When should you use a one-sample t-test?
A) When comparing the mean of a sample to a known population mean
B) When comparing two independent samples
C) When analyzing paired observations
D) When testing for categorical relationships
When should you use a paired t-test?
A) When comparing two measurements from the same subjects
C) When testing for correlation
D) When testing multiple variables simultaneously
When should you use a one-way ANOVA?
A) When comparing means across more than two groups
B) When analyzing paired data
C) When comparing two independent groups
D) When testing for correlation
What is the null hypothesis in a Chi-square test?
A) There is no association between categorical variables
B) The means of multiple groups are equal
C) The dataset follows a normal distribution
D) The correlation coefficient is zero
When should you use the Wilcoxon Rank-Sum Test instead of a t-test?
A) When data is not normally distributed
B) When testing for correlation
C) When comparing means across more than two groups
D) When working with categorical data
What is the Kruskal-Wallis test used for?
A) Comparing more than two independent groups when data is not normally distributed
B) Testing for correlation
C) Analyzing paired data
D) Checking for multicollinearity
What does the Mann-Whitney U test compare?
A) Differences between two independent groups when normality cannot be assumed
B) Differences between paired samples
C) The variance of multiple groups
D) The presence of outliers
What is the purpose of logistic regression?
A) Predicting categorical outcomes
B) Predicting continuous outcomes
C) Testing for normality
D) Comparing means across groups
When should you use multiple linear regression?
A) When predicting a continuous outcome based on multiple predictor variables
B) When analyzing a categorical dependent variable
C) When testing for correlation between two variables
D) When checking for normality
What is multicollinearity in regression analysis?
A) A situation where predictor variables are highly correlated with each other
B) A method to check if residuals are normally distributed
C) A test for model accuracy
D) A measure of goodness-of-fit
Which statistical test should you use to compare the means of two independent groups?
A) Paired t-test
B) Independent t-test
C) One-way ANOVA
D) Chi-square test
If you want to test whether a dataset follows a normal distribution, which test should you use?
A) Chi-square test
B) One-way ANOVA
C) Shapiro-Wilk test
D) Mann-Whitney U test
You want to compare the means of three different independent groups. Which test should you use?
A) Independent t-test
C) Chi-square test
D) Wilcoxon test
What statistical test should be applied to analyze the relationship between two categorical variables?
D) Pearson correlation
You have two independent groups with non-normally distributed data. Which test should you use?
B) Mann-Whitney U test
D) Paired t-test
You want to measure the relationship between two continuous variables. Which test should you use?
A) One-way ANOVA
C) Pearson correlation
If you suspect that the relationship between two variables is not linear, which correlation test should you use?
A) Pearson correlation
B) Spearman correlation
D) Shapiro-Wilk test
You are comparing the means of two independent groups with small sample sizes. Which assumption must be met for using an independent t-test?
A) Normality of data in both groups
B) Data must be categorical
C) Groups must be paired
D) The number of samples must be equal
You are comparing means across four different groups. If your ANOVA result is significant, what should be your next step?
A) Conclude all group means are different
B) Perform post-hoc tests (e.g., Tukey's test)
C) Conduct a Pearson correlation test
D) Recalculate p-values with a lower alpha
You are analyzing survey responses with two categorical variables. Which statistical test is appropriate?
D) Spearman correlation
You suspect that your two groups do not follow a normal distribution. What is the best alternative to an independent t-test?
A) Mann-Whitney U test
D) Wilcoxon signed-rank test
You need to compare medians of two related samples. Which test should you use?
C) Wilcoxon signed-rank test
You need to predict whether customers will buy a product (yes/no) based on various predictor variables. Which statistical method should you use?
A) Linear regression
C) Logistic regression
You are performing a regression analysis, but your residuals show strong heteroscedasticity. What should you do?
A) Transform the dependent variable or use robust regression
B) Ignore the issue if p-values are significant
C) Apply a Pearson correlation test instead
D) Remove outliers until residuals are homogeneous
You suspect collinearity among predictor variables in your regression model. How can you check for it?
A) Calculate Variance Inflation Factor (VIF)
B) Perform a t-test on each predictor
C) Run a Shapiro-Wilk test
D) Use a Chi-square test
How do you combine data frames?
D = rbind(D, new_students)
How do you create an new ID column in a data frame
ID = paste0("ID_”, c(1:nrow(D)))
cbind(ID,D)
ID = paste0("ID_", c(1:nrow(D)))
What does D$ID = ID do?
will add an ID column but as the last column
How do you combine two data frames?
M = merge(D1, D2, by=”ID”)
this only adds rows that are in both df, with all = TRUE everything will be combined
how do you store a df and export it?
write.table(D, “my_dataframe.csv”, sep=”;”, col.names=True, row.names=False)
How do you import data
D = read.table(“name.csv”, sep =”;”, header=True)
What types of correlations can be applied?
pearson, spearman and kendall
What does the Pearson corralation assume?
A) Assumes a monotone relationship but makes no assumption about the distribution of the variables
B)Assumes linearity, a monotone relationship and that both variables are normally distributed
C)No assumptions about distribution or relationship (can work with ties)
B
What does the Kendall corralation assume?
What does the spearman corralation assume?
A
What is inferential statistics?
• Inferential statistics uses the sample to draw conclusions about the population
When should the p-value get smaller?
The p value gets smaller with:
- Greater difference of the means of the two groups
- Smaller variances in the groups
- Higher sample sizes
what is the p-value?
the p value describes the probability to retrieve such or more extreme results if the null hypothesis (no association between the two variables) is true
Which test needs to be done if you want to analyse multiple groups, that are normally distributed and have equal variance?
Anova and Tukeys HSD test
Which test when:
multiple groups
normal distribution
not equal variance
Welchs Anova and Games- Howell test
two groups
Welch test
no normal distribution
Mann Whitney U test
Kruskal Wallis Test and Dunn test
How do you check for normal distribution?
with the shapiro wilk test
How do you check for equal variance?
bartlett.test
Zuletzt geändertvor 2 Monaten