Which statements regarding the comparisons of expression values are correct?
1) The main reason why expression values are log-transformed is to meet the assumption of a t-test
2) If one compares the mean of a sample of 10 measurements to a reference value one can use a t-distribution with 9 degree of freedoms to determine a significant deviation
3) A paired t-test is also called non-parametric test
4) Non-overlapping error bars representing the standard error of the mean always imply a significant (p<5%) difference between the means of the two populations
5) Non-overlapping error bars representing the 95% C.I. of the mean always imply a significant (p<5%) difference between the means of the two populations
6) An ANOVA that compares two groups for one factor gives exactly the same p-value as a two-sample t-test
1) The main reason why expression values are log-transformed is to meet the assumption of a t-test -> wrong; because we analyze relative expression levels; we are interested in fold-changes, not absolute changes
2) If one compares the mean of a sample of 10 measurements to a reference value one can use a t-distribution with 9 degree of freedoms to determine a significant deviation -> correct
3) A paired t-test is also called non-parametric test -> wrong, t-test is parametric because it makes the assumption of a (normal) distribution; non-parametric tests do not make an assumption
4) Non-overlapping error bars representing the standard error of the mean always imply a significant (p<5%) difference between the means of the two populations -> wrong
5) Non-overlapping error bars representing the 95% C.I. of the mean always imply a significant (p<5%) difference between the means of the two populations -> correct
6) An ANOVA that compares two groups for one factor gives exactly the same p-value as a two-sample t-test -> correct
In a paper it is written that “… Gene TBP53 is 2.3-fold higher expressed in patients (mean=8.8, sd= 0.56, N=5) than in controls (mean=7.7, sd=0.37, N=5), which is significant (t-test,p=0.005).” Which statements are correct?
1) Mean expression levels are given in log2 space
2) The standard error of the mean for patients is 0.56
3) The standard error of the mean for patients is 0.56/square root(5) = 0.25
4) Less than 0.5% of the patients have lower expression levels than controls
5) If they tested 10.000 genes on a microarray and this is the gene with the most significant p-value, it is very likely a false positive result
1) Mean expression levels are given in log2 space -> correct
2) The standard error of the mean for patients is 0.56 -> wrong
3) The standard error of the mean for patients is 0.56/square root(5) = 0.25 -> correct
4) Less than 0.5% of the patients have lower expression levels than controls -> wrong
5) If they tested 10.000 genes on a microarray and this is the gene with the most significant p-value, it is very likely a false positive result -> correct
You have 12 brain samples from Alzheimer patients and 12 age- and sex-macthed controls. You isolate RNA twice from each sample (i.e. make technical replicates) and sequence 48 RNA-Seq libraries in total. You quantify gene expressionlevels for all detectable genes and want to identify genes differently expressed in Alzheimer patients using a two-sample t-test. Which statements are correct?
1) The two technical replicates are not independent since they come from the same individual
2) One could average the expression levels of the two technical replicates per individual and then calculate a t-test with N=12 per group in order to avoid pseudoreplication
3) The technical replicates allow to estimate how much variations is caused by RNA isolation, library preparation and sequencing
4) Since the RNAs were independently isolated one can calculate the t-test assuming N=24 for patients and N=24 per control
5) The variance between technical replicates is expected to be smaller than the variance between two Alzheimer patients (i.e. between biological replicates)
1) The two technical replicates are not independent since they come from the same individual -> correct
2) One could average the expression levels of the two technical replicates per individual and then calculate a t-test with N=12 per group in order to avoid pseudoreplication -> correct
3) The technical replicates allow to estimate how much variations is caused by RNA isolation, library preparation and sequencing -> correct
4) Since the RNAs were independently isolated one can calculate the t-test assuming N=24 for patients and N=24 per control -> wrong
5) The variance between technical replicates is expected to be smaller than the variance between two Alzheimer patients (i.e. between biological replicates) -> correct
Which statements regarding the False discovery Rate (FDR) are true?
1) The FDR is used to correct for parametric assumptions
2) The FDR is used to correct for multiple testing
3) The Benjamini-Hochberg procedure is a common way to calculate the FDR
4) The Bonferroni correction is a common way to calculate the FDR
5) Storey’s q-value is a common way to calculate the FDR
1) The FDR is used to correct for parametric assumptions -> false
2) The FDR is used to correct for multiple testing -> correct
3) The Benjamini-Hochberg procedure is a common way to calculate the FDR -> correct
4) The Bonferroni correction is a common way to calculate the FDR -> wrong, not common, too conservative
5) Storey’s q-value is a common way to calculate the FDR -> correct
Which statements regarding the phenomenon of p-value hacking are correct?
1) If one performs 10 independent statistical test, there is a >40% chance that at least one of them as a p-value of <0.05, even if the null hypothesis is true for all ten tests
2) If one gets a p-value of 0.06 for the test of a hypothesis, then decides to correct for sex, and then gets a p-value of 0.04, the chance that the null hypothesis is true is still below 5%
3) A p-value of 0.05 means that the alternative hypothesis is true with 95% probability
4) Collecting data until significance reaches 0.05 is not regarded as p-value hacking as it does not alter the probability of rejecting the null hypothesis
1) If one performs 10 independent statistical test, there is a >40% chance that at least one of them as a p-value of <0.05, even if the null hypothesis is true for all ten tests -> correct (chance to get no p-value <0.05 for ten tests = 0.95^10 = 0.6 -> 40% to get p<0.05)
2) If one gets a p-value of 0.06 for the test of a hypothesis, then decides to correct for sex, and then gets a p-value of 0.04, the chance that the null hypothesis is true is still below 5% -> wrong
3) A p-value of 0.05 means that the alternative hypothesis is true with 95% probability -> wrong
4) Collecting data until significance reaches 0.05 is not regarded as p-value hacking as it does not alter the probability of rejecting the null hypothesis -> wrong
Zuletzt geändertvor 8 Tagen