Which statements regarding the generation of (single-cell) RNA-Seq libraries for Illumina sequencers are correct?
1) The human Cell Atlas is an international project that was finished in 2017
2) The total RNA content per human cell is ~10ng
3) Early barcoding refers to the pooling of cDNA different samples before library generation and thereby increases the throughput of library generation
4) Current single-cell RNA-seq technologies have allowed to generate scRNA-seq datasets for up to 1000 cells
5) UMIs can be used to identify cDNA molecules that have been generated by amplification and not by different reverse transcription events
1) The human Cell Atlas is an international project that was finished in 2017 -> wrong, started 2016 and still ongoing
2) The total RNA content per human cell is ~10ng -> wrong, it is rather ~10-30 pg so 1000x less than 10ng (and ~0,25 mio mRNA molecules per cell)
3) Early barcoding refers to the pooling of cDNA different samples before library generation and thereby increases the throughput of library generation -> correct
4) Current single-cell RNA-seq technologies have allowed to generate scRNA-seq datasets for up to 1000 cells -> wrong, more – biggest dataset that has been published with RNAseq is over 1 million cells (so 1000x more)
5) UMIs can be used to identify cDNA molecules that have been generated by amplification and not by different reverse transcription events -> correct
Which statements regarding quantitative RNA-seq analysis are correct?
1) Blat is a commonly used to map RNA-seq reads
2) Normalization allows to compare absolute expression levels among samples
3) Read counts per transcript are more difficult to estimate than read counts per gene
4) RPKM is a measure of the expression normalized for transcript length and normalized for the total number of reads sequenced
5) Demultiplexing is the process in which reads are separated based on their barcodes that label different RNA-seq libraries
1) Blat is a commonly used to map RNA-seq reads -> wrong, Blat is too slow; common are e.g. Bowtie, BWA, STAR, NextGenMap, GSNAP
2) Normalization allows to compare absolute expression levels among samples -> wrong, does not allow to compare absolute expression levels (which is nearly impossible among samples); one normalizes to the total number of reads; one assumes that there is the same amount of RNA molecule per sample or: under the assumption that most of the genes are not differently expressed
3) Read counts per transcript are more difficult to estimate than read counts per gene -> correct
4) RPKM is a measure of the expression normalized for transcript length and normalized for the total number of reads sequenced -> correct
5) Demultiplexing is the process in which reads are separated based on their barcodes that label different RNA-seq libraries -> correct
Which of the following statements regarding statistics are correct?
1) If the mean and standard deviation of a normal distribution is 0 and 1, respectively 95% of the data fall between -1.96 and + 1.96
2) To half the standard error of the mean for a normal distribution, one needs to increase the sample size by two-fold
3) The central limit theorem states that for many populations distributions that are not normally distributed, the distribution of sample means will become increasingly close to a normal distribution as the sample size increase
4) The standard deviation cannot be negative
5) If a sample of n=100 is taken from a population whose standard deviation is equal to 10 then the standard error of the mean is expected to be 10
1) If the mean and standard deviation of a normal distribution is 0 and 1, respectively 95% of the data fall between -1.96 and + 1.96 -> correct
2) To half the standard error of the mean for a normal distribution, one needs to increase the sample size by two-fold -> wrong, 4x/ four-fold increase would be necessary because sem = s/squareroot(N)
3) The central limit theorem states that for many populations distributions that are not normally distributed, the distribution of sample means will become increasingly close to a normal distribution as the sample size increase -> correct
4) The standard deviation cannot be negative -> correct (because square root)
5) If a sample of n=100 is taken from a population whose standard deviation is equal to 10 then the standard error of the mean is expected to be 10 -> wrong, sem = s/squareroot(N) with s=10 and N=100 -> 10/10 = 1
Last changed6 days ago