What kind of methods do you know to filter for empty droplets and what do you need to consider for that?

Cellranger:

Plotting of UMI counts (basically genes discovered) depending on the number of barcodes to determin number of cells

everything ______

CAVE: always provide a number of estimated cells

emptyDrops:

(1) every droplets with less than 100 UMIs are filtered out

(2) find knee point and inflection point

(3) calculate correlation between cells in this range

How can you estimate the rate of doublets?

idea: sequencing human samples/tumors in mice:

just plot #mouse UMIs and #human UMIs —> double-positive fraction

alternative: use RNAseq counts that have the same SNPs

other methods available:

scDblFinder (R package):

Srublet: artificially create doublets (pool expression values of different columns, cluster observed transcriptome and synthetic doublets

CAVE: differentiate between clustered cells and continous expression date (trajectory) but doublet filtering might not be as important as these cells are clustering away anyway

What kind of distribution would you use to model scRNAseq and what kind of parameter does it take into account? How can you then estimate the technical variation?

Poisson/Negative Binomial/Regularized Negative Binomial distribution

Poisson: models distribution depending on mean gene expression across cells

Regularized Negative Binomial: middle gound between Negative Binomial and Poisson distribution (most stable,

variance residuals can be computed from fitted Poisson distribution

Last changed7 months ago