DSAVE: Detection of misclassified cells in single-cell RNA-Seq data
Artikel i vetenskaplig tidskrift, 2020

Single-cell RNA sequencing has become a valuable tool for investigating cell types in complex tissues, where clustering of cells enables the identification and comparison of cell populations. Although many studies have sought to develop and compare different clustering approaches, a deeper investigation into the properties of the resulting populations is lacking. Specifically, the presence of misclassified cells can influence downstream analyses, highlighting the need to assess subpopulation purity and to detect such cells. We developed DSAVE (Down-SAmpling based Variation Estimation), a method to evaluate the purity of single-cell transcriptome clusters and to identify misclassified cells. The method utilizes down-sampling to eliminate differences in sampling noise and uses a log-likelihood based metric to help identify misclassified cells. In addition, DSAVE estimates the number of cells needed in a population to achieve a stable average gene expression profile within a certain gene expression range. We show that DSAVE can be used to find potentially misclassified cells that are not detectable by similar tools and reveal the cause of their divergence from the other cells, such as differing cell state or cell type. With the growing use of single-cell RNA-seq, we foresee that DSAVE will be an increasingly useful tool for comparing and purifying subpopulations in single-cell RNA-Seq datasets.


Johan Gustafsson

Chalmers, Biologi och bioteknik, Systembiologi

Jonathan Robinson

Chalmers, Biologi och bioteknik, Systembiologi

Juan Salvador Inda Diaz

Göteborgs universitet

Chalmers, Matematiska vetenskaper, Tillämpad matematik och statistik

Elias Björnson

Wallenberg Lab.

Chalmers, Biologi och bioteknik, Systembiologi

Rebecka Jörnsten

Chalmers, Matematiska vetenskaper, Tillämpad matematik och statistik

Göteborgs universitet

Jens B Nielsen

BioInnovation Institute

Chalmers, Biologi och bioteknik, Systembiologi


1932-6203 (ISSN) 19326203 (eISSN)

Vol. 15 12 December e0243360




Cell- och molekylärbiologi





Mer information

Senast uppdaterat