How are the displayed TPM expression values calculated?
The expression values are in transcripts per million, TPM, as calculated in Packer, et al., 2019. Please note this is a different version of TPM than commonly used in bulk RNA sequence analysis. This single-cell version has no gene length normalization. Raw UMI counts are first normalized by dividing by a cell-specific size factor. Normalized counts are then averaged across all the cells corresponding to each annotated cell type. The average value is divided by the sum of averaged expression values for each cell type, and this is multiplied by 1,000,000 to give the TPM value.
The proportion of cells expressing a gene is calculated as the percentage of individual cells corresponding to a given cell type with at least 1 UMI for the gene.
What is the thresholding procedure used?
For details on the thresholding procedure, please see the Methods in our preprint. The True Positive and False Discovery rates are indicated in the documentation of each dataset.
Why are expression levels for some genes unreliable? What are methanol-fixed cells?
Some samples (e.g. L1, L4) were obtained from live cells and were sorted based on fluorescent marker expression to enrich for neurons or neuron types of interest. Other samples (e.g. adult hermaphrodite and male) were obtained from methanol-fixed cells, stained with DAPI, and sorted to collect diploid cells. See the individual dataset documentation pages for specific information.
These datasets collected from different methods may display differences in expression, which should be taken into account when comparing gene expression between datasets. More details in upcoming publications.
For “live cells” datasets, fluorescent markers overexpression using various promoters and 3′ UTRs may cause RNA-Seq data to incorrectly indicate gene expression.
