Deconvoluting spatial data
With the deconvolution_spatial
workflow, one or multiple spatial slides can be deconvoluted in one run. For that, a MuData
object for each slide is expected, with the spatial data saved in mdata.mod["spatial"]
. The spatial slides are deconvoluted using the same reference. For the reference, one MuData
with the gene expression data saved in mdata.mod["rna"]
is expected as input.
The workflow provides the possibility to run deconvolution using Cell2Location
and Tangram
.
Steps
Cell2Location
For the reference and each spatial slide the following steps are run. Note, that if multiple slides are deconvoluted in one run, the same parameter setting is used for each slide.
Gene selection. There are two possibilities for the gene selection:
Genes of a user-provided feature set (csv-file) are used for deconvolution. All genes of that gene list need to be present in both, spatial slides and scRNA-Seq reference.
Feature selection performed according to Cell2Location, i.e. via the function: cell2location.utils.filtering.filter_genes
Note: if no csv-file is provided by the user, the workflow will run the feature selection via the function: cell2location.utils.filtering.filter_genes. Thus, gene selection is not optional.
Regression/reference model is fitted and a plot of the training history as well as QC plots are saved in the
./figures/Cell2Location
directory. Additionally, a csv-fileCell2Loc_inf_anver.csv
with the estimated expression of every gene in every cell type is saved in./cell2location.output
.(Optional) Reference model is saved in
./cell2location.output
Spatial mapping model is fitted. Training history and QC plots are saved in the
./figures/Cell2Location
directory. Plot of the spatial embedding coloured byq05_cell_abundance_w_sf
is also saved in./figures/Cell2Location
.(Optional) Spatial mapping model is saved in
./cell2location.output
MuData
objects of the spatial slide and the reference are saved in./cell2location.output
. TheMuData
object of the spatial slide contains the estimated cell type abundances.
Tangram
For the reference and each spatial slide the following steps are run. Note, that if multiple slides are deconvoluted in one run, the same parameter setting is used for each slide.
Gene selection with two possibilities:
Genes of a user-provided feature set (csv-file) are used for deconvolution. All genes of that gene list need to be present in both, spatial slides and scRNA-Seq reference.
sc.tl.rank_genes_groups is used to select genes. The top n genes of each group make up the reduced gene set.
Note: if no csv-file is provided by the user, the workflow will run the feature selection via sc.tl.rank_genes_groups. Thus, gene selection is not optional.
Data is preprocessed with tangram.pp_adatas
Tangram model is fitted with tangram.mapping_utils.map_cells_to_space and annotations are transfered from single-cell data onto space with tangram.project_cell_annotations
Plot of the spatial embedding coloured by
tangram_ct_pred
is saved in./figures/Tangram
MuData
objects of the spatial slide and the reference are saved in./tangram.output
. TheMuData
object of the spatial slide contains the deconvolution predictions.
Steps to run
Activate conda environment
conda activate pipeline_env
Generate yaml and log file
panpipes deconvolution_spatial config
Specify the parameter setting in the pipeline.yml file
Run complete deconvolution workflow with
panpipes deconvolution_spatial make full --local
The Deconvoluting spatial data tutorial guides you through deconvolution workflow of Panpipes
step by step.