Normalization methods

panpipes currently supports the following normalization methods:

RNA

  1. Standard normalization using scanpy’s normalize_total and log1p.

Additionally, the normalized data can be scaled using scanpy’s sc.pp.scale

PROT

  1. clr using muon’s prot processing, with the option to specify margin for normalization,

    1. clr margin= 0, normalize within each cells’ counts distribution, across all features (row-wise, as you would do for RNA data)

    2. clr margin= 1, normalize within each feature’s distribution, across all cells (column-wise, recommended for proteomics data)

    If you come from R, please note that the margins are transposed in the Python and anndata world. img1

  2. dsb using muon’s prot processing. This method is only applicable when you have raw 10x inputs (see supported input files).

ATAC

  1. Standard normalization using scanpy’s normalize_total and log1p.

  2. TFIDF with 3 flavours

    1. “signac”, following signac’s defaults. using muon’s atac processing

    2. “logTF”: logging the TF term using using muon’s atac processing

    3. “logIDF”: logging the IDF term using using muon’s atac processing

Spatial Transcriptomics

  1. Standard normalization using scanpy’s normalize_total and log1p.

  2. Analytic Pearson Residual normalization using scanpy’s normalize_pearson_residuals.

Layers nomenclature within each modality

Raw, Normalised and Scaled data are saved for each modality in their specific layers:

atac.layers["logTF_norm"] = atac.X.copy()

Using the following nomenclature:

method

layer

modality

raw counts

“raw_counts”

RNA/ATAC/PROT

standard log1p

“logged_counts”

RNA or ATAC

scaled counts

“scaled_counts”

RNA or ATAC

clr

“clr”

PROT

dsb

“dsb”

PROT

TFIDF (signac flavour)

“signac_norm”

ATAC

TFIDF (logTF)

“logTF_norm”

ATAC

TFIDF (logIDF)

“logIDF_norm”

ATAC

standard log1p

“lognorm”

Spatial

Pearson Residuals

“norm_pearson_resid”

Spatial