infercnvpy.tl.infercnv#
- infercnvpy.tl.infercnv(adata, *, reference_key=None, reference_cat=None, reference=None, lfc_clip=3, window_size=100, step=10, dynamic_threshold=1.5, exclude_chromosomes=('chrX', 'chrY'), chunksize=5000, n_jobs=None, inplace=True, layer=None, key_added='cnv')#
Infer Copy Number Variation (CNV) by averaging gene expression over genomic regions.
This method is heavily inspired by infercnv but more computationally efficient. The method is described in more detail in on the The inferCNV method page.
There, you can also find instructions on how to prepare input data.
- Parameters:
adata (
AnnData
) – annotated data matrixreference_key (
Optional
[str
] (default:None
)) – Column name in adata.obs that contains tumor/normal annotations. If this is set to None, the average of all cells is used as reference.reference_cat (
Union
[None
,str
,Sequence
[str
]] (default:None
)) – One or multiple values inadata.obs[reference_key]
that annotate normal cells.reference (
Optional
[ndarray
] (default:None
)) – Directly supply an array of average normal gene expression. Overridesreference_key
andreference_cat
.lfc_clip (
float
(default:3
)) – Clip log fold changes at this valuewindow_size (
int
(default:100
)) – size of the running window (number of genes in to include in the window)step (
int
(default:10
)) – only compute every nth running window where n =step
. Set to 1 to compute all windows.dynamic_threshold (
Optional
[float
] (default:1.5
)) – Values< dynamic threshold * STDDEV
will be set to 0, where STDDEV is the stadard deviation of the smoothed gene expression. Set toNone
to disable this step.exclude_chromosomes (
Optional
[Sequence
[str
]] (default:('chrX', 'chrY')
)) – List of chromosomes to exclude. The default is to exclude genosomes.chunksize (
int
(default:5000
)) – Process dataset in chunks of cells. This allows to run infercnv on datasets with many cells, where the dense matrix would not fit into memory.n_jobs (
Optional
[int
] (default:None
)) – Number of jobs for parallel processing. Default: use all cores. Data will be submitted to workers in chunks, seechunksize
.inplace (
bool
(default:True
)) – If True, save the results in adata.obsm, otherwise return the CNV matrix.layer (
Optional
[str
] (default:None
)) – Layer from adata to use. IfNone
, useX
.key_added (
str
(default:'cnv'
)) – Key under which the cnv matrix will be stored in adata ifinplace=True
. Will store the matrix inadata.obsm["X_{key_added}"] and additional information in `adata.uns[key_added]
.
- Return type:
- Returns:
Depending on inplace, either return the smoothed and denoised gene expression matrix sorted by genomic position, or add it to adata.