episcanpy.api.pp.filter_cells¶
- episcanpy.api.pp.filter_cells(adata, min_counts=None, min_features=None, max_counts=None, max_features=None, inplace=True, copy=False)¶
Filter cell outliers based on counts and numbers of genes expressed.
For instance, only keep cells with at least min_counts counts or min_features genes expressed. This is to filter measurement outliers, i.e. “unreliable” observations.
Only provide one of the optional parameters
min_counts
,min_features
,max_counts
,max_features
per call.- Parameters
- data
The (annotated) data matrix of shape
n_obs
×n_vars
. Rows correspond to cells and columns to genes.- min_counts
Minimum number of counts required for a cell to pass filtering.
- min_features
Minimum number of genes expressed required for a cell to pass filtering.
- max_counts
Maximum number of counts required for a cell to pass filtering.
- max_features
Maximum number of genes expressed required for a cell to pass filtering.
- inplace
Perform computation inplace or return result.
- Returns
Depending on
inplace
, returns the following arrays or directly subsets and annotates the data matrix:
Examples
>>> adata = sc.datasets.krumsiek11() >>> adata.n_obs 640 >>> adata.var_names ['Gata2' 'Gata1' 'Fog1' 'EKLF' 'Fli1' 'SCL' 'Cebpa' 'Pu.1' 'cJun' 'EgrNab' 'Gfi1'] >>> # add some true zeros >>> adata.X[adata.X < 0.3] = 0 >>> # simply compute the number of genes per cell >>> sc.pp.filter_cells(adata, min_features=0) >>> adata.n_obs 640 >>> adata.obs['nb_features'].min() 1 >>> # filter manually >>> adata_copy = adata[adata.obs['nb_features'] >= 3] >>> adata_copy.obs['nb_features'].min() >>> adata.n_obs 554 >>> adata.obs['nb_features'].min() 3 >>> # actually do some filtering >>> sc.pp.filter_cells(adata, min_features=3) >>> adata.n_obs 554 >>> adata.obs['nb_features'].min() 3