episcanpy.api.pp.filter_cells

episcanpy.api.pp.filter_cells(adata, min_counts=None, min_features=None, max_counts=None, max_features=None, inplace=True, copy=False)

Filter cell outliers based on counts and numbers of genes expressed.

For instance, only keep cells with at least min_counts counts or min_features genes expressed. This is to filter measurement outliers, i.e. “unreliable” observations.

Only provide one of the optional parameters min_counts, min_features, max_counts, max_features per call.

Parameters
data

The (annotated) data matrix of shape n_obs × n_vars. Rows correspond to cells and columns to genes.

min_counts

Minimum number of counts required for a cell to pass filtering.

min_features

Minimum number of genes expressed required for a cell to pass filtering.

max_counts

Maximum number of counts required for a cell to pass filtering.

max_features

Maximum number of genes expressed required for a cell to pass filtering.

inplace

Perform computation inplace or return result.

Returns

Depending on inplace, returns the following arrays or directly subsets and annotates the data matrix:

cells_subsetndarray

Boolean index mask that does filtering. True means that the cell is kept. False means the cell is removed.

number_per_cellndarray

Depending on what was tresholded (counts or genes), the array stores n_counts or n_cells per gene.

Examples

>>> adata = sc.datasets.krumsiek11()
>>> adata.n_obs
640
>>> adata.var_names
['Gata2' 'Gata1' 'Fog1' 'EKLF' 'Fli1' 'SCL' 'Cebpa'
 'Pu.1' 'cJun' 'EgrNab' 'Gfi1']
>>> # add some true zeros
>>> adata.X[adata.X < 0.3] = 0
>>> # simply compute the number of genes per cell
>>> sc.pp.filter_cells(adata, min_features=0)
>>> adata.n_obs
640
>>> adata.obs['nb_features'].min()
1
>>> # filter manually
>>> adata_copy = adata[adata.obs['nb_features'] >= 3]
>>> adata_copy.obs['nb_features'].min()
>>> adata.n_obs
554
>>> adata.obs['nb_features'].min()
3
>>> # actually do some filtering
>>> sc.pp.filter_cells(adata, min_features=3)
>>> adata.n_obs
554
>>> adata.obs['nb_features'].min()
3