episcanpy.api.tl.dpt¶
- episcanpy.api.tl.dpt(adata, n_dcs=10, n_branchings=0, min_group_size=0.01, allow_kendall_tau_shift=True, neighbors_key=None, copy=False)¶
Infer progression of cells through geodesic distance along the graph [Haghverdi16] [Wolf19].
Reconstruct the progression of a biological process from snapshot data. Diffusion Pseudotime has been introduced by [Haghverdi16] and implemented within Scanpy [Wolf18]. Here, we use a further developed version, which is able to deal with disconnected graphs [Wolf19] and can be run in a hierarchical mode by setting the parameter n_branchings>1. We recommend, however, to only use
dpt()
for computing pseudotime (n_branchings=0) and to detect branchings viapaga()
. For pseudotime, you need to annotate your data with a root cell. For instance:adata.uns['iroot'] = np.flatnonzero(adata.obs['cell_types'] == 'Stem')[0]
This requires to run
neighbors()
, first. In order to reproduce the original implementation of DPT, use method==’gauss’ in this. Using the default method==’umap’ only leads to minor quantitative differences, though.New in version 1.1.
dpt()
also requires to rundiffmap()
first. As previously,dpt()
came with a default parameter ofn_dcs=10
butdiffmap()
has a default parameter ofn_comps=15
, you need to passn_comps=10
indiffmap()
in order to exactly reproduce previousdpt()
results.- Parameters
- adata :
AnnData
AnnData
Annotated data matrix.
- n_dcs :
int
int
(default:10
) The number of diffusion components to use.
- n_branchings :
int
int
(default:0
) Number of branchings to detect.
- min_group_size :
float
float
(default:0.01
) During recursive splitting of branches (‘dpt groups’) for n_branchings > 1, do not consider groups that contain less than min_group_size data points. If a float, min_group_size refers to a fraction of the total number of data points.
- allow_kendall_tau_shift :
bool
bool
(default:True
) If a very small branch is detected upon splitting, shift away from maximum correlation in Kendall tau criterion of [Haghverdi16] to stabilize the splitting.
- neighbors_key :
str
|None
Optional
[str
] (default:None
) If not specified, dpt looks .uns[‘neighbors’] for neighbors settings and .obsp[‘connectivities’], .obsp[‘distances’] for connectivities and distances respectively (default storage places for pp.neighbors). If specified, dpt looks .uns[neighbors_key] for neighbors settings and .obsp[.uns[neighbors_key][‘connectivities_key’]], .obsp[.uns[neighbors_key][‘distances_key’]] for connectivities and distances respectively.
- copy :
bool
bool
(default:False
) Copy instance before computation and return a copy. Otherwise, perform computation inplace and return None.
- adata :
- Return type
- Returns
Depending on copy, returns or updates adata with the following fields.
If n_branchings==0, no field dpt_groups will be written.
- dpt_pseudotime
pandas.Series
(adata.obs, dtype float) Array of dim (number of samples) that stores the pseudotime of each cell, that is, the DPT distance with respect to the root cell.
- dpt_groups
pandas.Series
(adata.obs, dtype category) Array of dim (number of samples) that stores the subgroup id (‘0’, ‘1’, …) for each cell. The groups typically correspond to ‘progenitor cells’, ‘undecided cells’ or ‘branches’ of a process.
- dpt_pseudotime
Notes
The tool is similar to the R package destiny of [Angerer16].