episcanpy.api.ct.load_features

episcanpy.api.ct.load_features(file_features, chromosomes=['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', 'X', 'Y'], path='', input_file_format=None, sort=False)

The function load features is here to transform a bed file into a usable set of units to measure methylation levels. It has to be a bed-like file. You also need to specify the chromosomes you use as a list of characteres like [‘1’, ‘7’, ‘8’, ‘9’, ‘X’, ‘Y’, ‘M’]. The chromosome list you give as input can be not ordered. If you don’t specify the chromosomes, the default is the human genome (including X, Y and mitochondrial DNA). THE BED (and gtf) FILE need to be sorted. (upcoming an option to sort the file).

The output is a dictionary where the keys are chromosomes and the value is a list containing [start, end, name] for every feature extracted.

This function will load the entire annoation file. If you want to use only parts of gtf/gff files please look the functions load_features_gff and load_features_gtf.

Parameters
file_features

the names of the bed file you want to load.

chromosomes

chromosomes corresponding to the bed file. If not specified, it’s human by default

path

if you want to specify the path where your bed file is.

input_file_format

if None, the input format is deduced from the extension name (admitted bed, gtf, gff) if str specified, it overlook the rxtension name and load the feature file as the specified input file format.

sort

if True, the bed file is sorted based on starting coordinates.