Register the training data

After preparing the single-cell dataset, this tutorial shows how to register and load the data into UNAGI tool

Load UNAGI tool

import warnings
warnings.filterwarnings('ignore')
from UNAGI import UNAGI
unagi = UNAGI()

Set up the training data

Users can register the data using the UNAGI.setup_data function. You can specify the .h5ad format dataset or the directory of your dataset folder. UNAGI.setup_data funciton also requires users to specify the column name (stage_key) to store the stage information in the anndata.obs dataframe and the total number of time-points (stage). Besides, to run the graph convolution neural network (GCN), it’s mandatory to construct the cell neighbor graphs for each stage of data. Users can use neighbors and threads parameters to specify the number of neighbors of cell neighbor graphs and the number of threads to build graphs using sklearn.neighbors.kneighbors_graph.

data_directory = 'PATH_TO_DATA.h5ad' # path to the data or the folder containing the data
stage_column_name = 'column_name_of_stage_information_in_your_data' # key in the obsm of the data that contains the stage information
total_time_points = 4 # change it to the total number of time-points in your data
gcn_neighbors = 30 # number of neighbors to be used in the knn graph
cpu_threads= 5 # number of threads to be used in the knn graph
unagi.setup_data(data_path = data_directory, stage_key=stage_column_name, total_stage=total_time_points, neighbors=gcn_neighbors, threads=cpu_threads)