{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Register the training data\n", "\n", "After preparing the single-cell dataset, this tutorial shows how to register and load the data into UNAGI tool" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load UNAGI tool" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import warnings\n", "warnings.filterwarnings('ignore')\n", "from UNAGI import UNAGI\n", "unagi = UNAGI()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set up the training data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Users can register the data using the `UNAGI.setup_data` function. You can specify the `.h5ad` format dataset or the directory of your dataset folder. `UNAGI.setup_data` funciton also requires users to specify the column name (stage_key) to store the stage information in the anndata.obs dataframe and the total number of time-points (stage). Besides, to run the graph convolution neural network (GCN), it's mandatory to construct the cell neighbor graphs for each stage of data. Users can use `neighbors` and `threads` parameters to specify the number of neighbors of cell neighbor graphs and the number of threads to build graphs using `sklearn.neighbors.kneighbors_graph`. \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data_directory = 'PATH_TO_DATA.h5ad' # path to the data or the folder containing the data\n", "stage_column_name = 'column_name_of_stage_information_in_your_data' # key in the obsm of the data that contains the stage information\n", "total_time_points = 4 # change it to the total number of time-points in your data\n", "gcn_neighbors = 30 # number of neighbors to be used in the knn graph\n", "cpu_threads= 5 # number of threads to be used in the knn graph" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "unagi.setup_data(data_path = data_directory, stage_key=stage_column_name, total_stage=total_time_points, neighbors=gcn_neighbors, threads=cpu_threads)" ] } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 2 }