{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Designing iDREM hyper-parameters\n",
    "\n",
    "UNAGI utilizes [iDREM](https://github.com/phoenixding/idrem) (Ding et al. 2018) to reconstruct the temporal gene regulatory networks. Currently, UNAGI only supports using iDREM on the **Human** and **Mouse** data. There are several hyper-parameters that can be adjusted to better fit the input data, including Normalize_Data, Minimum_Absolute_Log_Ratio_Expression, Convergence_Likelihood, and Minimum_Standard_Deviation. \n",
    "\n",
    "-   Normalize_Data: Choosing from [**'log_normalize_data'**, **'normalize_data'**, **'no_normalization'**]\n",
    "    -   **log_normalize_data**: The expression vector $(v_0,v_1,...,v_n)$ will be transformed to $(0, log_2 (v_1 )−log_2(v_0 ),...,log_2 (v_n )-log_2 (v_{n−1} ))$. This normalization method should be used if the expression is not in log space.\n",
    "    -   **normalize_data**: The expression vector $(v_0,v_1,...,v_n)$ will be transformed to $(v_{1} − v_{0} , ..., v_{n} − v_{n−1})$. If the expression in already log space and a time ‘0’ experiment was conducted, then this normalization should be used.\n",
    "    -   **no_normalization**: The expression vector $(v_0,v_1,...,v_n)$ will be transformed to $(v_0,v_1,...,v_n)$. If the expression is in log space and no time point ‘0’ experiment was conducted, then we will add a pesudo time ‘0’ (starting time point) with all gene expression equals to 0.\n",
    "-   Minimum_Absolute_Log_Ratio_Expression: After transformation (Log normalize data, Normalize data, or No Normalization/add 0) if the absolute value of the gene’s largest change is below this threshold, then the gene will be ﬁltered. How change is deﬁned depends on whether the Change should be based on parameter is set to Maximum−Minimum or Diff erence from 0.\n",
    "    -   Smaller minimum absolute log ratio expression will lead to more branches in the reconstructed temporal gene regulatory network.\n",
    "-   Convergence_Likelihood %: This parameter controls the percentage likelihood gain required to continue searching for better parameters for the model. \n",
    "    -   Increasing this parameter can lead to a faster running time, decreasing it may lead to better values of the parameters.\n",
    "-   Minimum Standard Deviation: This parameter controls the minimum standard deviation on the Gaussian distributions. \n",
    "    -   Increasing this parameter is recommended if applying DREM to scRNA-seq data to avoid potential overﬁtting of low variance in expression due to small discrete counts.\n",
    "\n",
    "The function `UNAGI.register_iDREM_parameters()` allows users to specify thier customized hyper-parameters. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import warnings\n",
    "warnings.filterwarnings(\"ignore\")\n",
    "from UNAGI import UNAGI\n",
    "unagi = UNAGI()\n",
    "\n",
    "#....... load the data and setup the model architecture and hyperparameters ..........#\n",
    "\n",
    "iDREM_Path = 'directory_to_iDREM_tool'\n",
    "Normalize_Data = 'normalize_data'\n",
    "Minimum_Absolute_Log_Ratio_Expression = 0.01\n",
    "Convergence_Likelihood = 0.0001\n",
    "Minimum_Standard_Deviation = 0.01\n",
    "unagi.register_iDREM_parameters(iDREM_Path, Normalize_data = Normalize_Data, Minimum_Absolute_Log_Ratio_Expression = Minimum_Absolute_Log_Ratio_Expression, Convergence_Likelihood = Convergence_Likelihood, Minimum_Standard_Deviation = Minimum_Standard_Deviation)\n",
    "unagi.run_UNAGI(iDREM_Path)"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}