{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Junction-Level Differential Gene Analysis (JDEG)\n", "\n", "In this section, we perform junction-level differential gene analysis using the adjacency matrix. \n", "This analysis aims to identify genes that exhibit significant differences in junction-level expression \n", "between different conditions or cell types.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Step 1: Identify Junction Markers Using MAST\n", "\n", "This step is identical to [Step1](./step7_1_MAST.ipynb)of the EDEG analysis. \n", "The only difference is that the input_anndata file is changed to AdjacencyComp_PDAC.h5ad, which contains junction-level features instead of exon-level ones.\n", "The rest of the analysis pipeline remains the same, including the use of the MAST model through Seurat." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Step 2: Identify Junction-Level Markers Using Exon-Level Information\n", "\n", "To determine gene-level differential expression, we now aggregate exon marker information by gene, enabling a higher-level view of differences.\n", "\n", "The following function converts junction-level marker results from Seurat into gene-level statistical information. It uses the Stouffer method to combine junction p-values, and also computes average log2 fold changes.\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from DOLPHIN.EDEG.generate_JDEG import run_jdeg" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "pd_JDEG = run_jdeg(seurat_output = \"/mnt/data/kailu/00_scExon/10_GO_PDAC/02_model/02_exon_adj/MAST/AdjacencyComp_MAST_ductal.csv\", \n", " output = \"./PDAC_MAST_ductal_junction_final.csv\")" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | Exon_names | \n", "p_val | \n", "avg_log2FC | \n", "pct.1 | \n", "pct.2 | \n", "p_val_adj | \n", "Gene_names | \n", "MAST_abs_avg_log2FC | \n", "MAST_stouffer_pval | \n", "MAST_stouffer_pval_adj_bonf | \n", "
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "RPS26-12 | \n", "4.578045e-194 | \n", "-2.627707 | \n", "0.949 | \n", "0.989 | \n", "6.177248e-189 | \n", "RPS26 | \n", "1.923962 | \n", "6.474450e-153 | \n", "6.062675e-149 | \n", "
| 1 | \n", "FXYD2-22 | \n", "5.593814e-170 | \n", "-4.929407 | \n", "0.299 | \n", "0.994 | \n", "7.547845e-165 | \n", "FXYD2 | \n", "4.119512 | \n", "0.000000e+00 | \n", "0.000000e+00 | \n", "
| 2 | \n", "FXYD2-32 | \n", "3.611924e-162 | \n", "-4.767175 | \n", "0.226 | \n", "0.994 | \n", "4.873642e-157 | \n", "FXYD2 | \n", "4.119512 | \n", "0.000000e+00 | \n", "0.000000e+00 | \n", "
| 3 | \n", "RPL34-8 | \n", "5.754681e-156 | \n", "-2.005695 | \n", "0.996 | \n", "1.000 | \n", "7.764906e-151 | \n", "RPL34 | \n", "2.007840 | \n", "5.145729e-143 | \n", "4.818461e-139 | \n", "
| 4 | \n", "FXYD2-42 | \n", "1.697355e-154 | \n", "-4.520503 | \n", "0.165 | \n", "0.989 | \n", "2.290275e-149 | \n", "FXYD2 | \n", "4.119512 | \n", "0.000000e+00 | \n", "0.000000e+00 | \n", "