Junction-Level Differential Gene Analysis (JDEG)

In this section, we perform junction-level differential gene analysis using the adjacency matrix. This analysis aims to identify genes that exhibit significant differences in junction-level expression between different conditions or cell types.

Step 1: Identify Junction Markers Using MAST

This step is identical to Step1of the EDEG analysis. The only difference is that the input_anndata file is changed to AdjacencyComp_PDAC.h5ad, which contains junction-level features instead of exon-level ones. The rest of the analysis pipeline remains the same, including the use of the MAST model through Seurat.

Step 2: Identify Junction-Level Markers Using Exon-Level Information

To determine gene-level differential expression, we now aggregate exon marker information by gene, enabling a higher-level view of differences.

The following function converts junction-level marker results from Seurat into gene-level statistical information. It uses the Stouffer method to combine junction p-values, and also computes average log2 fold changes.

[1]:
from DOLPHIN.EDEG.generate_JDEG import run_jdeg
[2]:
pd_JDEG = run_jdeg(seurat_output = "/mnt/data/kailu/00_scExon/10_GO_PDAC/02_model/02_exon_adj/MAST/AdjacencyComp_MAST_ductal.csv",
    output = "./PDAC_MAST_ductal_junction_final.csv")
[5]:
pd_JDEG.head()
[5]:
Exon_names p_val avg_log2FC pct.1 pct.2 p_val_adj Gene_names MAST_abs_avg_log2FC MAST_stouffer_pval MAST_stouffer_pval_adj_bonf
0 RPS26-12 4.578045e-194 -2.627707 0.949 0.989 6.177248e-189 RPS26 1.923962 6.474450e-153 6.062675e-149
1 FXYD2-22 5.593814e-170 -4.929407 0.299 0.994 7.547845e-165 FXYD2 4.119512 0.000000e+00 0.000000e+00
2 FXYD2-32 3.611924e-162 -4.767175 0.226 0.994 4.873642e-157 FXYD2 4.119512 0.000000e+00 0.000000e+00
3 RPL34-8 5.754681e-156 -2.005695 0.996 1.000 7.764906e-151 RPL34 2.007840 5.145729e-143 4.818461e-139
4 FXYD2-42 1.697355e-154 -4.520503 0.165 0.989 2.290275e-149 FXYD2 4.119512 0.000000e+00 0.000000e+00