{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Alternative Splicing Analysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Step 1: Convert Outrigger PSI output to .h5ad format\n", "\n", "This step converts the Outrigger PSI matrix into an `.h5ad` file for downstream analysis. \n", "Missing (NaN) values are preserved to reflect unquantified splicing events.\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from DOLPHIN.AS.convert_psi_to_h5ad import run_convert_psi" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 795/795 [05:34<00:00, 2.38it/s]\n" ] } ], "source": [ "adata_psi = run_convert_psi(\n", " metadata_path=\"./fsla_meta.csv\",\n", " outrigger_path=\"./outrigger_output\",\n", " out_name='fsla',\n", " out_directory=\"./\"\n", ")" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "AnnData object with n_obs × n_vars = 795 × 9487\n", " obs: 'celltype1', 'celltype2'\n", " var: 'gene_name'" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "adata_psi " ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
isoform1=junction:10:100246936-100253420:-|isoform2=junction:10:100250333-100253420:-@exon:10:100250248-100250332:-@junction:10:100246936-100250247:-isoform1=junction:10:100256477-100260965:-|isoform2=junction:10:100260320-100260965:-@exon:10:100260218-100260319:-@junction:10:100256477-100260217:-isoform1=junction:10:100489762-100490705:-|isoform2=junction:10:100490323-100490705:-@exon:10:100490008-100490322:-@junction:10:100489762-100490007:-isoform1=junction:10:100496432-100497666:-|isoform2=junction:10:100497281-100497666:-@exon:10:100497135-100497280:-@junction:10:100496432-100497134:-isoform1=junction:10:100498208-100499159:-|isoform2=junction:10:100498805-100499159:-@exon:10:100498705-100498804:-@junction:10:100498208-100498704:-isoform1=junction:10:100516961-100526974:-|isoform2=junction:10:100526555-100526974:-@exon:10:100526399-100526554:-@junction:10:100516961-100526398:-isoform1=junction:10:100523930-100526974:-|isoform2=junction:10:100526555-100526974:-@exon:10:100526399-100526554:-@junction:10:100523930-100526398:-isoform1=junction:10:100983818-100986748:-|isoform2=junction:10:100984075-100986748:-@exon:10:100983948-100984074:-@junction:10:100983818-100983947:-isoform1=junction:10:101611770-101624744:-|isoform2=junction:10:101612478-101624744:-@exon:10:101612337-101612477:-@junction:10:101611770-101612336:-isoform1=junction:10:101624811-101672914:-|isoform2=junction:10:101667981-101672914:-@exon:10:101667886-101667980:-@junction:10:101624811-101667885:-...isoform1=junction:X:78945496-78960507:+|isoform2=junction:X:78945496-78952192:+@exon:X:78952193-78952335:+@junction:X:78952336-78960507:+isoform1=junction:X:78947864-78960507:+|isoform2=junction:X:78947864-78952192:+@exon:X:78952193-78952335:+@junction:X:78952336-78960507:+isoform1=junction:X:79361480-79362941:-|isoform2=junction:X:79362692-79362941:-@exon:X:79362581-79362691:-@junction:X:79361480-79362580:-isoform1=junction:X:81202246-81276983:+|isoform2=junction:X:81202246-81202436:+@exon:X:81202437-81202576:+@junction:X:81202577-81276983:+isoform1=junction:Y:12909408-12912726:+|isoform2=junction:Y:12909408-12911838:+@exon:Y:12911839-12911968:+@junction:Y:12911969-12912726:+isoform1=junction:Y:13359987-13366266:-|isoform2=junction:Y:13360529-13366266:-@exon:Y:13360430-13360528:-@junction:Y:13359987-13360429:-isoform1=junction:Y:19587508-19590082:+|isoform2=junction:Y:19587508-19589520:+@exon:Y:19589521-19589612:+@junction:Y:19589613-19590082:+isoform1=junction:Y:19735751-19741317:-|isoform2=junction:Y:19739663-19741317:-@exon:Y:19739528-19739662:-@junction:Y:19735751-19739527:-isoform1=junction:Y:20582694-20588023:+|isoform2=junction:Y:20582694-20584473:+@exon:Y:20584474-20584524:+@junction:Y:20584525-20588023:+isoform1=junction:Y:2854772-2866792:+|isoform2=junction:Y:2854772-2865087:+@exon:Y:2865088-2865245:+@junction:Y:2865246-2866792:+
SRR18388386NaNNaNNaNNaNNaNNaN1.0NaNNaNNaN...NaNNaN1.00.000000NaNNaNNaNNaNNaN1.0
SRR18387779NaNNaNNaNNaNNaNNaN1.0NaNNaNNaN...NaNNaNNaN0.054945NaNNaNNaNNaNNaN1.0
SRR18387770NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...NaNNaN1.00.000000NaNNaNNaNNaNNaN1.0
SRR18388394NaNNaNNaNNaNNaNNaN1.0NaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaN1.0
SRR18387788NaNNaNNaNNaNNaNNaN1.0NaNNaNNaN...NaNNaN1.0NaNNaNNaNNaNNaNNaN1.0
\n", "

5 rows × 9487 columns

\n", "
" ], "text/plain": [ " isoform1=junction:10:100246936-100253420:-|isoform2=junction:10:100250333-100253420:-@exon:10:100250248-100250332:-@junction:10:100246936-100250247:- \\\n", "SRR18388386 NaN \n", "SRR18387779 NaN \n", "SRR18387770 NaN \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " isoform1=junction:10:100256477-100260965:-|isoform2=junction:10:100260320-100260965:-@exon:10:100260218-100260319:-@junction:10:100256477-100260217:- \\\n", "SRR18388386 NaN \n", "SRR18387779 NaN \n", "SRR18387770 NaN \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " isoform1=junction:10:100489762-100490705:-|isoform2=junction:10:100490323-100490705:-@exon:10:100490008-100490322:-@junction:10:100489762-100490007:- \\\n", "SRR18388386 NaN \n", "SRR18387779 NaN \n", "SRR18387770 NaN \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " isoform1=junction:10:100496432-100497666:-|isoform2=junction:10:100497281-100497666:-@exon:10:100497135-100497280:-@junction:10:100496432-100497134:- \\\n", "SRR18388386 NaN \n", "SRR18387779 NaN \n", "SRR18387770 NaN \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " isoform1=junction:10:100498208-100499159:-|isoform2=junction:10:100498805-100499159:-@exon:10:100498705-100498804:-@junction:10:100498208-100498704:- \\\n", "SRR18388386 NaN \n", "SRR18387779 NaN \n", "SRR18387770 NaN \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " isoform1=junction:10:100516961-100526974:-|isoform2=junction:10:100526555-100526974:-@exon:10:100526399-100526554:-@junction:10:100516961-100526398:- \\\n", "SRR18388386 NaN \n", "SRR18387779 NaN \n", "SRR18387770 NaN \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " isoform1=junction:10:100523930-100526974:-|isoform2=junction:10:100526555-100526974:-@exon:10:100526399-100526554:-@junction:10:100523930-100526398:- \\\n", "SRR18388386 1.0 \n", "SRR18387779 1.0 \n", "SRR18387770 NaN \n", "SRR18388394 1.0 \n", "SRR18387788 1.0 \n", "\n", " isoform1=junction:10:100983818-100986748:-|isoform2=junction:10:100984075-100986748:-@exon:10:100983948-100984074:-@junction:10:100983818-100983947:- \\\n", "SRR18388386 NaN \n", "SRR18387779 NaN \n", "SRR18387770 NaN \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " isoform1=junction:10:101611770-101624744:-|isoform2=junction:10:101612478-101624744:-@exon:10:101612337-101612477:-@junction:10:101611770-101612336:- \\\n", "SRR18388386 NaN \n", "SRR18387779 NaN \n", "SRR18387770 NaN \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " isoform1=junction:10:101624811-101672914:-|isoform2=junction:10:101667981-101672914:-@exon:10:101667886-101667980:-@junction:10:101624811-101667885:- \\\n", "SRR18388386 NaN \n", "SRR18387779 NaN \n", "SRR18387770 NaN \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " ... \\\n", "SRR18388386 ... \n", "SRR18387779 ... \n", "SRR18387770 ... \n", "SRR18388394 ... \n", "SRR18387788 ... \n", "\n", " isoform1=junction:X:78945496-78960507:+|isoform2=junction:X:78945496-78952192:+@exon:X:78952193-78952335:+@junction:X:78952336-78960507:+ \\\n", "SRR18388386 NaN \n", "SRR18387779 NaN \n", "SRR18387770 NaN \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " isoform1=junction:X:78947864-78960507:+|isoform2=junction:X:78947864-78952192:+@exon:X:78952193-78952335:+@junction:X:78952336-78960507:+ \\\n", "SRR18388386 NaN \n", "SRR18387779 NaN \n", "SRR18387770 NaN \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " isoform1=junction:X:79361480-79362941:-|isoform2=junction:X:79362692-79362941:-@exon:X:79362581-79362691:-@junction:X:79361480-79362580:- \\\n", "SRR18388386 1.0 \n", "SRR18387779 NaN \n", "SRR18387770 1.0 \n", "SRR18388394 NaN \n", "SRR18387788 1.0 \n", "\n", " isoform1=junction:X:81202246-81276983:+|isoform2=junction:X:81202246-81202436:+@exon:X:81202437-81202576:+@junction:X:81202577-81276983:+ \\\n", "SRR18388386 0.000000 \n", "SRR18387779 0.054945 \n", "SRR18387770 0.000000 \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " isoform1=junction:Y:12909408-12912726:+|isoform2=junction:Y:12909408-12911838:+@exon:Y:12911839-12911968:+@junction:Y:12911969-12912726:+ \\\n", "SRR18388386 NaN \n", "SRR18387779 NaN \n", "SRR18387770 NaN \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " isoform1=junction:Y:13359987-13366266:-|isoform2=junction:Y:13360529-13366266:-@exon:Y:13360430-13360528:-@junction:Y:13359987-13360429:- \\\n", "SRR18388386 NaN \n", "SRR18387779 NaN \n", "SRR18387770 NaN \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " isoform1=junction:Y:19587508-19590082:+|isoform2=junction:Y:19587508-19589520:+@exon:Y:19589521-19589612:+@junction:Y:19589613-19590082:+ \\\n", "SRR18388386 NaN \n", "SRR18387779 NaN \n", "SRR18387770 NaN \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " isoform1=junction:Y:19735751-19741317:-|isoform2=junction:Y:19739663-19741317:-@exon:Y:19739528-19739662:-@junction:Y:19735751-19739527:- \\\n", "SRR18388386 NaN \n", "SRR18387779 NaN \n", "SRR18387770 NaN \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " isoform1=junction:Y:20582694-20588023:+|isoform2=junction:Y:20582694-20584473:+@exon:Y:20584474-20584524:+@junction:Y:20584525-20588023:+ \\\n", "SRR18388386 NaN \n", "SRR18387779 NaN \n", "SRR18387770 NaN \n", "SRR18388394 NaN \n", "SRR18387788 NaN \n", "\n", " isoform1=junction:Y:2854772-2866792:+|isoform2=junction:Y:2854772-2865087:+@exon:Y:2865088-2865245:+@junction:Y:2865246-2866792:+ \n", "SRR18388386 1.0 \n", "SRR18387779 1.0 \n", "SRR18387770 1.0 \n", "SRR18388394 1.0 \n", "SRR18387788 1.0 \n", "\n", "[5 rows x 9487 columns]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "adata_psi.to_df().head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Step 2: Cell Clustering Using PSI Values\n", "\n", "This step processes the _PSI.h5ad file to facilitate cell clustering. \n", "To enable PCA and downstream analyses, missing PSI (NaN) values are imputed with random values between 0 and 1. \n", "The resulting matrix is saved as a new .h5ad file containing the imputed PSI values.\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from DOLPHIN.AS.convert_random_psi import run_psi_random" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "adata_psi_random = run_psi_random(\n", " outrigger_psi_data=\"./alternative_splicing/fsla_PSI.h5ad\",\n", " out_name=\"fsla\",\n", " out_directory='./')" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
isoform1=junction:10:100246936-100253420:-|isoform2=junction:10:100250333-100253420:-@exon:10:100250248-100250332:-@junction:10:100246936-100250247:-isoform1=junction:10:100256477-100260965:-|isoform2=junction:10:100260320-100260965:-@exon:10:100260218-100260319:-@junction:10:100256477-100260217:-isoform1=junction:10:100489762-100490705:-|isoform2=junction:10:100490323-100490705:-@exon:10:100490008-100490322:-@junction:10:100489762-100490007:-isoform1=junction:10:100496432-100497666:-|isoform2=junction:10:100497281-100497666:-@exon:10:100497135-100497280:-@junction:10:100496432-100497134:-isoform1=junction:10:100498208-100499159:-|isoform2=junction:10:100498805-100499159:-@exon:10:100498705-100498804:-@junction:10:100498208-100498704:-isoform1=junction:10:100516961-100526974:-|isoform2=junction:10:100526555-100526974:-@exon:10:100526399-100526554:-@junction:10:100516961-100526398:-isoform1=junction:10:100523930-100526974:-|isoform2=junction:10:100526555-100526974:-@exon:10:100526399-100526554:-@junction:10:100523930-100526398:-isoform1=junction:10:100983818-100986748:-|isoform2=junction:10:100984075-100986748:-@exon:10:100983948-100984074:-@junction:10:100983818-100983947:-isoform1=junction:10:101611770-101624744:-|isoform2=junction:10:101612478-101624744:-@exon:10:101612337-101612477:-@junction:10:101611770-101612336:-isoform1=junction:10:101624811-101672914:-|isoform2=junction:10:101667981-101672914:-@exon:10:101667886-101667980:-@junction:10:101624811-101667885:-...isoform1=junction:X:78945496-78960507:+|isoform2=junction:X:78945496-78952192:+@exon:X:78952193-78952335:+@junction:X:78952336-78960507:+isoform1=junction:X:78947864-78960507:+|isoform2=junction:X:78947864-78952192:+@exon:X:78952193-78952335:+@junction:X:78952336-78960507:+isoform1=junction:X:79361480-79362941:-|isoform2=junction:X:79362692-79362941:-@exon:X:79362581-79362691:-@junction:X:79361480-79362580:-isoform1=junction:X:81202246-81276983:+|isoform2=junction:X:81202246-81202436:+@exon:X:81202437-81202576:+@junction:X:81202577-81276983:+isoform1=junction:Y:12909408-12912726:+|isoform2=junction:Y:12909408-12911838:+@exon:Y:12911839-12911968:+@junction:Y:12911969-12912726:+isoform1=junction:Y:13359987-13366266:-|isoform2=junction:Y:13360529-13366266:-@exon:Y:13360430-13360528:-@junction:Y:13359987-13360429:-isoform1=junction:Y:19587508-19590082:+|isoform2=junction:Y:19587508-19589520:+@exon:Y:19589521-19589612:+@junction:Y:19589613-19590082:+isoform1=junction:Y:19735751-19741317:-|isoform2=junction:Y:19739663-19741317:-@exon:Y:19739528-19739662:-@junction:Y:19735751-19739527:-isoform1=junction:Y:20582694-20588023:+|isoform2=junction:Y:20582694-20584473:+@exon:Y:20584474-20584524:+@junction:Y:20584525-20588023:+isoform1=junction:Y:2854772-2866792:+|isoform2=junction:Y:2854772-2865087:+@exon:Y:2865088-2865245:+@junction:Y:2865246-2866792:+
SRR183883860.5488140.7151890.6027630.5448830.4236550.6458941.0000000.8917730.9636630.383442...0.1922000.9169991.0000000.0000000.2243250.6460990.3773030.2391750.8439211.0
SRR183877790.4716490.2859350.8722930.4193840.4653970.1919931.0000000.5499050.6568980.418817...0.6909720.5700530.5548950.0549450.3503640.7659370.0748630.8086290.2413411.0
SRR183877700.4671570.2071760.9138400.6884350.0013120.8028880.1923680.4108500.8280480.916628...0.6556580.1505401.0000000.0000000.5110760.0956350.8356700.2176150.7908231.0
SRR183883940.9216300.5767430.4864090.6466800.8441610.3013501.0000000.2145580.5893720.956229...0.2586620.3663790.2705190.6132850.8293670.9481840.8161190.6773520.1572221.0
SRR183877880.9780440.7746970.7806610.5421420.9468170.9965281.0000000.8412710.6346490.234987...0.3595070.2369031.0000000.9505380.3755100.2470620.3202560.0472800.2748631.0
\n", "

5 rows × 9487 columns

\n", "
" ], "text/plain": [ " isoform1=junction:10:100246936-100253420:-|isoform2=junction:10:100250333-100253420:-@exon:10:100250248-100250332:-@junction:10:100246936-100250247:- \\\n", "SRR18388386 0.548814 \n", "SRR18387779 0.471649 \n", "SRR18387770 0.467157 \n", "SRR18388394 0.921630 \n", "SRR18387788 0.978044 \n", "\n", " isoform1=junction:10:100256477-100260965:-|isoform2=junction:10:100260320-100260965:-@exon:10:100260218-100260319:-@junction:10:100256477-100260217:- \\\n", "SRR18388386 0.715189 \n", "SRR18387779 0.285935 \n", "SRR18387770 0.207176 \n", "SRR18388394 0.576743 \n", "SRR18387788 0.774697 \n", "\n", " isoform1=junction:10:100489762-100490705:-|isoform2=junction:10:100490323-100490705:-@exon:10:100490008-100490322:-@junction:10:100489762-100490007:- \\\n", "SRR18388386 0.602763 \n", "SRR18387779 0.872293 \n", "SRR18387770 0.913840 \n", "SRR18388394 0.486409 \n", "SRR18387788 0.780661 \n", "\n", " isoform1=junction:10:100496432-100497666:-|isoform2=junction:10:100497281-100497666:-@exon:10:100497135-100497280:-@junction:10:100496432-100497134:- \\\n", "SRR18388386 0.544883 \n", "SRR18387779 0.419384 \n", "SRR18387770 0.688435 \n", "SRR18388394 0.646680 \n", "SRR18387788 0.542142 \n", "\n", " isoform1=junction:10:100498208-100499159:-|isoform2=junction:10:100498805-100499159:-@exon:10:100498705-100498804:-@junction:10:100498208-100498704:- \\\n", "SRR18388386 0.423655 \n", "SRR18387779 0.465397 \n", "SRR18387770 0.001312 \n", "SRR18388394 0.844161 \n", "SRR18387788 0.946817 \n", "\n", " isoform1=junction:10:100516961-100526974:-|isoform2=junction:10:100526555-100526974:-@exon:10:100526399-100526554:-@junction:10:100516961-100526398:- \\\n", "SRR18388386 0.645894 \n", "SRR18387779 0.191993 \n", "SRR18387770 0.802888 \n", "SRR18388394 0.301350 \n", "SRR18387788 0.996528 \n", "\n", " isoform1=junction:10:100523930-100526974:-|isoform2=junction:10:100526555-100526974:-@exon:10:100526399-100526554:-@junction:10:100523930-100526398:- \\\n", "SRR18388386 1.000000 \n", "SRR18387779 1.000000 \n", "SRR18387770 0.192368 \n", "SRR18388394 1.000000 \n", "SRR18387788 1.000000 \n", "\n", " isoform1=junction:10:100983818-100986748:-|isoform2=junction:10:100984075-100986748:-@exon:10:100983948-100984074:-@junction:10:100983818-100983947:- \\\n", "SRR18388386 0.891773 \n", "SRR18387779 0.549905 \n", "SRR18387770 0.410850 \n", "SRR18388394 0.214558 \n", "SRR18387788 0.841271 \n", "\n", " isoform1=junction:10:101611770-101624744:-|isoform2=junction:10:101612478-101624744:-@exon:10:101612337-101612477:-@junction:10:101611770-101612336:- \\\n", "SRR18388386 0.963663 \n", "SRR18387779 0.656898 \n", "SRR18387770 0.828048 \n", "SRR18388394 0.589372 \n", "SRR18387788 0.634649 \n", "\n", " isoform1=junction:10:101624811-101672914:-|isoform2=junction:10:101667981-101672914:-@exon:10:101667886-101667980:-@junction:10:101624811-101667885:- \\\n", "SRR18388386 0.383442 \n", "SRR18387779 0.418817 \n", "SRR18387770 0.916628 \n", "SRR18388394 0.956229 \n", "SRR18387788 0.234987 \n", "\n", " ... \\\n", "SRR18388386 ... \n", "SRR18387779 ... \n", "SRR18387770 ... \n", "SRR18388394 ... \n", "SRR18387788 ... \n", "\n", " isoform1=junction:X:78945496-78960507:+|isoform2=junction:X:78945496-78952192:+@exon:X:78952193-78952335:+@junction:X:78952336-78960507:+ \\\n", "SRR18388386 0.192200 \n", "SRR18387779 0.690972 \n", "SRR18387770 0.655658 \n", "SRR18388394 0.258662 \n", "SRR18387788 0.359507 \n", "\n", " isoform1=junction:X:78947864-78960507:+|isoform2=junction:X:78947864-78952192:+@exon:X:78952193-78952335:+@junction:X:78952336-78960507:+ \\\n", "SRR18388386 0.916999 \n", "SRR18387779 0.570053 \n", "SRR18387770 0.150540 \n", "SRR18388394 0.366379 \n", "SRR18387788 0.236903 \n", "\n", " isoform1=junction:X:79361480-79362941:-|isoform2=junction:X:79362692-79362941:-@exon:X:79362581-79362691:-@junction:X:79361480-79362580:- \\\n", "SRR18388386 1.000000 \n", "SRR18387779 0.554895 \n", "SRR18387770 1.000000 \n", "SRR18388394 0.270519 \n", "SRR18387788 1.000000 \n", "\n", " isoform1=junction:X:81202246-81276983:+|isoform2=junction:X:81202246-81202436:+@exon:X:81202437-81202576:+@junction:X:81202577-81276983:+ \\\n", "SRR18388386 0.000000 \n", "SRR18387779 0.054945 \n", "SRR18387770 0.000000 \n", "SRR18388394 0.613285 \n", "SRR18387788 0.950538 \n", "\n", " isoform1=junction:Y:12909408-12912726:+|isoform2=junction:Y:12909408-12911838:+@exon:Y:12911839-12911968:+@junction:Y:12911969-12912726:+ \\\n", "SRR18388386 0.224325 \n", "SRR18387779 0.350364 \n", "SRR18387770 0.511076 \n", "SRR18388394 0.829367 \n", "SRR18387788 0.375510 \n", "\n", " isoform1=junction:Y:13359987-13366266:-|isoform2=junction:Y:13360529-13366266:-@exon:Y:13360430-13360528:-@junction:Y:13359987-13360429:- \\\n", "SRR18388386 0.646099 \n", "SRR18387779 0.765937 \n", "SRR18387770 0.095635 \n", "SRR18388394 0.948184 \n", "SRR18387788 0.247062 \n", "\n", " isoform1=junction:Y:19587508-19590082:+|isoform2=junction:Y:19587508-19589520:+@exon:Y:19589521-19589612:+@junction:Y:19589613-19590082:+ \\\n", "SRR18388386 0.377303 \n", "SRR18387779 0.074863 \n", "SRR18387770 0.835670 \n", "SRR18388394 0.816119 \n", "SRR18387788 0.320256 \n", "\n", " isoform1=junction:Y:19735751-19741317:-|isoform2=junction:Y:19739663-19741317:-@exon:Y:19739528-19739662:-@junction:Y:19735751-19739527:- \\\n", "SRR18388386 0.239175 \n", "SRR18387779 0.808629 \n", "SRR18387770 0.217615 \n", "SRR18388394 0.677352 \n", "SRR18387788 0.047280 \n", "\n", " isoform1=junction:Y:20582694-20588023:+|isoform2=junction:Y:20582694-20584473:+@exon:Y:20584474-20584524:+@junction:Y:20584525-20588023:+ \\\n", "SRR18388386 0.843921 \n", "SRR18387779 0.241341 \n", "SRR18387770 0.790823 \n", "SRR18388394 0.157222 \n", "SRR18387788 0.274863 \n", "\n", " isoform1=junction:Y:2854772-2866792:+|isoform2=junction:Y:2854772-2865087:+@exon:Y:2865088-2865245:+@junction:Y:2865246-2866792:+ \n", "SRR18388386 1.0 \n", "SRR18387779 1.0 \n", "SRR18387770 1.0 \n", "SRR18388394 1.0 \n", "SRR18387788 1.0 \n", "\n", "[5 rows x 9487 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "adata_psi_random.to_df().head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Step 3: Differential Alternative Splicing Analysis\n", "\n", "In this step, we perform differential alternative splicing analysis using the Wilcoxon test. \n", "To enable this, missing PSI (NaN) values are imputed using the average PSI across all events \n", "within each cell cluster. Specifically, for each cluster, we calculate the mean PSI across all \n", "available events and use this value to fill NaNs in that cluster's cells. This ensures that \n", "events with sparse coverage still receive imputations based on the overall splicing profile \n", "of their respective cluster.\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from DOLPHIN.AS.generate_differential_as import run_differential_as" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total number of splicing events before filtering: 9487\n", "Number of splicing events after filtering (>= 10 cells with valid PSI): 4978\n" ] } ], "source": [ "adata_psi_DAS = run_differential_as(\n", " outrigger_psi_data=\"./alternative_splicing/fsla_PSI.h5ad\",\n", " out_name=\"fsla\",\n", " cluster_name=\"celltype1\",\n", " out_directory='./'\n", ")" ] } ], "metadata": { "kernelspec": { "display_name": "DOLPHIN", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.15" } }, "nbformat": 4, "nbformat_minor": 2 }