DOLPHIN Model module

This module provides functions to run the DOLPHIN model.

DOLPHIN.model.run_model.run_DOLPHIN(data_type, graph_in, fea_in, current_out_path='./', params=None, device='auto', seed_num=0)[source]

Run the DOLPHIN model on single-cell RNA-seq data to obtain latent cell embeddings.

Parameters:

data_type (str) – Specifies the type of input single-cell RNA-seq data. - “full-length”: For full-length RNA-seq data. - “10x”: For 10x Genomics RNA-seq data.
graph_in (object) – The input graph structure (precomputed from exon-level data).
fea_in (anndata.AnnData) – The input feature matrix, provided as an AnnData object.
current_out_path (str, optional) – Output directory where the resulting cell embeddings will be saved. The embeddings will be written to DOLPHIN_Z.h5ad. Default is ‘./’.
params (dict, optional) –
A dictionary of model hyperparameters. If not provided, default parameters will be used depending on data_type. Customizable parameters include:
- ”gat_channel” : Number of GAT output channels per head.
- ”nhead” : Number of GAT attention heads.
- ”gat_dropout” : Dropout rate in the GAT layer.
- ”list_gra_enc_hid” : Encoder MLP hidden layer sizes.
- ”gra_p_dropout” : Dropout rate in the encoder.
- ”z_dim” : Dimensionality of the latent space.
- ”list_fea_dec_hid” : Feature decoder MLP hidden layer sizes.
- ”list_adj_dec_hid” : Adjacency decoder MLP hidden layer sizes.
- ”lr” : Learning rate.
- ”batch” : Mini-batch size.
- ”epochs” : Number of training epochs.
- ”kl_beta” : KL divergence loss weight.
- ”fea_lambda” : Feature reconstruction loss weight.
- ”adj_lambda” : Adjacency reconstruction loss weight.
device (str, optional) –
Computational device to run the model on. Options are:
- ’auto’ (default): Automatically selects ‘cuda’ if a GPU is available, otherwise falls back to ‘cpu’.
- ’cuda’ or ‘cuda:0’: Use the first available GPU.
- ’cpu’: Run on CPU only.
GPU acceleration is recommended for large datasets or faster training.
seed_num (int, optional) – Random seed for reproducibility. Default is 0.

Returns:

Saves the latent cell embedding matrix to DOLPHIN_Z.h5ad under current_out_path.

Return type:

None