SIDISH.SIDISH
- class SIDISH.SIDISH(adata, bulk, device='cpu', seed=1234, use_spatial_graph=False, k_neighbors=None)[source]
SIDISH (Semi-Supervised Iterative Deep Learning for Identifying High-Risk Cells).
This framework integrates single-cell and bulk RNA-seq data to identify High-Risk cancer cells and potential biomarkers.
- Parameters:
adata (AnnData) – Single-cell RNA-seq data.
bulk (pd.DataFrame) – Bulk RNA-seq data.
use_spatial_graph (bool, optional) – Whether to use spatial graph information (default=False).
k_neighbors (int, optional) – Number of neighbors to use for constructing the spatial graph (default=5).
device (str) – Computation device (‘cpu’ or ‘cuda’).
seed (int, optional) – Random seed for reproducibility (default=1234).
Methods
__init__(adata, bulk[, device, seed, ...])annotateCells(test_adata, percentile_cells, mode)Extracts latent representations from the trained VAE.
get_MarkerGenes([logfc_threshold, ...])Identifies marker genes for the specified group using different statistical methods.
get_embedding([n_neighbors, resolution, ...])get_percentille(percentile)init_Phase1(epochs, i_epochs, latent_size, ...)Initializes Phase 1: training a Variational Autoencoder (VAE) on single-cell RNA-seq data.
init_Phase2(epochs, hidden, lr, dropout, ...)Initializes Phase 2: training a Deep Cox model for survival analysis using bulk RNA-seq data.
plotUMAP(resolution[, figure_size, ...])Performs UMAP dimensionality reduction and Leiden clustering on the latent space.
plot_CellType_UMAP([size, resolution, celltype])plot_HighRisk_UMAP([size, resolution, celltype])plot_KM([penalizer, data_name, ...])Plot Kaplan-Meier survival curves for High-Risk and background patient groups.
plot_double_Perturbation_Heatmap(...[, top_n])plot_perturbation_UMAP_default(genes_of_interest)Generates UMAP visualizations for specified genes after in-silico perturbation.
plot_perturbation_UMAP_differential(...[, ...])Generates UMAP visualizations for specified genes after in-silico perturbation.
plot_top_perturbed_genes(gene_data[, top_n])Plots a barplot of the top N genes with the highest percentage reduction in High-Risk cells after in-silico perturbation.
reload(path[, num_workers])run_Perturbation([n_jobs])run_double_Perturbation(genes[, top_n, ...])run_double_Perturbation_score(genes[, ...])train(iterations, percentile, steepness, path)Trains the SIDISH framework iteratively, refining the identification of High-Risk cells.