Quick Start

Here, you will be briefly guided through the basics of how to use BALSAM and DeltaTopic.

Data Preparation

Read your data file, for example, a h5ad file, using scanpy:

import scanpy as sc
adata = sc.read(filename_spliced)
adata_unspliced = sc.read(filename_unspliced)

OR from a numpy array:

from scipy.sparse import csr_matrix
import anndata as ad
adata = ad.AnnData(csr_matrix(X_spliced))
adata.layers["counts"] = adata.X.copy()
adata.obsm["unspliced_expression"] = csr_matrix(X_unspliced)

Register spliced and unspliced counts:

adata.layers["counts"] = adata.X.copy()
adata.obsm["unspliced_expression"] = adata_unspliced.X.copy()

Setup anndata:

from DeltaTopic.nn.util import setup_anndata
setup_anndata(adata, layer="counts", unspliced_obsm_key = "unspliced_expression")

Note

if you are training BALSAM only, you can skip the additional step to read and register unspliced counts.

Training

Import the model and train:

from DeltaTopic.nn.modelhub import DeltaTopic
model = DeltaTopic(adata, n_latent = 32)
model.train(400)

Save model states and output the latent space:

import pandas as pd
model.save(SavePATH) #"./saved_model/"
model.get_parameters(save_dir = SavePath) # spike and slab parameters
topics_np = model.get_latent_representation() # latent topic proportions
pd.DataFrame(topics_np).to_csv(SaveFILENAME)

Analysis

Finally, perform favorite analyis on the latent space and topic loading. For an example of analyis used in the paper, please refer to the Rmd files in the project repository.