API
- class spatialmeta.AnnDataST(*args, **kwargs)[source]
Anndata object for Spatial Transcriptomics data
- class spatialmeta.AnnDataJointSMST(*args, **kwargs)[source]
Anndata object for Joint Spatial Metabolomics and Transcriptomics data
Data
- spatialmeta.data.load_adata(sample_name: str, modality: Literal['ST', 'SM', 'joint']) AnnDataST | AnnDataSM | AnnDataJointSMST[source]
Load the AnnData object for the given sample name and modality.
- Parameters:
sample_name – str The name of the sample. Use list_datasets to get the list of all available datasets.
modality – Literal[“ST”, “SM”, “joint”] The modality of the dataset. Choose from “ST”, “SM”, or “joint”.
Preprocessing
- spatialmeta.pp.read_sm_csv_as_anndata(sm_file: str) AnnDataSM[source]
Read SM csv file as AnnData object.
- Parameters:
sm_file – str. The file path and name of the SM csv file.
- Returns:
AnnDataSM. The AnnData object with SM data.
- spatialmeta.pp.get_mz_reference(p: ImzMLParser, ppm_tolerance: int = 5) DataFrame[source]
Get m/z reference from ImzMLParser object.
- Parameters:
p – ImzMLParser. The ImzMLParser object.
ppm_tolerance – int. The ppm tolerance, defalut is 5.
- Returns:
pd.DataFrame. The m/z reference for the SM data, and the column names are “m/z”, “Interval Width (+/- Da)”.
- spatialmeta.pp.read_sm_imzml_as_anndata(p: ImzMLParser, mz_reference: DataFrame) AnnDataSM[source]
Read SM imzML file as AnnData object.
- Parameters:
p – ImzMLParser. The ImzMLParser object.
mz_reference – pd.DataFrame. The m/z reference.
- Returns:
AnnDataSM. The AnnData object with SM data.
- spatialmeta.pp.merge_sm_pos_neg(adata_SM_pos: AnnDataSM, adata_SM_neg: AnnDataSM) AnnDataSM[source]
Merge the positive and negative SM data.
- Parameters:
adata_SM_pos – AnnDataSM. The AnnData object with positive SM data.
adata_SM_neg – AnnDataSM. The AnnData object with negative SM data.
- Returns:
AnnDataSM. The merged AnnData object with SM data.
- spatialmeta.pp.calculate_qc_metrics_sm(adata_SM: AnnDataSM)[source]
Calculate the total intensity and mean intensity of each spot.
- Parameters:
adata_SM – AnnDataSM. The AnnDataSM object.
- spatialmeta.pp.filter_cells_sm(adata_SM: AnnDataSM, min_total_intensity: int | None = None, min_mean_intensity: int | None = None, max_total_intensity: int | None = None, max_mean_intensity: int | None = None) AnnDataSM[source]
Filter cells based on total intensity and mean intensity.
- Parameters:
adata_SM – AnnDataSM. The AnnDataSM object.
min_total_intensity – int, minimum total intensity, default None.
min_mean_intensity – int, minimum mean intensity, default None.
max_total_intensity – int, maximum total intensity, default None.
max_mean_intensity – int, maximum mean intensity, default None.
- Returns:
AnnDataSM. The filtered AnnDataSM object.
- spatialmeta.pp.filter_metabolites_sm(adata_SM: <module 'anndata' from '/home/docs/checkouts/readthedocs.org/user_builds/spatialmeta/envs/latest/lib/python3.9/site-packages/anndata/__init__.py'>, min_cells: int | None = None, max_cells: int | None = None) AnnDataSM[source]
Filter metabolites based on the number of cells.
- Parameters:
adata_SM – AnnDataSM. The AnnDataSM object.
min_cells – int, minimum number of cells, default None.
max_cells – int, maximum number of cells, default None.
- Returns:
AnnDataSM. The filtered AnnDataSM object.
- spatialmeta.pp.new_spot_sample(adata_SM, adata_ST, spatial_key_SM='spatial', spatial_key_ST='spatial', min_diam=500) DataFrame[source]
Generate new spots by resampling in the intersection of the convex hull of SM and ST spots.
- Parameters:
adata_SM – AnnDataSM. The AnnData object with SM data.
adata_ST – AnnDataST. The AnnData object with ST data.
spatial_key_SM – str. The spatial key for SM data, default is “spatial”.
spatial_key_ST – str. The spatial key for ST data, default is “spatial”.
min_diam – int. The minimum diameter of the hexagonal grid, default is 500.
- Returns:
pd.DataFrame. The new spots in the intersection of the convex hull of SM and ST spots.
- spatialmeta.pp.spot_align_byknn(new_dot_in_df: DataFrame, adata_SM: AnnData, adata_ST: AnnData, spatail_key_SM: str = 'spatial', spatial_key_ST: str = 'spatial', min_dist: int = 500, n_neighbors: int = 5, dist_fold: float = 1.5) Tuple[AnnDataSM, AnnDataST][source]
Reassignment the new spots to the SM and ST data by KNN.
- Parameters:
new_dot_in_df – pd.DataFrame. The new spots in the intersection of the convex hull of SM and ST spots, oytput of function ‘new_spot_sample()’.
adata_SM – AnnDataSM. The AnnData object with SM data.
adata_ST – AnnDataST. The AnnData object with ST data.
spatail_key_SM – str. The spatial key for SM data, default is “spatial”.
spatial_key_ST – str. The spatial key for ST data, default is “spatial”.
min_dist – int. The minimum distance of the spot, which is same as the min_dist in function ‘new_spot_sample()’, default is 500.
n_neighbors – int. The neighbors for KNN calculation, default is 5.
dist_fold – float. The minimum distance fold, used to filter the nearest spots, defaults to 1.5. For example, if min_dist is 500 and dist_fold is 1.5, the minimum distance for filtering is 500 * 1.5 = 750. This filters out spots greater than this distance.
- Returns:
Tuple[AnnDataSM, AnnDataST]. The AnnData object with SM and ST data after reassignment.
- spatialmeta.pp.joint_adata_sm_st(adata_SM_new: AnnDataSM, adata_ST_new: AnnDataST) AnnDataJointSMST[source]
Merge the SM and ST data into a joint AnnData object.
- Parameters:
adata_SM_new – AnnDataSM. The AnnData object with SM data after reassignment.
adata_ST_new – AnnDataST. The AnnData object with ST data after reassignment.
- Returns:
AnnDataJointSMST. The joint AnnData object with SM and ST data.
- spatialmeta.pp.normalize_total_joint_adata_sm_st(joint_adata: AnnDataJointSMST, target_sum_SM: int | None = 10000.0, target_sum_ST: int | None = 10000.0)[source]
Normalize the total intensity of the SM and ST data in the joint AnnData object.
- Parameters:
joint_adata – AnnDataJointSMST. The joint AnnData object with SM and ST data.
target_sum_SM – Optional[int]. The target sum for SM data, default is 1e4.
target_sum_ST – Optional[int]. The target sum for ST data, default is 1e4.
- spatialmeta.pp.spatial_variable_joint_adata_sm_st(joint_adata: AnnDataJointSMST, n_top_genes: int = 2000, n_top_metabolites: int = 800, add_key: str = 'highly_variable_moranI', batch_key: str | None = None, min_samples: int = 2, min_frac: float = 0.8, min_logfc: float = 3)[source]
Calculate the spatial variables for the joint AnnData object and remove the batch-specific spatial variables.
- Parameters:
joint_adata – AnnDataJointSMST. The joint AnnData object with SM and ST data.
n_top_genes – int. The number of top genes, default is 2000.
n_top_metabolites – int. The number of top metabolites, default is 800.
add_key – str. The key for the spatial variables, default is “highly_variable_moranI”.
batch_key – Optional[str]. The batch key, default is None.
min_samples – int. The minimum number of samples, default is 2.
min_frac – float. The minimum fraction, default is 0.8.
min_logfc – float. The minimum log fold change, default is 3.
- spatialmeta.pp.spatial_variable(adata: AnnData | AnnDataSM | AnnDataST, *, layer: str | None = None, n_top_variable: int = 2000, add_key: str = 'highly_variable_moranI', batch_key: str | None = None, min_samples: int = 2, min_frac: float = 0.8, min_logfc: float = 3)[source]
Calculate the spatial variables and add the results to the AnnData object.
- Parameters:
adata – AnnData. The AnnData object.
layer – Optional[str]. The layer key, default is None.
n_top_variable – int. The number of top variables, default is 2000.
add_key – str. The key for the spatial variables, default is “highly_variable_moranI”.
batch_key – Optional[str]. The batch key, default is None.
min_samples – int. The minimum number of samples, default is 2.
min_frac – float. The minimum fraction, default is 0.8.
min_logfc – float. The minimum log fold change, default is 3.
- spatialmeta.pp.rank_gene_and_metabolite_groups(adata: AnnData, var_object: str = 'type', use_raw: bool = False, groupby_ST: str = 'leiden', groupby_SM: str = 'leiden', key_added_ST: str = 'rank_genes_groups', key_added_SM: str = 'rank_metabolites_groups', **kwargs)[source]
Rank gene and metabolite groups and add the results to the AnnData object.
- Parameters:
adata – AnnData. The AnnData object.
var_object – str. The variable object, default is “type”.
use_raw – bool. The flag of using raw data, default is False.
groupby_ST – str. The groupby key for ST data, default is “leiden”.
groupby_SM – str. The groupby key for SM data, default is “leiden”.
key_added_ST – str. The key for the rank gene groups, default is “rank_genes_groups”.
key_added_SM – str. The key for the rank metabolite groups, default is “rank_metabolites_groups”.
**kwargs –
dict. The keyword arguments for sc.tl.rank_genes_groups.
- spatialmeta.pp.corrcoef_stsm_inall(adata: AnnDataJointSMST, inputlist=None, list_type='gene', use_raw=True, ntop=10)[source]
Calculate the correlation coefficients between the ST and SM data in the joint AnnData object and add the results to the AnnData object uns as ‘corrcoef_stsm_inall_top’ and ‘corrcoef_stsm_inall’.
- Parameters:
adata – AnnDataJointSMST. The joint AnnData object with SM and ST data.
inputlist – Optional[list]. The input list, default is None.
list_type – str. The list type, default is “gene”.
use_raw – bool. The flag of using raw data, default is True.
ntop – int. The number of top genes or metabolites, default is 10.
- spatialmeta.pp.corrcoef_stsm_ingroup(adata: AnnDataJointSMST, inputlist=None, list_type='gene', groupby='leiden', use_raw=True, ntop=5)[source]
Calculate the correlation coefficients between the ST and SM data in the joint AnnData object in each group and add the results to the AnnData object uns as ‘corrcoef_stsm_ingroup_top’ and ‘corrcoef_stsm_ingroup’.
- Parameters:
adata – AnnDataJointSMST. The joint AnnData object with SM and ST data.
inputlist – Optional[list]. The input list, default is None.
list_type – str. The list type, default is “gene”.
groupby – str. The groupby key, default is “leiden”.
use_raw – bool. The flag of using raw data, default is True.
ntop – int. The number of top genes or metabolites, default is 5.
- spatialmeta.pp.spatial_distance_cluster(adata: AnnData, groupby: str = 'leiden', spatial_key: str = 'spatial', metric: str = 'euclidean', use_raw: bool = False, **kwargs) DataFrame[source]
Calculate the spatial distance between clusters.
- Parameters:
adata – AnnData. The AnnData object.
groupby – str. The groupby key, default is “leiden”.
spatial_key – str. The spatial key, default is “spatial”.
metric – str. The metric, default is “euclidean”.
use_raw – bool. The flag of using raw data, default is False.
- Returns:
pd.DataFrame. The spatial distance between clusters.
- spatialmeta.pp.calculate_dot_df(adata: AnnData, groupby: str = 'leiden', spatial_key: str = 'spatial', use_raw: bool = False, **kwargs) DataFrame[source]
Calculate the dot dataframe.
- Parameters:
adata – AnnData. The AnnData object.
groupby – str. The groupby key, default is “leiden”.
spatial_key – str. The spatial key, default is “spatial”.
use_raw – bool. The flag of using raw data, default is False.
- Returns:
pd.DataFrame. The dot dataframe.
- spatialmeta.pp.metabolite_annotation(adata_SM: AnnDataSM, adduct_type: str, adduct_method: str, tolerance_ppm: float = 5, inplace: bool = True) DataFrame[source]
Annotate metabolites based on the m/z values. :param adata_SM: AnnDataSM. The AnnDataSM object. :param adduct_type: str. The adduct type. :param adduct_method: str. The adduct method, ‘add’ or ‘sub’. :param tolerance_ppm: float, default 5. The tolerance in ppm. :param inplace: bool, default True. Whether to modify the AnnDataSM object inplace. :return: pd.DataFrame. The annotated metabolites.
- spatialmeta.pp.merge_and_assign_var_data(joint_adata: AnnDataJointSMST, var_anno_df: DataFrame, columns_to_assign: List[str])[source]
Merge the var_anno_df with joint_adata.var and joint_adata.raw.var and assign the columns to joint_adata.var and joint_adata.raw.var.
- Parameters:
joint_adata – AnnDataJointSMST. The joint AnnData object with SM and ST data.
var_anno_df – pd.DataFrame. The var annotation dataframe.
columns_to_assign – List[str]. The columns to assign.
return: None
- spatialmeta.pp.calculate_metabolite_enrichment(metabolite_list: List[str], cutoff: float = 0.05, type: str = 'sub_class') DataFrame[source]
Calculate the metabolite enrichment.
- Parameters:
metabolite_list – List[str]. The list of metabolites.
cutoff – float. The cutoff, default is 0.05.
type – str. The type, default is “sub_class”.
- Returns:
pd.DataFrame. The metabolite enrichment results.
- spatialmeta.pp.add_obs_to_adata(object_adata: <module 'anndata' from '/home/docs/checkouts/readthedocs.org/user_builds/spatialmeta/envs/latest/lib/python3.9/site-packages/anndata/__init__.py'>, adata: <module 'anndata' from '/home/docs/checkouts/readthedocs.org/user_builds/spatialmeta/envs/latest/lib/python3.9/site-packages/anndata/__init__.py'>, obs_key: str)[source]
Add obs to object_adata from adata.
- Parameters:
object_adata – AnnData. The AnnData object.
adata – AnnData. The AnnData object.
obs_key – str, the key of obs to be added.
- spatialmeta.pp.add_hvf_to_jointadata(joint_adata: AnnDataJointSMST, adata_SM: AnnDataSM, adata_ST: AnnDataST, hvf_key_SM: str = 'highly_variable_moranI', hvf_key_ST: str = 'highly_variable_moranI', hvf_key_joint: str = 'highly_variable_moranI')[source]
Add highly variable features to joint_adata.
- Parameters:
joint_adata – AnnDataJointSMST. The AnnDataJointSMST object.
adata_SM – AnnDataSM. The AnnDataSM object.
adata_ST – AnnDataST. The AnnDataST object.
hvf_key_SM – str, the key of highly variable features in adata_SM.var, default “highly_variable_moranI”.
hvf_key_ST – str, the key of highly variable features in adata_ST.var, default “highly_variable_moranI”.
hvf_key_joint – str, the key of highly variable features in joint_adata.var, default “highly_variable_moranI”.
- spatialmeta.pp.calculate_scale_factor(adata_SM: AnnDataSM, adata_ST: AnnDataST, spatial_key_SM: str = 'spatial', spatial_key_ST: str = 'spatial') Tuple[float, float][source]
Calculate the scaling factor between SM and ST data.
- Parameters:
adata_SM – AnnDataSM. The AnnData object with SM data.
adata_ST – AnnDataST. The AnnData object with ST data.
spatial_key_SM – str. The spatial key for SM data, default is “spatial”.
spatial_key_ST – str. The spatial key for ST data, default is “spatial”.
- Returns:
Tuple[float, float]. The scaling factor for width and height.
- spatialmeta.pp.spot_transform_by_manual(adata: AnnData, horizontal_flip: bool = False, vertical_flip: bool = False, rotation: int | None = None, scale_width: float = 1, scale_height: float = 1, translation_x: float | None = None, translation_y: float | None = None, spatial_key_SM: str = 'spatial', new_spatial_key_SM: str = 'new1_spatial')[source]
Transform the spatial coordinates of SM data by manual.
- Parameters:
adata – AnnData. The AnnData object.
horizontal_flip – bool. The horizontal flip flag, if True, flip the spatial coordinates horizontally,default is False.
vertical_flip – bool. The vertical flip flag, if True, flip the spatial coordinates vertically,default is False.
rotation – int. The rotation angle, if not None, rotate the spatial coordinates,default is None.
scale_width – float. The scaling factor for width,default is 1.
scale_height – float. The scaling factor for height,default is 1.
translation_x – Optional[float]. The translation factor for x coordinate,default is None.
translation_y – Optional[float]. The translation factor for y coordinate,default is None.
spatial_key_SM – str. The spatial key for SM data, default is “spatial”.
new_spatial_key_SM – str. The new spatial key for SM data, default is “new1_spatial”.
- Returns:
AnnDataSM. The AnnData object with transformed SM data.
Model
Alignment Model for ST and SM
This class is designed to align spatial transcriptomics (ST) and spatial metabolomics (SM) data. Firstly the model will learn a separate latent space for ST and SM by two indendent variational autoencoders (VAE). Then, the model will learn (1) an Affine Matrix A to align the coordinate of ST and SM data, (2) (optional) a diffeomorphic transformation matrix V to align the spatial coordinate of ST and SM data, (3) (optional) a linear transformation matrix W to align the latent space of ST and SM data, and (4) (optional) a linear transformation matrix V to align the metabolite latent space to the histology image feature.
- class spatialmeta.model.AlignmentModule(*, adata_st: ~spatialmeta.util._classes.AnnDataST, adata_sm: ~spatialmeta.util._classes.AnnDataSM, hidden_stacks: ~typing.List[int] = [128], n_latent: int = 64, bias: bool = True, use_batch_norm: bool = True, use_layer_norm: bool = False, dropout_rate: float = 0.1, activation_fn: ~typing.Callable = <class 'torch.nn.modules.activation.ReLU'>, device: str | ~torch.device = 'cpu', batch_embedding: ~typing.Literal['embedding', 'onehot'] = 'onehot', encode_libsize: bool = False, batch_hidden_dim: int = 8, reconstruction_method_st: ~typing.Literal['mse', 'zg', 'zinb'] = 'zinb', reconstruction_method_sm: ~typing.Literal['mse', 'zg', 'g'] = 'g')[source]
AlignmentModule is a class for aligning spatial transcriptomics and metabolomics datasets.
- Parameters:
adata_st – AnnDataST. The spatial transcriptomics dataset.
adata_sm – AnnDataSM. The spatial metabolomics dataset.
hidden_stacks – List[int]. The hidden layer sizes of the encoder and decoder.
n_latent – int. The latent dimension.
bias – bool. If True, use bias in the linear layers.
use_batch_norm – bool. If True, use batch normalization.
use_layer_norm – bool. If True, use layer normalization.
dropout_rate – float. The dropout rate.
activation_fn – Callable. The activation function.
device – Union[str, torch.device]. The device to run the model.
batch_embedding – Literal[“embedding”, “onehot”]. The batch embedding method.
encode_libsize – bool. If True, encode the library size.
batch_hidden_dim – int. The batch hidden dimension.
reconstruction_method_st – Literal[‘mse’, ‘zg’, ‘zinb’]. The reconstruction method for the spatial transcriptomics dataset.
reconstruction_method_sm – Literal[‘mse’, ‘zg’, ‘g’]. The reconstruction method for the spatial metabolomics dataset.
- spatialmeta.model.AlignmentModule.fit_vae(self, max_epoch: int = 30, n_per_batch: int = 128, kl_weight: float = 2.0, n_epochs_kl_warmup: int | None = 400, optimizer_parameters: Iterable | None = None, weight_decay: float = 1e-06, lr: bool = 5e-05, random_seed: int = 12, validation_split: float = 0.1)
Fit the two VAE models independently for the spatial transcriptomics and metabolomics datasets.
- Parameters:
max_epoch – int. The maximum number of epochs.
n_per_batch – int. The number of samples per batch.
kl_weight – float. The weight of the KL divergence loss.
n_epochs_kl_warmup – Union[int, None]. The number of epochs for KL divergence warmup.
optimizer_parameters – Iterable. The optimizer parameters.
weight_decay – float. The weight decay.
lr – float. The learning rate.
random_seed – int. The random seed.
validation_split – float. The validation split.
- Returns:
Dict. The loss record.
- spatialmeta.model.AlignmentModule.fit_alignment(self, data: dict, initial_scale: bool | None = None, a: float = 50.0, p: float = 2.0, expand: float = 2.0, nt: float = 3, niter: int = 500, diffeo_start: float = 0, diffeo: bool = False, epV: float = 0.2, sigmaM: float = 1.0, sigmaR: float = 500000.0, align_sm_spot_to: Literal['histology', 'ST'] = 'histology', align_spot_outline: bool = True, align_sm_feature_to_st_feature: bool = False, align_sm_feature_to_histology_feature: bool = False, debug_path: str | None = None)
Fit the alignment model for spatial transcriptomics (ST) features and spatial metabolomics (SM) features based on computation of Affine Matrix A by stochastic gradient descent See the original implementation at https://github.com/JEFworks-Lab/STalign
- Parameters:
data – dict. The data dictionary containing features of ST and SM to be aligned.
initial_scale – bool. The initial scale for the image
a – float. Smoothness scale of velocity field.
p – float. Power of Laplacian in velocity regularization.
expand – float. The expansion factor.
nt – float. Number of timesteps for integrating velocity field.
niter – int. The number of iterations.
diffeo_start – float. The starting step of diffeomorphism.
epV –
float. Gradient descent step size for velocity field. The default value was set to a small value (2e-1) to avoid divergence, compared to the original implementation so the user may need
to adjust this value to allow velocity field to converge.
sigmaM – float. Standard deviation of image matching term for Gaussian mixture modeling in cost function.
sigmaR – float. Standard deviation of regularization term for Gaussian mixture modeling in cost function.
align_sm_spot_to – bool. Whether to align the SM spot to spots sampled from the histology image or ST spots. Should be either ‘histology’ or ‘ST’.
align_spot_outline – bool. Whether to align the spot outline or all spots
align_sm_feature_to_histology_feature – bool. Whether to align the SM feature to histology gray scaled feature.
align_sm_feature_to_st_feature – bool. Whether to align the ST feature with SM feature.
debug_path – str. The optional temporary file path that save intermediate results in alignment.
Integration Model for ST and SM
This class is designed to handle vertical and horizontal spatial transcriptomics (ST) and spatial metabolomics (SM) data. The model learn a shared latent space to predict spatial sub-clusters characterized by unique transcriptional and metabolic states.
- spatialmeta.model.ConditionalVAESTSM(*args, **kwargs)[source]
This class implements a Conditional Variational Autoencoder with Mixture of Experts (MoE) for vertical and horizontal integration of ST and SM.
- Parameters:
adata – AnnDataJointSMST object containing the spatial multi-omics data.
hidden_stacks – List of integers specifying the number of hidden units in each stack of the encoder and decoder, default is [128].
batch_keys – Optional list of strings specifying the batch keys for batch correction.
n_latent – Integer specifying the dimensionality of the latent space, default is 10.
bias – Boolean indicating whether to include bias terms in the linear layers, default is True.
use_batch_norm – Boolean indicating whether to use batch normalization in the linear layers, default is True.
use_layer_norm – Boolean indicating whether to use layer normalization in the linear layers, default is False.
dropout_rate – Float specifying the dropout rate for the linear layers, default is 0.1.
activation_fn – Callable specifying the activation function to use in the linear layers, default is nn.ReLU.
device – String or torch.device specifying the device to use for computation, default is “cpu”.
batch_embedding – Literal[“embedding”, “onehot”] specifying the type of batch embedding to use, default is “onehot”.
encode_libsize – Boolean indicating whether to encode library size information, default is False.
batch_hidden_dim – Integer specifying the dimensionality of the batch hidden layer, default is 8.
reconstruction_method_st – Literal[‘mse’, ‘zg’, ‘zinb’] specifying the reconstruction method for the spatial data, default is ‘zinb’.
reconstruction_method_sm – Literal[‘mse’, ‘zg’, ‘g’] specifying the reconstruction method for the single-cell multi-omics data, default is ‘g’.
- spatialmeta.model.ConditionalVAESTSM.fit(self, max_epoch: int = 35, n_per_batch: int = 128, mode: Literal['single', 'multi'] | None = None, **kwargs)
Fits the model.
- Parameters:
max_epoch – Integer specifying the maximum number of epochs to train the model, default is 35.
n_per_batch – Integer specifying the number of samples per batch, default is 128.
mode – Optional string specifying the mode of training. Can be either ‘single’ or ‘multi’, default is None.
reconstruction_reduction – String specifying the reduction method for the reconstruction loss, default is ‘sum’.
kl_weight – Float specifying the weight of the KL divergence loss, default is 1.
reconstruction_st_weight – Float specifying the weight of the reconstruction loss for spatial transcriptomics, default is 1.
reconstruction_sm_weight – Float specifying the weight of the reconstruction loss for single-cell multi-omics, default is 1.
reconstruction_st_corr_weight – Float specifying the weight of the correlation reconstruction loss for spatial transcriptomics, default is 1.
reconstruction_sm_corr_weight – Float specifying the weight of the correlation reconstruction loss for single-cell multi-omics, default is 1.
n_epochs_kl_warmup – Integer specifying the number of epochs for KL divergence warmup, default is 400.
optimizer_parameters – Iterable specifying the parameters for the optimizer, default is None.
weight_decay – Float specifying the weight decay for the optimizer, default is 1e-6.
lr – Float specifying the learning rate for the optimizer.
random_seed – Integer specifying the random seed, default is 12.
kl_loss_reduction – String specifying the reduction method for the KL divergence loss, default is ‘mean’.
mmd_weight – Float specifying the weight of the MMD loss, default is 1.
- Returns:
Dictionary containing the training loss values.
- spatialmeta.model.ConditionalVAESTSM.get_latent_embedding(self, latent_key: Literal['z', 'q_mu'] = 'q_mu', n_per_batch: int = 128, show_progress: bool = True) ndarray
Get the latent embedding of the data.
- Parameters:
latent_key – String specifying the key of the latent variable to return, default is “q_mu”.
n_per_batch – Integer specifying the number of samples per batch, default is 128.
show_progress – Boolean indicating whether to show the progress bar, default is True.
- Returns:
Numpy array containing the latent embedding.
- spatialmeta.model.ConditionalVAESTSM.get_normalized_expression(self, latent_key: Literal['z', 'q_mu'] = 'q_mu', n_per_batch: int = 128, show_progress: bool = True) ndarray
Get the normalized expression of the data.
- Parameters:
latent_key – String specifying the key of the latent variable to return, default is “q_mu”.
n_per_batch – Integer specifying the number of samples per batch, default is 128.
show_progress – Boolean indicating whether to show the progress bar, default is True.
- Returns:
Numpy array containing the normalized expression.
- spatialmeta.model.ConditionalVAESTSM.get_modality_contribution(self, latent_key: Literal['z', 'q_mu'] = 'q_mu')
Get the contribution of each modality to the joint latent space.
- Parameters:
latent_key – Which latent representation to use, either “z” for the sampled latent or “q_mu” for the mean of the latent distribution
Integration Model for SM Only
This class is designed to handle spatial metabolomics (SM) data. The model learn a shared latent space to predict spatial sub-clusters characterized by unique metabolic states.
- spatialmeta.model.ConditionalVAESM(adata: ~anndata._core.anndata.AnnData, hidden_stacks: ~typing.List[int] = [128], batch_keys: ~typing.List[str] | None = None, n_latent: int = 10, bias: bool = True, use_batch_norm: bool = True, use_layer_norm: bool = False, dropout_rate: float = 0.1, activation_fn: ~typing.Callable = <class 'torch.nn.modules.activation.ReLU'>, device: str | ~torch.device = 'cpu', batch_embedding: ~typing.Literal['embedding', 'onehot'] = 'onehot', encode_libsize: bool = False, batch_hidden_dim: int = 8, reconstruction_method: ~typing.Literal['mse', 'zg', 'zinb', 'g'] = 'g')[source]
This class implements a Conditional Variational Autoencoder (CVAE) for vertical integration SM.
- Parameters:
adata – AnnDataSM object containing the SM data.
hidden_stacks – List of integers specifying the number of hidden units in each stack of the encoder and decoder, default is [128].
batch_keys – Optional list of strings specifying the batch keys for batch correction.
n_latent – Integer specifying the dimensionality of the latent space, default is 10.
bias – Boolean indicating whether to include bias terms in the linear layers, default is True.
use_batch_norm – Boolean indicating whether to use batch normalization in the linear layers, default is True.
use_layer_norm – Boolean indicating whether to use layer normalization in the linear layers, default is False.
dropout_rate – Float specifying the dropout rate for the linear layers, default is 0.1.
activation_fn – Callable specifying the activation function to use in the linear layers, default is nn.ReLU.
device – String or torch.device specifying the device to use for computation, default is “cpu”.
batch_embedding – Literal[“embedding”, “onehot”] specifying the type of batch embedding to use, default is “onehot”.
encode_libsize – Boolean indicating whether to encode library size information, default is False.
batch_hidden_dim – Integer specifying the dimensionality of the batch hidden layer, default is 8.
reconstruction_method – Literal[‘mse’, ‘zg’, ‘zinb’] specifying the reconstruction method, default is ‘g’. mse is mean squared error, zg is zero-inflated Gaussian, and zinb is zero-inflated negative binomial.
return: Dictionary containing the training loss values.
- spatialmeta.model.ConditionalVAESM.fit(self, max_epoch: int = 35, n_per_batch: int = 128, reconstruction_reduction: str = 'sum', kl_weight: float = 15.0, reconstruction_weight: float = 1.0, n_epochs_kl_warmup: int | None = 0, optimizer_parameters: Iterable | None = None, weight_decay: float = 1e-06, lr: bool = 5e-05, random_seed: int = 12, kl_loss_reduction: str = 'mean', mmd_weight: float = 1.0)
Fits the model.
- Parameters:
max_epoch – Integer specifying the maximum number of epochs to train the model, default is 35.
n_per_batch – Integer specifying the number of samples per batch, default is 128.
reconstruction_reduction – String specifying the reduction method for the reconstruction loss, default is ‘sum’.
kl_weight – Float specifying the weight of the KL divergence loss, default is 15.
reconstruction_weight – Float specifying the weight of the reconstruction loss, default is 1.
n_epochs_kl_warmup – Integer specifying the number of epochs for KL divergence warmup, default is 0.
optimizer_parameters – Iterable specifying the parameters for the optimizer, default is None.
weight_decay – Float specifying the weight decay for the optimizer, default is 1e-6.
lr – Float specifying the learning rate for the optimizer.
random_seed – Integer specifying the random seed, default is 12.
kl_loss_reduction – String specifying the reduction method for the KL divergence loss, default is ‘mean’.
- Returns:
Dictionary containing the training loss values.
- spatialmeta.model.ConditionalVAESM.get_latent_embedding(self, latent_key: Literal['z', 'q_mu'] = 'q_mu', n_per_batch: int = 128, show_progress: bool = True) ndarray
Get the latent embedding of the data.
- Parameters:
latent_key – String specifying the key of the latent variable to return, default is “q_mu”.
n_per_batch – Integer specifying the number of samples per batch, default is 128.
show_progress – Boolean indicating whether to show the progress bar, default is True.
- Returns:
Numpy array containing the latent embedding.
- spatialmeta.model.ConditionalVAESM.get_normalized_expression(self, latent_key: Literal['z', 'q_mu'] = 'q_mu', n_per_batch: int = 128, show_progress: bool = True) ndarray
Get the normalized expression of the data.
- Parameters:
latent_key – String specifying the key of the latent variable to return, default is “q_mu”.
n_per_batch – Integer specifying the number of samples per batch, default is 128.
show_progress – Boolean indicating whether to show the progress bar, default is True.
- Returns:
Numpy array containing the normalized expression.
Plotting
- spatialmeta.pl.make_colormap(colors: list, show_palette: bool = False)[source]
Make a colormap from a list of colors.
- Parameters:
colors – list of colors to use in the colormap.
show_palette – whether to display the colormap as a palette, default is False.
- Returns:
LinearSegmentedColormap object.
- spatialmeta.pl.create_fig(figsize: tuple = (8, 4))[source]
Create a figure with the specified size and axis properties.
- Parameters:
figsize – tuple specifying the size of the figure, default is (8, 4).
- Returns:
figure and axis objects.
- spatialmeta.pl.create_subplots(nrow: int, ncol: int, figsize: tuple = (8, 4))[source]
Create a figure with the specified size and axis properties.
- Parameters:
nrow – number of rows in the subplot.
ncol – number of columns in the subplot.
figsize – tuple specifying the size of the figure, default is (8, 4).
- Returns:
figure and axis objects.
- spatialmeta.pl.plot_spot_sm_st(adata_SM: AnnDataSM, adata_ST: AnnDataST, SM_spatial_key: str = 'spatial', ST_spatial_key: str = 'spatial', stacked: bool = False, ST_color: str = '#C7C8CC', SM_color: str = '#C499BA', s: int = 10, **kwargs)[source]
Plot the spatial distribution of spots from Spatial Transcriptomics and Spatial Metabolomics data.
- Parameters:
adata_SM – AnnData object containing the Spatial Metabolomics data.
adata_ST – AnnData object containing the Spatial Transcriptomics data.
SM_spatial_key – key in adata_SM.obsm where the spatial coordinates are stored, default is “spatial”.
ST_spatial_key – key in adata_ST.obsm where the spatial coordinates are stored, default is “spatial”.
stacked – whether to plot the data in a single plot or side-by-side, default is False.
ST_color – color to use for the Spatial Transcriptomics data, default is “#C7C8CC”.
SM_color – color to use for the Spatial Metabolomics data, default is “#C499BA”.
s – size of the scatter points, default is 10.
kwargs – additional keyword arguments to pass to the scatter function.
- Returns:
None.
- spatialmeta.pl.plot_newdot_sm_st(new_dot_in_df: DataFrame, adata_SM: AnnDataSM, adata_ST: AnnDataST, ST_spatial_key: str = 'spatial', SM_spatial_key: str = 'spatial', ST_color: str = '#C7C8CC', SM_color: str = '#C499BA', new_dot_color: str = '#FF0000')[source]
Plot the spatial distribution of new spots in the Spatial Metabolomics and Spatial Transcriptomics data.
- Parameters:
new_dot_in_df – DataFrame containing the spatial coordinates of the new spots.
adata_SM – AnnData object containing the Spatial Metabolomics data.
adata_ST – AnnData object containing the Spatial Transcriptomics data.
SM_spatial_key – key in adata_SM.obsm where the spatial coordinates are stored, default is “spatial”.
ST_spatial_key – key in adata_ST.obsm where the spatial coordinates are stored, default is “spatial”.
ST_color – color to use for the Spatial Transcriptomics data, default is “#C7C8CC”.
SM_color – color to use for the Spatial Metabolomics data, default is “#C499BA”.
new_dot_color – color to use for the new spots, default is “#FF0000”.
- Returns:
None.
- spatialmeta.pl.plot_markerfeature(adata: <module 'anndata' from '/home/docs/checkouts/readthedocs.org/user_builds/spatialmeta/envs/latest/lib/python3.9/site-packages/anndata/__init__.py'>, groupby: str, palette: dict, marker_feature_list: list, figsize: tuple = (10, 4), logfoldchanges_max: int = 10, logfoldchanges_min: int = -2, uns_key: str = 'rank_genes_groups', save_path: str | None = None)[source]
Create a scatter plot of the marker features in the AnnData object.
- Parameters:
adata – AnnData object containing the data.
groupby – key in adata.obs to use for grouping the data.
palette – dictionary containing the colors to use for each group.
marker_feature_list – list of marker features to highlight.
figsize – tuple specifying the size of the figure, default is (10, 4).
logfoldchanges_max – maximum logfoldchanges value to consider, default is 10.
logfoldchanges_min – minimum logfoldchanges value to consider, default is -2.
uns_key – key in adata.uns where the data is stored, default is ‘rank_genes_groups’.
save_path – path to save the figure, default is None.
- Returns:
None.
- spatialmeta.pl.plot_marker_gene_metabolite(adata, groupby: str, palette: dict, ST_marker_feature_list: list, SM_marker_feature_list: list, figsize: tuple = (10, 4), save_path: str | None = None, logfoldchanges_max_ST: int = 10, logfoldchanges_max_SM: int = 10, key_ST: str = 'rank_genes_groups', key_SM: str = 'rank_metabolites_groups')[source]
Create a scatter plot of the marker genes and metabolites in the AnnData object.
- Parameters:
adata – AnnDataSMST object containing the ST and SM data.
groupby – key in adata.obs to use for grouping the data.
palette – dictionary containing the colors to use for each group.
ST_marker_feature_list – list of marker genes to highlight.
SM_marker_feature_list – list of marker metabolites to highlight.
figsize – tuple specifying the size of the figure, default is (10, 4).
save_path – path to save the figure, default is None.
logfoldchanges_max_ST – maximum logfoldchanges value to consider for ST, default is 10.
logfoldchanges_max_SM – maximum logfoldchanges value to consider for SM, default is 10.
key_ST – key in adata.uns where the ST data is stored, default is ‘rank_genes_groups’.
key_SM – key in adata.uns where the SM data is stored, default is ‘rank_features_groups’.
- Returns:
None.
- spatialmeta.pl.plot_corrcoef_stsm_inall(adata: AnnDataJointSMST, row_cluster: bool = True, col_cluster: bool = True, figsize: tuple = (10, 10))[source]
Create a clustermap of the correlation coefficients between genes and metabolites.
- Parameters:
adata – AnnDataJointSMST object containing the ST and SM data.
row_cluster – whether to cluster the rows, default is True.
col_cluster – whether to cluster the columns, default is True.
figsize – tuple specifying the size of the figure, default is (10, 10).
- Returns:
None.
- spatialmeta.pl.plot_corrcoef_stsm_ingroup(adata: AnnDataJointSMST, row_cluster: bool = True, cluster: str = 'cluster_0', col_cluster: bool = True, figsize: tuple = (10, 10))[source]
Create a clustermap of the correlation coefficients between genes and metabolites in a specific cluster.
- Parameters:
adata – AnnDataJointSMST object containing the ST and SM data.
row_cluster – whether to cluster the rows, default is True.
cluster – cluster to consider, default is ‘cluster_0’.
col_cluster – whether to cluster the columns, default is True.
figsize – tuple specifying the size of the figure, default is (10, 10).
- Returns:
None.
- spatialmeta.pl.plot_spatial_deconvolution(adata: AnnData, palettes: dict | None = None, min_spots: int = 10, img_alpha: float = 0.5, show_color_bar: bool = True, s: float = 10, vmin: float = 0.5, show: bool = False, valid_cell_types: list | None = None, ax=None, marker='H', use_pie_chart: bool = False, key_celltype_predict_proportions: str = 'celltype_predict_proportions', key_celltype_predict: str = 'celltype_predict')[source]
Create a spatial plot of the cell type predictions.
- Parameters:
adata – AnnData object containing the data.
palettes – dictionary containing the colors to use for each group, default is None.
min_spots – minimum number of spots to consider, default is 10.
img_alpha – alpha value for the image, default is 0.5.
show_color_bar – whether to show the color bar, default is True.
s – size of the spots, default is 10.
vmin – minimum value for the spots, default is 0.5.
show – whether to show the plot, default is False.
valid_cell_types – list of valid cell types, default is None.
ax – axis to use for the plot, default is None.
marker – marker to use for the spots, default is ‘H’.
use_pie_chart – whether to use a pie chart, default is False.
key_celltype_predict_proportions – key in adata.obs containing the cell type proportions, default is ‘celltype_predict_proportions’.
key_celltype_predict – key in adata.obs containing the cell type predictions, default is ‘celltype_predict’.
- Returns:
None.
- spatialmeta.pl.plot_gene_corrcoef_sm_ingroup(adata: AnnDataJointSMST, genename: str, groupby: str = 'leiden', use_raw: bool = True, ntop: int = 5)[source]
Create a dotplot of the correlation coefficients between genes and metabolites in a specific cluster.
- Parameters:
adata – AnnDataJointSMST object containing the ST and SM data.
genename – name of the gene.
groupby – key in adata.obs to use for grouping the data, default is ‘leiden’.
use_raw – whether to use the raw data, default is True.
ntop – number of top genes to consider, default is 5.
- Returns:
None.
- spatialmeta.pl.plot_metabolite_corrcoef_st_ingroup(adata: AnnDataJointSMST, metabolite: str, groupby: str = 'leiden', use_raw: bool = True, ntop: int = 5)[source]
Create a dotplot of the correlation coefficients between genes and metabolites in a specific cluster.
- Parameters:
adata – AnnDataJointSMST object containing the ST and SM data.
metabolite – name of the metabolite.
groupby – key in adata.obs to use for grouping the data, default is ‘leiden’.
use_raw – whether to use the raw data, default is True.
ntop – number of top genes to consider, default is 5.
- Returns:
None.
- spatialmeta.pl.plot_volcano_corrcoef_gene(adata: AnnDataJointSMST, metabolite: str, use_raw: bool = True, threshold: float = 0.25, nonmarker_size: int = 8, marker_size: int = 20, marker_alpha: float = 1, color_threshold: float = 0.25, color_above: str = '#D2649A', color_below: str = '#40A578', color_above_font: str = '#D2649A', color_below_font: str = '#40A578', fontsize: int = 8, color_neutral: str = 'grey', figsize: tuple = (6, 5), title: str = 'Volcano Plot for Correlation Coefficients', show: bool = True)[source]
Create a volcano plot of the correlation coefficients between genes and metabolites.
- Parameters:
adata – AnnDataJointSMST object containing the ST and SM data.
metabolite – name of the metabolite.
use_raw – whether to use the raw data, default is True.
threshold – threshold for the correlation coefficient, default is 0.25.
nonmarker_size – size of the non-marker, default is 8.
marker_size – size of the marker, default is 20.
marker_alpha – alpha value for the marker, default is 1.
color_threshold – threshold for the color, default is 0.25.
color_above – color for values above the threshold, default is ‘#D2649A’.
color_below – color for values below the threshold, default is ‘#40A578’.
color_above_font – color for values above the threshold, default is ‘#D2649A’.
color_below_font – color for values below the threshold, default is ‘#40A578’.
fontsize – font size for the labels, default is 8.
color_neutral – color for neutral values, default is ‘grey’.
figsize – tuple specifying the size of the figure, default is (6, 5).
title – title of the plot, default is ‘Volcano Plot for Correlation Coefficients’.
show – whether to show the plot, default is True.
- Returns:
None.
- spatialmeta.pl.plot_volcano_corrcoef_metabolite(adata: AnnDataJointSMST, gene: str, use_raw: bool = True, threshold: float = 0.25, nonmarker_size: int = 8, marker_size: int = 20, marker_alpha: float = 1, color_threshold: float = 0.25, color_above: str = '#D2649A', color_below: str = '#40A578', color_above_font: str = '#D2649A', color_below_font: str = '#40A578', fontsize: int = 8, color_neutral: str = 'grey', figsize: tuple = (6, 5), title: str = 'Volcano Plot for Correlation Coefficients', show: bool = True)[source]
Create a volcano plot of the correlation coefficients between genes and metabolites.
- Parameters:
adata – AnnDataJointSMST object containing the ST and SM data.
gene – name of the gene.
use_raw – whether to use the raw data, default is True.
threshold – threshold for the correlation coefficient, default is 0.25.
nonmarker_size – size of the non-marker, default is 8.
marker_size – size of the marker, default is 20.
marker_alpha – alpha value for the marker, default is 1.
color_threshold – threshold for the color, default is 0.25.
color_above – color for values above the threshold, default is ‘#D2649A’.
color_below – color for values below the threshold, default is ‘#40A578’.
color_above_font – color for values above the threshold, default is ‘#D2649A’.
color_below_font – color for values below the threshold, default is ‘#40A578’.
fontsize – font size for the labels, default is 8.
color_neutral – color for neutral values, default is ‘grey’.
figsize – tuple specifying the size of the figure, default is (6, 5).
title – title of the plot, default is ‘Volcano Plot for Correlation Coefficients’.
show – whether to show the plot, default is True.
- Returns:
None.
- spatialmeta.pl.plot_trajectory_with_arrows(adata: <module 'anndata' from '/home/docs/checkouts/readthedocs.org/user_builds/spatialmeta/envs/latest/lib/python3.9/site-packages/anndata/__init__.py'>, path_key: str = 'trajectory_1', img_key: str = 'scaledres', color: str = 'trajectory_1', fig=None, ax=None, arrow_head_width: int = 15, arrow_width: float = 0.05, show: bool = False, **kwargs)[source]
Create a spatial plot of the trajectory with arrows.
- Parameters:
adata – AnnData object containing the data.
path_key – key in adata.uns containing the path, default is ‘trajectory_1’.
img_key – key in adata.uns containing the image, default is ‘scaledres’.
color – key in adata.obs containing the color, default is ‘trajectory_1’.
fig – figure to use for the plot, default is None.
ax – axis to use for the plot, default is None.
arrow_head_width – width of the arrow head, default is 15.
arrow_width – width of the arrow, default is 0.05.
show – whether to show the plot, default is False.
- Returns:
None.
- spatialmeta.pl.plot_clustermap_with_smoothing(adata: <module 'anndata' from '/home/docs/checkouts/readthedocs.org/user_builds/spatialmeta/envs/latest/lib/python3.9/site-packages/anndata/__init__.py'>, window_size: int = 200, cmap: str = 'vlag', feature_top: int = 10, key: str = 'rank_genes_groups', save_path: str | None = None, figsize: tuple = (10, 10), return_data: bool = False, **kwargs)[source]
Create a clustermap of the top features with smoothing applied.
- Parameters:
adata – AnnData object containing the data.
window_size – size of the window for smoothing, default is 200.
cmap – colormap to use for the clustermap, default is ‘vlag’.
feature_top – number of top features to consider, default is 10.
key – key in adata.uns to use for the features, default is ‘rank_genes_groups’.
save_path – path to save the plot, default is None.
figsize – tuple specifying the size of the figure, default is (10, 10).
return_data – whether to return the data, default is False.
- Returns:
None.
- spatialmeta.pl.plot_features_trajectory(adata: <module 'anndata' from '/home/docs/checkouts/readthedocs.org/user_builds/spatialmeta/envs/latest/lib/python3.9/site-packages/anndata/__init__.py'>, features: list, bins: int = 100, palette: dict | list | None = None, figsize: tuple = (10, 5), scale: bool = False, save_path: str | None = None)[source]
Create a line plot of the mean values of features along a trajectory.
- Parameters:
adata – AnnData object containing the data.
features – list of features to plot.
bins – number of bins to use for grouping the data, default is 100.
palette – dictionary or list of colors to use for the features, default is None.
figsize – tuple specifying the size of the figure, default is (10, 5).
scale – whether to scale the values, default is False.
save_path – path to save the plot, default is None.
- Returns:
None.
- spatialmeta.pl.plot_network(adata: <module 'anndata' from '/home/docs/checkouts/readthedocs.org/user_builds/spatialmeta/envs/latest/lib/python3.9/site-packages/anndata/__init__.py'>, groupby: str = 'leiden', spatial_key: str = 'spatial', use_raw: bool = False, palette: dict | None = None, edge_use_scale: bool = True, node_use_scale: bool = True, node_scale_factor: float = 10.0, edge_weight_threshold: float = 0.1, edge_scale_factor: float = 1.0, seed: int | None = None, top_n_neighbors: int = 5, show_weight: bool = False, show_labels: bool = True, node_min_size: float = 1.0, split_by: str | None = None, return_data: bool = False, show: bool = True, iterations: int = 50)[source]
Create a network plot of the data. And the network plot is generated based on the spatial distance and spot size.
- Parameters:
adata – AnnData object containing the data.
groupby – key in adata.obs to use for grouping the data, default is ‘leiden’.
spatial_key – key in adata.obsm containing the spatial data, default is ‘spatial’.
use_raw – whether to use the raw data, default is False.
palette – dictionary containing the colors to use for each group, default is None.
edge_use_scale – whether to use scale for the edge, default is True.
node_use_scale – whether to use scale for the node, default is True.
node_scale_factor – factor to scale the node, default is 10.0.
edge_weight_threshold – threshold for the edge weight, default is 0.1.
edge_scale_factor – factor to scale the edge, default is 1.0.
seed – seed to use for random number generation, default is None.
top_n_neighbors – number of top neighbors to consider, default is 5.
show_weight – whether to show the weight, default is False.
show_labels – whether to show the labels, default is True.
node_min_size – minimum size of the node, default is 1.0.
split_by – key in adata.obs to use for splitting the data, default is None.
return_data – whether to return the data, default is False.
show – whether to show the plot, default is True.
iterations – number of iterations to use for the layout, default is 50.
- Returns:
None.
- spatialmeta.pl.Wrapper(adata: AnnData)[source]
Plotly wrapper for spatial data.
- Parameters:
adata – Anndata object containing spatial data.
log – Log of the current state of the plot, defaults to None.
n_clicks – Dictionary of the number of clicks for each action, defaults to dict(annotate=0).
- spatialmeta.pl.Wrapper.to_plotly(self, init_feature='COL3A1')
Use Dash to create a plotly figure for spatial data.
- Parameters:
init_feature – Initial feature to display, defaults to ‘COL3A1’.
- Returns:
Dash app.