SpaMTP: Spatial Metabolomics Analysis • SpaMTP

Mouse Urinary Bladder Vignette

This tutorial highlights the utility of SpaMTP on single-cell resolution Spatial Metabolomics. This vignette will use public mouse bladder Spatial Metabolomics data first published by Römpp et al.(2010), and used in the Cardinal V3 article.

Using SpaMTP we will highlight:

Loading Spatial Metabolomic (SM) Data
Using Cardinal tools - Tissue Segmentation using SSC
Automated Metabolite Annotation of m/z Masses
Simplifying Lipid Nomenclature into Lipid Categories and Classes
Differential Expression Analysis of Annotated Metabolites
Metabolite Expression Visualisation
Pathway Analysis
Re-Clustering using PCA and Seurat
Finding Spatially Correlated Metabolites

Author: Andrew Causer

Install and Import R Libraries

First we need to import the required libraries for this analysis.

## Install SpaMTP if not previously installed
if (!require("SpaMTP"))
    devtools::install_github("GenomicsMachineLearning/SpaMTP")

#General Libraries
library(SpaMTP)
library(Cardinal)
library(Seurat)
library(dplyr)

#For plotting + DE plots
library(ggplot2)
library(EnhancedVolcano)

Load SM Data using SpaMTP

Sample Pre-Processing

Pre-processing is an import step in any pipeline and should be performed prior to any downstream analysis. For spatial metabolic data, there are various packages which have robust functions and applications for removing technical noise from a SM dataset, these include Cardinal and SCILS-LAB. This pre-processing pipeline is copied from Cardinal’s V3 Vignette. We will not run it in this case because it can take some time. The code required for the pre-processing is provided below but we will just load the processed data object below.

Cardinal Pre-Processing Code

##### Using Cardinal #####

# ## Load raw data
# bladder <- readImzML("HR2MSImouseurinarybladderS096", folder = "vignette_data_files/", mass.range=c(400,1000), resolution=10, units="ppm") 
  
# ## Get mean intensity values across all spots (like bulk data)
# mse_mean <- summarizeFeatures(bladder, FUN="mean") 

# ## Generates a list of significant peaks to use as a reference for normalisation and pre-processing
# ref <- mse_mean %>% 
#     peakPick(method="mad",
#             SNR=6) %>%
#     peakAlign("mean",
#             tolerance=15,
#             units="ppm") %>%
#     peakFilter() %>%
#     process()

# ## Normalise and pre-process the raw data using the reference peak
# mse_peaks <- mse %>% 
#     normalize(method="tic") %>%
#     peakBin(ref=mz(ref),
#             tolerance=15,
#             units="ppm") %>%
#     process()

# ## Save the data
# write.table(fData(mse_peaks),  "mouse-bladder-peaks.tsv")
# writeMSIData(mse_peaks, "mouse-bladder-peaks.imzML") # We will load this data in for the rest of the analysis using SpaMTP
# save(mse_mean, file="mouse-bladder-mean.RData")

Load Processed SM Data

The first step is loading the data. In this case our pre-processed data was saved in the format of .ibd and .imzML files. Alternative formats that data can be loaded into SpaMTP are through an intensity matrix or a Cardinal Object. These are displayed in different vignettes.

This SpaMTP function is a wrapper on Cardinal’s readImzML() function. For more information on how this function works, and the relative inputs it accepts please head to Cardinal’s documentation page.

Here, we have specified the file path were the .imzML and .imb files are stored, along with the mass range of m/z values to include and also set the resolution or bin size of each peak to 10ppm.

bladder <- LoadSM(name = "mouse-bladder-peaks", path = "vignette_data_files/Mouse_Urinary_Bladder/", mass.range=c(400,1000), resolution=10, units="ppm", bin_package = "Cardinal")

We can visualise the data below. Because SpaMTP is built on a Seurat Object, we can plot and use useful Seurat functions. In this case, we will plot the spatial distribution of the total number of features (m/z’s) present per pixel.

ImageFeaturePlot(bladder, features = "nFeature_Spatial", size = 1) + coord_flip() + scale_x_reverse()

After visualising the data we can see our tissue section, and that there are some areas outside the tissue region that contain pixel information, this is noise that will be removed later. Published in the original paper by Römpp et al.(2010) we will visualise three m/z values that define organ structures within the urinary bladder.

ImageMZPlot(bladder, mzs = c(770.5104, 770.56, 741.5307), size = 1.5) & coord_flip() & scale_x_reverse()

Switching between SpaMTP and Cardinal Objects

Now that we have loaded and visualised out data to demonstrate it’s quality, we will identify similar groups of pixels based on their relative metabolite composition. This is similar to clustering performed with Spatial Transcriptomics data. Following the Cardinal workflow, we will use a method called Spatially-aware Shrunken Centroid (SSC) segmentation. To generate these results, we can convert our SpaMTP object to Cardinal:

bladder_cardinal <- ConvertSeuratToCardinal(data = bladder, assay = "Spatial", slot = "counts")

Gathering Intensity, m/z values and metadata from Seurat Object ...

Converting intensity matrix and Generating Cardinal Object ...

Lets observe the Cardinal Object structure after conversion:

bladder_cardinal

## An object of class 'MSProcessedImagingExperiment'
##   <305 feature, 34840 pixel> imaging dataset
##     imageData(1): intensity
##     featureData(0):
##     pixelData(6): Seurat_metadata.orig.ident
##         Seurat_metadata.nCount_Spatial ... Seurat_metadata.x_coord
##         Seurat_metadata.y_coord
##     run(1): run0
##     raster dimensions: 260 x 134
##     coord(2): x = 1..260, y = 1..134
##     mass range: 403.0434 to 994.1133 
##     centroided: FALSE

We can see that the Object layout is correct and we have 305 m/z values across 34,840 pixels. Next we can generate a plot using Cardinal:

bladder_cardinal %>%
    subsetFeatures(mz >= 750, mz <= 800) %>%
    plot()

We can see that our data has converted properly to a Cardinal Object. Now we can run SSC and check the results match.

Tissue Segmentation using SSC

Following the Cardinal tutorial, we will use the same values for:

Spatial neighborhood radius = 2
Maximum number of segments/clusters = 10
Sparsity thresholding parameter = 24

set.seed(1)
ssc <- spatialShrunkenCentroids(bladder_cardinal,
            r=2, s=c(24), k=10,
            method="gaussian")

col_palette = list("1" = "#9FBEAC", 
                   "2" = "#C2B03B",
                   "3" = "#F99D1D",
                   "4" = "#008E87", 
                   "5" = "#0074B0",  
                   "6" = "#DE4D6C", 
                   "7" = "#F99D1D",
                   "8" = "#DE4D6C",
                   "9" = "#BF212E")

image(ssc, model=list(s=24), col= unlist(unname(col_palette)))

We can see that our clustering results match that of previously published results. Based on this, we can add the SSC results to our Cardinal object and then match them to their respective spots within the SpaMTP object.

Although the majority of functions within the SpaMTP package are used as wrappers for analysing and modifying Seurat objects, there are a few functions that are designed to assist in SM analysis in Cardinal. The function below is one, which allows you to add the SCC results back to a Cardinal Object.

bladder_cardinal <- add_ssc_annotation(bladder_cardinal, ssc, resolution = 24)

We can now check where our SSC results were added in our Cardinal Object:

head(data.frame(pixelData(bladder_cardinal)), n = 3)

run	x	y	Seurat_metadata.orig.ident	Seurat_metadata.nCount_Spatial	Seurat_metadata.nFeature_Spatial	Seurat_metadata.sample	Seurat_metadata.x_coord	Seurat_metadata.y_coord	ssc
run0	1	1	SeuratProject	54142.66	32	mouse-bladder-peaks	1	1	4
run0	2	1	SeuratProject	33354.74	12	mouse-bladder-peaks	2	1	4
run0	3	1	SeuratProject	62429.26	59	mouse-bladder-peaks	3	1	4

Based on the image above, there are a few clusters that are associated with regions outside the tissue. Because we are not interested in these (technical noise), we will subset our SpaMTP object to only include the clusters of interest (cluster 2, 5 and 6).

# subset the pixels to only include our ssc clusters of interest
tissue_clusters = c(2,5,6)
test <- subsetPixels(bladder_cardinal, ssc %in% tissue_clusters)

# plot a key marker m/z only across our segments of interest in Cardinal
image(test, mz =798.5410)

Add SSC segmentation results to SpaMTP Object

Below, we will add the relative results from the SSC analysis to our SpaMTP object:

bladder$ssc <- as.factor(pixelData(bladder_cardinal)$ssc)

ImageDimPlot(bladder, group.by = "ssc", size = 2, cols = col_palette) + coord_flip() + scale_x_reverse()

We can see that the pixel annotations have been added correctly, and we have a ggplot2 figure that can easily be manipulated using ggplot2 commands. Following the processing and annotation of pixels across our dataset, we will next annotate the m/z values.

Data QC

After importing the data and adding in the SSC results, we can generate some QC plots to observe the distribution of the data.

options(repr.plot.width = 20, repr.plot.height = 5)

MZRidgePlot(seurat.obj = bladder, group.by = "ssc", bins = 10, log.data = TRUE, cols = col_palette, verbose = F)+ theme_minimal() +theme(panel.grid = element_blank()) |
MZVlnPlot(seurat.obj = bladder,group.by = "ssc", log.data = TRUE, show.points = F,cols = col_palette, verbose = F) |
MZBoxPlot(seurat.obj = bladder,group.by = "ssc", log.data = TRUE, show.points = F, top.cutoff = 0.05,bottom.cutoff = 0.05,cols = col_palette, verbose = F)

These plots display the intensity of each m/z summed across all pixels per group. Based on this, there seems to be a majority of m/z values which display global expression across the tissue which is expected, suggesting that there are no outlying m/z values.

We can also look at the distribution of specific features across groups by providing a feature name. Lets try and visualise the total intensity of “mz-740.471557617188” across each cluster using the box.plot function:

options(repr.plot.width = 10, repr.plot.height = 5)

MZBoxPlot(seurat.obj = bladder,group.by = "ssc", mzs = "mz-740.471557617188", log.data = TRUE, show.points = F, top.cutoff = 0.05,bottom.cutoff = 0.05,cols = col_palette)

Based on this we can see that the “mz = 740.471557617188” is expressed in all clusters and is highest in cluster 6.

Metabolite Annotation of m/z Masses

One of the major steps in the analysis of metabolomics is the identification/annotation of m/z masses to their relative common metabolite name. There are various public websites that can be used to run this, however, they are often complicated and do not take in an R object or annotate them directly.

SpaMTP has a user-friendly function that assigns possible annotations to each m/z value based on a few variables:

db: Database to query the m/z masses against -> options are = “HMDB”, “LipidMaps”, “GNPS” or “ChEBI”
polarity: Polarity mode the experiment was run in -> either “positive” or “negative”
adducts: Specify different adduct ions that are likely formed or lost during the mass spectrometry imaging.

bladder_annotated <- AnnotateSM(bladder, db = Lipidmaps_db, ppm_error = 15, adducts = "M+K", polarity = "positive")

Filtering 'Lipidmaps_db' database by M+K adduct/s

Searching database against input m/z's to return annotaiton results

Adding annotations to Seurat Object .... 

Returning Seurat object that include ONLY SUCCESSFULLY ANNOTATED m/z features

bladder_annotated

## An object of class Seurat 
## 79 features across 34840 samples within 1 assay 
## Active assay: Spatial (79 features, 0 variable features)
##  1 layer present: counts
##  1 spatial field of view present: fov

Following annotation with the LIPID_MAPS database, there are 79 m/z masses that were successfully annotated!

These annotations are stored in the feature meta.data slot within our SpaMTP Object, lets take a look:

head(bladder_annotated[["Spatial"]]@meta.data, n = 3)

	raw_mz	mz_names	observed_mz	all_IsomerNames	all_Isomers	all_Isomers_IDs	all_Adducts	all_Formulas	all_Errors
16	426.9289	mz-426.928863525391	426.928863525391	Laurenenyne A; Laurenenyne B; Katsuurenyne A	LMPK02000053; LMPK02000054; LMPK02000055	LIPIDMAPS:LMPK02000053; LIPIDMAPS:LMPK02000054; LIPIDMAPS:LMPK02000055	M+K	C15H18O2Br2	3.8565
17	431.0381	mz-431.038116455078	431.038116455078	5,7,3’,4’,5’-Pentahydroxy-3,6,8-trimethoxyflavone	LMPK12113357	LIPIDMAPS:LMPK12113357	M+K	C18H16O10	1.4116
37	465.0228	mz-465.022796630859	465.022796630859	Aspergillusether D	LMPK13090053	LIPIDMAPS:LMPK13090053	M+K	C20H20O6Cl2	8.725

We can see that for each m/z value, along with all possible annotations, we also observe information including database ID, the adduct used to annotate this metabolite, the common chemical formula, and the plus-minus ppm error between the observed mass and the reference mass of these annotated molecules in our reference dataset.

In addition to storing the m/z annotations in the feature meta.data slot, if save.intermediates = TRUE is set within the AnnotateSM() function, then an intermediate data.frame will be stored in the @tools slot of the SpaMTP object. This data.frame is a less compressed version of the annotation data.frame stored in the feature meta.data slot, which is used later for pathway analysis. For more information visit the hidden window below:

SStoring an intermediate data.frame for downstream pathway analysis

str(bladder_annotated@tools) #The stored dataframe is called db_3

## List of 1
##  $ db_3:'data.frame':    98 obs. of  12 variables:
##   ..$ ID          : int [1:98] 16 17 37 39 40 41 55 61 61 68 ...
##   ..$ Match       : logi [1:98] TRUE TRUE TRUE TRUE TRUE TRUE ...
##   ..$ observed_mz : num [1:98] 427 431 465 467 468 ...
##   ..$ Reference_mz: num [1:98] 427 431 465 467 468 ...
##   ..$ Error       : num [1:98] 3.86 1.41 8.73 1.16 13.2 ...
##   ..$ Adduct      : chr [1:98] "M+K" "M+K" "M+K" "M+K" ...
##   ..$ Formula     : chr [1:98] "C15H18O2Br2" "C18H16O10" "C20H20O6Cl2" "C24H25O5Cl" ...
##   ..$ Exactmass   : num [1:98] 388 392 426 428 429 ...
##   ..$ Isomers     : chr [1:98] "LMPK02000053; LMPK02000054; LMPK02000055" "LMPK12113357" "LMPK13090053" "LMPK13080006" ...
##   ..$ InchiKeys   : chr [1:98] "ZAKKSVIAKVSQCR-SXATZZDESA-N; ZAKKSVIAKVSQCR-LKQQYTGOSA-N; ZAKKSVIAKVSQCR-UNHDMXBBSA-N" "STOZTZBHYTVXHP-UHFFFAOYSA-N" "JYBXDWXPXSTXCT-SOFGYWHQSA-N" "RQNMKGDKKQRCKL-HZOWPXDZSA-N" ...
##   ..$ IsomerNames : chr [1:98] "Laurenenyne A; Laurenenyne B; Katsuurenyne A" "5,7,3',4',5'-Pentahydroxy-3,6,8-trimethoxyflavone" "Aspergillusether D" "Emeguisin B" ...
##   ..$ Isomers_IDs : chr [1:98] "LIPIDMAPS:LMPK02000053; LIPIDMAPS:LMPK02000054; LIPIDMAPS:LMPK02000055" "LIPIDMAPS:LMPK12113357" "LIPIDMAPS:LMPK13090053" "LIPIDMAPS:LMPK13080006" ...

Based on these results, we can now perform a serious of analyses to identify which metabolites are associated with certain biological features/processes.

Simplifying Lipid Nomenclature into Lipid Catagories and Classes

Because lipids are made of many carbon molecules in a long chain there are many different kinds of lipids that share the same molecular weight. This results in some annotations for a mass being very long, due to the numerous possible lipids matching that m/z value. Lets use a SpaMTP function to simplify the lipid nomenclature into common lipid categories and classes! This gives more simplified annotations to a level that most SM datasets are capable of identifying.

bladder_annotated@assays$Spatial@meta.data <- RefineLipids(bladder_annotated@assays$Spatial@meta.data, annotation.column = "all_IsomerNames", lipid_info = "simple")

refined_annotations <- bladder_annotated@assays$Spatial@meta.data # get the refined annotations from our feature meta.data slot

cat("non-lipid metabolites ... ")
head(refined_annotations, 1)


cat("refined lipid names ... ")
refined_annotations[11,]

non-lipid metabolites

	raw_mz	mz_names	observed_mz	all_IsomerNames	all_Isomers	all_Isomers_IDs	all_Adducts	all_Formulas	all_Errors	Lipid.Maps.Category	Lipid.Maps.Main.Class	Species.Name	Species.Name.Simple
16	426.9289	mz-426.928863525391	426.928863525391	Laurenenyne A; Laurenenyne B; Katsuurenyne A	LMPK02000053; LMPK02000054; LMPK02000055	LIPIDMAPS:LMPK02000053; LIPIDMAPS:LMPK02000054; LIPIDMAPS:LMPK02000055	M+K	C15H18O2Br2	3.8565	NA	NA	NA	NA

refined lipid metabolites

	raw_mz	mz_names	observed_mz	all_IsomerNames	all_Isomers	all_Isomers_IDs	all_Adducts	all_Formulas	all_Errors	Lipid.Maps.Category	Lipid.Maps.Main.Class	Species.Name	Species.Name.Simple
78	534.2962	mz-534.296203613281	534.296203613281	PC(O-14:0/2:0); PC(16:0/0:0); PC(0:0/16:0); PC(16:0/0:0)[rac]; PE(19:0/0:0); 1-(2-methoxy-6Z-octadecenyl)-sn-glycero-3-phosphoethanolamine	LMGP01020019; LMGP01050018; LMGP01050074; LMGP01050113; LMGP02050028; LMGP02060020	LIPIDMAPS:LMGP01020019; LIPIDMAPS:LMGP01050018; LIPIDMAPS:LMGP01050074; LIPIDMAPS:LMGP01050113; LIPIDMAPS:LMGP02050028; LIPIDMAPS:LMGP02060020	M+K	C24H50NO7P	1.0362	GP	PC; LPC; LPE	PC O-16:0; LPC 16:0; LPE 19:0	PC(16:0); LPC(16:0); LPE(19:0)

We can see that for metabolites that are not lipids, the names are returned as NA, however those that are annotated as lipids have been simplified into a more human readable format.

Some key m/z masses that were observed and annotated in the original publication are listed in the table below:

m/z mass	Annotated Metabolite
741.5307	SM(34:1)
798.5410	PC(34:1)
812.5566	PE(38:1)

Lets see how these m/z’s are annotated using SpaMTP:

key_mzs <- c(741.5307, 798.5410, 812.5566)
key_mz_names <- c()

for (mz in key_mzs){
    key_mz_names <- c(key_mz_names, FindNearestMZ(bladder_annotated, mz)) #Find the nearest m/z in our dataset to the mass provided
}

matches <- refined_annotations[refined_annotations$mz_names %in% key_mz_names,][c("raw_mz","Species.Name.Simple")] 
matches

	raw_mz	Species.Name.Simple
196	741.5306	SM(34:1); PA(37:1)
231	798.5411	PC(34:1); PE(37:1)
238	812.5562	PC(35:1); PE(38:1)

ImageMZAnnotationPlot(bladder_annotated, metabolites = c("SM(34:1)", "PC(34:1)"), size = 1, column.name = "Species.Name.Simple", plot.exact = F, dark.background = F, plot.pixel = T) & coord_flip() & scale_x_reverse()  & scale_colour_gradientn(colors = viridis(100))

We can see that our annotations match those published previously. Due to the limitations with MSI based technologies, we can not exactly identify the chemical structure of each molecule, meaning that some m/z values have multiple lipid names/annotations.

Differential Expression Analysis of Annotated Metabolites

One key step in understanding biological processes is determining genes, proteins or metabolites that demonstrate significantly different expression between given populations or cell types. Based on the analysis above, for this urinary bladder we identified 3 main tissue regions these being :

ssc segment	Tissue Region
2	adventitia
5	muscle
6	urothelium

These regions are identified also in the following references: 1, 2

Lets use SpaMTP’s FindAllDEMs() function to identify differentially expressed m/z metabolites (DEMs) between each of these three tissue regions. This function uses similar methods to those established for genetic data, whereby pixels are pseudo-bulked into random pools and assessed for differential expression (edgeR).

ROI <- subset(bladder_annotated, subset = ssc %in% c("2", "5", "6")) # subset our dataset to only include our 3 clusters (Region of Interest - ROI)

Now our object has been subset to only include our clusters of interest, we can now perform differential expression analysis.

# Performs pooling, pseudo-bulking and edgeR Differential Expression Analysis
cluster_DEMs <- FindAllDEMs(data = ROI, ident = "ssc", n = 3, logFC_threshold = 1.2, 
                            DE_output_dir =NULL, return.individual = FALSE, 
                            run_name = "FindAllDEMs", annotation.column = "all_IsomerNames" ) # DE results are returned in a data.frame

Pooling one sample into 3 replicates...

Running edgeR DE Analysis for  FindAllDEMs  -> with samples [ 5, 2, 6 ]

Starting condition: 5

Starting condition: 2

Starting condition: 6

Lets observe the top 10 DE metabolites for each cluster in a heatmap:

DEMsHeatmap(cluster_DEMs, only.pos = FALSE, order.by = "logFC",
            plot_annotations_column = "annotations", nlabels.to.show = 1, 
            n = 10, save_to_path = NULL
            ,plot.save.height = 10, plot.save.width = 15, annotation_colors = col_palette[c("2","5","6")])

We can also observe the spatial expression patterns of some top DE metabolites.

(ImageDimPlot(ROI, group.by = "ssc", split.by = "ssc", size = 2, cols = col_palette)/
ImageMZAnnotationPlot(ROI, metabolites = c("PC(20:4(5Z,8Z,11Z,14Z)/0:0)", "farnesyl triphosphate","SM(d18:1/19:0)"), size = 2)) & coord_flip() & scale_x_reverse()

We can see that the expression of these metabolites matches the spatial location of each of these three tissue regions quite well.

Noted in the heatmap above, there are many lipids which are identified as being DE. In particular, there are two lipids of interest which have been identified in the original publication as key metabolites that differentiate the muscle (cluster 5) and urothelium (cluster 6). These being SM(34:1) and PC(34:1) respectively.

The code below performs differential expression analysis just between the the muscle (cluster 5) and urothelium (cluster 6) pixels, to determine if the differential expression of these lipids can be detected.

### Subsets the original dataset to only include cluster 5 and 6
ROI_x <- subset(bladder_annotated, subset = ssc %in% c("5", "6"))

### Performs differential expression analysis between the two groups
cluster_DEMs_x <- FindAllDEMs(data = ROI_x, ident = "ssc", n = 3, logFC_threshold = 1.2, 
                            DE_output_dir =NULL, return.individual = FALSE, 
                            run_name = "FindAllDEMs", annotation.column = "all_IsomerNames" )

### Runs lipid nomenclature simplification on the DEM results
cluster_DEMs_x$DEMs <- RefineLipids(cluster_DEMs_x$DEMs, annotation.column = "annotations", lipid_info = "simple")

### Subsets the DEM results to only get up regulated metabolites for cluster 5 and 6
urothelium <- cluster_DEMs_x$DEMs[cluster_DEMs_x$DEMs$cluster == "6",]

### Sets up colouring for significant spots in volcano plot
keyvals <- ifelse(
    urothelium$logFC < -2  & urothelium$PValue < 10e-4, 'royalblue',
      ifelse(urothelium$logFC > 1 & urothelium$PValue < 10e-4, 'red',
        'black'))
  keyvals[is.na(keyvals)] <- 'black'
  names(keyvals)[keyvals == 'red'] <- 'Cluster 6'
  names(keyvals)[keyvals == 'black'] <- 'Non-Sig'
  names(keyvals)[keyvals == 'royalblue'] <- 'Cluster 5'

### Changes the shape of the lipid annotations  of interest in volcano plot
metabolites <- matches$Species.Name.Simple[1:2]

keyvals.shape <- ifelse(
    urothelium$Species.Name.Simple == metabolites[1], 15,
      ifelse(urothelium$Species.Name.Simple == metabolites[2], 17,
        20))
  keyvals.shape[is.na(keyvals.shape)] <- 20
  names(keyvals.shape)[keyvals.shape == 20] <- 'Other'
  names(keyvals.shape)[keyvals.shape == 15] <- metabolites[1]
  names(keyvals.shape)[keyvals.shape == 17] <- metabolites[2]

### Plots volcano plot with DEM results
volc_plot <- EnhancedVolcano::EnhancedVolcano( urothelium,
                                  selectLab = metabolites, 
                                  lab = urothelium$Species.Name.Simple,
                                  #FCcutoff = 0,
                                  colCustom = keyvals,
                                  shapeCustom = keyvals.shape,
                                  cutoffLineType = 'blank',
                                  pCutoff = 10e-4,
                                  FCcutoff = 1,
                                  pointSize = 6,
                                  labSize = 5,
                                  labCol = 'black',
                                  labFace = 'bold',
                                  colAlpha = 4/5,
                                  x = 'logFC',
                                  y = 'PValue', 
                                  gridlines.major = FALSE,
                                  gridlines.minor = FALSE)

volc_plot

In the volcano plot above, we can see that our analysis correctly identified SM(34:1) to be up-regulated in cluster 5, and PC(34:1) was up-regulated in cluster 6.

Lipids are a key metabolite expressed within the urinary bladder. Lets take a deeper look into the differentially expressed lipids groups with respect to their main category and class:

### Runs lipid nomenclature simplification on the DEM results generated between cluster 2, 5 and 6
cluster_DEMs$DEMs <- RefineLipids(cluster_DEMs$DEMs, annotation.column = "annotations", lipid_info = "simple")

# lets look at the Differenitally expressed Lipids groups
UP <- cluster_DEMs$DEMs[cluster_DEMs$DEMs$regulate == "Up",] # lets get the upregulated lipids for each cluster

# Compute counts of Lipid Maps Categories for 'Up' and 'Down' entries
category_counts <- UP %>%
  group_by(cluster, Lipid.Maps.Category) %>%
  summarise(count = n()) %>%
  ungroup()

category_counts <- category_counts %>%
  mutate(Lipid.Maps.Category = ifelse(is.na(Lipid.Maps.Category), "Non-Lipids", Lipid.Maps.Category))


# Plot bar graph
ggplot(category_counts, aes(x = Lipid.Maps.Category, y = count, fill = cluster)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(title = "Relative Number of Up Regulated Lipids Grouped by Catagory",
       x = "Lipid Maps Category",
       y = "Count") +
  theme_classic() +
  scale_fill_manual(values = col_palette)

The results displayed in the bar graph above demonstrate the number of different types of lipid categories differentially expressed in each cluster. Lipid categories are the most simple/broad annotation of a lipid. For example, GL stands for glycerolipids, which can represent such lipids classes as diglycerols and triglycerols. Likewise, SP stands for sphingolipids, which include lipids from classes such as sphingomyelins and glycosphingolipids. For more information on different lipid categories and classes, please visit the following links (1,2).

Lets now look at the different lipid classes that were differentially expressed by each cluster.

# Compute counts of Lipid Maps Categories for 'Up' and 'Down' entries
category_counts <- UP %>%
  group_by(cluster, Lipid.Maps.Main.Class) %>%
  summarise(count = n()) %>%
  ungroup()

category_counts <- category_counts %>%
  mutate(Lipid.Maps.Category = ifelse(is.na(Lipid.Maps.Main.Class), "Non-Lipids", Lipid.Maps.Main.Class))


# Plot bar graph
ggplot(category_counts, aes(x = Lipid.Maps.Main.Class, y = count, fill = cluster)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(title = "Relative Number of Up Regulated Lipids Grouped by Classes",
       x = "Lipid Maps Category",
       y = "Count") +
  theme_classic() +
  scale_fill_manual(values = col_palette)

Metabolite Expression Visualisation

Based on the analysis performed above, there are various ways we can visualise the data spatially. SpaMTP has a large range of methods to visualise data including binning common metabolites, 3D plots displaying metabolite/gene expression and more. Below, we will demonstrate some of the key plotting methods.

First, using the DE lipid results from above, we can plot the combined expression of all glycerophospholipids (GP) that were up expressed by our urothelium cells (cluster 6).

### Identified the m/z values that are GP lipids up expressed in cluster 6
mz <- na.omit(UP[c(UP$cluster == "6"& UP$Lipid.Maps.Category == "GP"),]$gene)

### Adds the binned expression value of all of these lipids into a column in the SpaMTP object's @meta.data slot
ROI <- BinMetabolites(ROI, mz, slot = "counts", bin_name = "GPs")

head(ROI, n = 3)

	orig.ident	nCount_Spatial	nFeature_Spatial	sample	x_coord	y_coord	ssc	GPs
130_11	SeuratProject	70658.19	56	mouse-bladder-peaks	130	11	5	0.000
123_12	SeuratProject	50833.22	25	mouse-bladder-peaks	123	12	5	2823.010
124_12	SeuratProject	60326.05	40	mouse-bladder-peaks	124	12	5	4422.283

ImageFeaturePlot(ROI, features = c("GPs"), size = 2, dark.background = F) & coord_flip() & scale_x_reverse()

We can see clearly that the combined expression of these GP lipids is highly expressed in the urothelium cells.

Next, we can observe the expression of two key lipids (same as those described above) in a 3D plot:

### Identified the m/z values that are to be plot
mzs <- unlist(lapply(c(798.5411, 741.5306), function(x) FindNearestMZ(data = ROI, target_mz = x)))

Plot3DFeature(ROI, features = mzs, assays = "Spatial", between.layer.height = 100, names = c("PC(34:1)","SM(34:1)"), plot.height = 400, plot.width = 700)

SpaMTP also includes a useful density plot function that generates a HTML which allows the user to simultaneously visualise any m/z peak and it’s relative metabolite annotation spatially. Users have access to a mass intensity plot where they can select any m/z peak, which will then be plotted and all possible metabolite annotations will be listed. Lets demonstrate below:

DensityMap(object = ROI, assay = "Spatial", slot = "counts", folder = "vignette_data_files/Mouse_Urinary_Bladder/")

Interactive Bar and Dot Plot

Instructions: Enter a number to describe the number of equally spaced points at which the density is to be estimated.

Pathway Analysis

In order to understand biological processes and complex diseases at a deeper level, we often look at biological pathways as a whole rather the expression of just individual metabolites/genes. SpaMTP pathway analysis uses a reference dataset that contains various biological pathways (i.e. KEGG or RAMP_DB) to determine significant changes in biological, cellular or molecular processes.

The first pathway analysis method uses the Fishers Exact Test to identify significant differentially expressed pathways based on the presence of specified features. Users can provide a list of metabolites or metabolites and genes, identified through DE analysis. This method will then identify significant pathways based on the relative proportion of features matching between the list provided and those present within the pathway.

The below code is performing FishersExactTest Pathway analysis on metabolites significantly over expressed in the urothelium (cluster 6):

## Subsets to include only UP-regulated m/z values in cluster 6 (relative to cluster 2 and 5)
cluster_6 <- UP[UP$cluster == "6",] 

## Based on these mz values we can get their corresponding annotations which are in the database ID format (stored in `$all_Isomer_IDs`)
metabolite_ids <- ROI[["Spatial"]]@meta.data[ROI[["Spatial"]]@meta.data$all_IsomerNames %in% cluster_6$annotations,]$all_Isomers_IDs

## We can then unlist them to get all the unique possible annotations 
metabolite_ids <- unique(unlist(lapply(metabolite_ids, function(x){strsplit(x, "; ")[[1]]})))

### NOTE: Rather then matching the annotation results back to the metabolite annotations, DE can be run changing the 'annotation.column' = "all_Isomer_IDs`

Lets now run FishersPathwayTest

cluster_6_pathways <- FishersPathwayAnalysis(list("metabolites" = metabolite_ids),
                            max_path_size = 500,
                            alternative = "greater",
                            min_path_size = 5,
                            pathway_all_info = T,
                            pval_cutoff = 0.05,
                            verbose = TRUE)

Fisher Testing ......
Loading files ......
Loading files finished!
Expanding database to extract all potential metabolites
Parsing the information of given analytes class
Begin metabolic pathway analysis ......
Merging datasets
Running test
Calculating p value......
P value obtained
Done

cluster_6_pathways[1:4,]

pathway_name	pathway_id	type	pathwayCategory	p_val	fdr	ratio	analytes_in_pathways	total_in_pathways
Phospholipid Biosynthesis	SMP00025	hmdb	smpdb3	0.0013372	0.1017251	0.1200000	3	25
Phospholipid Biosynthesis	map00564	kegg	kegg	0.0013372	0.1017251	0.1200000	3	25
Phosphatidylethanolamine Biosynthesis PE(15:0/18:2(9Z,12Z))	SMP15122	hmdb	smpdb3	0.0050468	0.1017251	0.1666667	2	12
Phosphatidylethanolamine Biosynthesis PE(20:1(11Z)/18:3(6Z,9Z,12Z))	SMP15441	hmdb	smpdb3	0.0050468	0.1017251	0.1666667	2	12

We can also run FisherPathwayTest using just m/z values. Lets try this below and see how the results differ. This time we will use Chebi_db, Lipidmaps_db and HMDB_db to annotate the m/z values, and we will use all possible positive adducts. (Note: the mz values are stored in the data.frame column called gene).

cluster_6_mz_pathways <- FishersPathwayAnalysis(list("mzs" = cluster_6$gene),
                            min_path_size = 5,
                            max_path_size = 500,    
                            alternative = "greater",
                            pathway_all_info = T,
                            pval_cutoff = 1,
                            verbose = TRUE)

cluster_6_mz_pathways[1:4,]

pathway_name	pathway_id	type	pathwayCategory	p_val	fdr	ratio	analytes_in_pathways	total_in_pathways
De Novo Triacylglycerol Biosynthesis TG(18:0/24:1(15Z)/16:1(9Z))	SMP18561	hmdb	smpdb3	0.072197	0.9307476	0.6	3	5
De Novo Triacylglycerol Biosynthesis TG(18:0/24:1(15Z)/18:0)	SMP18556	hmdb	smpdb3	0.072197	0.9307476	0.6	3	5
De Novo Triacylglycerol Biosynthesis TG(18:0/24:1(15Z)/18:1(9Z))	SMP18563	hmdb	smpdb3	0.072197	0.9307476	0.6	3	5
De Novo Triacylglycerol Biosynthesis TG(18:0/24:1(15Z)/18:2(9Z,12Z))	SMP18568	hmdb	smpdb3	0.072197	0.9307476	0.6	3	5

Now, lets visualise these results.We will first plot the results generated based on the provided metabolite ID’s.

VisualisePathways(bladder_annotated,pathway_df = cluster_6_pathways,p_val_threshold = 0.1,assay = "Spatial",slot = "counts")

We can also plot the results generated from running Fishers Exact Test using the m/z values. This will allow us to plot the expression of each pathway spatially!

VisualisePathways(bladder_annotated,pathway_df = cluster_6_mz_pathways,p_val_threshold = 0.1,assay = "Spatial",slot = "counts")

The second form of pathway analysis identifies differentially expressed pathways per group/cluster based on a the relative expression of metabolites and/or genes. Using a similar approach to GSEA features are ranked based on log fold change between groups, and the relative enrichment scores of these features are used to identify significant pathways differentially expressed by each group.

The code below uses the “ssc” clustering and the pseudo-bulking DE results to identify significant pathways per cluster:

DE_pathways <- FindRegionalPathways(SpaMTP = ROI,
                                ident = "ssc",
                                DE.list = list(cluster_DEMs$DEMs),
                                analyte_types = c("metabolites"),
                                SM_assay = "Spatial",
                                ST_assay = NULL)

DE_pathways

pathwayName	pval	padj	log2err	ES	NES	size	leadingEdge	Cluster_id	adduct_info	leadingEdge_metabolites	leadingEdge_metabolites_id	met_regulation	group_importance	pathwayRampId	sourceId	type	pathwayCategory
Fabry disease	0.0282251	0.0282251	0.3524879	-0.7072785	-1.576707	5	RAMP_C_0….	6	682.457092285156[M+K];686.583801269531[M+K];769.561889648438[M+K];772.525268554688[M+K];844.525817871094[M+K]	GlcCer(d18:1/12:0);Cer(d18:1/24:1(15Z));SM(d18:1/18:0);PC(10:0/22:0);LacCer(d18:1/12:0)	LIPIDMAPS:LMSP0501AA01;LIPIDMAPS:LMSP02010009;LIPIDMAPS:LMSP03010001;LIPIDMAPS:LMGP01010564;LIPIDMAPS:LMSP0501AB02	↓;↓;↓;↓;↓	11.40207	RAMP_P_000000110	SMP00525	hmdb	smpdb3
Gaucher Disease	0.0282251	0.0282251	0.3524879	-0.7072785	-1.576707	5	RAMP_C_0….	6	682.457092285156[M+K];686.583801269531[M+K];769.561889648438[M+K];772.525268554688[M+K];844.525817871094[M+K]	GlcCer(d18:1/12:0);Cer(d18:1/24:1(15Z));SM(d18:1/18:0);PC(10:0/22:0);LacCer(d18:1/12:0)	LIPIDMAPS:LMSP0501AA01;LIPIDMAPS:LMSP02010009;LIPIDMAPS:LMSP03010001;LIPIDMAPS:LMGP01010564;LIPIDMAPS:LMSP0501AB02	↓;↓;↓;↓;↓	11.40207	RAMP_P_000000111	SMP00349	hmdb	smpdb3
Globoid Cell Leukodystrophy	0.0282251	0.0282251	0.3524879	-0.7072785	-1.576707	5	RAMP_C_0….	6	682.457092285156[M+K];686.583801269531[M+K];769.561889648438[M+K];772.525268554688[M+K];844.525817871094[M+K]	GlcCer(d18:1/12:0);Cer(d18:1/24:1(15Z));SM(d18:1/18:0);PC(10:0/22:0);LacCer(d18:1/12:0)	LIPIDMAPS:LMSP0501AA01;LIPIDMAPS:LMSP02010009;LIPIDMAPS:LMSP03010001;LIPIDMAPS:LMGP01010564;LIPIDMAPS:LMSP0501AB02	↓;↓;↓;↓;↓	11.40207	RAMP_P_000000112	SMP00348	hmdb	smpdb3
Krabbe disease	0.0282251	0.0282251	0.3524879	-0.7072785	-1.576707	5	RAMP_C_0….	6	682.457092285156[M+K];686.583801269531[M+K];769.561889648438[M+K];772.525268554688[M+K];844.525817871094[M+K]	GlcCer(d18:1/12:0);Cer(d18:1/24:1(15Z));SM(d18:1/18:0);PC(10:0/22:0);LacCer(d18:1/12:0)	LIPIDMAPS:LMSP0501AA01;LIPIDMAPS:LMSP02010009;LIPIDMAPS:LMSP03010001;LIPIDMAPS:LMGP01010564;LIPIDMAPS:LMSP0501AB02	↓;↓;↓;↓;↓	11.40207	RAMP_P_000000113	SMP00526	hmdb	smpdb3
Metachromatic Leukodystrophy (MLD)	0.0282251	0.0282251	0.3524879	-0.7072785	-1.576707	5	RAMP_C_0….	6	682.457092285156[M+K];686.583801269531[M+K];769.561889648438[M+K];772.525268554688[M+K];844.525817871094[M+K]	GlcCer(d18:1/12:0);Cer(d18:1/24:1(15Z));SM(d18:1/18:0);PC(10:0/22:0);LacCer(d18:1/12:0)	LIPIDMAPS:LMSP0501AA01;LIPIDMAPS:LMSP02010009;LIPIDMAPS:LMSP03010001;LIPIDMAPS:LMGP01010564;LIPIDMAPS:LMSP0501AB02	↓;↓;↓;↓;↓	11.40207	RAMP_P_000000114	SMP00347	hmdb	smpdb3
Sphingolipid metabolism: integrated pathway	0.0015977	0.0111840	0.4550599	-0.7460415	-1.941830	8	RAMP_C_0….	6	682.457092285156[M+K];741.530578613281[M+K];743.546752929688[M+K];769.561889648438[M+K];825.623657226562[M+K];853.655090332031[M+K]	GlcCer(d18:1/12:0);SM(d18:1/16:0);SM(d18:0/16:0);SM(d18:1/18:0);SM(d18:1/22:0);SM(d18:1/24:0)	LIPIDMAPS:LMSP0501AA01;LIPIDMAPS:LMSP03010003;LIPIDMAPS:LMSP03010004;LIPIDMAPS:LMSP03010031;LIPIDMAPS:LMSP03010006;LIPIDMAPS:LMSP03010008	↓;↓;↓;↓;↓;↓;↓	11.40207	RAMP_P_000053493	WP4726	wiki	NA
Starch and sucrose metabolism	0.0282251	0.0282251	0.3524879	-0.7072785	-1.576707	5	RAMP_C_0….	6	682.457092285156[M+K];686.583801269531[M+K];769.561889648438[M+K];772.525268554688[M+K];844.525817871094[M+K]	GlcCer(d18:1/12:0);Cer(d18:1/24:1(15Z));SM(d18:1/18:0);PC(10:0/22:0);LacCer(d18:1/12:0)	LIPIDMAPS:LMSP0501AA01;LIPIDMAPS:LMSP02010009;LIPIDMAPS:LMSP03010001;LIPIDMAPS:LMGP01010564;LIPIDMAPS:LMSP0501AB02	↓;↓;↓;↓;↓	11.40207	RAMP_P_000000115	map00500	kegg	kegg

Based on this, we can see that there is only one significant differentially expressed pathway. Before we annotated our sample based on only M+K analytes, and using the LIPIDMAPS database. To increase the number of possible pathways detected we can broaden our annotation range to include M+H and M+K analytes from the HMDB, LIPIDMAPS and ChEBI databases and re-run the analysis.

## Get Data
ROI_pathays <- subset(bladder, subset = ssc %in% c("2", "5", "6"))

## Re-annotate
ROI_pathays <- AnnotateSM(ROI_pathays, db = rbind(Chebi_db,Lipidmaps_db,HMDB_db), ppm_error = 15, polarity = "positive", adducts = c("M+H","M+K"))


## Re-run DE analysis
DE <- FindAllDEMs(data = ROI_pathays, ident = "ssc", n = 3, logFC_threshold = 1.2, 
                            DE_output_dir =NULL, return.individual = FALSE, 
                            run_name = "FindAllDEMs", annotation.column = "all_IsomerNames")

DE_pathways <- FindRegionalPathways(SpaMTP = ROI_pathays,
                                ident = "ssc",
                                DE.list = list(DE$DEMs),
                                analyte_types = c("metabolites"),
                                SM_assay = "Spatial",
                                ST_assay = NULL)

DE_pathways

pathwayName	pval	padj	log2err	ES	NES	size	leadingEdge	Cluster_id	adduct_info	leadingEdge_metabolites	leadingEdge_metabolites_id	met_regulation	group_importance	pathwayRampId	sourceId	type	pathwayCategory
Fabry disease	0.0060913	0.0071065	0.4070179	-0.7810077	-1.637572	5	RAMP_C_0….	6	682.457092285156[M+K];734.567321777344[M+H];769.561889648438[M+K];844.525817871094[M+K]	Galactosylceramide (d18:1/12:0) ;PC(16:0/16:0);SM(d18:1/18:0);LacCer(d18:1/12:0)	chebi:76297;hmdb:HMDB0000564;LIPIDMAPS:LMSP03010001;LIPIDMAPS:LMSP0501AB02	↓;↓;↓;↓	11.45531	RAMP_P_000000110	SMP00525	hmdb	smpdb3
Gaucher Disease	0.0060913	0.0071065	0.4070179	-0.7810077	-1.637572	5	RAMP_C_0….	6	682.457092285156[M+K];734.567321777344[M+H];769.561889648438[M+K];844.525817871094[M+K]	Galactosylceramide (d18:1/12:0) ;PC(16:0/16:0);SM(d18:1/18:0);LacCer(d18:1/12:0)	chebi:76297;hmdb:HMDB0000564;LIPIDMAPS:LMSP03010001;LIPIDMAPS:LMSP0501AB02	↓;↓;↓;↓	11.45531	RAMP_P_000000111	SMP00349	hmdb	smpdb3
Globoid Cell Leukodystrophy	0.0060913	0.0071065	0.4070179	-0.7810077	-1.637572	5	RAMP_C_0….	6	682.457092285156[M+K];734.567321777344[M+H];769.561889648438[M+K];844.525817871094[M+K]	Galactosylceramide (d18:1/12:0) ;PC(16:0/16:0);SM(d18:1/18:0);LacCer(d18:1/12:0)	chebi:76297;hmdb:HMDB0000564;LIPIDMAPS:LMSP03010001;LIPIDMAPS:LMSP0501AB02	↓;↓;↓;↓	11.45531	RAMP_P_000000112	SMP00348	hmdb	smpdb3
Krabbe disease	0.0060913	0.0071065	0.4070179	-0.7810077	-1.637572	5	RAMP_C_0….	6	682.457092285156[M+K];734.567321777344[M+H];769.561889648438[M+K];844.525817871094[M+K]	Galactosylceramide (d18:1/12:0) ;PC(16:0/16:0);SM(d18:1/18:0);LacCer(d18:1/12:0)	chebi:76297;hmdb:HMDB0000564;LIPIDMAPS:LMSP03010001;LIPIDMAPS:LMSP0501AB02	↓;↓;↓;↓	11.45531	RAMP_P_000000113	SMP00526	hmdb	smpdb3
Metachromatic Leukodystrophy (MLD)	0.0060913	0.0071065	0.4070179	-0.7810077	-1.637572	5	RAMP_C_0….	6	682.457092285156[M+K];734.567321777344[M+H];769.561889648438[M+K];844.525817871094[M+K]	Galactosylceramide (d18:1/12:0) ;PC(16:0/16:0);SM(d18:1/18:0);LacCer(d18:1/12:0)	chebi:76297;hmdb:HMDB0000564;LIPIDMAPS:LMSP03010001;LIPIDMAPS:LMSP0501AB02	↓;↓;↓;↓	11.45531	RAMP_P_000000114	SMP00347	hmdb	smpdb3
Sphingolipid metabolism: integrated pathway	0.0095774	0.0095774	0.3807304	-0.6155733	-1.629880	10	RAMP_C_0….	6	503.981231689453[M+H];682.457092285156[M+K];686.583801269531[M+K];703.57470703125[M+H];743.546752929688[M+K];769.561889648438[M+K];825.623657226562[M+K];851.640014648438[M+K];853.655090332031[M+K]	8-oxo-dATP(4-);Galactosylceramide (d18:1/12:0) ;Cer(d18:1/24:1(15Z));N-heptadecanoyl-15-methylhexadecasphingosine-1-phosphocholine;SM(d14:0/20:0);SM(d18:1/18:0);SM(d18:1/22:0);beta-D-glucosyl-(1<->1’)-N-[(15Z)-tetracosenoyl]sphinganine;SM(d18:1/24:0)	chebi:30616;chebi:76297;hmdb:HMDB0004953;chebi:78646;chebi:78647;LIPIDMAPS:LMSP03010001;LIPIDMAPS:LMSP03010006;chebi:84699;LIPIDMAPS:LMSP03010008	↓;↓;↓;↓;↓;↓;↓;↓;↓;↓	11.45531	RAMP_P_000053493	WP4726	wiki	NA
Starch and sucrose metabolism	0.0060913	0.0071065	0.4070179	-0.7810077	-1.637572	5	RAMP_C_0….	6	682.457092285156[M+K];734.567321777344[M+H];769.561889648438[M+K];844.525817871094[M+K]	Galactosylceramide (d18:1/12:0) ;PC(16:0/16:0);SM(d18:1/18:0);LacCer(d18:1/12:0)	chebi:76297;hmdb:HMDB0000564;LIPIDMAPS:LMSP03010001;LIPIDMAPS:LMSP0501AB02	↓;↓;↓;↓	11.45531	RAMP_P_000000115	map00500	kegg	kegg

We can see that we now have many more differentially expressed pathways! Lets plot these results:

PlotRegionalPathways(regpathway = DE_pathways)

This plot highlights that cluster 5 (bladder muscle) is enriched for Biochemical pathways. Alternatively, the urothelium (cluster 6) is shown to down regulate numerous pathways including Sphingolipid metabolism. This matches with previous research suggesting that the metabolism of sphingolipids is mainly associated with muscle cells of the badder.

These results may not tell the entire story however, the ‘SSC’ clustering results do not capture every region of the tissue according to the original publication. There is a specific area between the bladder and urothelium called the ‘lamina propria’ which is currently unidentifiable based on the ‘SSC’ clustering result (this was also mentioned in the original paper).

Re-Clustering using PCA and Seurat

Because SpaMTP is built on the foundations of a Seurat Object, it allows the user to still use many useful Seurat functions on their spatial metabolomic data using SpaMTP. Below, we will demonstrate how to use metabolite-based PCA results from SpaMTP, along with Seurat’s clustering functions, to identify this rare cell type known as the (lamina propria).

First we will run clustering:

ROI <- RunMetabolicPCA(ROI, assay = "Spatial", slot = "counts")
ROI <- FindNeighbors(ROI, dims = 1:30, verbose = FALSE) # use the SpaMTP object generated from PCA Pathway Analysis
ROI <- RunUMAP(ROI, dims = 1:30, verbose = FALSE)
ROI <- FindClusters(ROI, resolution = 0.3, cluster.name = "clusters", verbose = FALSE)

Now, lets plot the results:

## Custom palette for SpaMTP/Seurat Clustering
SpaMTP_palette = list("0" = "#0074B0", 
                      "1" = "#008E87", 
                      "2" = "#DE4D6C",
                      "3" = "#BF212E", 
                      "4" = "#C2B03B" ,
                      "5" = "#92f09a",
                      "6" = "#9FBEAC")

DimPlot(ROI, group.by = "clusters", cols = SpaMTP_palette, pt.size = 2) | ImageDimPlot(ROI, size = 2, group.by = "clusters", cols = SpaMTP_palette)  + coord_flip() + scale_x_reverse()

Comparing these results to the SSC segmentation we can see this clustering can still detect the urinary bladder muscle (cluster 0/1), urothelium (cluster 2/3) and the adventitia (cluster 4). In addition, we are able to detect a region that matches a similar location to that of the lamina propria (cluster 5) identified by the original publication.

Finding Spatially Correlated Metabolites

An added benefit of using spatial technologies rather then single-cell/bulk methods is the retainment of spatial context. This allows us to analyse trends across the whole tissue section that would otherwise be lost in bulk analyses.

In the original publication they defined an m/z = 743.5482 as the lamina propria.

Lets use SpaMTP’s feature co-localisation function to check the top metabolites that spatially correlate to cluster 5!

correlated_features <- FindCorrelatedFeatures(ROI, ident = "clusters", SM.assay = "Spatial", nfeatures = 6) #find the spatial correlation between each cluster and their relative metabolites

Because we are interested in the lamina propria, lets focus on the top correlated metabolites with cluster 5:

df <- correlated_features[correlated_features$ident == "5",] # subsets to only include results from cluster 5
df$mz_name <- paste0("mz-",round(df$mz, digits = 6)) # simplifies the m/z names for the plot

## Plots the top 10 results
ggplot(df, aes(x = -rank, y = correlation)) + geom_bar(stat = "identity") + geom_text(aes(label = mz_name), vjust = -0.5, size = 3, color = "black") + coord_flip() + theme_classic()

We can see that our top hit is m/z = 734.5467, which is described in Cardinal’s paper as the peak defining the lamina propria.

We can check by looking at the spatial expression of this m/z:

## Formats Plots 
plots <- list()
for (i in c("0", "2", "3", "4", "5")){
    suppressWarnings(clst <- subset(ROI, subset = clusters %in% i))
    plots[[i]] <- ImageDimPlot(clst, group.by = "clusters", cols = SpaMTP_palette, size = 1.5)
}
plots[["mz"]] <- ImageFeaturePlot(ROI, features = 'mz-743.546752929688', size = 1.2) & scale_fill_gradientn(colors = viridis::turbo(100)) 


## Plots data
patchwork::wrap_plots(plots, ncol = 3, nrow = 2)& coord_flip() & scale_x_reverse()

In addition to finding correlated m/z’s based on spatial location, we can also identify metabolites that demonstrate a strong spatial pattern. This is done using Moran’s I scores to determine the most spatially variable features.

ROI <- FindSpatiallyVariableMetabolites(ROI, assay = "Spatial", image = "fov") # Finds Spatially Strong m/z's

ImageFeaturePlot(ROI, features = GetSpatiallyVariableMetabolites(ROI, assay = "Spatial", n = 6), size = 1)  & coord_flip() & scale_x_reverse()

We can also run pathway analysis on the new clustering results. Lets do that now, this time using Seurat’s FindAllMarker() function to calculating DE metabolites, highlighting SpaMTP’s flexibility. You will see below that FindAllMarker() requires normalised data. SpaMTP has a NormalizeSMData() function that allows for TIC (Total Ion Current) normalisation. For more info on this form of normalisation, please read Jauhiainen et.al (2014).

## Annotate ROI with additional datasets
ROI <- AnnotateSM(ROI, db = rbind(Chebi_db,Lipidmaps_db, HMDB_db), polarity = "positive")

## Performs TIC normalisation
ROI <- NormalizeSMData(ROI)

## Run FindAllMarkers() on clustering results
Idents(ROI) <- "clusters"
DE <- FindAllMarkers(ROI, assay = "Spatial")

## Runs Pathway Analysis between clusters
DE_pathways <- FindRegionalPathways(SpaMTP = ROI,
                                ident = "clusters",
                                DE.list = list(DE),
                                analyte_types = c("metabolites"),
                                SM_assay = "Spatial",
                                ST_assay = NULL)

Below, we will visualise the top 5 pathways differentially expressed between clusters:

PlotRegionalPathways(regpathway = DE_pathways, num_display = 4)

We can also plots some key pathways of interest that were also upregulated in our dataset.

key_pathways <- c("Biochemical pathways: part I","G alpha (i) signalling events","Metabolism of carbohydrates","Sphingolipid metabolism: integrated pathway",
                  "Sensory perception of sweet, bitter, and umami (glutamate) taste", "Sensory perception of taste",
                  "G alpha (i) signaling events")

PlotRegionalPathways(regpathway = DE_pathways, selected_pathways = key_pathways)

Based on these findings, we observe that the lamina propria (cluster 5) has an up regulation of metabolites associated with sensory perception pathways. This matches with biology, as this is the layer of the bladder that contains sensory neurons.

There are a large range of additional applications SpaMTP can be used for, such as, multi-modality analysis using both spatial transcriptomics and spatial metabolomics. Check out the SpaMTP github or documentation site for more.

Session Info

sessionInfo()

## R version 4.3.2 (2023-10-31)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Sonoma 14.3.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Australia/Brisbane
## tzcode source: internal
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] htmltools_0.5.8.1      EnhancedVolcano_1.20.0 ggrepel_0.9.6         
##  [4] ggplot2_3.5.1          dplyr_1.1.4            Seurat_5.1.0          
##  [7] SeuratObject_5.0.2     sp_2.1-4               Cardinal_3.4.3        
## [10] S4Vectors_0.40.2       EBImage_4.44.0         BiocParallel_1.36.0   
## [13] BiocGenerics_0.48.1    ProtGenerics_1.34.0    SpaMTP_1.0.0          
## 
## loaded via a namespace (and not attached):
##   [1] fs_1.6.5                    matrixStats_1.5.0          
##   [3] spatstat.sparse_3.1-0       bitops_1.0-9               
##   [5] httr_1.4.7                  RColorBrewer_1.1-3         
##   [7] tools_4.3.2                 sctransform_0.4.1          
##   [9] R6_2.5.1                    lazyeval_0.2.2             
##  [11] uwot_0.2.2                  withr_3.0.2                
##  [13] gridExtra_2.3               progressr_0.15.1           
##  [15] cli_3.6.3                   Biobase_2.62.0             
##  [17] textshaping_0.4.1           spatstat.explore_3.3-4     
##  [19] fastDummies_1.7.4           biglm_0.9-3                
##  [21] shinyjs_2.1.0               labeling_0.4.3             
##  [23] sass_0.4.9                  spatstat.data_3.1-4        
##  [25] ggridges_0.5.6              pbapply_1.7-2              
##  [27] pkgdown_2.1.1               systemfonts_1.1.0          
##  [29] svglite_2.1.3               scater_1.30.1              
##  [31] parallelly_1.41.0           Rfast2_0.1.5.2             
##  [33] CardinalIO_1.0.0            limma_3.58.1               
##  [35] rstudioapi_0.17.1           generics_0.1.3             
##  [37] ica_1.0-3                   spatstat.random_3.3-2      
##  [39] crosstalk_1.2.1             Matrix_1.6-5               
##  [41] ggbeeswarm_0.7.2            abind_1.4-8                
##  [43] lifecycle_1.0.4             yaml_2.3.10                
##  [45] edgeR_4.0.16                SummarizedExperiment_1.32.0
##  [47] SparseArray_1.2.4           Rtsne_0.17                 
##  [49] grid_4.3.2                  promises_1.3.2             
##  [51] crayon_1.5.3                miniUI_0.1.1.1             
##  [53] lattice_0.21-9              beachmat_2.18.1            
##  [55] cowplot_1.1.3               zeallot_0.1.0              
##  [57] pillar_1.10.1               knitr_1.49                 
##  [59] fgsea_1.28.0                GenomicRanges_1.54.1       
##  [61] future.apply_1.11.3         codetools_0.2-19           
##  [63] Rnanoflann_0.0.3            fastmatch_1.1-6            
##  [65] leiden_0.4.3.1              glue_1.8.0                 
##  [67] spatstat.univar_3.1-1       data.table_1.16.4          
##  [69] vctrs_0.6.5                 png_0.1-8                  
##  [71] spam_2.11-0                 gtable_0.3.6               
##  [73] cachem_1.1.0                xfun_0.50                  
##  [75] S4Arrays_1.2.1              mime_0.12                  
##  [77] Rfast_2.1.4                 survival_3.5-7             
##  [79] SingleCellExperiment_1.24.0 signal_1.8-1               
##  [81] pheatmap_1.0.12             statmod_1.5.0              
##  [83] fitdistrplus_1.2-2          ROCR_1.0-11                
##  [85] nlme_3.1-163                matter_2.4.1               
##  [87] RcppAnnoy_0.0.22            GenomeInfoDb_1.38.8        
##  [89] bslib_0.8.0                 irlba_2.3.5.1              
##  [91] vipor_0.4.7                 KernSmooth_2.23-22         
##  [93] colorspace_2.1-1            DBI_1.2.3                  
##  [95] tidyselect_1.2.1            compiler_4.3.2             
##  [97] ontologyIndex_2.12          BiocNeighbors_1.20.2       
##  [99] xml2_1.3.6                  desc_1.4.3                 
## [101] ggdendro_0.2.0              DelayedArray_0.28.0        
## [103] plotly_4.10.4               scales_1.3.0               
## [105] lmtest_0.9-40               tiff_0.1-12                
## [107] stringr_1.5.1               digest_0.6.37              
## [109] goftest_1.2-3               fftwtools_0.9-11           
## [111] spatstat.utils_3.1-2        rmarkdown_2.29             
## [113] XVector_0.42.0              pkgconfig_2.0.3            
## [115] jpeg_0.1-10                 sparseMatrixStats_1.14.0   
## [117] MatrixGenerics_1.14.0       fastmap_1.2.0              
## [119] rlang_1.1.4                 htmlwidgets_1.6.4          
## [121] shiny_1.10.0                DelayedMatrixStats_1.24.0  
## [123] farver_2.1.2                jquerylib_0.1.4            
## [125] zoo_1.8-12                  jsonlite_1.8.9             
## [127] mclust_6.1.1                BiocSingular_1.18.0        
## [129] RCurl_1.98-1.16             magrittr_2.0.3             
## [131] kableExtra_1.4.0            scuttle_1.12.0             
## [133] GenomeInfoDbData_1.2.11     dotCall64_1.2              
## [135] patchwork_1.3.0             munsell_0.5.1              
## [137] Rcpp_1.0.14                 ggnewscale_0.5.0           
## [139] viridis_0.6.5               reticulate_1.40.0          
## [141] RcppZiggurat_0.1.6          stringi_1.8.4              
## [143] zlibbioc_1.48.2             MASS_7.3-60                
## [145] plyr_1.8.9                  rgoslin_1.6.0              
## [147] parallel_4.3.2              listenv_0.9.1              
## [149] deldir_2.0-4                splines_4.3.2              
## [151] tensor_1.5                  locfit_1.5-9.10            
## [153] igraph_2.1.2                spatstat.geom_3.3-4        
## [155] RcppHNSW_0.6.0              reshape2_1.4.4             
## [157] ScaledMatrix_1.10.0         evaluate_1.0.3             
## [159] RcppParallel_5.1.9          httpuv_1.6.15              
## [161] RANN_2.6.2                  tidyr_1.3.1                
## [163] purrr_1.0.2                 polyclip_1.10-7            
## [165] future_1.34.0               scattermore_1.2            
## [167] rsvd_1.0.5                  xtable_1.8-4               
## [169] RSpectra_0.16-2             later_1.4.1                
## [171] viridisLite_0.4.2           ragg_1.3.3                 
## [173] tibble_3.2.1                beeswarm_0.4.0             
## [175] IRanges_2.36.0              cluster_2.1.4              
## [177] globals_0.16.3