Abstract
Spatial omics is a broad term referring to technologies that allow for biomolecules to be observed within their native tissue context. These technologies have been used by biomedical researchers to gain a better understanding of cellular interactions, tumor microenvironment dynamics, and immune cell infiltration. While the basic outputs, such as spatial coordinates, segmentation masks, and transcript/protein matrices, are provided by the instrument software, the true biological insights come from several downstream, specialized analysis steps. Since spatial omics remains a relatively new field, no unified analysis pipeline has yet been established to encompass all platforms. Most workflows are adapted from single-cell RNA sequencing analysis frameworks, while incorporating additional steps that are specific to spatial data, especially for imaging-based technologies. At the same time, the diversity of platforms, data modalities, and output formats has introduced substantial challenges for data representation, interoperability, and cross-platform integration, highlighting the need for flexible, spatially aware, and user-friendly data structures made specifically for imaging-based data, not merely adapted from other methods. This review summarizes the general analytical steps following spatial omics data acquisition, commonly used data infrastructures and tools, existing gaps, and future directions in the field.
Keywords
1. Introduction
Advances in transcriptomic technologies have dramatically expanded the ability of biomedical researchers to characterize cellular function, diversity, and organization. Bulk RNA-sequencing first enabled high-throughput measurement of gene expression at the tissue level, offering a global view of transcriptional activity but inherently averaging signals across heterogeneous cell populations[1]. This limitation motivated the development of single-cell RNA sequencing (scRNA-seq), which transformed the field by resolving gene expression at the level of individual cells[2]. With single-cell data came a new generation of computational methods, ranging from normalization frameworks suited to sparse count data, to algorithms for clustering, lineage inference, batch correction, and large scale data integration, each designed to extract structure from increasingly complex cellular landscapes[3].
Despite its success, single-cell profiling requires dissociating tissues, which disrupts the native spatial organization of cells and removes the spatial context of gene expression. This loss of spatial information limits the ability to study cell-cell interactions (CCIs), tissue architecture, and the tumor environment. Spatial omics technologies emerged to restore this critical dimension[4]. Platforms such as sequencing-based spatial transcriptomics (sST), in situ hybridization (ISH), and multiplexed imaging now provide high dimensional molecular measurements directly within intact tissue architectures. As these methods emerged, analytical workflows were first adapted from previous technologies (e.g., scRNA-Seq), then slowly evolved to incorporate image processing, spatial statistics, probabilistic modeling, multimodal integration, and new frameworks for mapping cell types and interactions in situ.
Together, this progression from bulk to single cell to spatial profiling reflects a broader transition from measuring what genes are expressed, to understanding which cells express them, and finally, to uncovering where those cells reside and interact within their native microenvironments. The remainder of this review focuses on the analytical landscape that unfolds once spatial omics data is generated, beginning with an overview of the major spatial technologies and the forms of data they produce. This review will focus on sST and proteomics platforms since these are currently the most popular in the field, and most computational analysis tools have been designed to take these modalities as their primary input (Figure 1).
Figure 1. Schematic overview of computational analysis in spatial omics (transcriptomics and proteomics). Input depicts the output files from various spatial omics platforms, usually containing a count matrix with spatial coordinates and image(s). Most analysis is followed by various secondary analysis steps to clean and pre-process data: quality control and filtering; normalization; and cell segmentation. Tertiary analysis includes analyzing the data to understand the biological context. Created in BioRender. Plummer, J. (2026) https://app.biorender.com/illustrations/69d50cc9c059ab84dd7db164. PCA: principal component analysis; UMAP: uniform manifold approximation and projection; t-SNE: t-distributed stochastic neighbor embedding.
2. Types of Spatial Omics Technologies
Spatial omics technologies can be broadly grouped into three categories based on their measurement approach: sequencing-based sST, imaging-based spatial transcriptomics (iST), and imaging-based spatial proteomics (iSP)[5]. Sequencing-based platforms such as Visium, VisiumHD, Slide-seq V2, and Stereo-seq provide transcriptome-wide coverage but with varying spatial resolution, from near-single-cell to spot-based measurements averaging multiple cells. Imaging-based transcriptomics platforms such as Xenium, MERSCOPE, and CosMx offer subcellular resolution, but mainly profile targeted gene panels. Spatial proteomics, which has gained significant momentum in recent years[6], enables highly multiplexed protein measurement through cyclical IF platforms (e.g., Cellscape, COMET, CODEX, MACSima) and IMC. Beyond these commercial platforms, academia has also contributed significantly to the advancement of spatial omics with approaches such as Spatial-Mux-seq[7], which enables simultaneous multimodal spatial profiling and Deep-STARmap and Deep-RIBOmap[8], which allow single-cell transcriptomics profiling in 3D tissue blocks.
These platforms have fundamental differences that directly impact the data structure of their outputs, computational requirements, and appropriate analytical strategies. This review will cover general workflows for processing each of these categories of spatial omics data. However, even within these categories, platforms exhibit substantial heterogeneity in their technical specifications, necessitating flexible analytical approaches tailored to each.
2.1 Sequencing-based sST
sST allows for transcriptome-wide profiling across tissue sections. These methods rely on spatially indexed capture surfaces where transcripts are tagged with location-specific barcodes[9]. While alternative sST approaches exist, including microdissection-based methods (e.g. LCM-seq) and in situ sequencing (ISS) (e.g., FISSEQ), spatially barcoded capture platforms dominate current research and public repositories. Therefore, this review focuses primarily on these barcoding-based methods (Figure 2).
Figure 2. Overview of the computational workflow for analyzing sST and iST data. Input designates the output files from various spatial platforms. Filtering is conducted as a quality control step. Preprocessing allows for multiple samples to be incorporated into downstream analysis. Secondary analysis includes clustering and cell type annotation using marker genes. Spatial differential expression and neighborhood analysis are used to assess region-specific patterns and CCIs. Tertiary analysis includes visualization and integration across other modalities. Created in BioRender. Plummer, J. (2026) https://app.biorender.com/illustrations/69602bcf9f79a39835492d13?slideId=4056948a-e057-4b5a-a9b8-3a8c09b04843. sST: sequencing spatial transcriptomics; iST: imaging based spatial transcriptomics; CCIs: cell-cell interactions.
Most sST platforms use capture areas that are larger than individual cells, with diameters ranging from 10 μm to 100 μm in earlier technologies. Visium HD stands out with 2 μm × 2 μm squares, reaching near-single-cell resolution. However, because capture areas are predefined and spatially uniform rather than aligned to cell boundaries, even high-resolution methods do not achieve true single-cell resolution, where each measurement corresponds to an intact, segmented cell.
sST platforms face an inherent resolution-coverage trade-off. Lower resolution spots aggregate more cellular material, yielding higher transcript counts per measurement but obscuring cellular heterogeneity. Higher resolution capture areas approach single-cell dimensions but suffer from extreme sparsity, often capturing only tens to hundreds of unique molecular identifiers (UMIs) per bin. To address this, high-resolution data typically require computational binning or imputation to achieve adequate coverage for downstream analysis. Conversely, lower resolution data necessitates deconvolution methods to infer cell-type composition within each multi-cell spot, often leveraging reference single-cell datasets or spatial patterns to guide decomposition.
The raw output for sequencing-based platforms consists of BCL files which are converted to FASTQ files containing spatial barcodes, UMIs, and cDNA sequences (Table 1). Standard genomic alignment of these FASTQs produces BAM files, which are then processed into spot-by-gene count matrices for downstream analysis. Several platform manufacturers provide dedicated preprocessing workflows to perform read alignment, barcode assignment, and count matrix generation, such as Space Ranger and the stereo-seq analysis workflow from STOmics. For technologies that do not provide this, there are open source, community-created pipelines to perform these same functions, such as the warp analysis research pipeline[11]. Some platforms, such as Visium, also generate tissue images with varying resolutions, such as high-resolution histology images that enable morphological analysis or lower-resolution images for quality control (QC) and spot-to-tissue registration.
| Platform | Primary Output Files | Pipeline for initial processing/visualization |
| Sequencing-based Spatial Transcriptomics | ||
| Visium | Barcode mappings (.parquet), spot x gene matrices (.h5), images (.tiff, .png) | Space Ranger |
| Visium HD | Barcode mappings (.parquet), binned spot x gene matrices (.h5), feature slices (.h5), images (.tiff, .png), cell segmentation (.geojson) | Space Ranger |
| GeoMX | Counts data (.dcc), configuration file (.pkc), sequencing data (.fastq), images (.ome.tif, .ome.xml) | GeomxTools |
| Stereo-seq | Gene expression matrices (.gef, .gem), images (.tiff, .tar.gz) | SAW |
| Slide-seq | Bead x gene matrix (.h5ad), aligned BAM, FASTQ metrics (.txt), UMI metrics (.csv.gz), gene metrics (.csv.gz), cell metrics (.csv.gz) | WARP (Slide-seq pipeline) |
| DBiT-seq | Images (.tif, .png), tissue positions (.csv), fragment files (.tsv.gz) | ATX_epigenomics (Github), AtlasXBrowser |
| Imaging-based Spatial Transcriptomics | ||
| CosMx | Expression matrix (.csv), polygons (.csv), FOV positions (.csv), transcript data (.csv) | AtoMx SIP |
| Xenium | Images (.tif, .ome.tif), cell summary file (.csv.gz), cell and nucleus segmentation (.zarr.zip, .csv.gz, .parquet), transcript data (.parquet, .zarr.zip), cell x gene matrix (.tsv.gz, .h5, zarr.zip) | Xenium Onboard AnalysisXenium Ranger |
| MERSCOPE | Transcripts (.csv), cell boundaries (.parquet), cell x gene matrix (.csv), images (.tif) | MERSCOPE Visualizer |
| Imaging-based Spatial Proteomics | ||
| CODEX (Phenocycler) | Raw images (.qptiff), multi-channel images (.tif), cell locations (.csv), cell x protein matrix (.csv), | SOPA[10] |
| MIBI | Multi-channel images (.tif), segmentation (.tif, .fcs, .csv, .txt), cell data table (.csv) | MIBIscope System |
| CyTOF | Multi-channel images (.tif), segmentation (.csv), cell x protein matrix (.csv) | CyTOF instrument |
| CellScape | Raw images (.ome.tif), multi-channel images (.ome.tif), segmetnation (.csv), cell data table (.csv) | QuPath |
| MACSima | Multi-channel images (.ome.tif), cell data table (.csv, .fcs) | MACS iQ View Analysis Software |
| COMET | Raw images (ome.tif), ulti-channel images (.ome.tif), segmentation and dot overlays (.csv), cell data table (.csv) | HORIZON |
BAM: binary alignment map; SAW: stereo-seq analysis workflow; WARP: warp analysis research pipeline; FOV: field of view; SIP: spatial informatics platform; SOPA: spatial omics pipeline and analysis; MIBI: multiplexed ion beam imaging; COMET: comprehensive multiplexed epitope tracking.
2.2 Imaging-based sST
iST encompasses two main approaches: multiplexed ISH, which uses repeated rounds of probe hybridization and imaging, and ISS, which sequences barcoded transcripts directly in tissue[12]. Both use highly multiplexed fluorescence imaging. ISH-based platforms include MERFISH, seqFISH+, Xenium, and CosMx, while ISS-based platforms include STARmap and BaristaSeq. Unlike sequencing-based approaches, which infer transcript location through barcoded capture spots, imaging-based platforms localize each RNA molecule with single-molecule precision. This enables true single-cell and even subcellular resolution (Figure 2).
Because these platforms require predefined probe sets, they are inherently panel-based. The size of the panel varies by technology, ranging from tens to hundreds of genes to several thousand genes and up to the whole transcriptome. Although this usually limits transcriptome breadth relative to sequencing-based methods, panel-based approaches typically achieve higher sensitivity and lower technical noise.
Most iST platforms include proprietary software for image processing and transcript decoding, such as Xenium Ranger and the AtoMx spatial informatics platform, though the extent of on-instrument processing varies. Each detected transcript receives (x, y) coordinates relative to the tissue section or field of view along with an assigned gene identity. Cell segmentation is required to associate transcripts with individual cells, and this is often performed using nuclear and/or membrane fluorescent markers (e.g., DAPI, DiO). Final processed datasets typically include a transcript table, segmentation masks, a cell-by-gene feature matrix, cell morphology measurements, and high-resolution images.
2.3 iSP
iSP platforms extend the principles of iST to the proteome. Instead of nucleic acid probes, these methods use fluorescently labeled or metal-tagged antibodies to detect dozens to over one hundred protein markers within intact tissue sections. Because proteins represent the functional effectors of cellular processes, such as signaling molecules, receptors, and transcription factors, spatial proteomics reveals layers of biological information that are not directly captured from RNA measurements alone. Critically, these platforms can measure post-translational modifications, activation states (e.g., phosphorylation), and protein abundance, which often correlate poorly with transcript levels due to translational regulation and protein stability.
These assays typically operate through iterative imaging cycles or mass-spectrometry-based detection, enabling high multiplexing without spectral overlap. They provide single-cell or subcellular spatial resolution, often capturing protein localization to membranes, cytoplasmic domains, or specific intracellular compartments. In contrast with transcriptomic imaging methods that detect discrete transcript puncta, proteomic data consist of high-dimensional intensity-based measurements across multiple channels, requiring robust image registration, normalization, artifact correction, and segmentation strategies. Akoya provides an IO60 panel which can analyze 60 protein markers, and other commercialized platforms are able to detect 20-60 plex while some can extend to 100 plex[13].
The primary outputs of iSP platforms consist of multi-channel fluorescence images or ion images from mass spectrometry, with each channel corresponding to a specific protein marker. Cell segmentation, similar to iST, is typically performed using nuclear and/or membrane markers, assigning protein expression measurements to individual cells and defines cellular boundaries. The resulting cell-by-protein expression matrix contains intensity-based measurements rather than discrete counts, representing protein abundance through mean fluorescence intensity, integrated intensity, or other summary statistics per cellular compartment (Figure 3). Most platforms also provide spatial coordinate information and morphological feature measurements that enable spatially aware downstream analysis.
Figure 3. Overview of computational workflow for analyzing iSP data. Created in BioRender. Plummer, J. (2026) https://app.biorender.com/illustrations/69680e6ab268012d91f5251c?slideId=4056948a-e057-4b5a-a9b8-3a8c09b04843. iSP: imaging-based spatial proteomics; PCA: principal component analysis.
2.4 Diverse output formats from spatial omics platforms
Different spatial omics platforms generate data with distinct formats, structures, and analytical requirements. To accommodate spatial coordinates, multimodal measurements, and associated imaging data, several data representations and software ecosystems have been proposed across analysis communities (Figure 4). FIn the Bioconductor R repository ecosystem, SpatialExperiment[14] extends the widely used scRNA-seq data structure SingleCellExperiment[15] by introducing native support for spatial coordinates and associated imaging data. Specifically, spatial metadata is stored alongside cell-level annotations, where spatial coordinates are represented as a dedicated component, and histological images can also be linked to the object. Seurat[16] is another very popular scRNA-seq analysis R framework which already supports various sST and iST platforms.
Figure 4. Overview of data representations for spatial omics. This schematic illustrates how these frameworks organize spatial omics data into common conceptual components. Across both R and Python ecosystems, core elements include: (i) feature-by-cell gene expression matrices (e.g., assays or X), (ii) cell- or spot-level metadata (e.g., colData or obs), (iii) feature-level annotations (e.g., rowData or var), and (iv) reduced dimensional representations and graphs for downstream analysis. In spatially resolved datasets, these structures are extended to incorporate spatial coordinates and, in some cases, linked imaging data. Created in BioRender. Plummer, J. (2026) https://app.biorender.com/illustrations/69e1151c16f29279e0b09ccf.
In the Python ecosystem, Anndata[17] serves as the core data structure adopted by single-cell gene expression data analysis toolkit Scanpy[18]. Voyager[19] is implemented in both R/Bioconductor and Python/PyPI ecosystem, and it introduces SpatialFeatureExperiment data structure which extends SpatialExperiment by Simple Features[20].
Because of the heterogeneity of ST data, challenges in cross-platform integration, and the large data volume of high resolution images from imaging-based technologies, SpatialData[21] was developed. It is a flexible and extensible framework to manage large-scale spatial omics data and support out-of-core computation. It also empowers consistent spatial alignment and cross-modal integration by the coordinate system. SpatialData provides a dedicated I/O interface (spatialdata-io) that supports data ingestion from a wide range of commonly used commercial spatial omics platforms. It requires inputs in the SpatialData Zarr file format, which is an extension from Zarr and OME-NGFF enabling storage of large images, data and metadata, all linked to each other in an efficient and interoperable way.
Within the SpatialData framework, spatial information is organized into five core spatial elements: images, labels, points, shapes, and tables. Raster images are represented as images; transcript coordinates are stored as points; segmentation masks are encoded as labels; geometric objects such as cell boundaries, nuclei outlines, or circular regions are represented as shapes; and molecular measurements like gene and protein expression, inflorescence intensity, and associated metadata are stored in tables as Anndata format.
3. Advanced Data Analysis Steps and Pipelines
Once the data is generated, there are several steps that must be taken to generate biological insights. Not all of these steps are necessary in every case, but they are common in most workflows. These additional pipelines allow for better data refinement and integrative analysis methods.
3.1 Segmentation refinement
Cell segmentation represents a foundational step in imaging-based spatial omics, as accurate assignment of transcripts and proteins to individual cells is a prerequisite for all downstream single-cell analyses. Segmentation errors can propagate through the entire analytical pipeline. Misassignment of transcripts creates artificial co-expression patterns, under-segmentation merges distinct cell types and obscures heterogeneity, and over-segmentation fragments individual cells into spurious subpopulations. These errors fundamentally compromise biological interpretation in cell phenotyping, spatial niche analysis, cell-cell communication inference, and tumor microenvironment characterization[22]. While manufacturers provide default segmentation outputs, these general-purpose algorithms often require tissue-specific parameter optimization, necessitating fine-tuning.
Most platforms include proprietary segmentation refinement tools that allow parameter adjustment without requiring complete re-segmentation. For instance, 10x Genomics launched Xenium Ranger software which introduces a re-segmentation function enabling users to tune nuclear expansion distance, DAPI intensity thresholds, and expected cell size constraints. Similarly, Vizgen’s Post-processing Tool enables manual boundary correction and regeneration of single-cell expression matrices from updated segmentation masks, and Nanostring’s FastReseg[23] uses transcript-guided corrections to improve boundary accuracy in three dimensions. However, these vendor-specific tools are typically limited to their respective platforms and may not accommodate complex tissue architectures or specialized segmentation requirements.
To address these limitations, numerous third-party segmentation methods have been developed, predominantly leveraging deep learning approaches trained on diverse tissue types and imaging modalities (Table 2). Deep learning algorithms such as Cellpose and StarDist are popular choices due to their robust performance across tissue types, pre-trained models, and ability to handle challenging morphologies. Cellpose uses a flow-based representation of cell shapes, making it particularly effective for irregular cell boundaries, while StarDist employs star-convex polygon representations optimized for round or elliptical cells.
| Tool/Pipeline | Method Summary | Input | Output | Language |
| Segmentation Only | ||||
| FastReseg | Scores transcripts, identifies missegmented transcripts via a SVM, then reassigns transcripts via decision tree | Imaging-based transcriptomics data, reference transcript profile (scRNA-seq/spatial) | Reassigned transcripts | R |
| Cellpose[24] | Generates topological maps, neural network predicts gradients, uses gradient tracking to group pixels in cells | Fluorescence/histology images | Segmentation masks | Python |
| StarDist[25] | Object detection based on U-Net, predicts star-convex polygons for every pixel | Fluorescence/histology images | Segmented images | Python |
| Segger[26] | Encodes data as heterogeneous graph, GNN is then trained on cell-transcript links and refines them | Imaging-based transcriptomics data, scRNA-seq reference (optional) | Reassigned transcripts | Python |
| BIDCell[27] | Self-supervised deep-learning model using biologically-informed multiple loss functions to optimize learnable parameters for segmentation | Spatial transcriptomics data, histology image, scRNA-seq reference | Segmentation masks, cell x gene matrix | Python |
| Baysor[28] | Models data as mixture of cell-specific distributions, uses Bayesian mixture models to separate the mixture | Imaging-based transcriptomics data | Molecule coordinates, cell polygons | Julia |
| Proseg[29] | Based on unsupervised probabilistic model of the spatial distribution of transcripts | Imaging-based transcriptomics data | Reassigned transcripts, cell polygons | Rust |
| RAMCES[30] | Uses CNN to learn optimal markers, utilizes weighted combination of selected markers for segmentation | Imaging-based proteomics data | Marker rankings and weighted images | Python |
| Segmentation and Cell Type Classification | ||||
| JSTA[31] | Uses DNN to assign pixel-level cell type labels | Spatial transcriptomics data, scRNA-seq reference | Reassigned pixel labels | Python |
| ClusterMap[32] | Integrates spatial and expression data, density peak clustering to identify biologically meaningful structures | Imaging-based transcriptomics data | Segmentation mask, cell type annotation, tissue region map | Python |
| CelloType[33] | Transformer-based DNN with multiple branches to perform object detection, segmentation, and classification concurrently | Imaging-based transcriptomics data, histology image | Segmentation mask, object boxes and classes | Python |
| Bering[34] | Graph deep learning model utilizes transcript colocalization for cell type annotation, transcript representations are transferred to segmentation task | Imaging-based transcriptomics data | Cell type annotations, reassigned transcripts | Python |
SVM: support vector machine; GNN: graph neural network; CNN: convolutional neural network; DNN: deep neural network; JSTA: joint cell segmentation and cell type annotation.
Some more recent segmentation methods aim to leverage spatial omics-specific information. Segger, a graph neural network-based approach, constructs spatial graphs from transcript locations and uses message passing to refine cell boundaries while requiring less computational resources than image-based deep learning methods. BIDCell and other transcript-aware methods similarly exploit the observation that transcripts from the same cell should cluster spatially, using this biological prior to inform segmentation decisions.
Frequently in iST data, substantial fractions of detected transcripts remain unassigned to any cell after segmentation. While this is conventionally treated as technical noise from extracellular space or segmentation errors, accumulating evidence suggests biological relevance for some unassigned transcripts[35]. Troutpy has been developed to analyze spatial patterns of unassigned transcripts, testing whether they exhibit non-random spatial organization that would indicate biological signal rather than uniform technical noise. If unassigned transcripts show spatial clustering, co-localization with specific cell types, or enrichment in particular tissue regions, this information can inform iterative segmentation refinement by expanding boundaries in regions with high unassigned transcript density or adjusting segmentation algorithms to capture cellular protrusions.
Unlike imaging-based technologies, sST does not require segmentation refinement due to their spot-based nature, but high-resolution variants such as Visium HD can benefit from cell-aware aggregation. Tools like Bin2cell[36] address this by applying image-based segmentation, using StarDist or similar methods on H&E images, to group bins into putative single cells, enabling cell-level rather than bin-level analysis and improving compatibility with single-cell analytical frameworks.
As an alternative to simple cell segmentation, several methods have emerged that treat segmentation and cell type assignment as coupled problems rather than sequential steps, leveraging the observation that cell type identity constrains expected morphology. One of the most popular tools for this is JSTA which uses a scRNA-seq reference and initial watershed segmentation as input to a deep neural network to classify cells. This is followed by iterative reassignment by pixels based on local RNA densities until convergence is reached. These types of approaches can improve both segmentation and cell annotation accuracy when cell type groupings are PAGEXXXvailable, though often at increased computational cost.
After initial refinement, segmentation quality should be evaluated to determine if further refinement is necessary. This is typically through biological QC metrics in the absence of manual annotations: examining cell size distributions, doublet rates, unassigned transcript fractions, and whether known cell-type-specific markers localize appropriately. When ground truth cellular boundaries exist, overlap metrics like Intersection over Union provide quantitative accuracy measures, though manual segmentation remains labor-intensive and subjective for complex tissues
3.2 QC
QC involves filtering out low quality data points that may introduce technical noise, create spurious clusters, or obscure genuine biological signals. QC strategies for spatial omics largely build upon established single-cell RNA-seq workflows with some being adapted to accommodate spatial context and platform-specific characteristics.
Common QC procedures for sST include filtering cells or spots based on outlier total counts, number of unique transcripts, high percentage of mitochondrial genes, and number of cells per capture spot. These steps are routinely implemented using widely adopted analysis frameworks in both Python and R ecosystems such as Scanpy and Seurat. Importantly, Bhuva et al.[37] highlighted the critical impact of library size variation in sST data, demonstrating that differences in sequencing depth can substantially influence downstream analyses and interpretation. Their findings underscore the need for careful consideration of library size-related biases when performing QC and normalization in spatial omics studies.
For iST, QC requirements have some overlap with sST (e.g., filtering on total counts), but there are distinct differences. Primarily, image background denoising and cell segmentation refinement should be completed prior to biological QC, and mitochondrial gene QC is generally not required. Because of the absence of standardized metrics for evaluating iST data quality, Plummer et al.[38] proposed a set of quantitative metrics for assessing data quality. These include technical measurements such as transcripts per cell to evaluate sensitivity, as well as normalized transcripts per nucleus as a complementary indicator for cell segmentation. The authors released SpatialQM for metrics calculation, and SpatialTouchstone portable as a shared resource for the academic community to compare data quality across datasets.
When working with iSP, QC begins with raw image assessment and preprocessing, including evaluation of focus quality, signal-to-noise ratios across channels, and proper registration between imaging cycles. Background correction and denoising are essential as protein intensity measurements are particularly sensitive to autofluorescence, non-specific antibody binding, and cycle-to-cycle variation in staining efficiency. Channel-specific QC should be used to evaluate antibody performance by examining signal distributions and identifying channels with abnormally low signal or high background. After cell segmentation, cellular measurements undergo filtering based on total protein expression, number of detected markers, and morphological features.
While applying non-spatial QC methods is sufficient in many cases, leveraging spatial context during QC can enable distinction between biological heterogeneity and technical artifacts. An example of this would be fibrotic and necrotic areas, which are regions with naturally low counts. When such cells or spots form spatially coherent patterns aligned with known tissue structures, excluding them solely based on mitochondrial gene thresholds may inadvertently remove biologically meaningful signals. Incorporating spatial context by visualizing suspicious data points can help distinguish biologically relevant regions from technical artifacts.
3.3 Pre-processing
After filtering low quality spots/cells, normalization aims to correct for technical variation in order to enable valid biological comparisons. A majority of the original methods used for normalization in sST were taken directly from scRNA-seq analysis. The most common method is to perform some form of scaling on the data (e.g., CPM, library normalization) followed by log transformation. This is useful as a default approach due to its simplicity and interpretability; however, alternative methods, such as scTransform[39] which models the UMI counts for each gene using a generalized linear model, tend to see better performance[40].
While it is generally acceptable to apply the previously described scRNA-seq normalization methods to sequencing-based technologies, it is debatable whether this is the case for imaging-based technologies. This is because the total counts are determined by probe hybridization efficiency, not sequencing coverage. Additionally, the use of targeted gene panels may introduce bias towards specific gene sets or cell types, which cannot be accounted for with scRNA-seq normalization methods. In cases where cell volume data is available, Atta et al.[41] recommend utilizing that data for normalization with cell area serving as a proxy if volume is unavailable. This assumes that the transcription rate is constant between cells with transcript density being the value to normalize by. However, the efficacy of this method is largely impacted by segmentation accuracy.
When it comes to iSP, normalization refers to fluorescence intensity rather than count data. Common single-marker transformations include Z-score normalization, log or inverse hyperbolic sine transformation, and min-max scaling, each with distinct assumptions about data distribution and interpretation. Inverse hyperbolic sine transformation has gained favor in spatial proteomics due to its ability to handle zero values naturally while approximating log behavior for high intensities, making it robust across markers with different dynamic ranges[42].
Traditional normalization methods assume technical variation is independent of spatial location. However, sST data often exhibit spatial gradients in technical quality that are confounded with biological spatial patterns. Standard normalization may inadvertently remove genuine biological gradients while failing to correct spatially structured technical artifacts. Because of this, some researchers even recommend avoiding normalization before spatial domain identification if the method is not spatially-aware, as normalization can blur spatial boundaries and reduce the detectability of spatially organized gene expression programs[37]. Tools such as SpaNorm[43] address this issue by modeling spatial and non-spatial technical effects separately, using spatial autocorrelation to distinguish spatially smooth technical variation from sharp biological boundaries. However, this is more computationally expensive than non-spatial methods.
Normalization results can be evaluated qualitatively by visualizing the results via boxplots or histograms to verify the distributions are not skewed. After normalization, another common pre-processing step is dimensionality reduction. This serves multiple purposes in spatial omics analysis including denoising data through low-rank approximation (e.g. PCA), reducing computational burden for downstream analysis, and enabling two-dimensional visualization of cellular heterogeneity (e.g. UMAP, t-SNE).
Following these initial steps, the data is prepared for downstream biological analysis, including cell type identification and CCI inference.
3.5 Cell typing and spatial domain identification
Cell typing, also called cell type classification/annotation, is the process of converting quantitative gene/protein counts into biologically interpretable labels. The most popular methods across different technologies include clustering, cell state/phenotype scoring, reference-based mapping, and deep learning/machine learning tools. Other methods, such as deconvolution and segmentation-free analysis, are more specific to certain cases such as low-resolution sST data or imaging-based spatial data where accurate cell segmentation is difficult (Table 3).
| Tool/Pipeline | Method Summary | Input | Output | Language(s) |
| Clustering | ||||
| BayesSpace[44] | Full Bayesian stastical method that uses a low-dimensional representation of expression matrix to model spatial clustering | Spatial omics data | Spot and subspot level cluster labels | R |
| BASS[45] | Uses Bayesian hierachical modeling framework for joint clustering and spatial domain detection | Spatial omics data | Single-cell level cluster labels, spatial domain labels, cell type proprtions | R |
| Pixie[46] | Extracts pixel-level features, unsupervised clustering to identify pixel-level phenotypes, maps clusters back to image | Imaging-based proteomics data | Pixel-level cluster labels | Python |
| SpaCell[47] | Image feature extraction with CNN, K-means clustering on latent matrix representing image and gene-count data | Spatial transcriptomics data, histology image | Single-cell/spot level cluster labels | Python |
| Cell State/Phenotype Scoring | ||||
| FGSEA[48] | Takes ranked list of genes and calculate enrichment score based on the position and frequency at which genes from a gene set appear in that list | Spatial omics data,gene sets | Gene set enrichment scores | R |
| AUCell[49] | Calaculates enrichment of gene sets as an AUC across all ranked genes in a cell/spot | Spatial omics data,gene sets | Gene set activity scores | Python, R (implemented in SCENIC) |
| WSUM[50] | Multiplies each target gene in a gene set by its associated weight from the input data which are then summed to get final enrichment score | Spatial omics data,gene sets | Gene set enrichment scores | Python, R (implemented in decoupleR) |
| WMEAN[50] | Similar to WSUM but divides the summed enrichment score by the sum of the absolute value of weights | Spatial omics data,gene sets | Gene set enrichment scores | Python, R (implemented in decoupleR) |
| M-scores[51] | Compares the expression distributions between query and reference samples for given gene sets | Spatial omics data,gene sets | Gene set dysregulation scores | MyPROSLE webtool |
| Reference-based Mapping | ||||
| scType[52] | Analyzes detected gene signature at each spot against maker gene database (scTypeDB) or reference and scores them | Spatial transcriptomics data, scRNA-seq reference (optional) | Single-cell/spot-level cell type labels based | Python, R |
| Spatial-ID[53] | DNN trained on reference, GCN constructs spatial neighbor graph, autoencoders to encode gene expression patterns and embed spatial information | Spatial transcriptomics data, scRNA-seq reference | Single-cell/spot-level cell type labels | Python |
| TopACT[54] | Independently classifies cell type of each spot, uses dynamically scaled local neighborhood | Imaging-based transcriptomics data, scRNA-seq reference | Spot-level cell type annotations | Python |
| SpatialScope[55] | Deep generative model to learn distributions from scRNA-seq data which are then used to identify cell type labels | Spatial transcriptomics data, scRNA-seq reference | Spatial maps of cell types at single-cell resolution | Python |
| RedeHist[56] | U-Net to extract histological features, DNN produces latent embeddings for nucleus mask, generates cell abundance matrix | Spatial transcriptomics data, histology image, scRNA-seq reference | Single-cell cell type labels, whole transcriptome expression profiles, and coordinates | Python |
| CytoSpace[57] | Estimates cell type proportions and cells per spot, samples scRNA-seq data to match estimations, assigns single cells to spatial spots via shortest augmenting path optimization | Spatial transcriptomics data, scRNA-seq reference | Single-cell/spot level cell type labels | Python |
| Tangram[58] | Iterative learning of spatial alignment of sc/snRNA-seq data | Spatial transcriptomics data, histology image, sc/snRNA-seq reference | Single-cell labels/deconvolution | Python |
| Deep Learning/Machine Learning | ||||
| Novae[59] | Self-supervised graph attention network that encodes local environments into spatial representations | Spatial transcriptomics data, histology image (optional) | Single-cell/spot level cluster labels | Python |
| CellTune[60] | Feature extraction followed by training two gradient-boosted tree models in parallel, human updated labels to iteratively improve model | Imaging-based proteomics data | Single-cell level cell type labels, cell gating, maker positivity predictions | Desktop Application |
| CELESTA[61] | Identifies “anchor cells” via protein expression profile, classify “non-anchor” cells via both protein expression and known cell types within spatial neighborhood iteratively until convergence | Imaging-based proteomics data | Single-cell level cell type labels | R |
| CellSighter[62] | Ensemble of CNN models performing multi-class classification, calculates probability of each cell belonging to a class | Imaging-based proteomics data | Single-cell level cell type labels | Python |
| STARLING[63] | Probabilistic machine learning model that accounts for segmentation errors | Imaging-based proteomics data | Single-cell level cell type labels, cluster labels, per-cell segmentation error probabilities | Python |
| Deconvolution | ||||
| CARD[64] | Non-negative matrix factorization model with CAR modeling assumption | Sequencing-based transcriptomics data, scRNA-seq reference | Spot cell type proportions | R |
| GIST[65] | Bayesian probabilistic model using prior estimates of cell type proportions from paired tissue image to optimize estimates derived from spatial data | Sequencing-based transcriptomics data, histology image | Spot cell type proportions | R |
| Spotiphy[66] | Probabilistic generative modeling to estimate cell type proportions in capture and non-capture spots | Sequencing-based transcriptomics data, histology image, scRNA-seq reference | Spot cell type proportions, inferred scRNA profiles, and pseudo single-cell resolution image with cell type labels | Python |
| Cell2Location[67] | Bayesian model decomposes spatial expressed matrix into reference cell type signatures to estimate abundance of cell types at each location | Sequencing-based transcriptomics data, scRNA-seq reference | Spot cell type proportions | Python |
| SpaDecon[68] | Combine spatial and reference gene expression matrices, autoencoder identifies relevant features for cell types, infer cell type proportions | Sequencing-based transcriptomics data, histology image (optional), scRNA-seq reference | Spot cell type proportions | Python |
| RCTD[69] | Reference-based probabilistic model predicts cell types on pixels, predicts maximum-likelihood cell type proportions | Sequencing-based transcriptomics data, scRNA-seq reference | Spot cell type proportions | R |
| Segmentation-Free Analysis | ||||
| FICTURE[70] | Multilayered Dirichlet model for stochastic variational inference of pixel-level spatial factors | Spatial transcriptomics data, scRNA-seq reference (optional) | Pixel-level cell type labels | Python |
| SSAM[71] | Gaussian KDE to get spatial mRNA density, identify cell type signatures from gene expression vector field, signatures mapped to vector field via Pearson’s correlation | Imaging-based transcriptomics data, scRNA-reference (optional) | Pixel-level cell type labels | Python |
| Sainsc[72] | KDE to model 2D gene expression, models cell type assignments using cosine similarity of gene expression with reference | Nanometer resolution spatial tran scriptomics data, scRNA-seq reference (optional) | Pixel-level cell type labels | Python/Rust, Julia |
FGSEA: fast gene set enrichment analysis; AUC: area under the curve; WSUM: weighted sum; WMEAN: weighted mean; GCN: graph convolutional network; CAR: conditional autoregressive; KDE: kernel density estimation; SSAM: spot-based spatial cell-type analysis by multidimensional mRNA density estimation; RCTD: robust cell type decomposition; GIST: guiding-image spatial transcriptomics; CARD: conditional autoregressive-based deconvolution.
Clustering involves grouping data points that share intrinsic characteristics. Similar to scRNA-seq analysis, this means clustering by gene expression where cells with more similar expression patterns would be grouped together. This requires the manual assignment of cell type labels to the identified clusters based on the top marker genes. Graph-based clustering methods, such as Leiden[73] and Louvain[74] which are implemented in packages such as Scanpy, construct k-nearest neighbor graphs and identify communities, while other approaches use hierarchical clustering, k-means, or mixture models. For spatial data, it is possible to cluster both on gene expression and spatial location, typically assuming that closer spots/cells are more similar such as in methods like BayesSpace. Technically, single-cell clustering methods can still be used on spatial data, but it has been demonstrated that methods accounting for spatial coordinates tend to perform better at the cost of higher computational complexity[75]. The optimal method primarily depends on the input data (e.g., resolution, distinct spatial patterns, tissue type) and whether computational efficiency is a primary concern, with no single tool showing universally optimal performance. For example, Bayesian methods tend to see better performance when spots are organized in a regular grid or lattice structure[76].
Cell state or phenotype scoring provides an alternative to discrete cell type assignment by quantifying continuous biological states through marker gene expression. These include, but are not limited to, rank-based enrichment methods (e.g. FGSEA, AUCell), aggregation-based methods (e.g., WSUM, WMEAN), and expression distribution comparison methods (e.g., M-scores). These types of methods take gene sets, which are pre-defined groups of genes that participate in the same pathway or perform a common function, as part of their input and provide scores to quantify their activity for a given group of cells or capture spots. Gene sets can be derived from literature, pathway databases (e.g., MSigDB, KEGG), or computationally from reference datasets. In spatial contexts, scoring enables mapping of functional programs across tissue architecture, revealing spatial organization of processes like immune activation zones, hypoxic regions, or proliferative niches that may not correspond to distinct cell types. Among 18 different cell state scoring methods, Toro-Domínguez et al.[77] found that normalized WSUM, M-scores, and AUCell consistently performed well across multiple evaluation metrics.
Reference-based cell type annotation leverages existing single-cell or single-nucleus RNA-seq atlases to transfer cell type labels to spatial data. These methods typically compute marker gene signatures or use full expression profiles to match spatial observations to reference cell types. Reference-based annotation is particularly powerful when high-quality, tissue-matched references are available, enabling assignment of detailed cell states that would be difficult to resolve through de novo clustering alone. Performance of these methods depends critically on reference quality, biological concordance between reference and spatial datasets, and the presence of spatial cell types in the reference. Novel cell states or spatially restricted populations absent from the reference will likely be misannotated or forced into inappropriate categories.
In addition to traditional approaches, unique machine learning and deep learning techniques have increasingly become popular for cell type labeling. Graph-based models and neural networks in particular have gained traction due to their ability to represent datasets with complex, non-linear relationships. While these methods are powerful, they are often more computationally expensive, either requiring access to GPUs and/or having runtimes that scale exponentially with dataset size.
In the case of low-resolution sST, it is sometimes necessary to perform spot deconvolution to determine the proportions of cell types within a spot and, in some instances, their locations within a spot. Because spot-based sST is the oldest spatial omics technology commercially available, there are many deconvolution tools available, and their performances can vary depending on a variety of factors including the mRNA capture mechanism and detection mechanisms used[78]. Li et al.[79] note that reference-based methods tend to be more accurate and robust compared to non-reference-based methods with CARD, Cell2location, Tangram, and RCTD having the best overall performance.
When accurate cell segmentation in an imaging-based spatial dataset is challenging, such as in densely-packed tissues, segmentation-free methods can provide an alternative method of mapping cell type spatial patterns. These approaches bypass explicit cell boundary definition and instead analyze spatial expression patterns directly. Methods differ in their units of analysis: FICTURE and SSAM operate on hexagonal or square spatial bins, learning latent cell type representations from transcript spatial distributions without requiring cell assignment. Sainsc uses a probabilistic framework to infer cell type spatial distributions from transcript point patterns. Some methods (e.g., Baysor) offer hybrid approaches, optionally using segmentation when available or operating in segmentation-free mode when boundaries are ambiguous.
When evaluating the accuracy of cell type labels, it is best to have ground truth labels to compare them to. In cases where marker genes were not used for labeling, these can be compared to existing literature to verify the accuracy of cell type labels. Calculating the specificity of marker genes to a certain cell type group by using spatial autocorrelation metrics, such as Moran’s I or Geary’s C, can also help determine if the identified groupings are truly distinct. After mapping cell types in the tissue, it is then possible to identify distinct niches where these cells tend to localize and investigate how they differ via methods such as differential expression analysis. These spatial domains can then serve as input for other downstream analysis tasks such as neighborhood analysis and CCI modeling.
3.6 Neighborhood analysis and CCI modeling
CCIs orchestrate fundamental biological processes ranging from development and tissue homeostasis to immune responses and disease progression. While scRNA-seq data has been used to infer potential cellular communication networks based on ligand and receptor gene expression, these predictions lack spatial information, which can help inform which interactions are more likely to occur based on physical proximity. Spatial omics technologies address this limitation by enabling analysis of both molecular profiles and spatial organization, dramatically improving the ability of researchers to map functional interaction networks within native tissue architecture such as the tumor microenvironment[5].
Neighborhood analysis precedes CCI modeling by identifying which cell types are spatially organized in ways that enable interaction. These methods quantify whether cell populations exhibit spatial clustering, avoidance, or mixing patterns that deviate from what would be expected by chance. One of the most widely used approaches is neighborhood enrichment analysis, which quantifies spatial co-localization between cell type pairs by comparing observed versus expected frequencies of neighboring relationships. Implementations in packages like Squidpy[80] construct spatial graphs connecting cells within a defined radius, then calculate enrichment scores indicating whether certain cell type pairs appear as neighbors more or less frequently than random permutations would predict. Typical approaches either use biologically motivated distances or test multiple radii to identify distance-dependent relationships. Spatial co-occurrence analysis extends this by examining how cell type relationships change across spatial scales. Rather than a single distance threshold, co-occurrence functions quantify whether cell type pairs are enriched or depleted at increasing distances, revealing multi-scale spatial organization.
Following identification of proximally organized cell populations, CCI modeling aims to infer specific molecular signaling events, typically ligand-receptor pairs, mediating any possible interactions. Ligand-receptor database approaches form the foundation of most CCI methods (Table 4). These tools query curated databases of known receptor-ligand pairs against expression profiles to identify cell type pairs with complementary expression: one cell type expresses a ligand while a neighboring cell type expresses its cognate receptor. Statistical significance is assessed by comparing observed ligand-receptor co-expression to null distributions generated by permuting cell type labels while preserving spatial structure, or by randomizing spatial locations while maintaining cell type identities. However, database-driven approaches face several critical limitations. Different ligand-receptor databases vary substantially in coverage and annotation quality, with one recent analysis identifying less than 50% overlap in ligand-receptor pairs between major databases[90]. One reason for this is directional ambiguity: since many receptors also function as ligands, and most databases do not capture bidirectional signaling or complex multi-subunit receptors. Additionally, expression does not necessarily equate to functional interaction. Co-expression of ligand-receptor pairs is necessary, but post-translational modifications, receptor trafficking, spatial barriers, and regulatory context all influence whether predicted interactions occur functionally. Lastly, cross-sample comparisons of CCI predictions are sensitive to normalization choices and library size variation, potentially creating artificial differences in interaction strength.
| Tool/Pipeline | Method Summary | Input | Output | Language |
| Neighborhood Analysis | ||||
| SPACEc[81] | Generates vectors representing cell counts for each window of nearest neighbors then clusters them to identify commonly composed neighborhoods | Imaging-based proteomics data | Cellular neighborhoods | Python |
| Kandinsky[82] | Infers spot/cell neighborhoods using KNN, centroid distance, Delaunay triangulation, queen contiguity, and/or membrane distance; uses neighborhoods for clustering, calculating co-localization, and detecting hot/cold expression areas | Spatial omics data | Cell/spot neighborhoods, groupings, co-localization Z-scores, hot/cold areas | R |
| BANKSY[83] | Uses pair of spatial kernels to encode transcriptomics texture of microenvironment around each cell, augments features of each cell | Spatial omics data | Neighbor-augmented expression matrix | Python, R |
| CCI Modeling | ||||
| COMMOT[84] | Collective optimal transport to infer cell-cell communication | Spatial transcriptomics data | Cell-cell communication network | Python |
| SpaOTsc[85] | Structured optimal transport of signal senders to target signal receivers to obtain cell-cell communications | Spatial transcriptomics data, scRNA-seq reference (optional) | Mapping between spatial and scRNA-seq data, spatial subclustering, cell-cell communications, spatial distance for intercellular signaling, spatial map of intercellular gene-gene regulatory information flow | Python |
| SpaTalk[86] | Graph network and knowledge graph to model and score ligand-receptor-target signaling network | Spatial transcriptomics data, scRNA-seq reference | Cell type decomposition matrix, cell-cell communication and ligand-receptor-target networks | R |
| CellPhoneDB V3[87] | Public repository of ligands, receptors, and their interactions which are used to assess cellular crosstalk | Spatial transcriptomics data, scRNA-seq reference | Ranked cellular interactions | Python |
| HoloNet[88] | Models cell-cell communication events as multi-view network, attention-based graph learning model predicts target gene expression, decode functional communication events | Spatial transcriptomics data | Cell-cell communication events, functional communication events | Python |
| stLearn[89] | Spatially-constrained two-level permutation analysis to compute ligand-receptor scores | Spatial transcriptomics data, histology image (optional) | Ligand-receptor scores | Python |
| Neighborhood Analysis and CCI Modeling | ||||
| Squidpy | Models data as spatial graph with cells/spots as nodes and neighborhood relations as edges, can perform neighborhood enrichment test/ ligand-receptor interaction analysis | Spatial omics data | Spatial neighborhoods, ligand-receptor interactions | Python |
CCI: cell-cell interaction; KNN: k-nearest neighbors.
To address these limitations, spatial methods improve upon the accuracy of predictions made by database approaches by restricting analyses to cell pairs within defined interaction distances or by weighting interactions by spatial distance. COMMOT, for example, models ligand-receptor signaling as a spatial flow problem, using optimal transport theory to infer the ‘communication intensity’ between cell locations based on both expression levels and spatial proximity, capturing directional communication patterns and spatial gradients of signaling activity. SPACEc uses spatial correlation between ligand expression in sender cells and receptor expression in receiver cells across tissue regions to identify spatially coordinated signaling.
While CCI inference is a critical area of study, it suffers from a lack of standardized methods of assessment and validation, with researchers typically cross-referencing their predictions with existing literature or performing experimental validation[91]. As such, it is difficult to compare the performance of existing tools, with benchmarking studies providing varying results for the same tools depending on their metrics of evaluation and input databases. In general, it is recommended to subject the results from any of these tools to multiple methods of validation to best ensure the predictions are accurate.
3.7 Multi sample and multimodal integration
The workflow previously described applies to single-slice data; however, it can be adapted to account for multi-slice data with additional steps. Spatial omics studies increasingly incorporate multiple tissue samples, biological replicates, or patient cohorts to achieve statistical power for identifying reproducible biological patterns, assess inter-individual variation, and enable robust biomarker discovery. However, as in single-cell analysis, technical variation between samples arising from batch-specific library preparation, sequencing runs, reagent lots, or operator differences can obscure biological signals if not properly corrected.
There are many existing batch correction methods, many coming from single-cell analysis pipelines. For instance, Seurat has its in-house Canonical Correlation Analysis function that captures common variation features between different batches by finding Mutual Nearest Neighbors anchors. Harmony[92] performs fuzzy clustering in the PCA space with its low algorithmic complexity, aiming to balance computational efficiency and integration performance. For deep-learning based approaches, scVI[93] utilizes a variational autoencoder to model raw counts as a zero-inflated negative binomial distribution, and employs stochastic gradient descent for model training alongside GPU acceleration support. In their review of batch correction and integration methods, Ludington et al.[94] compared 11 tools and found that probabilistic methods, such as scVI, excel in removing unwanted technical variation while preserving meaningful biological structure, with Harmony also performing competitively with them.
In addition to batch correction, slice alignment is also necessary to ensure that certain features occupy the same coordinate system (Table 5). PASTE is one of the most well-known alignment methods. It computes pairwise alignment between slices using optimal transport to account for both transcriptional similarity and physical distance between spots. However, many early alignment methods like this assume full overlap between sections, which is often not the case. To address this issue, there has been the development of PASTE2, which allows for partial-overlap between slices, and STalign, which utilizes Large Deformation Diffeomorphic Metric Mapping to accommodate non-linear distortions. If multiple z-sections or contiguous blocks were imaged, some alignment tools (e.g., PASTE2) allow for the reconstruction of the data into a navigable 3D model to analyze tissue structure and gradients in three dimensions.
| Tool/Pipeline | Method Summary | Input | Output | Language |
| Multi-slice Alignment/Integration | ||||
| PASTE[95] | Pairwise alignment via optimal transport that models both transcriptional similarity and physical distance/Multiple slice integration by combining fused Gromov-Wasserstein barycenter with NMF | Pair of spatial transcriptomics slices (assumes full overlap) | Pairwise mappings between slices/NMF decomposition of center slice gene expression, mapping between center slice and input slices | Python |
| PASTE2[96] | Extension of PASTE allowing for partial alignment, can utilize histology images to aid alignment by identifying spots with similar histology | Pair of spatial transcriptomics slices, histology images (optional) | Partial alignment matrix, overlap percentage, stacked slices for 3D reconstruction | Python |
| JADE[97] | Encoders extract low-dimensional embeddings, graph attention module to get embedding-space alignment, roundtrip learning scheme to refine embeddings and alignments alternately within training iterations | Pair of spatial transcriptomics slices | Probabilistic alignment matrix, embedding representation of each spot | Python |
| Spateo | Uses probablistic model for aligning slices to create aligned 3D point clouds | Spatial transcirptomics slices | 3D reconstruction of slices | Python |
| Multi-dataset Alignment/Integration | ||||
| STalign[98] | Rasterize source and target coordinates into images, solves mapping between images, applies mapping to source | Pair of spatial transcriptomics datasets, histology image for single-cell and spot-resolution alignment | Source aligned coordinates | Python |
| SLAT[99] | Cross-dataset SVD to project omics profiles into shared low-dimensional space, GCN to encode local and global information, align graphs | Spatial omics datasets | Cell to cell/spot to spot matching, similarity scores, cell type levels | Python |
| spatiAlign[100] | Autoencoder generates low-dimensional gene representations, optimized by self-supervised contrastive learning, reconstructs original input | Spatial transcriptomics datasets | Learned lower-dimensional representations, reconstructed gene expression matrix | Python |
| MIIT[101] | Spatial data are processed to reference matrices, registered to stained images, source section is registered to image space of target section, data from source are fused to match spatial organization of target | Spatial omics datasets, histology images | Integrated spatial omics data from source | Python |
| STAligner[102] | Graph attention autoencoder neural network to extract spatially aware embedding, constructs spot triples, iterative optimization | Spatial transcriptomics dataset(s) | Batch-corrected spatial embeddings | Python |
| SPACEL | Uses GCN and adversarial learning algorithm to find spatial domains that are spatially and transcriptomically coherent across slices | Spatial transcriptomics dataset(s) | Single slice cell type deconvolution, domain identification across slices, 3D reconstruction of slices (for consecutive slices) | Python |
| STAIR | Uses hetergeneous graph attention network to learn spatial features and get consistent spatial domians across slices | Spatial transcirptomics dataset(s) | Aligned spatial embeddings, de novo 3D reconstruction of slices | Python |
| CAST | Uses deep GNN and physical alignment for single-cell level alignment | Spatial transcriptomics dataset(s) | Common features across slices, alignment of pairs of slices, projection of one slice to another | Python |
| Spatial Multi-omics Integration | ||||
| SANTO[103] | Identifies overlap between slices, dynamic graph CNN to extract local and global embeddings for spatial coordinates and omics feature expression, generate soft mapping to generate full alignment/stitching | Pair of spatial omics slices (includes spatial epigenomics) | Transformed spatial coordinates of source slice, coarse and fine rotation and translation | Python |
| stLVG[104] | Vector-guided graph model with location, direction, and angle-based edge weights learns cross-slice features via adversarial module, learns cell representations via multi-view contrastive learning module | Spatial omics datasets (including epigenomics) | Learned cell embeddings | Python |
| SpaMV[105] | Two GAT encoders per omics datasets to extract shared and private information, infers shared representations | Spatial omics datasets (including epigenomics and metabolomics) | Inferred latent variables | Python |
| Multi-modal Integration | ||||
| SIMO[106] | KNN to construct spatial graph and modality map, fused Gromov-Wasserstein optimal transport to get mapping between cells and spots, label transfer of non-transcriptomic data via Unbalanced Optimal Transport | Spatial transcriptomics data, single-cell data | Single-cell data mapped to spatial | Python |
| SpatialEx+[107] | Generates missing spatial omics data for H&E images (SpatialEx), omics cycle modules to establish omics-omics associations | Adjacent spatial omics slices (different omics), corresponding H&E images | Spatial omics profiles aligned with H&E | Python |
| MISO[108] | Extracts low-dimensional embeddings from each modality, calculate outer products of modality-specific embeddings to get interaction feature vectors which serve as input for k-means clustering | Spatial omics data, histology images | Clustered embeddings | Python |
| MaxFuse[109] | Fuzzy smoothed embedding followed by iterative co-embedding, data smoothing, and cell matching | Spatial omics data, single-cell data | Joint embedding coordinates | Python |
NMF: non-negative matrix factorization; JADE: joint alignment and deep embedding; SVD: singular value decompositions; GCN: graph convolutional network; SLAT: spatial-linked alignment tool; MIIT: multi-omics imaging integration toolset; GNN: graph neural network; CNN: convolutional neural network; GAT: graph attentions network; SIMO: spatial integration of multi-omics.
Beyond alignment between slices generated by the same spatial technology, alignment and integration between slices across platforms has become increasingly popular given that no single spatial omics platform simultaneously achieves high spatial resolution, comprehensive molecular coverage, and multimodal measurement. For instance, Pitino et al.[110] applied Xenium, CosMx, Akoya, and integrated them with H&E to achieve multi-modality analysis. Alignment between Xenium and Visium typically relies on registered H&E images with anatomical landmarks, to reduce manual intervention. Landmark-free alignment frameworks are also emerging for automated cross-platform registration. However, as these platforms often require serial sections or distinct samples, spatial alignment remains a significant challenge.
A comprehensive review[111] benchmarked 24 multi-slice ST alignment and integration tools including 10 bayesian inference statistically mapping tools, 10 graph-based tools and 4 image processing and registration tools. As a result, the authors recommended autoencoder-based tools (e.g., STalign, SLAT, SpatiAlign) among all compared methods. SLAT in particular was highlighted for its effectiveness in cross-dataset alignment in seqFISH and Stereo-seq data, and cross-platform alignment between Visium and Xenium, as well as its potential to extend to 3D tissue reconstruction. Another review by Yan et al.[112] provides helpful guidelines for selecting which alignment method is best depending on the user’s input data and priorities for their analysis. They recommend utilizing SPACEL if annotated region labels are available. If they are not, they suggest PASTE2, Spateo, or STAIR depending on the platform used to generate the data. In the case where those methods fail, they recommend CAST, SLAT, STalign, or STAligner.
Alignment and integration methods can be evaluated qualitatively by visually inspecting the overlapping aligned slices to determine how well regions of interest and tissue boundaries align. Alignment can also be assessed quantitatively by calculating the expression similarity of two overlapping regions, with higher similarity meaning better alignment, or by selecting regional “landmarks” in the slices that should align and calculating the degree of overlap. In many cases, spatial omics data also benefits from integration with other non-spatial data modalities, such as using single-cell reference data for cell type labeling. SIMO allows for the integration of scRNA-seq in addition to other single-cell modalities (e.g., scATAC-seq) with sST. Another popular type of integration is molecular data with H&E or immunofluorescence histology, which allows researchers to bridge the gap between molecular phenotypes and classic morphological features. By employing spatial overlays, these multi-modal visualizations provide intuitive biological context, empowering the identification of tissue niches where gene expression correlates precisely with pathological structures.
4. Current Challenges in Spatial Omics Analysis
Despite remarkable technological and computational advances[4], spatial omics faces several persistent challenges that limit accessibility, reproducibility, and biological insight. One of the most persistent issues in the field is accurate cell segmentation in complex tissues. While deep-learning approaches have dramatically improved over classical ones, such as watershed segmentation, they still fail systematically in several biological contexts. Dense tissues where cell boundaries touch or overlap challenge segmentation algorithms that assume clear inter-cellular space. Additionally, morphologically complex cells (e.g., neurons) violate the compact, convex shape assumptions of many segmentation models. Lastly, weak or absent membrane staining, common in poorly fixed tissue or regions with particularly dense extracellular matrices, provides insufficient boundary information for accurate segmentation.
Platform heterogeneity and lack of standardization fragment the spatial omics ecosystem, impeding reproducibility, cross-study comparison, and method generalization[5]. Platforms differ across multiple dimensions including resolution, throughput, sensitivity, coordinate systems, and data formats. Because of this, computational tools can perform inconsistently across platforms. For example, normalization methods appropriate for sequencing-based platforms may be invalid for targeted imaging panels, which may be biased towards the characterization of specific cell types[102,113]. Cross-platform benchmarking is rare, leaving researchers uncertain whether methods validated on one platform will generalize to their data[38]. The lack of community standards compounds these issues. Initiatives to address this issue, like SpatialData, have emerged but are not yet widely adopted.
Another major issue that needs to be addressed is scalability, an issue shared with single-cell studies. Many spatial omics datasets can take up several terabytes of data, particularly high-resolution methods and ones that generate high-resolution images. These images can take up gigabytes of data, and processing these images, including denoising, segmentation, and normalization, can take hours, and often requires access to a GPU. Similarly, spatial statistics often scale quadratically or worse with cell numbers, becoming prohibitive for datasets with hundreds of thousands to millions of cells.
Finally, the scarcity of well-annotated spatial reference atlases limits supervised analysis approaches that have proven powerful in scRNA-seq. While projects like the Human Cell Atlas, Human Protein Atlas, and Tabula Sapiens provide comprehensive single-cell references covering diverse tissues and cell types, spatial references remain fragmented, tissue-specific, and rarely standardized. Existing spatial atlases cover only a subset of tissues and often represent single developmental stages or healthy tissue, lacking disease contexts, species diversity, or technical replicates. This gap forces researchers to rely on transferring labels from scRNA-seq references to spatial data, which can result in inaccurate labeling depending on the reference quality as previously mentioned. Moreover, the spatial context itself provides defining information ignored by reference-based label transfer. A cell expressing certain markers might represent different biological entities depending on its spatial location, and purely transcription-based label transfer cannot capture these spatial identity dimensions.
5. Conclusion
Spatial omics technologies have rapidly expanded our ability to study tissues in their native architectural context, bridging the gap between molecular profiling and spatial organization. From early sequencing-based platforms to increasingly sophisticated imaging-based transcriptomic and proteomic methods, each technological generation has deepened biological insight while introducing new computational challenges. As datasets grow in resolution, multiplexing, and physical scale, analytical frameworks must evolve beyond adaptations of single-cell workflows toward methods that explicitly model spatial structure, tissue morphology, and multimodal complexity[4]. Given the heterogeneity of platform data formats, a universal framework is highly desirable. While Seurat and Anndata can store image information, SpatialData provides native support for raster images and greater flexibility across multiple platforms, making it a promising candidate for a unified data representation in sST. In this review, we have summarized file output formats of commonly used platforms, processing data storage structures, and downstream computational analytical tools.
Looking ahead, the next phase of spatial omics will require methodological innovations that unify molecules, cells, tissue architecture, and morphology into coherent analytical frameworks. Equally important will be efforts to standardize file formats, establish benchmarking datasets, and develop reproducible workflows to ensure that analyses remain transparent and comparable across studies. As spatial profiling of tissues continues to expand, other assays outside of transcriptomics and proteomics have moved into the spatial dimension such as epigenomics and metabolomics. Currently, there are significantly less analysis tools developed with specifically these modalities in mind. Instead, they are often integrated with sST and proteomics as shown in Table 5. As these modalities become more common, there will have to be tools and documented methods for processing and normalizing these datasets. With the continued progress of spatial omics towards higher resolutions, larger sample sizes, and richer multimodal measurements, computational methods will play an increasingly central role in unlocking the biological insights embedded within these data. By addressing current analytical gaps and building frameworks designed for the spatial dimension from the ground up, the field is poised to deliver transformative advances in tissue biology, disease mechanisms, and translational medicine.
Authors contribution
Alexander M, Liu Y: Writing-original draft, visualization.
Dezem FS: Writing-review & editing.
Chasteen H: Visualization.
Plummer J: Conceptualization, writing-review & editing.
Conflicts of interest
The authors declare no conflicts of interest.
Ethical approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Availability of data and materials
Not applicable.
Funding
The work was supported by the Ovarian Cancer Research Alliance (Grant No. ECIG-2022–3-1143) and the Chan Zuckerberg Foundation (Grant Nos. OS00001235 and 2024–345901), all awarded to Jasmine Plummer.
Copyright
© The Author(s) 2026.
References
-
3. Andrews TS, Kiselev VY, McCarthy D, Hemberg M. Tutorial: Guidelines for the computational analysis of single-cell RNA sequencing data. Nat Protoc. 2021;16(1):1-9.[DOI]
-
8. Sui X, Lo JA, Luo S, He Y, Tang Z, Lin Z, et al. Scalable spatial single-cell transcriptomics and translatomics in 3D thick tissue blocks. Nat Methods. 2025;22:2574-2584.[DOI]
-
9. Chen A, Liao S, Cheng M, Ma K, Wu L, Lai Y, et al. Spatiotemporal transcriptomic atlas of mouse organogenesis using DNA nanoball-patterned arrays. Cell. 2022;185(10):1777-1792.[DOI]
-
11. Degatano K, Awdeh A, Cox RS III, Dingman W, Grant G, Khajouei F, et al. Warp analysis research pipelines: Cloud-optimized workflows for biological data processing and reproducible analysis. Bioinformatics. 2025;41(10):btaf494.[DOI]
-
12. Chu YH, Hardin H, Zhang R, Guo Z, Lloyd RV. In situ hybridization: Introduction to techniques, applications and pitfalls in the performance and interpretation of assays. Semin Diagn Pathol. 2019;36(5):336-341.[DOI]
-
13. Liu Y, Dai Y, Wang L. Spatial omics at the forefront: Emerging technologies, analytical innovations, and clinical applications. Cancer Cell. 2026;44(1):24-49.[DOI]
-
14. Righelli D, Weber LM, Crowell HL, Pardo B, Collado-Torres L, Ghazanfar S, et al. SpatialExperiment: Infrastructure for spatially-resolved transcriptomics data in R using Bioconductor. Bioinformatics. 2022;38(11):3128-3131.[DOI]
-
15. Amezquita RA, Lun ATL, Becht E, Carey VJ, Carpp LN, Geistlinger L, et al. Orchestrating single-cell analysis with bioconductor. Nat Meth. 2020;17(2):137-145.[DOI]
-
16. Satija R, Farrell JA, Gennert D, Schier AF, Regev A. Spatial reconstruction of single-cell gene expression data. Nat Biotechnol. 2015;33(5):495-502.[DOI]
-
17. Virshup I, Rybakov S, Theis FJ, Angerer P, Wolf FA. Anndata: Access and store annotated datamatrices. J Open Source Softw. 2024;9(101):4371.[DOI]
-
18. Wolf FA, Angerer P, Theis FJ. SCANPY large-scale single-cell gene expression data analysis. Genome Biol. 2018;19(1):15.[DOI]
-
19. Moses L, Einarsson PH, Jackson K, Luebbert L, Booeshaghi AS, Antonsson S, et al. Voyager: Exploratory single-cell genomics data analysis with geospatial statistics. BioRxiv [Preprint]. 2023.[DOI]
-
20. Pebesma E. Simple features for R: Standardized support for spatial vector data. R J. 2018;10(1):439.[DOI]
-
21. Marconato L, Palla G, Yamauchi KA, Virshup I, Heidari E, Treis T, et al. SpatialData: An open and universal data framework for spatial omics. Nat Meth. 2025;22(1):58-62.[DOI]
-
22. Mitchel J, Gao T, Cole E, Petukhov V, Kharchenko PV. Impact of segmentation errors in analysis of spatial transcriptomics data. BioRxiv [Preprint]. 2025.[DOI]
-
23. Wu L, Beechem JM, Danaher P. Using transcripts to refine image based cell segmentation with FastReseg. Sci Rep. 2025;15:30508.[DOI]
-
24. Stringer C, Pachitariu M. Cellpose3: One-click image restoration for improved cellular segmentation. Nat Meth. 2025;22(3):592-599.[DOI]
-
25. Schmidt U, Weigert M, Broaddus C, Myers G. Cell detection with star-convex polygons. In: Frangi AF, Schnabel JA, Davatzikos C, Alberola-López C, Fichtinger G, editors. Medical image computing and computer assisted intervention–MICCAI 2018; 2018 Sep 16-20; Granada, Spain. Cham: Springer; 2018. p. 265-273.[DOI]
-
26. Heidari E, Moorman A, Unyi D, Pasnuri N, Rukhovich G, Calafato D, et al. Segger: Fast and accurate cell segmentation of imaging-based spatial transcriptomics data. BioRxiv [Preprint]. 2025.[DOI]
-
29. Jones DC, Elz AE, Hadadianpour A, Ryu H, Glass DR, Newell EW. Cell simulation as cell segmentation. Nat Meth. 2025;22(6):1331-1342.[DOI]
-
32. He Y, Tang X, Huang J, Ren J, Zhou H, Chen K, et al. ClusterMap for multi-scale clustering analysis of spatial gene expression. Nat Commun. 2021;12:5909.[DOI]
-
35. Salas SM, Dammann M, Rubens RK, Drummer F, Halle L, Becker S, et al. Exploration of RNA outside segmented cells in spatial transcriptomics reveals extrasomatic RNA organization. BioRxiv [Preprint]. 2025.[DOI]
-
38. Plummer JT, Dezem FS, Cook DP, Park J, Zhang L, Liu Y, et al. Standardized metrics for assessment and reproducibility of imaging-based spatial transcriptomics datasets. Nat Biotechnol. 2025;1-13.[DOI]
-
41. Atta L, Clifton K, Anant M, Aihara G, Fan J. Gene count normalization in single-cell imaging-based spatially resolved transcriptomics. Genome Biol. 2024;25(1):153.[DOI]
-
42. Li W, Mao L, Liu Y, Peng F, Sachs N, Wu W, et al. Toward computationally complete spatial omics. BioRxiv [Preprint]. 2026.[DOI]
-
48. Korotkevich G, Sukhov V, Budin N, Shpak B, Artyomov MN, Sergushichev A. Fast gene set enrichment analysis. BioRxiv [Preprint]. 2016.[DOI]
-
49. Aibar S, González-Blas CB, Moerman T, Huynh-Thu VA, Imrichova H, Hulselmans G, et al. SCENIC: Single-cell regulatory network inference and clustering. Nat Meth. 2017;14(11):1083-1086.[DOI]
-
51. Toro-Domínguez D, Martorell-Marugán J, Martinez-Bueno M, López-Domínguez R, Carnero-Montoro E, Barturen G, et al. Scoring personalized molecular portraits identify Systemic Lupus Erythematosus subtypes and predict individualized drug responses, symptomatology and disease progression. Brief Bioinform. 2022;23(5):bbac332.[DOI]
-
56. Zhong Y, Zhang J, Ren X. Spatial transcriptomics prediction from histology images at single-cell resolution using RedeHist. BioRxiv [Preprint]. 2024.[DOI]
-
59. Blampey Q, Benkirane H, Bercovici N, Mulder K, Gessain G, Ginhoux F, et al. Novae: A graph-based foundation model for spatial transcriptomics data. Nat Meth. 2025;22(12):2539-2550.[DOI]
-
60. Bussi Y, Shainshein D, Ovits E, Posner S, Azulay N, Maimon N, et al. CellTune: An integrative software for accurate cell classification in spatial proteomics. BioRxiv [Preprint]. 2025.[DOI]
-
62. Amitay Y, Bussi Y, Feinstein B, Bagon S, Milo I, Keren L. CellSighter: A neural network to classify cells in highly multiplexed images. Nat Commun. 2023;14:4302.[DOI]
-
64. Ma Y, Zhou X. Spatially informed cell-type deconvolution for spatial transcriptomics. Nat Biotechnol. 2022;40(9):1349-1359.[DOI]
-
65. Zubair A, Chapple RH, Natarajan S, Wright WC, Pan M, Lee HM, et al. Cell type identification in spatial transcriptomics data can be improved by leveraging cell-type-informative paired tissue images using a Bayesian probabilistic model. Nucleic Acids Res. 2022;50(14):e80.[DOI]
-
68. Coleman K, Hu J, Schroeder A, Lee EB, Li M. SpaDecon: Cell-type deconvolution in spatial transcriptomics with semi-supervised learning. Commun Biol. 2023;6:378.[DOI]
-
70. Si Y, Lee C, Hwang Y, Yun JH, Cheng W, Cho CS, et al. FICTURE: Scalable segmentation-free analysis of submicron-resolution spatial transcriptomics. Nat Meth. 2024;21(10):1843-1854.[DOI]
-
73. Traag VA, Waltman L, van Eck NJ. From Louvain to leiden: Guaranteeing well-connected communities. Sci Rep. 2019;9:5233.[DOI]
-
74. Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech. 2008;2008(10):P10008.[DOI]
-
75. Yuan Z, Zhao F, Lin S, Zhao Y, Yao J, Cui Y, et al. Benchmarking spatial clustering methods with spatially resolved transcriptomics data. Nat Meth. 2024;21(4):712-722.[DOI]
-
78. Singh A, Cakmak P, Lun JH, Macas J, Plate KH, Reiss Y, et al. Benchmarking cell-type deconvolution in cross-platform transcriptomic data. BioRxiv [Preprint]. 2025.[DOI]
-
80. Palla G, Spitzer H, Klein M, Fischer D, Schaar AC, Kuemmerle LB, et al. Squidpy: A scalable framework for spatial omics analysis. Nat Meth. 2022;19(2):171-178.[DOI]
-
81. Tan Y, Kempchen TN, Becker M, Haist M, Feyaerts D, Liu J, et al. SPACEc: A streamlined, interactive Python workflow for multiplexed image processing and analysis. Nat Commun. 2025;16:10652.[DOI]
-
82. Andrei P, Grieco M, Acha-Sagredo A, Dhami P, Fung K, Rodriguez-Justo M, et al. Kandinsky: Enabling neighbourhood analysis of spatial omics data for functional insights on cell ecosystems. BioRxiv [Preprint]. 2025.[DOI]
-
84. Cang Z, Zhao Y, Almet AA, Stabell A, Ramos R, Plikus MV, et al. Screening cell–cell communication in spatial transcriptomics via collective optimal transport. Nat Meth. 2023;20(2):218-228.[DOI]
-
86. Shao X, Li C, Yang H, Lu X, Liao J, Qian J, et al. Knowledge-graph-based cell-cell communication inference for spatially resolved transcriptomic data with SpaTalk. Nat Commun. 2022;13:4429.[DOI]
-
88. Li H, Ma T, Hao M, Guo W, Gu J, Zhang X, et al. Decoding functional cell–cell communication events by multi-view graph learning on spatial transcriptomics. Brief Bioinform. 2023;24(6):bbad359.[DOI]
-
91. Cesaro G, Nagai JS, Gnoato N, Chiodi A, Tussardi G, Klöker V, et al. Advances and challenges in cell–cell communication inference: A comprehensive review of tools, resources, and future directions. Brief Bioinform. 2025;26(3):bbaf280.[DOI]
-
93. Lopez R, Regier J, Cole MB, Jordan MI, Yosef N. Deep generative modeling for single-cell transcriptomics. Nat Meth. 2018;15(12):1053-1058.[DOI]
-
94. Ludington L, Ouardini K, Secheresse X, Loeb R, Pignet A, Domingues OD, et al. Comprehensive benchmarking of batch integration methods for spatial transcriptomics using a large-scale cancer atlas. BioRxiv [Preprint]. 2026.[DOI]
-
97. Guo Y, Liu JS, Cheng H, Ma Y. JADE: Joint Alignment and Deep Embedding for Multi-Slice Spatial Transcriptomics. BioRxiv [Preprint]. 2025.[DOI]
-
100. Zhang C, Liu L, Zhang Y, Li M, Fang S, Kang Q, et al. spatiAlign: An unsupervised contrastive learning model for data integration of spatially resolved transcriptomics. GigaScience. 2024;13:giae042.[DOI]
-
101. Wess M, Midtbust E, Guillem JCC, Viset T, Størkersen Ø, Krossa S, et al. Spatial integration of multi-omics data from serial sections using the novel Multi-Omics Imaging Integration Toolset. GigaScience. 2025;14:giaf035.[DOI]
-
102. Zhou X, Dong K, Zhang S. Integrating spatial transcriptomics data across different conditions, technologies and developmental stages. Nat Comput Sci. 2023;3(10):894-906.[DOI]
-
104. Lou Y, Li X, Yang Q, Dai H, Ma K, Zuo C. Vector-guided graph learning for spatial multi-slice multi-omics alignment. Cell Rep Meth. 2025;5(12):101241.[DOI]
-
105. Liu Y, Ma K, Xu H, Xu K, Hu Y, Lin Z, et al. Interpretable spatial multi-omics data integration and dimension reduction with SpaMV. BioRxiv [Preprint]. 2025.[DOI]
-
106. Yang P, Jin K, Yao Y, Jin L, Shao X, Li C, et al. Spatial integration of multi-omics single-cell data with SIMO. Nat Commun. 2025;16:1265.[DOI]
-
107. Liu Y, Wang C, Wang Z, Chen L, Li Z, Song J, et al. High-parameter spatial multi-omics through histology-anchored integration. Nat Meth. 2026;23(2):373-386.[DOI]
-
108. Coleman K, Schroeder A, Loth M, Zhang D, Park JH, Sung JY, et al. Resolving tissue complexity by multimodal spatial omics modeling with MISO. Nat Meth. 2025;22(3):530-538.[DOI]
-
110. Pitino E, Pascual-Reguant A, Segato-Dezem F, Wise K, Salvador-Martinez I, Crowell HL, et al. STAMP: Single-cell transcriptomics analysis and multimodal profiling through imaging. Cell. 2025;188(18):5100-5117.[DOI]
-
111. Khan M, Arslanturk S, Draghici S. A comprehensive review of spatial transcriptomics data alignment and integration. Nucleic Acids Res. 2025;53(12):gkaf536.[DOI]
-
112. Yan Y, Gu T, Sun C, Zhang Y, Cui Y, Lin S, et al. Benchmarking alignment methods for spatial transcriptomics data. Nat Comput Sci. 2026;1-18.[DOI]
-
113. Atta L, Clifton K, Anant M, Aihara G, Fan J. Gene count normalization in single-cell imaging-based spatially resolved transcriptomics. BioRxiv [Preprint]. 2024.[DOI]
Copyright
© The Author(s) 2026. This is an Open Access article licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, sharing, adaptation, distribution and reproduction in any medium or format, for any purpose, even commercially, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Publisher’s Note
Share And Cite



