Spatial information matters: are traditional imputation methods effective for spatial transcriptomics data?
Recent advancements in spatially resolved transcriptomics (SRT) have enabled near single-cell resolution, providing rich spatial context crucial for uncovering biological insights. However, high-resolution SRT datasets remain sparse and prone to dropout events that may impede accurate interpretation. Computational imputation methods are often employed to recover missing values, yet existing state-of-the-art (SOTA) techniques—designed for tabular, single-cell RNA, or general SRT data—have not been systematically benchmarked on datasets produced by newer SRT technologies. In this study, we evaluate seven SOTA imputation methods across five emerging SRT platforms encompassing 23 datasets. Our results reveal that no single method consistently excels, with most struggling to accurately identify valid dropouts. Motivated by these limitations, we introduce ‘SpaMean-Impute’, a novel imputation method tailored for SRT datasets that incorporates spatial information to mitigate dropout effects and detect valid dropouts. Our proposed method outperforms the SOTA imputation methods across evaluation metrics, such as adjusted rand index (ARI), normalized mutual information (NMI), adjusted mutual information (AMI), and homogeneity (HOMO). In case of ARI, the proposed method outperforms the SOTA methods on average 16.15%, whereas 18.45% improvement in NMI, 18.96% in AMI, and 13.98% in the case of HOMO. Furthermore, the proposed method is computationally efficient compared with other SOTA methods. For example, compared with the SOTA deep-learning-based imputation methods, the proposed method is \\∼33\\times faster and requires, on average, 1500 MB less memory during imputation. Moreover, our approach offers notable computational efficiency. Source code, datasets, and benchmarking scripts are available at: https://github.com/FahimHafiz/SpaMean-Impute.