SC22 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Workshops Archive

GPU-Accelerated Differential Dependency Analysis of Single-Cell Transcriptomics Data

Workshop: Eighth Computational Approaches for Cancer Workshop (CAFCW22)

Authors: Gil Speyer (Arizona State University) and Xishuang Dong and Seungchan Kim (Prairie View A&M University)

Abstract: Complex diseases such as cancer and neurological disorders require a systemic approach to understand underlying causes and identify therapeutic targets to help patients. More comprehensive analyses, however, often bring significant computational challenges. EDDY (Evaluation of Differential DependencY) is a computational method to identify rewiring of biological pathways between biological conditions such as drug responses or subtypes of disease [1]. Through its probabilistic framework with resampling and permutation, aided by the incorporation of annotated gene sets, EDDY demonstrated superior sensitivity to other methods. Further development integrated prior knowledge into these interrogations [2]. However, the considerable computational cost for this statistical rigor limited its application to larger datasets. Fortunately, ample and independent computation coupled with manageable memory footprint positioned EDDY as a strong candidate for graphical processing unit (GPU) implementation. With custom kernels to decompose the independence test loop, network construction, network enumeration, and Bayesian network scoring to accelerate the computation. GPU-accelerated EDDY consistently benchmarked at two orders of magnitude in performance enhancement [3]. EDDY has been applied to the determination of rewired pathways controlling differing small molecule responses in cancer cell lines [4]. Further investigations extended this to pathways associated with pulmonary hypertension [5].

Recent emergence of single cell transcriptomic and spatial transcriptomic data raises additional computational challenges, mainly due to an order of magnitude increase in sample size, compared to bulk cell transcriptomic data, often bringing the number of samples to analyze to hundreds of thousands of cells (samples). This called for additional optimization of the existing EDDY-GPU codes. By working with a NVIDIA team through Princeton Hackathon 2022, we were able to dramatically increase the computational speed of the EDDY-GPU. New sampling strategies has been implemented to adjust to samples counts at this scale. In addition, the latest code development phase identified various performance bottlenecks, which not only improved acceleration but allowed for the incorporation of even larger gene sets, such as immune pathways. Hence, EDDY’s statistical rigor can now be brought to bear in the inference of specific diagnostic and treatment strategies for the individual patient, and with an implementation that allows this data analysis to be run on a physician’s desktop within reasonable time. We will present preliminary results using this newly improved EDDY-GPU with single cell transcriptomic data from cancer, Alzheimer’s disease, and pulmonary hypertension.

Back to Eighth Computational Approaches for Cancer Workshop (CAFCW22) Archive Listing

Back to Full Workshop Archive Listing