Supercomputing in the Biological Sciences: Toward Zettascale and Yottascale Simulations
DescriptionBiological systems present some of the most demanding, compute intensive high performance computing applications. Mechanistic understanding of viruses and molecular machines, as well as computational drug design efforts, often require calculations of free energies. Free energy calculations require enormous amounts of conformational sampling to achieve equilibrium thermodynamics. Even modest amounts of sampling (e.g. 1 millisecond of physiological time) require 10^12 time steps. Due to the electrostatic charges present, long-range electrostatic forces play important roles. Thus, biological simulations are often much more intensive than materials science applications, which typically do not include long-range electrostatic interactions. Additional factors of complexity, such as the fact that many processes are far from equilibrium and that chemical reactions can be critical (requiring quantum mechanical calculations), further complicate these systems. If we neglect chemical reactions and non-equilibrium effects, we estimate that simulating 1 second of physiological time for the human genome (in the case of 23 chromosomes) would require at least 10 YF (1 YF = 10^24 FLOPs). While these calculations are far beyond the scope of current platforms, they provide a roadmap for the way forward in biomolecular simulation. To strive toward this vision, we perform large-scale explicit solvent molecular dynamics simulations feasible on current platforms and also scope out much larger systems using coarse-grained approaches using a multiresolution strategy. Such simulations play an important role in integrating disparate forms of experimental data into a single coherent picture. We used explicit solvent MD simulations (2.64 million atoms) to identify the accommodation corridor in the ribosome, critical for tRNA selection during protein synthesis (Sanbonmatsu, et al., PNAS, 2005). Microsecond explicit solvent simulations of the ribosome (2.2 million atoms) also laid the foundations for our energy landscape calculations using all-atom structure-based simulations of spontaneous accommodation events (Whitford, et al., PLoS Comput. Biol., 2013; Whitford, et al., RNA, 2010). We are applying a similar strategy to chromatin architecture, which plays a key role in embryo development, brain function and cancer. As a first step, we have performed the first explicit solvent simulation of an entire gene locus (GATA4), consisting of 427 nucleosomes and over one billion atoms (the first published billion atom biomolecular simulation) (Jung, et al., J. Comp. Chem. 2019). We will also describe coarse-grained simulations of the X-chromosome consistent with high throughput capture sequencing data (Lappala, et al., PNAS, 2021), which help us to scope more detailed and more intensive simulations.
Event Type
Invited Talk
TimeWednesday, 16 November 20221:30pm - 2:15pm CST
LocationDallas Ballroom/Omni Hotel
TP
XO/EX
Recorded