PIs in Johns Hopkins University active in the last 90 days
Allocations with low numbers of SUs (10,000 or less) are usually those used as educational allocations, or are given as startup allocations, or extensions. Allocations with less than 10 SUs are usually used for storage purposes.

Go back Choose a different time period.

Name Project Title Teragrid Resource Discipline Board Type Base Allocation
Enis Afgan Galaxy Gateway Infrastructure Development IU/TACC Jetstream Computer and Computation Research Startup 300,000
" " IU/TACC Storage (Jetstream Storage) " " 2,000
Lauren Corlies Figuring Out Gas & Galaxies in Enzo (FOGGIE): Resolving the Small-Scale Structure of Gas Flows in the Circumgalactic Medium TACC Dell/Intel Knights Landing, Skylake System (Stampede2) Extragalactic Astronomy and Cosmology Research 40,106
" " TACC Long-term tape Archival Storage (Ranch) " " 30,000
Kiara Eldred Mechanisms of photoreceptor subtype specification IU/TACC Jetstream Developmental Biology Startup 50,000
Mallory Freeberg Developing workflows for post-transcriptional gene regulation analyses. IU/TACC Jetstream Biological Sciences Startup 100,000
Rigoberto Hernandez Nonequilibrium Molecular Dynamics Simulations, VII PSC Regular Memory (Bridges) Chemistry Research 2,764,229
" " SDSC Comet GPU Nodes (Comet GPU) " " 171,429
" " SDSC Dell Cluster with Intel Haswell Processors (Comet) " " 136,757
" " SDSC Medium-term disk storage (Data Oasis) " " 10,000
" " TACC Dell PowerEdge C8220 Cluster with Intel Xeon Phi coprocessors (Stampede) " " 6,144
" " TACC Long-term tape Archival Storage (Ranch) " " 2,000
" " TACC Dell/Intel Knights Landing, Skylake System (Stampede2) " " 1,065
" " PSC Storage (Bridges Pylon) " " 500
Margaret Johnson Modeling protein interactions and assembly at the cellular scale LSU Cluster (superMIC) Biophysics Research 100,000
Anthony Kolasny Campus Champions Allocations for Johns Hopkins University (TRA130018) SDSC Dell Cluster with Intel Haswell Processors (Comet) Educational Infrastructure Campus Champions 50,000
" " LSU Cluster (superMIC) " " 50,000
" " IU/TACC Jetstream " " 50,000
" " PSC Regular Memory (Bridges) " " 40,000
" " TACC Dell/Intel Knights Landing, Skylake System (Stampede2) " " 1,563
" " PSC Large Memory Nodes (Bridges Large) " " 1,000
" " PSC Storage (Bridges Pylon) " " 500
" " TACC Long-term tape Archival Storage (Ranch) " " 500
" " TACC Dell PowerEdge C8220 Cluster with Intel Xeon Phi coprocessors (Stampede) " " 0
" " Open Science Grid (OSG) " " 0
Benjamin Langmead Scaling core genomics algorithms on Xeon Phi LSU Cluster (superMIC) Computer and Information Science and Engineering Research 1,000,000
" " TACC Dell/Intel Knights Landing, Skylake System (Stampede2) " " 149,318
" " TACC Long-term tape Archival Storage (Ranch) " " 500
Kevin Manalo Campus Champion for Johns Hopkins University LSU Cluster (superMIC) Training Campus Champions 50,000
" " PSC Regular Memory (Bridges) " " 50,000
" " IU/TACC Jetstream " " 50,000
" " SDSC Dell Cluster with Intel Haswell Processors (Comet) " " 50,000
" " Open Science Grid (OSG) " " 50,000
" " XStream/Stanford University GPU Supercomputer (Cray CS-Storm, Intel Ivy-Bridge, NVIDIA K80) " " 5,000
" " SDSC Comet GPU Nodes (Comet GPU) " " 2,500
" " PSC Bridges GPU (Bridges GPU) " " 2,500
" " TACC Dell/Intel Knights Landing, Skylake System (Stampede2) " " 1,600
" " PSC Large Memory Nodes (Bridges Large) " " 1,000
" " PSC Storage (Bridges Pylon) " " 500
" " SDSC Medium-term disk storage (Data Oasis) " " 500
" " TACC Long-term tape Archival Storage (Ranch) " " 500
Rajat Mittal Multi-Physics Computational Modeling of Cardiac Flows and Heart Murmurs Using a Parallelized Immersed Boundary Solver TACC Dell/Intel Knights Landing, Skylake System (Stampede2) Fluid, Particulate, and Hydraulic Systems Research 136,821
" " TACC Dell PowerEdge C8220 Cluster with Intel Xeon Phi coprocessors (Stampede) " " 136,721
" " PSC Regular Memory (Bridges) " " 75,906
" " TACC Long-term tape Archival Storage (Ranch) " " 5,000
" " PSC Storage (Bridges Pylon) " " 500
Timothy Mueller Graduate-level course in computational materials design TACC Dell/Intel Knights Landing, Skylake System (Stampede2) Materials Research Educational 1,302
" " TACC Long-term tape Archival Storage (Ranch) " " 500
" First-principles study of materials for energy storage and conversion TACC Dell/Intel Knights Landing, Skylake System (Stampede2) " Research 8,891
" " TACC Long-term tape Archival Storage (Ranch) " " 500
Elizabeth Ploetz Coarse-grained model of amyloid beta interactions with organelle-specific bilayers as a function of pH LSU Cluster (superMIC) Biophysics Startup 50,000
" " PSC Regular Memory (Bridges) " " 50,000
" " SDSC Dell Cluster with Intel Haswell Processors (Comet) " " 50,000
" " TACC Dell/Intel Knights Landing, Skylake System (Stampede2) " " 1,563
" " TACC Long-term tape Archival Storage (Ranch) " " 500
" " SDSC Medium-term disk storage (Data Oasis) " " 500
" " PSC Storage (Bridges Pylon) " " 500
" " TACC Dell PowerEdge C8220 Cluster with Intel Xeon Phi coprocessors (Stampede) " " 0
James Taylor The Galaxy XSEDE Gateway IU/TACC Jetstream Biological Sciences Research 2,000,000
" " PSC Storage (Bridges Pylon) " " 30,000
" " PSC Large Memory Nodes (Bridges Large) " " 17,000
" " TACC Dell/Intel Knights Landing, Skylake System (Stampede2) " " 9,205
" " TACC Long-term tape Archival Storage (Ranch) " " 2,048
" " XSEDE Extended Collaborative Support " " 1
Winston Timp Long-read sequencing analysis of non-model organisms IU/TACC Jetstream Genetics and Nucleic Acids Startup 50,000
Daniel Tward Computational Anatomy Gateway SDSC Dell Cluster with Intel Haswell Processors (Comet) Neuroscience Biology Research 700,000
" " TACC Dell/Intel Knights Landing, Skylake System (Stampede2) " " 111,495
" " SDSC Comet GPU Nodes (Comet GPU) " " 10,000
" " TACC Long-term tape Archival Storage (Ranch) " " 500
" " SDSC Medium-term disk storage (Data Oasis) " " 500
" " XSEDE Extended Collaborative Support " " 1
Chenguang Wang A Global Sensitivity Analysis Software for Monotone Missing Data PSC Regular Memory (Bridges) Statistics and Probability Startup 50,000
" " PSC Storage (Bridges Pylon) " " 500
Tamer Zaki Direct numerical simulations of transitional and turbulent boundary layers SDSC Dell Cluster with Intel Haswell Processors (Comet) Fluid, Particulate, and Hydraulic Systems Research 1,038,900
" " SDSC Medium-term disk storage (Data Oasis) " " 4,000
Close

Project Abstract

Galaxy Gateway Infrastructure Development

PI: Enis Afgan



The Galaxy project (galaxyproject.org) enables multiple avenues of access to its flagship product, the Galaxy application: access via a web service such as Galaxy Main (usegalaxy.org), local installation, and self-service cloud deployments. This XSEDE startup allocation focuses on the development of the technologies enabling cloud deployments and is in contrast to the allocation that focuses on resources delivered to the researchers via the Galaxy Main application (usegalaxy.org). In the upcoming year, we will focus on developing a more integrated/portable solution for how Galaxy utilizes Jetstream followed by focusing on scaling. For the portability aspect, we will extend CloudLaunch application (https://github.com/galaxyproject/cloudlaunch/) to operate natively with container technologies and develop a new version of the CloudMan application (https://github.com/galaxyproject/cloudman/tree/v2.0) that will offer infrastructure and management capabilities for Galaxy as well as other applications. Integrating support for the container technologies into these applications will allow for the build-one-run-anywhere model to take place. In particular, it will be possible to leverage artifacts that already exist in the community (e.g., Galaxy flavors) and make those readily available on Jetstream. Once the development of the initial deployment process is sufficiently functional, we will focus on scaling the infrastructure available to Galaxy. Scaling will be achieved by deploying and (re)configuring a cluster management system atop provisioned resources (e.g., Kubernetes) while handling the necessary integration steps for the deployed applications. In addition to the development efforts for this allocation, Galaxy project holds numerous training events. Traditionally, we have used AWS instances for a number of training events. As the deployment integrations with Jetstream solidify, we will increasingly transition to using Jetstream for supplying the necessary resources to the training participants.
Close

Project Abstract

Figuring Out Gas & Galaxies in Enzo (FOGGIE): Resolving the Small-Scale Structure of Gas Flows in the Circumgalactic Medium

PI: Lauren Corlies



How galaxies acquire, process, expel, and re-acquire their gas has long been at the heart of the open questions in galaxy evolution. Yet, while the diffuse gaseous halos surrounding galaxies, known as the circumgalactic medium (CGM), plays host to these large-scale flows, the CGM has been chronically under-resolved in essentially every hydrodynamic cosmological simulation of galaxy evolution to date. Here, we request computing resources to carry out two projects aimed at understanding the small-scale physics of the CGM and how it relates to galaxy evolution: 1. Using a novel refinement method, we will resolve the CGM over half of cosmic time to unprecedented small scales (< 500 parsecs). Our preliminary tests indicate that resolution has a profound effect on the physical morphology, density, temperature, ionization, and chemical structure of the CGM. We will then explore a range of different star formation feedback models, many of which have been "successful" at reproducing the observations in simulations with lower CGM resolution. The bulk of these simulations will be for a single halo to enable easier comparisons, with additional runs for the "best" model in 5 other test halos to test for consistency and halo-to-halo variation. 2. Observations suggest that structures driving the absorption seen in spectral data are due to small dense clouds of cold gas residing in a diffuse, hot ambient medium. We will simulate for the first time how stripped material from such clouds mixes in conditions similar to those found in the outer galactic halo. With resolutions as small as 1 pc, we will conduct a parameter study of 16 simulations to place constraints on which physical properties of the clouds are most important for shaping the observed absorption line profiles. Combined, we request 202,000 SUs on Stampede2 with 118Tb of persistent backup on Ranch. Simultaneously studying the CGM at cosmological and parsec scales will provide unique insights into the small-scale, multiphase nature of the CGM that will inform both future simulations as well as future interpretations of observations.
Close

Project Abstract

Mechanisms of photoreceptor subtype specification

PI: Kiara Eldred



A central challenge in developmental neurobiology is to understand how the myriad types of neurons in the human nervous system are generated. An attractive context to study how neurons choose their fates is the vertebrate retina, with its limited number of subtypes arranged in a well-defined layered structure. However, despite decades of research, the current developmental model is surprisingly vague, with the >60 neuronal subtypes of the retina specified in broad temporal windows and further diversified by stochastic (i.e. random) mechanisms. Though specific signaling pathways and transcription factors have been implicated in these mechanisms, a fundamental molecular and developmental understanding of cell fate specification in the vertebrate retina has not been achieved. Moreover, developmental features unique to the primate retina, such as the stochastic patterning of three cone subtypes, have been intractable to mechanistic studies. We propose an ambitious project to elucidate the temporal and stochastic mechanisms controlling the specification of the color-detecting cone photoreceptors in the human retina. The stochastic patterning of three cone cell subtypes, sensitive to red, green, or blue light, enables trichromatic color vision, a property unique to primates among mammals. Specification of cone subtypes is a two-step process, with a temporal decision to take on blue or red/green fate, followed by a stochastic choice to take on red or green fate. To determine the mechanisms controlling cone subtype fate, we have adapted a powerful system to differentiate human retinal organoids from induced pluripotent stem cells (iPSCs). These “retinas in a dish” recapitulate what is known about human cone cell specification, including gene expression patterns, cell morphologies, and developmental timing. This system enables us to address questions about human development for the first time on multiple scales: molecular, genetic, cellular, developmental, and evolutionary. With this retinal organoid system, we will test how timing of retina-intrinsic thyroid hormone signaling controls blue and red/green cone fates. We will assess how natural variation in DNA elements affects DNA looping to control stochastic gene expression and red and green cone fates. We will differentiate non-human primate retinal organoids to evaluate the hypothesis that human trichromatic color vision arose from a recombination event involving the genes encoding the red and green-detecting opsins. Finally, we will use single cell RNAseq to track the developmental trajectories of photoreceptors in order to understand the temporal mechanisms controlling fate specification. The resources granted to us by XSEDE will allow us to analyze RNA-seq and single cell RNA-seq data from retinal organoids at different developmental time points, align sequencing results from over 700 individuals, run GWAS, and analyze CHIP and ATAC-seq data from photoreceptors. Together, these experiments will provide insights into cone subtype specification that will be crucial for therapeutic applications and will lay the groundwork for deciphering the temporal and stochastic mechanisms that produce all of the neuronal subtypes of the human retina.
Close

Project Abstract

Developing workflows for post-transcriptional gene regulation analyses.

PI: Mallory Freeberg



Regulation of gene activity at the level of mRNAs is central to many aspects of cell biology, development, and disease. The 3’-untranslated regions of transcripts in particular are major sites of post-transcriptional regulation by RNA-binding proteins (RBPs) and small RNAs (sRNAs). Next-generation sequencing applications are extremely useful for studying and understanding molecular processes mediated by RBPs and sRNAs, which means that large datasets are generated by biologists aiming to answer biological questions in this field. Currently, processing and analyzing such large datasets is typically farmed out to computational cores or research groups; however, many biologists would prefer to analyze their own data in order to more fully understand and interpret them. In order to facilitate bench scientists analyzing their own data, the Galaxy platform offers infrastructure, computing resources, and GUIs to execute computational tools on large biological datasets. The final component to the Galaxy platform is training materials to help biologists use these established resources. Here, we will develop training materials for analyzing datasets related to RBP- and sRNA-mediated regulation of gene expression in an effort to facilitate biologists using Galaxy for their own data analysis needs. In addition, we continue collaborating with biologists to explore gene regulation as it relates to their own topics of research.
Close

Project Abstract

Nonequilibrium Molecular Dynamics Simulations, VII

PI: Rigoberto Hernandez



Continuing our ongoing work on XSEDE, a series of simulations are proposed to enhance the compu- tational effort in support of our subcontract to NSF Grant #CHE-1503408 (CCI Center for Sustainable Nanotechnology [CSN]), and the just renewed NSF Grant #CHE-1700749 on “Dynamical Consistency in Nonequilibrium Molecular Dynamics Simulations and Applications.” 2.0 Million SUs on Stampede are requested for nonequilibrium simulations of particles with varying degrees of surface patterning and structure spanning a range of specificity from a mean-field model of Janus particles, and the elaboration of stochastic hard collision (SHC) model particles as a simplified solvent for chemical reactions. 3.40 million SUs on Stampede are requested to perform all-atom and coarse-grained models of sustainable nanoparticles interacting with each other or membranes. 3.80 Million SUs on Comet are requested to perform adaptive steered molecular dynamics (ASMD) simulations of increasingly complex proteins. In total, we are requesting 5.40 Million SUs on Stampede, and 3.80 Million SUs on Comet so as to perform a series of specific nonequilibrium simulations that will advance our understanding of structure and dynamics in a broad range of nanomaterials across multiple scales.
Close

Project Abstract

Modeling protein interactions and assembly at the cellular scale

PI: Margaret Johnson



Current estimates of protein interactions in a single cell such as yeast enumerate ~6000 proteins and approximately 26000 distinct interactions between them [1]. While the physical laws that govern these biomolecular interactions are known, it is a significant challenge to extend our knowledge of molecular forces to predict how such a network of many stochastic interacting components behaves in the cell. In my research group at John Hopkins we use theory and computational modeling to predict pathways and mechanisms of multi-protein self-organization processes in the cell using accurate spatial and temporal dynamics. Our current emphasis is on understanding protein recruitment and assembly on the membrane, such as occurs in clathrin-mediated endocytosis (CME) [2]. We develop novel algorithms [3] for detailed single-particle reaction diffusion (RD) simulations of multi-protein assembly, and additionally use molecular simulation and experiment to quantify the rates of individual protein interactions. The RD simulations we perform to study mechanisms of protein assembly require binding rates as parameters. Although many of the rates for have been determined experimentally, they are only for solution binding (3D), and the rate constants for proteins diffusing on the membrane surface (2D) must be approximated. To better quantify these 2D rate constants and develop a generalized approach for best predicting 2D rates from 3D values, we will use molecular dynamics (MD) simulation coupled with enhanced sampling techniques [4] to calculate binding between specific adaptor proteins involved in CME in humans. To this end, we initially need to study the thermodynamics of protein-protein interactions to decipher the binding sites and binding modes. We will employ the metadynamics approach [5] to calculate the free energy landscape of protein interactions. A major emphasis in the renewal cycle will be placed on the MD simulations, which were not performed in the initial proposal, and demand a large increase in computational resources.
Close

Project Abstract

Campus Champions Allocations for Johns Hopkins University (TRA130018)

PI: Anthony Kolasny



The Campus Champions Allocations for Johns Hopkins University (JHU) provides initial XSEDE resources as researchers initiate their allocation grants. It has provided computational time for initial porting and preliminary results.
Close

Project Abstract

Scaling core genomics algorithms on Xeon Phi

PI: Benjamin Langmead



With the advent of high-throughput life science instruments such as massively parallel DNA sequencers, the bottlenecks in life science research are increasingly computational. Scientists depend crucially on efficient, accurate software tools for analyzing sequencing datasets. We propose to optimize three read alignment tools written in my lab, Bowtie, Bowtie 2 and Vargas, as well as a fourth tool called HISAT. These are widely used tools for solving two variants of the read alignment problem: standard DNA alignment, and spliced RNA alignment. Our aim is to substantially improve thread scaling for these tools on the Intel Xeon Phi Knight's Landing architecture. This requires that we use VTune to investigate existing scaling bottlenecks. It also requires that we perform thread-scaling experiments across a wide range of tool parameters and numbers of simultaneous threads.
Close

Project Abstract

Campus Champion for Johns Hopkins University

PI: Kevin Manalo



We are looking to support users needing to expand beyond local resources (local = MARCC = Maryland Advanced Research Computing Center).
Close

Project Abstract

Multi-Physics Computational Modeling of Cardiac Flows and Heart Murmurs Using a Parallelized Immersed Boundary Solver

PI: Rajat Mittal



Heart disease is the single most consequential disease in the industrialized world and current trends point to a worsening outlook. The thesis of the current research is that transformative ideas in modeling, computation and analysis of cardiac blood flows and associated heart sound can enable highly effective and inexpensive simulation-guided therapies and diagnostic tools for heart diseases in the not-too-distant future. Understanding cardiac flows, especially inside the left ventricle, is an essential first step to diagnosing and treating several heart related diseases. Numerous efforts have been made in analyzing cardiac flows over the past two decades through in-vitro, in-vivo, experiments and computational methods. Though medical imaging techniques such as phase contrast MRI, CT, etc., provide details about the anatomy and some gross features of the flow, not much insight related to complex fluid behavior can be obtained using these approaches. Most of the computational and laboratory (in-vitro) studies have employed simplified geometries while attempting to match physiological values in terms of key hemodynamic parameters, and ignored pivotal interactions between heart flexible tissues, especially heart valves, and complex flow patterns inside heart chambers. More importantly, the role of cardiac blood flow on the heart function and cardiac disease progression have been yet fully explored. Heart sounds contain much information on the cardiac flow and associated cardiovascular health and disease. The auscultation of heart sound is effective and non-invasive diagnostic modality. However, underlying physics of heart sound generation still have made little inroads into auscultation primarily. Due to lack of understanding of causal mechanisms of heart sounds, correlations between heart sounds and underlying pathology are currently based primarily on deduction, inference and heuristics. The long-term goal of the current project is to develop image-based, biophysically-detailed, multi-scale, multi-physics models of the ventricular blood flows, coagulation biochemistry and heart sounds in health and disease that run efficiently on large-scale, CPU clusters, and to use these models to understand the biophysics of cardiac dysfunction and therapy. In this year, we will investigate i) left ventricular thrombus inhibition by anticoagulants in infarcted left ventricles, ii) thrombus formation in bioprosthetic transcatheter aortic valves, iii) heart sound generation and propagation in the realistic human thorax, and iv) pulsatile blood flow dynamics in a stenosed aorta.
Close

Project Abstract

Graduate-level course in computational materials design

PI: Timothy Mueller



In the coming semester I will be teaching a graduate-level course on computational materials design, with a focus on predictive methods for calculating the properties of materials. The course (EN.510.633, "Computational Materials Design") can be found by searching the following web site from the Johns Hopkins registrar's office: https://sis.jhu.edu/classes/default.aspx Methods to be covered include potential models / force fields, density functional theory, quantum Monte Carlo, molecular dynamics, cluster expansions, and machine learning. The course will include five computational laboratory assignments in which the students will be asked to apply the methods they have learned to calculate material properties. Seventeen students have already registered for the class, and resources are requested for up to twenty students. This course has been taught three times before (Fall 2013, Fall 2014, and Fall 2016) using the TACC Stampede supercomputer, and resources are requested to teach the course on Stampede2. Many of the students taking the course are unfamiliar with modern high-performance computing environments, and access to the XSEDE resources gives these students valuable experience in running modern software packages in a realistic research environment. The software for this course was avaialble on TACC Stampede, but one of the software packages (the General Utiltity Lattice Program, or GULP) is not yet available on Stampede2. If this allocation is granted, I will request that GULP is installed for this course. To repeat the success of this course in prior years, I am requesting to renew this educational grant.
Close

Project Abstract

First-principles study of materials for energy storage and conversion

PI: Timothy Mueller



Computational resources on TACC Stampede 2 are requested to support three research projects. In the first project, resources are requested to calculate the adsorption energies of O, OH, and OOH on the surfaces of Ni-Pt nanoparticles, to developed an improved model of the structure and properties of these nanocatalysts in oxidizing conditions. In the second project, resources are requested to construct cluster expansions of CO and H adsorbed on different Cu surfaces, to improve our understanding of how surface coverage impacts the catalytic properties of Cu for CO2 and CO reduction. In the third project, resources are requested to generate cluster expansions of hydrogen adsorbed on the surfaces of transition metal phosphides and platinum, to better understand the mechanism of the hydrogen evolution reaction on transition metal phosphides and how they compare to platinum. Detailed benchmark and scaling information for the calculations to be performed on Stampede is provided.
Close

Project Abstract

Coarse-grained model of amyloid beta interactions with organelle-specific bilayers as a function of pH

PI: Elizabeth Ploetz



OVERVIEW Much Alzheimer’s disease (AD) research is focused on elucidating which neuronal organelles are the predominant producers of the amyloid-β peptide (Aβx). Researchers want to know if the overproduction of Aβx that occurs in this disease is due to a mistrafficking of the amyloid precursor protein (APP) and/or the β-secretase enzyme (BACE-1) that processes APP in AD such that these molecules have increased residence in organelles that promote Aβx production. Recent publications from 2013-2014 have shown that transgenic models of AD possessing mutations in Presinilin 1 (PS1) have defective vacuolar ATPases. Subsequently, lysosomes are too alkaline and Aβx degradation and clearance is inhibited. Unfortunately, these findings are irrelevant for the vast majority of AD cases, which do not involve mutations to PS1. It is known, however, that in sporadic AD (SAD) there is a decreased intracellular pH. How this acidosis affects organellar pH has not been investigated. Since BACE-1 has an acidic pH optimum, determining how the intracellular acidosis of SAD modifies the pH of the organelles within which APP is processed may help pinpoint which organelle has the greatest potential to produce Aβx and what specific environmental modification could be therapeutic. The Aim of the proposed work is to gain a molecular level understanding from computer simulations of how changes in the lumenal pH of crude organelle models (double bilayer "organelle cross-sections" with organelle-specific lipid compositions) impact Aβ42 self-aggregation and Aβ42–lipid interactions. This aim will be achieved using coarse-grained molecular dynamics simulations of liposomes with encapsulated Aβx. The proposed work is driven by the hypothesis that llower pH values will promote self-aggregation of Aβx and interactions of Aβx with the liposome walls, leading to disruption of the integrity of the liposome. Coarse-grained (CG) molecular dynamics (MD) simulations will be performed of double lipid bilayer systems in which the lumenal compartment is filled with encapsulated Aβ42 (25 μM). 25 μM was the upper Aβ42 concentration estimate from an in vitro study that quantified the orders-of-magnitude higher Aβ42 concentration found in vesicles as opposed the pM-nM levels found extracellularly. Different liposome lipid compositions will be used in an attempt to begin to take into account the known changes in membrane fluidity, curvature, width, packing defects, and surface charge found in different biological membranes. The lipid compositions were taken from the experimental literature. For this initial study, further complexities such as membrane bound proteins, realistic lumenal contents, and alterations in lipid composition due to AD will not be included in the “organelle” model. For each organelle model, the effect of pH will simply be taken into account by changing the type of bead parameters used for the CG sites for the negatively charged lipids and the amino acids. Three different pHs will be simulated, roughly corresponding to pH 3, pH 4, and pH 5-7. While this corresponds to a larger range of pH values than the expected shifts due to AD pathology, use of these ranges will allow for the observation of trends. In real life, it is expected that hyperacidification not only leads to changes in the way Aβx interacts, but also to higher Aβx concentrations due to higher BACE-1 activity. Since this has not been quantified experimentally, the Aβx concentration will not be varied simultaneously with pH in the simulations. Aggregation of Aβ42 and Aβ42–lipid preferential interfacial interactions, as well as lipid sorting, will be quantified using the Fluctuation Solution Theory (FST). METHODS 25 μM Aβ42 will be simulated inside double lipid bilayer systems using the lipid compositions previously determined experimentally; the asymmetry of the lipid distributions will be approximated based upon the knowledge of the plasma membrane asymmetry. No new force field (FF) parameters will need to be developed. Simulations will be performed using the Martini FF version 2.2P (accounts for orientational polarization of water) combined with an elastic network model, ElNeDyn (Elastic Network in Dynamics, parameterized to reproduce the backbone deformations, etc. observed in atomistic simulations of proteins). The Martini FF was chosen because it is a thermodynamic-based coarse-graining, it is not too coarse (chemical specificity for different amino acids and different lipids is retained), parameters are available for all of the molecules necessary and for any possible extensions that might be desired, the force field is more transferable than are many other CG FFs, and the CG representation can be easily transferred back to a FG representation at any time. Because the Martini FF does not currently allow for 2º changes, and because these changes are believed to be important for this work, three different sets of simulations will be performed for each lipid composition and lumenal pH value, each with a different initial conformation for all of the Aβ42 molecules. One set will start with all the Aβ42 molecules in a helix-kink-helix conformation determined from NMR experiments in an apolar environment (PDB 1IYT), the second with all the Aβ42 molecules in β-sheet structures (monomer configurations will be extracted from the solution NMR stucture of Aβ42 fibrils) (PDB 2BEG), and the third will start with all the Aβ42 molecules in the conformations determined by NMR studies of Aβ42 in aqueous solution (PDB 1Z0Q). In each of the three starting configurations, the Aβ42 molecules will be placed at random configurations within the liposome lumen. Although the FF does not allow for secondary structural changes, it has been shown to correctly account for tertiary structural changes and for correct peptide-lipid orientations. The simulation time necessary is unknown, but the simulations will be run until all properties analyzed have converged. Any properties that do not converge with appropriate levels of effort will not be reported without the unbridled discussion of the statistical uncertainty. The preferential solvation, ps, will be quantified to allow for a rigorous analysis of the changes in aggregation properties that is not reliant upon a correct visual interpretation of the simulated trajectories and that also goes beyond taking into account direct-contact interactions to also include through-space interactions, which give large contributions to thermodynamic properties. The preferential solvation of component A by components B and C is defined as psAB,C = δxB,A – δxC,A. δxj,i are the excess local mole fractions, and can be calculated once the Kirkwood-Buff integrals (KBIs) of the system have been calculated. The KBIs will be calculated based upon the particle fluctuations within small, grand canonical regions within the simulation boxes. A large, positive psAB,C would indicate that component A had a stronger preference to be solvated by B rather than by C. Trends in the excess coordination numbers (extractable from a FST analysis) with pH will allow for the changes in the Aβ42 molecular association equilibrium constant with pH to be calculated. All of the proposed analysis procedures rely upon established, derived expressions relying upon no approximations or parameters; the PI has extensive experience with FST. The analysis methods will provide information analogous to that obtained by Timasheff’s osmotic pressure studies of protein-protein interactions. Additional properties to be analysed include the lipid organization within the membrane for different compositions and the effects of Aβ42 on the liposome itself, e.g., curvature and membrane tension, since it is believed that lipid surfaces catalyze Aβ42 aggregation. ANTICIPATED RESULTS The psAβlipid,water (see Methods), Aβ42-Aβ42 excess coordination numbers, and deformation of the membrane will increase as the pH decreases irrespective of the lipid composition used.
Close

Project Abstract

The Galaxy XSEDE Gateway

PI: James Taylor



We have developed and continue to support the Galaxy genomics analysis system (Goecks et al. 2010). Our main public Galaxy analysis website currently supports more than 50,000 genomics researchers performing hundreds of thousands of analysis jobs every month. Galaxy’s positive impact on biomedical research is felt in two areas. First, Galaxy enables biomedical researchers to perform complex compute-intensive analyses with nothing to install and configure. Second, it allows tool developers to deploy their analysis applications without the need to design interfaces and maintain compute-infrastructure. Both of these are profoundly felt. Despite the fact we do not put any requirements on citing our work, over 700 papers published in 2014 alone have utilized Galaxy as the analysis platform. Making XSEDE resources available through the main Galaxy instance not only increases the number of users whose science can be supported, but the unique capabilities of XSEDE resources also increase the types of analyses than can be offered
Close

Project Abstract

Long-read sequencing analysis of non-model organisms

PI: Winston Timp



Long-read sequencing has finally reached an inflection point where the accuracy and yield are sufficient to begin affordable interrogation of non-model organisms, organisms without a well-established genome sequence. Evidence of this abounds - the avian phylogenetics project, the rapid spread of microbial sequencing, even the recent genome assembly of many agricultural products, both plant and animal. We intend to leverage this long read omics data generated by both ourselves and others to perform comparisons and analysis between species to examine conservation and in specific species/families to examine unique adaptations. XSEDE will help by allowing us to generate a Galaxy service for ease of use for our collaborators at other institutions. Specifically, we will make available datasets for a comparative genomics project (protein, cds, predicted gene families, etc) and allow our collaborators with limited computational experience, but extensive biological knowledge, to be able to perform BLAST searches, orthology predictions, gene ontology assignments, conservation investigation and other analyses through the Galaxy platform.
Close

Project Abstract

Computational Anatomy Gateway

PI: Daniel Tward



Computational Anatomy is a discipline focused on the quantitative analysis of the variability in biological shape. Large Deformation Diffeomorphic Metric Mapping (LDDMM) is the key algorithm, which assigns computable descriptors of anatomical shapes and a metric distance between them. This is achieved by describing populations of anatomy as a group of diffeomorphic transformations applied to a template, and using a metric on the space of diffeomorphisms. LDDMM is being used extensively in the neuroimaging (\url{www.mristudio.org}) and cardiovascular imaging (\url{www.cvrgrid.org}) communities. Modern high resolution scanners are producing images which require the computational power of multiple processing units. One of our goals is to analyze the high-resolution images on large memory (1TB memory) machines with the existing OpenMP implementation of LDDMM. The initial startup year was extremely successful in meeting our goals and a science gateway has been created. The Computational Anatomy Science Gateway (\url{https://www.xsede.org/gateways-listing}) is established to support neuroimaging community. In this year, we plan to continue with development and expansion of the science gateway role to support this community. The neuroimaging community (\url{www.mristudio.org}) that this gateway will support has 6000 users, who are beginning to transfer to our science gateway model, while we continue to expand our user base. Our local cluster has processed over 47K brains since 2009 which required 450K CPU-hrs on 8 core nodes with 32GB of memory. Over the past year we have begun to offload the majority of jobs from our local cluster to XSEDE resources. The gateway is enabling the processing of large populations in parallel on multiple cores. With these new data sets, innovations in software optimization, and algorithms that benefit from statistical knowledge of populations are being explored. Significant development with GPGPUs has occurred, and we hope to further exploit the KNL environment in the same manner. In addition to the neuroimaging community, the cardiovascular community (\url{www.cvrgrid.org}) will also utilize the gateway for shape analysis (\url{http://cvrgrid.org/features/cardiac-c}) in the future if resources allow.
Close

Project Abstract

A Global Sensitivity Analysis Software for Monotone Missing Data

PI: Chenguang Wang



Randomized trials are commonly plagued by missing data when outcomes are scheduled to be measured at fixed points in time after randomization. The analysis of such trials relies on untestable assumptions about the missing data mechanism. To address such missing data issue, it has been recommended that the sensitivity of the trial results to assumptions should be a mandatory reporting requirement. Scharfstein et al. (2017) proposed a formal methodology for conducting sensitivity analysis of randomized trials in which outcomes are missing due to subjects prematurely withdraw from study participation. In their approach, the untestable and testable assumptions were guaranteed to be compatible; their testable assumptions were based on a flexible, semi-parametric model for the distribution of the observable data. In this project, our goal is to implement the method as software packages.
Close

Project Abstract

Direct numerical simulations of transitional and turbulent boundary layers

PI: Tamer Zaki



Very thin boundary layers at the liquid-solid interface often determine the dynamics of the entire flow configuration. Direct numerical simulations were performed of the transitional boundary layer, and resources are requested for simulation of the fully turbulent boundary layer downstream. Spatially and temporally resolved flow fields of both computations will be hosted within the Johns Hopkins Turbulence DataBase (JHTDB: \url{http://turbulence.pha.jhu.edu}) and will be publicly available to the entire turbulence research community. The simulation setup is divided into two zones: The first region is a C-grid that includes the flow upstream of the leading edge of a plate, the interaction of incoming disturbances with the leading edge, and the transitional boundary layer downstream and ultimately the formation of an equilibrium turbulent boundary layer, $Re_\tau = 480$. This simulation was completed in the first two years of the project. The second region, which is the focus of this renewal, is a Cartesian domain that overlaps with the first zone and continues the turbulent boundary layer simulation to higher Reynolds number, $Re_\tau = 2,000$. The transitional boundary layer is characterized by a random juxtaposition of laminar and turbulent regions, with the turbulence appearing sporadically in space and time. Within the turbulent patches, the perturbations are more energetic than in equilibrium flows, and the skin friction and heat transfer rates also exceed their values in fully turbulent conditions. These effects will be examined using ensemble averaging techniques applied to naturally forming turbulent spots in direct numerical simulations of boundary-layer transition. The dynamics of the laminar-turbulence interface that surrounds the turbulent patches, and which dictates the spreading rate of these spots, will also be studied. Farther downstream, the flow relaxes towards a fully turbulent state with continual increase in the Reynolds number with distance from the leading edge. Unlike earlier simulations of fully turbulent boundary layers were inflow perturbations are artificially synthesized, our simulation of transition provides a physical and accurate inflow for our direct numerical simulation of the turbulent boundary layer. We will study the turbulent/non-turbulent interface that separates the boundary layer from the free stream, and how it is distorted by large-scale coherent motions within the near-wall region. The origin and dynamics of these large-scale structures will also be studied, as they play a central role in the near-wall turbulence cycle.