Current Research Projects
Pegasus Workflow Management System
The Pegasus project encompasses a set of technologies that help workflow-based applications execute in a number of different environments including desktops, campus clusters, grids, and now clouds. Scientific workflows allow users to easily express multi-step computations, for example retrieve data from a database, reformat the data, and run an analysis. Once an application is formalized as a workflow the Pegasus Workflow Management Service can map it onto available compute resources and execute the steps in appropriate order. Pegasus can easily handle workflows with several million computational tasks.
Funding Agency: NSF
Repository and Workflows for Accelerating Circuit Realization
RACE will enable researchers and design experts to expand the state-of-the art in ASIC design through novel cyberinfrastructure and workflow tools that accelerate every phase of discovery, creation, adoption, and use by linking and computing around a repository of user-generated data, including new tools, new IP blocks/libraries, new design flows, training modules, and experience-base documenting best practices to adopt (and pitfalls to avoid).
Funding Agency: DARPA
Predictive Modeling and Diagnostic Monitoring of Extreme Science Workflows
Scientific workflows are now being used in a number of scientific domains including astronomy, bioinformatics, climate modeling, earth science, civil engineering, physics, and many others. Unlike monolithic applications, workflows often run across heterogeneous resources distributed across wide area networks. Some workflow tasks may require high performance computing resources, while others can run efficiently on high throughput computing systems. Workflows also access data from potentially different data repositories and use data, often represented as files to communicate between the workflow components. As the result of the data access patterns, workflow performance can be greatly influenced by the performance of networks and storage devices.
Funding Agency: DOE
Secure and Resilient Architecture: Scientific Workflow Integrity with Pegasus (SWIP)
The Scientific Workflow Integrity with Pegasus project strengthens cybersecurity controls in the Pegasus Workflow Management System in order to provide assurances with respect to the integrity of computational scientific methods. These strengthened controls enhance both Pegasus’ handling of science data and its orchestration of software-defined networks and infrastructure. The result is increased trust in computational science and increased assurance in our ability to reproduce the science by allowing scientists to validate that data has not been changed since a workflow completed and that the results from multiple workflows are consistent.Learn more
Funding Agency: NSF
In Situ Data Analytics for Next Generation Molecular Dynamics Workflows
Molecular dynamics simulations studying the classical time evolution of a molecular system at atomic resolution are widely recognized in the fields of chemistry, material sciences, molecular biology and drug design; these simulations are one of the most common simulations on supercomputers. Next-generation supercomputers will have dramatically higher performance than do current systems, generating more data that needs to be analyzed. The coordination of data generation and analysis cannot rely on manual, centralized approaches as it does now. This project aims to transform the centralized nature of the molecular dynamics analysis into a distributed approach that is predominantly performed in situ. Specifically, this effort combines machine learning and data analytics approaches, workflow management methods, and high performance computing techniques to analyze molecular dynamics data as it is generated, save to disk only what is really needed for future analysis, and annotate molecular dynamics trajectories to drive the next steps in increasingly complex simulations’ workflows.Learn more
Funding Agency: NSF
Population Architecture using Genomics and Epidemiology (PAGE)
Over recent years, genome-wide association studies (GWAS) have allowed researchers to uncover hundreds of genetic variants associated with common diseases. However, the discovery of genetic variants through GWAS research represents just the first step in the challenging process of piecing together the complex biological picture of common diseases. To help speed the process, the National Human Genome Research Institute, is supporting new research in existing large epidemiology studies, all with a rich range of measures of health and potential disease, and many with long-term follow-up. The focus of the new research is on how genetic variants initially identified through GWAS research are related to a person’s biological and physical characteristics, such as weight, cholesterol levels, blood sugar levels or bone density. Scientists will also examine how non-genetic factors, such as diet, medications and smoking, may interact with genetic factors or each other to influence health outcomes.
Funding Agency: NIH
Center for Collaborative Genetic Studies on Mental Disorders (CSGSMD)
The Center for Collaborative Genetic Studies on Mental Disorders is a collaboration of Rutgers University RUCDR, Washington University in St. Louis and the University of Southern California’s Information Sciences Institute. It is funded by a grant from the National Institute of Mental Health. The Center produces, stores, and distributes clinical data and biomaterials (DNA samples and cell lines) available in the NIMH Human Genetics Initiative. The Center creates and distributes computational tools that support investigation and analysis of the clinical data. In addition, the Center creates tools that enables researchers to determine which samples or data might be of use to them, so that they may request access from NIMH.
Funding Agency: NIH
WRENCH: Workflow Management System Simulation Workbench
Capitalizing on recent advances in distributed application and platform simulation technology, WRENCH makes it possible to (1) quickly prototype workflow, WMS implementations, and decision-making algorithms; and (2) evaluate/compare alternative options scalably and accurately for arbitrary, and often hypothetical, experimental scenarios. This project will define a generic and foundational software architecture, that is informed by current state-of-the-art WMS designs and planned future designs. The implementation of the components in this architecture when taken together form a generic “scientific instrument” that can be used by workflow users, developers, and researchers. This scientific instrument will be instantiated for several real-world WMSs and used for a range of real-world workflow applications.
Funding Agency: NSF
Boutiques: A cross-platform application repository for science gateways
Porting applications to science gateways is a critical step to enable their execution on distributed computing infrastructures and their sharing among scientific communities. However, application porting remains a costly human effort that consists of 1) installing the application on the target execution platform, 2) describing the application in a format compatible with the science gateway, and 3) generating proper user interfaces. Due to the variety of science gateways, application porting efforts are often replicated several times while mutualization would save cost and improve the quality of the ported applications. Boutiques is an application repository that allows automatic import and exchange of applications in science gateways. Compared to previous initiatives, our repository relies on Linux containers to solve the problem of application installation in a lightweight manner. In addition, it adopts a flexible application description format which is versatile enough to be used in various science gateways.