Current Research Projects

 

pegasus-dark

Pegasus Workflow Management System

The Pegasus project encompasses a set of technologies that help workflow-based applications execute in a number of different environments including desktops, campus clusters, grids, and now clouds. Scientific workflows allow users to easily express multi-step computations, for example retrieve data from a database, reformat the data, and run an analysis. Once an application is formalized as a workflow the Pegasus Workflow Management Service can map it onto available compute resources and execute the steps in appropriate order. Pegasus can easily handle workflows with several million computational tasks.

Learn more Source Code

Funding Agency: NSF

 

race_logo

Repository and Workflows for Accelerating Circuit Realization

RACE will enable researchers and design experts to expand the state-of-the art in ASIC design through novel cyberinfrastructure and workflow tools that accelerate every phase of discovery, creation, adoption, and use by linking and computing around a repository of user-generated data, including new tools, new IP blocks/libraries, new design flows, training modules, and experience-base documenting best practices to adopt (and pitfalls to avoid).

Learn more

Funding Agency: DARPA

 

customLogo

 

Panorama 360: Performance Data Capture and Analysis for End-to-end Scientific Workflows

Scientific workflows are now being used in a number of scientific domains including astronomy, bioinformatics, climate modeling, earth science, civil engineering, physics, and many others. Unlike monolithic applications, workflows often run across heterogeneous resources distributed across wide area networks. Some workflow tasks may require high performance computing resources, while others can run efficiently on high throughput computing systems. Workflows also access data from potentially different data repositories and use data, often represented as files to communicate between the workflow components. As the result of the data access patterns, workflow performance can be greatly influenced by the performance of networks and storage devices.

Learn more

Funding Agency: DOE

 

Secure and Resilient Architecture: Scientific Workflow Integrity with Pegasus (SWIP)

The Scientific Workflow Integrity with Pegasus project strengthens cybersecurity controls in the Pegasus Workflow Management System in order to provide assurances with respect to the integrity of computational scientific methods. These strengthened controls enhance both Pegasus’ handling of science data and its orchestration of software-defined networks and infrastructure. The result is increased trust in computational science and increased assurance in our ability to reproduce the science by allowing scientists to validate that data has not been changed since a workflow completed and that the results from multiple workflows are consistent.

Learn more

Funding Agency: NSF

 

In Situ Data Analytics for Next Generation Molecular Dynamics Workflows

Molecular dynamics simulations studying the classical time evolution of a molecular system at atomic resolution are widely recognized in the fields of chemistry, material sciences, molecular biology and drug design; these simulations are one of the most common simulations on supercomputers. Next-generation supercomputers will have dramatically higher performance than do current systems, generating more data that needs to be analyzed. The coordination of data generation and analysis cannot rely on manual, centralized approaches as it does now. This project aims to transform the centralized nature of the molecular dynamics analysis into a distributed approach that is predominantly performed in situ. Specifically, this effort combines machine learning and data analytics approaches, workflow management methods, and high performance computing techniques to analyze molecular dynamics data as it is generated, save to disk only what is really needed for future analysis, and annotate molecular dynamics trajectories to drive the next steps in increasingly complex simulations’ workflows.

Learn more

Funding Agency: NSF

 

Model Integration through Knowledge-Rich Data and Process Composition (MINT)

Major societal and environmental challenges require forecasting how natural processes and human activities affect one another. There are many areas of the globe where climate affects water resources and therefore food availability, with major economic and social implications. Today, such analyses require significant effort to integrate highly heterogeneous models from separate disciplines, including geosciences, agriculture, economics, and social sciences. Model integration requires resolving semantic, spatio-temporal, and execution mismatches, which are largely done by hand today and may take more than two years. The Model INTegration (MINT) project will develop a modeling environment which will significantly reduce the time needed to develop new integrated models, while ensuring their utility and accuracy. Research topics to be addressed include: 1) New principle-based semiautomatic ontology generation tools for modeling variables, to ground analytic graphs to describe models and data; 2) A novel workflow compiler using abductive reasoning to hypothesize new models and data transformation steps; 3) A new data discovery and integration framework that finds new sources of data, learns to extract information from both online sources and remote sensing data, and transforms the data into the format required by the models; 4) A new methodology for spatio-temporal scale selection; 5) New knowledge-guided machine learning algorithms for model parameterization to improve accuracy; 6) A novel framework for multi-modal scalable workflow execution; and 7) Novel composable agroeconomic models.

Learn more

Funding Agency: DARPA

 

Genome_logo5_0

Population Architecture using Genomics and Epidemiology (PAGE)

Over recent years, genome-wide association studies (GWAS) have allowed researchers to uncover hundreds of genetic variants associated with common diseases. However, the discovery of genetic variants through GWAS research represents just the first step in the challenging process of piecing together the complex biological picture of common diseases. To help speed the process, the National Human Genome Research Institute, is supporting new research in existing large epidemiology studies, all with a rich range of measures of health and potential disease, and many with long-term follow-up. The focus of the new research is on how genetic variants initially identified through GWAS research are related to a person’s biological and physical characteristics, such as weight, cholesterol levels, blood sugar levels or bone density. Scientists will also examine how non-genetic factors, such as diet, medications and smoking, may interact with genetic factors or each other to influence health outcomes.

Learn more

Funding Agency: NIH

 

cgsmd_logo

Center for Collaborative Genetic Studies on Mental Disorders (CSGSMD)

The Center for Collaborative Genetic Studies on Mental Disorders is a collaboration of Rutgers University RUCDR, Washington University in St. Louis and the University of Southern California’s Information Sciences Institute. It is funded by a grant from the National Institute of Mental Health. The Center produces, stores, and distributes clinical data and biomaterials (DNA samples and cell lines) available in the NIMH Human Genetics Initiative. The Center creates and distributes computational tools that support investigation and analysis of the clinical data. In addition, the Center creates tools that enables researchers to determine which samples or data might be of use to them, so that they may request access from NIMH.

Learn more

Funding Agency: NIH

 

 

SimCenter: Center for Computational Modeling and Simulation

The SimCenter will provide modeling and simulation tools using a new open-source framework that: (1) addresses various natural hazards, such as windstorms, storm surge, tsunamis, and earthquakes; (2) tackles complex, scientific questions of concern to disciplines involved in natural hazards research, including earth sciences, geotechnical and structural engineering, architecture, urban planning, risk management, social sciences, public policy, and finance; (3) utilizes machine learning to facilitate and improve modeling and simulation using data obtained from experimental tests, field investigations, and previous simulations; (4) quantifies uncertainties associated with the simulation results obtained; (5) utilizes the high-performance parallel computing, dXata assimilation, and related capabilities to easily combine software applications into workflows of unprecedented sophistication and complexity; (6) extends and refines software tools for carrying out performance-based engineering evaluations and supporting decisions that enhance the resilience of communities susceptible to multiple natural hazards; and (7) utilizes existing applications that already provide many of the pieces of desired computational workflows.

Learn more

Funding Agency: NSF

 

XSEDE: Integrating, Enabling and Enhancing National Cyberinfrastructure with Expanding Community Involvement

Scientists, engineers, social scientists, and humanists around the world – many of them at colleges and universities – use advanced digital resources and services every day. Things like supercomputers, collections of data, and new tools are critical to the success of those researchers, who use them to make our lives healthier, safer, and better. XSEDE is an NSF-funded virtual organization that integrates and coordinates the sharing of advanced digital services – including supercomputers and high-end visualization and data analysis resources – with researchers nationally to support science. Digital services provide users with seamless integration to NSF’s high-performance computing and data resources. XSEDE’s integrated, comprehensive suite of advanced digital services combined with other high-end facilities and campus-based resources, serve as the foundation for a national cyberinfrastructure ecosystem.

Learn more

Funding Agency: NSF

 

Open Science Grid

The OSG provides common service and support for resource providers and scientific institutions using a distributed fabric of high throughput computational services. The OSG does not own resources but provides software and services to users and resource providers alike to enable the opportunistic usage and sharing of resources. The OSG is jointly funded by the Department of Energy and the National Science Foundation. The OSG is primarily used as a high-throughput grid where scientific problems are solved by breaking them down into a very large number of individual jobs that can run independently.

Learn more

Funding Agency: NSF and DOE

 

 

logo-vertical

WRENCH: Workflow Management System Simulation Workbench

Capitalizing on recent advances in distributed application and platform simulation technology, WRENCH makes it possible to (1) quickly prototype workflow, WMS implementations, and decision-making algorithms; and (2) evaluate/compare alternative options scalably and accurately for arbitrary, and often hypothetical, experimental scenarios.  This project will define a generic and foundational software architecture, that is informed by current state-of-the-art WMS designs and planned future designs.  The implementation of the components in this architecture when taken together form a generic “scientific instrument” that can be used by workflow users, developers, and researchers.  This scientific instrument will be instantiated for several real-world WMSs and used for a range of real-world workflow applications.

Learn more Source Code

Funding Agency: NSF

 

 

5,320 views