NSF Award: Secure and Resilient Architecture: Scientific Workflow Integrity with Pegasus (SWIP)

with No Comments

Scientists use computer systems to analyze and store their scientific data, sometimes in a complex process across multiple machines. This process can be tedious and error-prone, which has led to the development of software known as a "workflow management system". Workflow management systems allow scientists to describe their process in a human-friendly way and then the software handles the details of the processing for the scientists, dealing with tedious and repetitive steps and handling errors. One popular workflow management system is Pegasus, which, over the past three years, was used to run over 700,000 workflows by scientists in a number of domains including astronomy, bioinformatics, earthquake science, gravitational wave physics, ocean science, and neuroscience. The "Scientific Workflow Integrity with Pegasus" project enhances Pegasus with additional security features. The scientist's description of their desired work is protected from tampering and the data processed by Pegasus is checked to ensure it hasn't been accidentally or maliciously modified. Such tamper protection is attained by cryptographic techniques that ensure data integrity. These changes allow scientists, and our society, to be more confident of scientific findings based on collected data.

The Scientific Workflow Integrity with Pegasus project strengthens cybersecurity controls in the Pegasus Workflow Management System in order to provide assurances with respect to the integrity of computational scientific methods. These strengthened controls enhance both Pegasus' handling of science data and its orchestration of software-defined networks and infrastructure. The result is increased trust in computational science and increased assurance in our ability to reproduce the science by allowing scientists to validate that data has not been changed since a workflow completed and that the results from multiple workflows are consistent. The focus on Pegasus is due to its popularity in the scientific community as a method of computation and data management automation. For example, LIGO, the NSF-funded gravitational-wave physics project, recently used the Pegasus Workflow Management System to structure and execute the analyses that confirmed and quantified its historic detection of a gravitational wave, confirming the prediction made by Einstein 100 years ago. The proposed project has established collaborations with LIGO and additional key NSF infrastructure providers and science projects to ensure broadly applied results.


Investigators: Von Welch (PI), Steven Myers (Co-PI), Ewa Deelman (Co-PI), Ilya Baldin (Co-PI)


Start Date: September 1, 2016
End Date: August 31, 2019