Government agencies, healthcare organizations, and research institutions often need to combine information from different databases to solve important problems like tracking disease outbreaks, improving public services, or conducting medical research. However, these organizations currently cannot share their data because it contains sensitive personal information protected by privacy laws and regulations. This creates a significant barrier to research and policy development that could otherwise improve public health, enhance government services, and advance scientific discovery. This project addresses this challenge by developing a secure system that allows organizations to answer important questions using combined datasets without actually sharing the sensitive information itself. This work serves the national interest by enabling evidence-based policymaking through secure data collaboration, advancing public health research while protecting individual privacy, strengthening government efficiency through improved data-driven decision making, and supporting American leadership in privacy-preserving data science technologies. This project develops the Trusted Integration Data Exchange System, a platform that enables multiple organizations to perform complex data analysis on jointly held sensitive datasets while maintaining compliance with data sharing policies and federal, state, and local regulations. The research activities include developing formal languages that capture regulations, dataset structures, and study objectives, which are then processed using novel cryptographic database operators. The project introduces two new models of multiparty computation: helper-assisted two-party computation using confidential computing for integrity to serve as an additional party that simplifies privacy-preserving operators for data integration, and a novel paradigm for performing secure multiparty computations when regulations prevent data sharing even in encrypted form. The Trusted Integration Data Exchange System will include an automated compliance checker that confirms whether an integration study is legally viable. Additionally, the project will develop user-friendly tools including development environments and debugging utilities to help data scientists and engineers write policies and queries, diagnose potential platform issues, and debug study results. The platform will be evaluated across multiple domains including healthcare, education, and government services, with comprehensive training provided to students, scientists, and engineers in privacy-preserving data analysis techniques. This award reflects NSF’s statutory mission and has been deemed worthy of support through evaluation using the Foundation’s intellectual merit and broader impacts review criteria.

Faculty:

  • Sebastian Angel
  • Brett Falk
  • Andreas Haeberlen
  • Ryan Marcus
  • Pratyush Mishra