Large, diverse datasets hold tremendous promise, if only we can derive statistical insights from them. But often, these datasets are siloed and withheld, because of privacy concerns. Differential privacy can mitigate these concerns — it provides a strong, attractive privacy guarantee that protects data owners from risks associated disclosure of their data. But this guarantee can be hard to achieve in practical settings, while preserving the utility of the data. This project is developing foundational theory and practical tools for private data analysis that can be used by non-experts in a variety of applications, allowing for important new analyses of data sets that might be distributed across multiple owners. The project also includes substantial outreach activities, in the form of course development and workshop organization, and will train PhD students to be future leaders in the development of privacy technologies. Several thrusts of the project include:

  1. To extend the theory of differential privacy to work with relaxed guarantees, amenable to high-accuracy analyses when there are only a relatively small number of parties with limited ability to collude;
  2. To develop programming languages and automatic verification tools capable of automatically certifying the differential privacy properties of distributed systems, in which each party has only partial access to the data; and
  3. To develop tool chains to implement differentially private algorithms in distributed settings, using (among other technologies) secure multi-party computation as a computational substrate.

Read more about differential privacy in this article and a second article in the Penn News, and in our textbook on differential privacy!


Collaborators

  • Andreas Haeberlen (CIS)
  • Benjamin Pierce (CIS)
  • Aaron Roth (CIS)
  • Michael Kearns (CIS)

Funding

NSF, Sloan Foundation

Students

  • Matthew Joseph
  • Seth Neel
  • Edo Roth
  • Hengchu Zhang