Project Overview

DPComp: Realistic Data Mining Under Differential Privacy

Faculty Sponsor

Michael Hay (mhay@colgate.edu)

Department(s)

Computer Science

Abstract

Privacy concerns are a major obstacle to deriving the scientific insights now possible from increasing data collection and powerful new analysis techniques. The goal of privacy technology is to permit data mining and analysis to be carried out over a collection of sensitive records donated by individuals. Ideally, individuals receive a guarantee that the analysis does not lead to harmful disclosures about them. At the same time, data miners and scientists hope to study the data with little disruption to their methods and results. Differential privacy has emerged as an important standard for protection of individual's sensitive information. Differential privacy guarantees that the output the analyst receives is statistically indistinguishable from the output the analyst would have received if any one individual had opted out of the collection. Differentially private algorithms have been developed to support many common data mining tasks. However, many of these have yet to see widespread adoption in real-world systems. The goal of this research project is to work towards closing the gap between theory and practice. This summer project is part of a larger, multi-year NSF-sponsored project. A website describing that effort can be found here: https://www.dpcomp.org/ Interested students are strongly encouraged to review this website before applying.

Student Qualifications

Responsibilities: Duties will include software development and data analysis. Research assistants are expected to be flexible and willing to learn new skills as needed. The specific job description will be crafted based on the current needs of the project and the skill set of the student. Student qualifications: Sufficient background in computer science (the equivalent of COSC 101, 102 is required and more advanced courses such as 301 and 302 are preferred). Strong computer programming skills (in any language) are desired. This can include programming completed through course work and/or projects completed outside of courses such as internships or individual projects. Additional skills that would be beneficial include: data analysis, statistics, and prior research experience.

Number of Student Researchers

2 students

Project Length

8 weeks


Applications open on 01/15/2017 and close on 02/07/2017


<< Back to List





If you have questions, please contact Karyn Belanger (kgbelanger@colgate.edu).