{"646312":{"#nid":"646312","#data":{"type":"event","title":"PhD Defense by Wanrong Zhang","body":[{"value":"\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EThesis\u003C\/strong\u003E\u0026nbsp;\u003Cstrong\u003ETitle:\u003C\/strong\u003E\u0026nbsp;Privacy-preserving Statistical Tools: Differential Privacy and Beyond\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EAdvisors:\u003C\/strong\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDr. Rachel Cummings, Industrial Engineering and Operations Research, Columbia University\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDr. Yajun Mei, School of Industrial and Systems Engineering, Georgia Tech\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003ECommittee members:\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDr. Mark Davenport, School of Electrical and Computer Engineering, Georgia Tech\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDr. Sara Krehbiel, Department of Mathematics and Computer Science, Santa Clara University\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDr. Jeff Wu, School of Industrial and Systems Engineering, Georgia Tech\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EDate and Time:\u0026nbsp;\u003C\/strong\u003E1:00 - 3:00 pm ET, Wednesday April 21, 2021\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EMeeting URL:\u003C\/strong\u003E\u0026nbsp;\u003Ca href=\u0022https:\/\/bluejeans.com\/960559571\u0022\u003Ehttps:\/\/bluejeans.com\/960559571\u003C\/a\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EMeeting ID:\u003C\/strong\u003E\u0026nbsp;960 559 571(BlueJeans)\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EAbstract:\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003EDifferential privacy has emerged as the de facto gold standard in protecting the privacy of individuals when processing sensitive data, because of its powerful formal guarantees. Several companies, including Google, Apple, Microsoft, have deployed differentially private tools, but barriers remain between such systems and full-featured privacy-preserving data analytics. This thesis focuses on two main challenges: private online decision-making and privacy of dataset-level properties.\u0026nbsp;\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EMost of the existing differentially private tools are for static databases with non-adaptive analysis. However, modern data analysts interact with online dataset adaptively. The first part of this thesis studies private algorithms for two classical statistical online decision-making problems.\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EIn Chapter 2, we study the statistical problem of change-point detection through the lens of differential privacy. This problem appears in many important practical settings involving personal data, such as identifying disease outbreaks based on hospital records, or IoT devices detecting activity within a home. We give the first private algorithms for both online and offline change-point detection. We prove a differential privacy guarantee and accuracy guarantees for our algorithms. We also give the first finite-sample accuracy guarantees for the standard non-private MLE. Additionally, we provide empirical validation, which provides evidence that our algorithms perform well for practical use.\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EIn Chapter 3, we study False Discovery Rate (FDR) control in multiple hypothesis testing under the constraint of differential privacy. Unlike previous work in this direction, we focus on the online setting, meaning that a decision about each hypothesis must be made immediately after the test is performed, rather than waiting for the output of all tests as in the offline setting. We provide new private algorithms based on state-of-the-art results in non-private online FDR control. Our algorithms have strong provable guarantees for privacy and statistical performance as measured by FDR and power. We also provide experimental results to demonstrate the efficacy of our algorithms in a variety of data environments.\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003ESecond, classical differential privacy does not protect sensitive global properties of a dataset, such as the distribution of race and gender among users in the training set or proprietary information. We demonstrate a new dataset-level privacy vulnerability and introduce new privacy notions beyond individual privacy in the second part of this thesis.\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EIn Chapter 4, we study the leakage of dataset properties at the population-level. Our primary focus in this work is on attacks to infer dataset properties in the centralized multi-party machine learning setting, where the model is securely trained on several parties\u0026#39; data, and parties only have black-box access to the final model. We propose an effective attack strategy that requires only a few hundred queries to the model and relies on a simple attack architecture that even a computationally bound attacker can use. Our attack successes on different types of datasets including tabular, text, and graph data, and leakage occurs even if the sensitive attribute is not included in the training data and has a low correlation with other attributes and the target variable.\u0026nbsp;\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EIn Chapter 5, we depart from individual privacy to initiate the study of attribute privacy, where a data owner is concerned about revealing sensitive properties of a whole dataset during analysis. We propose definitions to capture attribute privacy in two relevant cases where global attributes may need to be protected:(1) properties of a specific dataset and (2) parameters of the underlying distribution from which dataset is sampled. We also provide two efficient mechanisms and one inefficient mechanism that satisfy attribute privacy for these settings.\u0026nbsp;\u0026nbsp;\u003C\/p\u003E\r\n","summary":null,"format":"limited_html"}],"field_subtitle":"","field_summary":"","field_summary_sentence":[{"value":": Privacy-preserving Statistical Tools: Differential Privacy and Beyond"}],"uid":"27707","created_gmt":"2021-04-09 17:19:50","changed_gmt":"2021-04-09 17:19:50","author":"Tatianna Richardson","boilerplate_text":"","field_publication":"","field_article_url":"","field_event_time":{"event_time_start":"2021-04-21T14:00:00-04:00","event_time_end":"2021-04-21T16:00:00-04:00","event_time_end_last":"2021-04-21T16:00:00-04:00","gmt_time_start":"2021-04-21 18:00:00","gmt_time_end":"2021-04-21 20:00:00","gmt_time_end_last":"2021-04-21 20:00:00","rrule":null,"timezone":"America\/New_York"},"extras":[],"groups":[{"id":"221981","name":"Graduate Studies"}],"categories":[],"keywords":[{"id":"100811","name":"Phd Defense"}],"core_research_areas":[],"news_room_topics":[],"event_categories":[{"id":"1788","name":"Other\/Miscellaneous"}],"invited_audience":[{"id":"78761","name":"Faculty\/Staff"},{"id":"78771","name":"Public"},{"id":"174045","name":"Graduate students"},{"id":"78751","name":"Undergraduate students"}],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[],"email":[],"slides":[],"orientation":[],"userdata":""}}}