DC-Area Anonymity, Privacy, and Security SeminarWinter 2019 Seminar
Monday, February 25th, 2019
1:00 p.m. - 4:30 p.m.
Lunch at 12 p.m. at NuVegan Café (8150 Baltimore Ave) Location: AV Williams Building, Room 4172
University of Maryland, College Park
Host: Jonathan Katz
1:00 p.m. - 1:25 p.m.
Speaker: Christine Task (NIST PSCR De-ID Challenge Technical Lead, Knexus Research)
Title: Designing a National Challenge for Differential Privacy
Abstract: The National Institute of Standards and Technology (NIST) is currently sponsoring the first national challenge event in differential privacy. The NIST Differentially Private Synthetic Data Challenge, began in the summer of 2018 with a concept-building phase where contestants submitted concept papers proposing a mechanism to enable the protection of personally identifiable information while maintaining a dataset's utility for analysis. The second phase will include a sequence of empirical matches throughout fiscal year 2019 where participants with implemented systems will compete to produce high quality synthetic data from real data sets. The third and final match begins March 10th (registration is still open). The NIST challenge has enabled exciting progress in several important open research problems, including both the development of algorithms to generate synthetic data and the development of metrics to evaluate synthetic data. This talk, will cover some of the unique challenges encountered designing an empirically scored challenge that included a theoretical proof requirement, as well as some of the long range expected outcomes.
1:25 p.m. - 1:50 p.m.
Speaker: Bargav Jayaraman (University of Virginia)
Title: Differential Privacy for Machine Learning
Abstract: Differential privacy has been widely accepted as a standard privacy notion for performing analytics over sensitive data. Since its inception, it has been adopted by the machine learning community for performing privacy preserving machine learning. There has been a proliferation of works in this space ranging from simple empirical risk minimization (such as regularized logistic regression) to more complex non-convex optimization problems (such as deep learning). These works can be further categorized based on the learning scenario they consider — from binary classification in low-dimensional settings to multi-class classification in high-dimensional or distributed settings. While it is well understood that using low privacy budget guarantees stronger privacy at the cost of utility and in contrast using higher budget sacrifices privacy for better utility, it is still an open question as to what concrete value of privacy budget should be used in practice. As a consequence, the above works use arbitrary value of budget to justify their model utility — ranging from small fractions (for ERM algorithms) to several thousands or even millions (for deep learning).
Recent efforts have gone into "relaxing" the stringent definition of differential privacy to drastically reduce the privacy budget while maintaining good model utility, which has made private learning of complex tasks more feasible. In this talk, I will discuss how these "relaxations" of differential privacy notion can lead to more unintended privacy leakage. This can be quantified by measuring the vulnerability of private models against membership and attribute inference attacks. Finally, I will conclude the talk with more open questions and future research directions in the field.
1:50 p.m. - 2:20 p.m.
2:20 p.m. - 2:45 p.m.
Speaker: Paul Syverson (U.S. Naval Research Laboratory)
Title: Self-Authenticating Subdomains
Abstract: We present self-authenticating subdomains, which embed public keys in domain names, effectively weaving security into the fabric of the Web. Self-authenticating subdomains can be used to preclude certificate hijack, providing site owners more control over the security of their sites than other existing mechanisms. We present an implementation of our self-authenticating subdomains and a corresponding browser extension that validates connections to them. Use of this extension and deployment of self-authenticating subdomains on servers are both backwards compatible with existing browsers and authentication infrastructure: TLS handshakes, certificate issuance guidelines, and Certificate Transparency logging and auditing. Similarly, connecting to self-authenticating subdomains in a browser that does not understand them does not affect existing infrastructure protections of those connections.
The public keys we embed in domain names are in the format of Tor onion service keys, but the client can be ignorant of Tor and need not direct traffic over any onion routing network to obtain our protections. Our extension also works in Tor Browser with .onion alternative services, which, as currently deployed, undermine usable security and assurance in site authentication. We describe how our extension improves usable security and provides hijack-resistant authentication for these alternative services.
2:45 p.m. - 3:10 p.m.
Speaker: Jonathan Katz (University of Maryland, College Park)
Title: How to Hash: Efficient and Secure Multiparty Computation from Fixed-Key Block Ciphers
Abstract: Many implementations of secure computation use fixed-key AES; this results in substantial performance benefits due to hardware support for AES and the ability to avoid recomputing the AES key schedule. Surveying these implementations, however, we find that most utilize AES in a heuristic fashion; in the best case this leaves a gap in the security proof, but in many cases we show it allows for explicit attacks.
Motivated by this unsatisfactory state of affairs, we initiate a comprehensive study of how to use fixed-key block ciphers for secure computation — in particular for OT extension and circuit garbling — efficiently and securely. Our results provide end-to-end security proofs for implementations of secure-computation protocols based on fixed-key block ciphers (modeled as random permutations). Perhaps surprisingly, at the same time our work also results in noticeable performance improvements over the state-of-the-art.
Work by Chun Guo, Jonathan Katz, Xiao Wang, and Yu Yu.
3:10 p.m. - 3:40 p.m.
3:40 p.m. - 4:05 p.m.
Speaker: Kelsey Fulton (University of Maryland, College Park)
Title: Understanding security mistakes developers make: Qualitative analysis from Build It, Break It, Fix It
Abstract: Secure development is a challenging task that requires developers to consider several possible threats and build appropriate mitigations. Prior work has studied the types of vulnerabilities that developers are most likely to introduce. However, this work has only provided limited insights into the reasons for these vulnerabilities. In this paper, we build on prior work through an in-depth investigation of 76 programs developed in a quasi-controlled environment designed to mimic real-world constraints — correctness, performance, and security. In addition to writing secure code, participants were also asked to search for vulnerabilities in other teams' programs; in total, teams submitted 866 exploits related to 166 vulnerabilities. We analyzed and characterized each submitted program and vulnerability according to a number of variables to determine the types of vulnerabilities developers were most likely to introduce and find and the factors that contributed to each. Based on our results, we provide recommendations for automation and process to improve secure development.
4:05 p.m. - 4:30 p.m.
Speaker: Daniel Votipka (University of Maryland, College Park)
Title: An Observational Investigation of Reverse Engineers' Processes and Mental Models
Abstract: Reverse engineering is a complex task essential to several software security jobs like vulnerability discovery and malware analysis. While traditional program comprehension tasks (e.g., program maintenance or debugging) have been thoroughly studied, reverse engineering diverges from these tasks as reverse engineers do not have access to developers, source code, comments, or internal documentation. Further, reverse engineers often have to overcome countermeasures employed by the developer to make the task harder (e.g., symbol stripping, packing, obfuscation). Significant research effort has gone into providing program analysis tools to support reverse engineers. However, little work has been done to understand the way they think and analyze programs, leading to a lack of adoption of these tools among practitioners. This talk reports on a first step toward better understanding the reverse engineer's process and mental models and provides directions for improving program analysis tools to better fit their users. We present the initial results of a semi-structured, observational interview study of reverse engineers (n=16). Each interview investigated the questions they asked while probing the program, how they answered these questions, and the decisions made throughout. Our initial observations suggest that reverse engineers rely on a variety of reference points in both the program text and structure as well as its dynamic behavior to build hypotheses about the program's function and identify points of interest for future exploration. In most cases, our reverse engineers used a mix of static and dynamic analysis — mostly manually — to verify these hypotheses. Throughout, they rely on intuition built up over past experience. From these observations, we provide recommendations for user interface and program analysis improvements to support the reverse engineer.
Driving: The closest visitor parking is in the Regents Drive Garage or the Xfinity Visitor Lot (both about a 5-minute walk from the AV Williams building). You need to pay at the meter. See campus map for locations of those lots. Metro: Take the Metro to the College Park stop on the Green line. Then take the free College Park Metro campus shuttle (#104), and get off at the stop at the Glenn L. Martin Wind Tunnel. You can find a detailed schedule and map for shuttle #104 here. Also, several public buses also bring you quite close to the A.V. Williams building, and they all take Metro Smart Cards.