Mini-workshop on New directions on Robustness in ML

About the Series

The IDEAL workshop series brings in experts on topics related to the foundations of data science to present their perspective and research on a common theme. This virtual workshop will feature five talks and discussion sessions.

Synopsis

As machine learning systems are being deployed in almost every aspect of decision-making, it is vital for them to be reliable and secure to adversarial corruptions and perturbations of various kinds. This workshop will explore newer notions of robustness and the different challenges that arise in designing reliable ML algorithms. Topics include test-time robustness, adversarial perturbations, distribution shifts, and explore connections between robustness and other areas. The workshop speakers are Aleksander Madry, Gautam Kamath, Kamalika Chaudhuri, Pranjal Awasthi and Sebastien Bubeck.  

The workshop is part of the Fall 2021 Special Quarter on Robustness in High-dimensional Statistics and Machine Learning; co-organized by Professors Yu Cheng (University of Illinois at Chicago), Chao Gao (University of Chicago), and Aravindan Vijayaraghavan (Northwestern University).

Logistics

  • Date: Tuesday, November 16, 2021 11:00am-3:30pm CDT (Chicago Time)
  • Location: Virtual IDEAL (on gather.town)- watch the full event here
  • Free Registration: Attendees must register to receive information for joining. Login information for Gather.Town will be provided via email. (People who have already selected to attend this workshop in the Fall 2021 special quarter participation form do not need to fill in this registration form).

Schedule

Titles and Abstracts

 
Speaker: Kamalika Chaudhuri
Title: Theoretically Understanding the Origins of Adversarial Robustness
Abstract: As machine learning is increasingly deployed, there is a need for reliable and robust methods that go beyond simple test accuracy.  In this talk, we will discuss one such challenge — robustness to adversarial examples. These examples are small imperceptible perturbations to legitimate test inputs that cause machine learning classifiers to misclassify. While recent work has proposed many attacks and defenses, why exactly they arise still remains a mystery. In this talk, we’ll take a closer principled look at this question in the context of non-parametric methods, linear classifiers, and finally neural networks. We will show that in most cases, the problem is the training algorithms we use.
 
Speaker: Pranjal Awasthi
Title: Finding Adversarially Robust Low Dimensional Representations
Abstract: Adversarial robustness measures the susceptibility of a machine
learning algorithm to small perturbations made to the input either attest time or at training time. Our current theoretical understanding of adversarial robustness is limited, and has mostly focused on supervised learning tasks.
In this talk, I will consider a natural extension of PrincipalComponent Analysis (PCA) where the goal is to find a low dimensionalsubspace to represent the given data with minimum projection error,and that is in addition robust to small perturbations (measured in say\ell_\infty norm). Unlike PCA which is solvable in polynomial time,our formulation is computationally intractable to optimize as itcaptures the well-studied sparse PCA objective. I willpresent polynomial time algorithms that give bicriteria constantfactor approximations in the worst case.
 
Our algorithmic techniques will also be robust to corruptions attraining time. I will then discuss theoretical and practical applications
of the main result.
 
Speaker: Sebastien Bubeck
Title: A universal law of robustness via isoperimetry
Abstract: I will present a theorem that tentatively explains why robustness necessitate very large neural networks. Joint work with Mark Sellke.
 
Speaker: Aleksander Madry
Title: ML Model Debugging: A Data Perspective
Abstract: Machine learning models tend to rely on an abundance of training data. Yet, understanding the underlying structure of this data—and models’ exact dependence on it—remains a challenge.

In this talk, we will present a framework for directly modeling predictions as functions of training data. This framework, given a dataset and a learning algorithm, pinpoints—at varying levels of granularity—the relationships between train and test point pairs through the lens of the corresponding model class. Even in its most basic version, our framework enables many applications, including discovering subpopulations, quantifying model brittleness via counterfactuals, and identifying train-test leakage.

Speaker: Gautam Kamath
Title: Statistical Estimation with Differential Privacy
Abstract: Naively implemented, statistical procedures are prone to leaking information about their training data, which can be problematic if the data is sensitive. To address these issues, one can consider differential privacy, a rigorous notion of data privacy which is applicable to algorithms in a variety of statistical settings. In this talk, I will survey recent results in differential private statistical estimation, offering a few vignettes showing that privacy constraints can raise novel challenges for even the most fundamental problems. Along the way, I’ll mention connections to tools and techniques in a number of fields, including modern advances in robust statistics.